QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION

dc.contributor.advisorGaur, Dr. Manas
dc.contributor.advisorRaff, Dr. Edward
dc.contributor.authorJannumahanti, Aayush
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2025-02-13T15:35:01Z
dc.date.available2025-02-13T15:35:01Z
dc.date.issued2024-01-01
dc.description.abstractQuantization is a technique that compresses numerical representations to reduce space requirements, though it may sacrifice some precision. Albeit this lossy compression method can improve efficiency, it often comes at the cost of performance. Large Language Models (LLMs) are computationally intensive, posing challenges for users with limited hardware resources. However, advancements in fine-tuning strategies such as QLoRa, LLM.int8(), GGUF, GGML, llama.cpp, and various quantization techniques (8/4-bit, NF4, FP16/32/64, BF16, bitsandbytes) have democratized access to LLMs by reducing the resource burden. LLM weights are typically stored as floating-point numbers, and quantization reduces the precision of these weights to decrease the model’s resource requirements. While this can significantly reduce model size, it may also impact accuracy due to the compressed representation of weights. Lower levels of quantization result in smaller models but may lead to diminished performance. The findings from this research provide critical insights into the viability of using quantized LLMs in sensitive domains like mental health. They highlight the importance of balancing explanation quality with computational efficiency. This benchmarking effort lays the groundwork for deploying effective and resource-efficient LLMs in mental health applications, ultimately supporting professionals and patients with reliable AI-driven insights. As the study progresses, models will be trained sequentially in groups, categorized by familiessuch as LLAMA, Phi, Mixtral, Hermes, Falcon, Gemma, Qwen, and others. This research explores the trade-off between weight precision and model accuracy, aiming to better understand the challenges and potential of quantized LLMs in mental health applications. All models were trained and tested with the generous support of the University of Maryland, Baltimore County’s High-Performance Computing Facility, which provided GPU-accelerated resources.
dc.formatapplication:pdf
dc.genrethesis
dc.identifierdoi:10.13016/m2xnyi-awtw
dc.identifier.other12994
dc.identifier.urihttp://hdl.handle.net/11603/37638
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Jannumahanti_umbc_0434M_12994.pdf
dc.subjectArtificial Intelligence (AI)
dc.subjectDeep Learning
dc.subjectHealthcare AI
dc.subjectHigh-Performance Computing
dc.subjectMachine Learning
dc.subjectQuantized Large Language Models
dc.titleQUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Jannumahanti_umbc_0434M_12994.pdf
Size:
998.21 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Supplements.zip
Size:
402.77 KB
Format:
Unknown data format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jannumahanti-Aayush_Oopen.pdf
Size:
289.08 KB
Format:
Adobe Portable Document Format
Description: