QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION
| dc.contributor.advisor | Gaur, Dr. Manas | |
| dc.contributor.advisor | Raff, Dr. Edward | |
| dc.contributor.author | Jannumahanti, Aayush | |
| dc.contributor.department | Computer Science and Electrical Engineering | |
| dc.contributor.program | Computer Science | |
| dc.date.accessioned | 2025-02-13T15:35:01Z | |
| dc.date.available | 2025-02-13T15:35:01Z | |
| dc.date.issued | 2024-01-01 | |
| dc.description.abstract | Quantization is a technique that compresses numerical representations to reduce space requirements, though it may sacrifice some precision. Albeit this lossy compression method can improve efficiency, it often comes at the cost of performance. Large Language Models (LLMs) are computationally intensive, posing challenges for users with limited hardware resources. However, advancements in fine-tuning strategies such as QLoRa, LLM.int8(), GGUF, GGML, llama.cpp, and various quantization techniques (8/4-bit, NF4, FP16/32/64, BF16, bitsandbytes) have democratized access to LLMs by reducing the resource burden. LLM weights are typically stored as floating-point numbers, and quantization reduces the precision of these weights to decrease the model’s resource requirements. While this can significantly reduce model size, it may also impact accuracy due to the compressed representation of weights. Lower levels of quantization result in smaller models but may lead to diminished performance. The findings from this research provide critical insights into the viability of using quantized LLMs in sensitive domains like mental health. They highlight the importance of balancing explanation quality with computational efficiency. This benchmarking effort lays the groundwork for deploying effective and resource-efficient LLMs in mental health applications, ultimately supporting professionals and patients with reliable AI-driven insights. As the study progresses, models will be trained sequentially in groups, categorized by familiessuch as LLAMA, Phi, Mixtral, Hermes, Falcon, Gemma, Qwen, and others. This research explores the trade-off between weight precision and model accuracy, aiming to better understand the challenges and potential of quantized LLMs in mental health applications. All models were trained and tested with the generous support of the University of Maryland, Baltimore County’s High-Performance Computing Facility, which provided GPU-accelerated resources. | |
| dc.format | application:pdf | |
| dc.genre | thesis | |
| dc.identifier | doi:10.13016/m2xnyi-awtw | |
| dc.identifier.other | 12994 | |
| dc.identifier.uri | http://hdl.handle.net/11603/37638 | |
| dc.language | en | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
| dc.relation.ispartof | UMBC Theses and Dissertations Collection | |
| dc.relation.ispartof | UMBC Graduate School Collection | |
| dc.relation.ispartof | UMBC Student Collection | |
| dc.rights | This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu or contact Special Collections at speccoll(at)umbc.edu | |
| dc.source | Original File Name: Jannumahanti_umbc_0434M_12994.pdf | |
| dc.subject | Artificial Intelligence (AI) | |
| dc.subject | Deep Learning | |
| dc.subject | Healthcare AI | |
| dc.subject | High-Performance Computing | |
| dc.subject | Machine Learning | |
| dc.subject | Quantized Large Language Models | |
| dc.title | QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION | |
| dc.type | Text | |
| dcterms.accessRights | Distribution Rights granted to UMBC by the author. |
Files
License bundle
1 - 1 of 1
Loading...
- Name:
- Jannumahanti-Aayush_Oopen.pdf
- Size:
- 289.08 KB
- Format:
- Adobe Portable Document Format
- Description:
