QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION

Jannumahanti, Aayush

QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION

dc.contributor.advisor	Gaur, Dr. Manas
dc.contributor.advisor	Raff, Dr. Edward
dc.contributor.author	Jannumahanti, Aayush
dc.contributor.department	Computer Science and Electrical Engineering
dc.contributor.program	Computer Science
dc.date.accessioned	2025-02-13T15:35:01Z
dc.date.available	2025-02-13T15:35:01Z
dc.date.issued	2024-01-01
dc.description.abstract	Quantization is a technique that compresses numerical representations to reduce space requirements, though it may sacrifice some precision. Albeit this lossy compression method can improve efficiency, it often comes at the cost of performance. Large Language Models (LLMs) are computationally intensive, posing challenges for users with limited hardware resources. However, advancements in fine-tuning strategies such as QLoRa, LLM.int8(), GGUF, GGML, llama.cpp, and various quantization techniques (8/4-bit, NF4, FP16/32/64, BF16, bitsandbytes) have democratized access to LLMs by reducing the resource burden. LLM weights are typically stored as floating-point numbers, and quantization reduces the precision of these weights to decrease the model’s resource requirements. While this can significantly reduce model size, it may also impact accuracy due to the compressed representation of weights. Lower levels of quantization result in smaller models but may lead to diminished performance. The findings from this research provide critical insights into the viability of using quantized LLMs in sensitive domains like mental health. They highlight the importance of balancing explanation quality with computational efficiency. This benchmarking effort lays the groundwork for deploying effective and resource-efficient LLMs in mental health applications, ultimately supporting professionals and patients with reliable AI-driven insights. As the study progresses, models will be trained sequentially in groups, categorized by familiessuch as LLAMA, Phi, Mixtral, Hermes, Falcon, Gemma, Qwen, and others. This research explores the trade-off between weight precision and model accuracy, aiming to better understand the challenges and potential of quantized LLMs in mental health applications. All models were trained and tested with the generous support of the University of Maryland, Baltimore County’s High-Performance Computing Facility, which provided GPU-accelerated resources.
dc.format	application:pdf
dc.genre	thesis
dc.identifier	doi:10.13016/m2xnyi-awtw
dc.identifier.other	12994
dc.identifier.uri	http://hdl.handle.net/11603/37638
dc.language	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Theses and Dissertations Collection
dc.relation.ispartof	UMBC Graduate School Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu or contact Special Collections at speccoll(at)umbc.edu
dc.source	Original File Name: Jannumahanti_umbc_0434M_12994.pdf
dc.subject	Artificial Intelligence (AI)
dc.subject	Deep Learning
dc.subject	Healthcare AI
dc.subject	High-Performance Computing
dc.subject	Machine Learning
dc.subject	Quantized Large Language Models
dc.title	QUANTIZED LARGE LANGUAGE MODELS FOR MENTAL HEALTH APPLICATIONS: A BENCHMARK STUDY ON EFFICIENCY, ACCURACY AND RESOURCE ALLOCATION
dc.type	Text
dcterms.accessRights	Distribution Rights granted to UMBC by the author.

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Jannumahanti_umbc_0434M_12994.pdf
Size:: 998.21 KB
Format:: Adobe Portable Document Format

Download

Name:: Supplements.zip
Size:: 402.77 KB
Format:: Unknown data format

Download

License bundle

Now showing 1 - 1 of 1

Name:: Jannumahanti-Aayush_Oopen.pdf
Size:: 289.08 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

UMBC Theses and Dissertations
UMBC Computer Science and Electrical Engineering Department
UMBC Graduate School
UMBC Student Collection