Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis

Radwan, AhmadAmarneh, MohannadAlawneh, HussamAshqar, HuthaifaAlSobeh, AnasMagableh, Aws Abed Al RaheemPredictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media AnalysisIGI Global2024My UniversityMy University2024-10-282024-10-282024-01-01enTextRadwan, Ahmad, Mohannad Amarneh, Hussam Alawneh, Huthaifa I. Ashqar, Anas AlSobeh, and Aws Abed Al Raheem Magableh. “Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis.” International Journal of Web Services Research (IJWSR) 21, no. 1 (January 1, 2024): 1–22. https://doi.org/10.4018/IJWSR.338222.https://doi.org/10.4018/IJWSR.338222http://hdl.handle.net/11603/3681222 pagesAttribution 4.0 Internationalhttps://creativecommons.org/licenses/by/4.0/The prevalence of stress-related disorders has increased significantly in recent years, necessitating scalable methods to identify affected individuals. This paper proposes a novel approach utilizing large language models (LLMs), with a focus on OpenAI's generative pre-trained transformer (GPT-3) embeddings and machine learning (ML) algorithms to classify social media posts as indicative or not of stress disorders. The aim is to create a preliminary screening tool leveraging online textual data. GPT-3 embeddings transformed posts into vector representations capturing semantic meaning and linguistic nuances. Various models, including support vector machines, random forests, XGBoost, KNN, and neural networks, were trained on a dataset of >10,000 labeled social media posts. The top model, a support vector machine, achieved 83% accuracy in classifying posts displaying signs of stress.