Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis

Date

2024-01-01

Department

Program

Citation of Original Publication

Radwan, Ahmad, Mohannad Amarneh, Hussam Alawneh, Huthaifa I. Ashqar, Anas AlSobeh, and Aws Abed Al Raheem Magableh. “Predictive Analytics in Mental Health Leveraging LLM Embeddings and Machine Learning Models for Social Media Analysis.” International Journal of Web Services Research (IJWSR) 21, no. 1 (January 1, 2024): 1–22. https://doi.org/10.4018/IJWSR.338222.

Rights

Attribution 4.0 International

Subjects

Abstract

The prevalence of stress-related disorders has increased significantly in recent years, necessitating scalable methods to identify affected individuals. This paper proposes a novel approach utilizing large language models (LLMs), with a focus on OpenAI's generative pre-trained transformer (GPT-3) embeddings and machine learning (ML) algorithms to classify social media posts as indicative or not of stress disorders. The aim is to create a preliminary screening tool leveraging online textual data. GPT-3 embeddings transformed posts into vector representations capturing semantic meaning and linguistic nuances. Various models, including support vector machines, random forests, XGBoost, KNN, and neural networks, were trained on a dataset of >10,000 labeled social media posts. The top model, a support vector machine, achieved 83% accuracy in classifying posts displaying signs of stress.