Classification of Social Media Data for Suicidal Ideation

Author/Creator ORCID

Date

2017-01-01

Type of Work

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.

Abstract

Social media use continues to grow worldwide and an ever-growing number of people are using various social media platforms to update their social circles on their mental health challenges and suicidal ideations in real time. Number of psychological studies show that expressing suicidal thoughts and attempting suicide happens within a matter of hours and therefore automatic detection and analysis of social media posts by vulnerable users, serves as a critical, real-time window into their health and safety. In this theses, we are interested in classifying data from Twitter and Reddit as "Suicidal Risky Expression" and "Non-Risky Expression". We propose a method which includes automatic collection of tweets (Twitter data) and posts (Reddit data) based on suicidal vocabulary, parsing and tokenizing collected textual data, passing this data through a trained neural network and segregating data into two classes "Suicidal data" and "Non-suicidal data". Since this process runs in real time, the classified data can be used to support at-risk users either by reporting suicidal content to behavioral crisis response teams or by connecting people to mental-health support resources in real-time window.