A system for collection and analysis of opinions in microblog data: a text mining approach

Author/Creator ORCID

Date

2013-02-27

Department

Towson University. Department of Computer and Information Sciences

Program

Citation of Original Publication

Rights

Copyright protected, all rights reserved.
There are no restrictions on access to this document. An internet release form signed by the author to display this document online is on file with Towson University Special Collections and Archives.

Subjects

Abstract

Microblogging has become a very popular communication platform among Internet users. Its applications are rich sources of data for text mining, opinion mining and sentiment analysis. Its services are also becoming a platform for marketing and public relations for organizations and political parties. Political parties are interested to know if people support their program or not. Social organizations are asking people's opinion on current debates. All this information can be obtained from microblogging services, as their users post everyday what they like/dislike, and their opinions on many aspects of their life. In our paper, we focus on using Twitter, the most popular microblogging platform, for the task of text mining and opinion mining commonly known as sentiment analysis. We propose a system to acquire, manage, manipulate, analyze microblog data and report results. We discuss and apply various text processing techniques for opinion mining and apply several machine learning algorithms to analyze if bloggers have an opinion on a particular issue. In this study, we collected over 9,256,819 tweets on the issue of same-sex marriage in Maryland and across USA. Using Naive Bayes and support vector machine classifiers we find that we can identify opinionated tweets with an accuracy of 90% and 55% respectively.