A study on textual contents in online communities and social media using text mining approaches

dc.contributor.advisorKim, Yanggon
dc.contributor.authorHong, Beomseok
dc.contributor.departmentTowson University. Department of Computer and Information Sciencesen_US
dc.date.accessioned2018-05-29T20:39:43Z
dc.date.available2018-05-29T20:39:43Z
dc.date.issued2018-05-29
dc.date.submitted2017-12
dc.description(D. Sc.) -- Towson University, 2017en_US
dc.description.abstractWith the advent of Web 2.0, users have become more interactive, and the population of user-generated contents (UGC) has also increased drastically on the web. Among various Web 2.0 applications, we focus on textual contents in social media and online question answering communities. Twitter has become one of the fastest growing social media sites, and is serving as an electronic word-of-mouth (eWOM) that affects customers’ buying decisions by sharing opinions and information about brands. However, lexical ambiguity is an obstacle to analyzing the data in social media for online reputation management. The enormous amount of tweets makes it impossible for a human to manually disambiguate them. Therefore, we propose an automated company name discrimination using topic signatures. From the experiment, we found that news articles can be used to extract topic signatures, and these topic signatures improved the company name discrimination result as compared to the baseline. Community Question Answering (CQA) sites are knowledge sharing platforms that allow users to post questions and answer questions asked by other users. There is a time lag between questions and answers. Askers need to wait for answers, and some of the questions are never answered. To solve this problem, we propose a weighted question retrieval method using the relationship between titles and descriptions. From the experiment, we found that exploiting the question descriptions increased the ranks of the relevant questions while reducing the recalls of them. Software information sites such as Stack Overflow, Super User, and Ask Ubuntu are specific CQA sites that allow software related questions and tagging systems. Tagging systems help to organize, search, and explore their questions for future use. However, the tag explosion and tag synonym are common problems in tagging systems, because tags are added and created by non-expert users. To mitigate these problems, we propose a tag recommendation method using the highest topic filtering. From the experiment, we observed that our tag recommendation method considerably improved rank-related results and that recommended tags can improve the quality of their questions.en_US
dc.description.urihttp://library.towson.edu/digital/collection/etd/id/65136/en_US
dc.formatapplication/pdf
dc.format.extentxi, 88 pagesen_US
dc.genredissertationsen_US
dc.identifierdoi:10.13016/M22B8VF8S
dc.identifier.otherDF2017Hong
dc.identifier.urihttp://hdl.handle.net/11603/10874
dc.language.isoen_USen_US
dc.relation.isAvailableAtTowson University
dc.titleA study on textual contents in online communities and social media using text mining approachesen_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DF2017Hong_Redacted.pdf
Size:
1.67 MB
Format:
Adobe Portable Document Format
Description:
Hong Dissertation
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.45 KB
Format:
Item-specific license agreed upon to submission
Description: