Detecting Spam Blogs: A Machine Learning Approach

Kolari, Pranam; Java, Akshay; Finin, Tim; Oates, Tim; Joshi, Anupam

Detecting Spam Blogs: A Machine Learning Approach

dc.contributor.author	Kolari, Pranam
dc.contributor.author	Java, Akshay
dc.contributor.author	Finin, Tim
dc.contributor.author	Oates, Tim
dc.contributor.author	Joshi, Anupam
dc.date.accessioned	2018-12-07T20:29:03Z
dc.date.available	2018-12-07T20:29:03Z
dc.date.issued	2006-07-16
dc.description	Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006)	en
dc.description.abstract	Weblogs or blogs are an important new way to publish information, engage in discussions, and form communities on the Internet. The Blogosphere has unfortunately been infected by several varieties of spam-like content. Blog search engines, for example, are inundated by posts from splogs – false blogs with machine generated or hijacked content whose sole purpose is to host ads or raise the PageRank of target sites. We discuss how SVM models based on local and link-based features can be used to detect splogs. We present an evaluation of learned models and their utility to blog search engines; systems that employ techniques differing from those of conventional web search engines. We evaluate the effectiveness of a combination of features, and finally report our informal analysis of a blog search engine index.	en
dc.description.sponsorship	This work is supported by NSFAwards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649	en
dc.description.uri	https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdf	en
dc.format.extent	6 pages	en
dc.genre	conference papers and proceedings preprints	en
dc.identifier	doi:10.13016/M27M0444D
dc.identifier.citation	Pranam Kolari, Akshay Java, Tim Finin, Tim Oates, and Anupam Joshi, Detecting Spam Blogs: A Machine Learning Approach, Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdf	en
dc.identifier.uri	http://hdl.handle.net/11603/12192
dc.language.iso	en	en
dc.publisher	AAAI	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subject	blog	en
dc.subject	learning	en
dc.subject	social media	en
dc.subject	spam	en
dc.subject	web spam	en
dc.subject	machine learning	en
dc.subject	UMBC Ebiquity Research Group	en
dc.title	Detecting Spam Blogs: A Machine Learning Approach	en
dc.type	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 260.pd.pdf
Size:: 80.15 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection