Detecting Spam Blogs: A Machine Learning Approach

dc.contributor.authorKolari, Pranam
dc.contributor.authorJava, Akshay
dc.contributor.authorFinin, Tim
dc.contributor.authorOates, Tim
dc.contributor.authorJoshi, Anupam
dc.date.accessioned2018-12-07T20:29:03Z
dc.date.available2018-12-07T20:29:03Z
dc.date.issued2006-07-16
dc.descriptionProceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006)en_US
dc.description.abstractWeblogs or blogs are an important new way to publish information, engage in discussions, and form communities on the Internet. The Blogosphere has unfortunately been infected by several varieties of spam-like content. Blog search engines, for example, are inundated by posts from splogs – false blogs with machine generated or hijacked content whose sole purpose is to host ads or raise the PageRank of target sites. We discuss how SVM models based on local and link-based features can be used to detect splogs. We present an evaluation of learned models and their utility to blog search engines; systems that employ techniques differing from those of conventional web search engines. We evaluate the effectiveness of a combination of features, and finally report our informal analysis of a blog search engine index.en_US
dc.description.sponsorshipThis work is supported by NSFAwards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649en_US
dc.description.urihttps://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdfen_US
dc.format.extent6 pagesen_US
dc.genreconference papers and proceedings preprintsen_US
dc.identifierdoi:10.13016/M27M0444D
dc.identifier.citationPranam Kolari, Akshay Java, Tim Finin, Tim Oates, and Anupam Joshi, Detecting Spam Blogs: A Machine Learning Approach, Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdfen_US
dc.identifier.urihttp://hdl.handle.net/11603/12192
dc.language.isoen_USen_US
dc.publisherAAAIen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectblogen_US
dc.subjectlearningen_US
dc.subjectsocial mediaen_US
dc.subjectspamen_US
dc.subjectweb spamen_US
dc.subjectmachine learningen_US
dc.subjectUMBC Ebiquity Research Groupen_US
dc.titleDetecting Spam Blogs: A Machine Learning Approachen_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
260.pd.pdf
Size:
80.15 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: