Detecting Spam Blogs: A Machine Learning Approach

dc.contributor.authorKolari, Pranam
dc.contributor.authorJava, Akshay
dc.contributor.authorFinin, Tim
dc.contributor.authorOates, Tim
dc.contributor.authorJoshi, Anupam
dc.date.accessioned2018-12-07T20:29:03Z
dc.date.available2018-12-07T20:29:03Z
dc.date.issued2006-07-16
dc.descriptionProceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006)en
dc.description.abstractWeblogs or blogs are an important new way to publish information, engage in discussions, and form communities on the Internet. The Blogosphere has unfortunately been infected by several varieties of spam-like content. Blog search engines, for example, are inundated by posts from splogs – false blogs with machine generated or hijacked content whose sole purpose is to host ads or raise the PageRank of target sites. We discuss how SVM models based on local and link-based features can be used to detect splogs. We present an evaluation of learned models and their utility to blog search engines; systems that employ techniques differing from those of conventional web search engines. We evaluate the effectiveness of a combination of features, and finally report our informal analysis of a blog search engine index.en
dc.description.sponsorshipThis work is supported by NSFAwards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649en
dc.description.urihttps://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdfen
dc.format.extent6 pagesen
dc.genreconference papers and proceedings preprintsen
dc.identifierdoi:10.13016/M27M0444D
dc.identifier.citationPranam Kolari, Akshay Java, Tim Finin, Tim Oates, and Anupam Joshi, Detecting Spam Blogs: A Machine Learning Approach, Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdfen
dc.identifier.urihttp://hdl.handle.net/11603/12192
dc.language.isoenen
dc.publisherAAAIen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectblogen
dc.subjectlearningen
dc.subjectsocial mediaen
dc.subjectspamen
dc.subjectweb spamen
dc.subjectmachine learningen
dc.subjectUMBC Ebiquity Research Groupen
dc.titleDetecting Spam Blogs: A Machine Learning Approachen
dc.typeTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
260.pd.pdf
Size:
80.15 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: