Detecting Spam Blogs: A Machine Learning Approach
dc.contributor.author | Kolari, Pranam | |
dc.contributor.author | Java, Akshay | |
dc.contributor.author | Finin, Tim | |
dc.contributor.author | Oates, Tim | |
dc.contributor.author | Joshi, Anupam | |
dc.date.accessioned | 2018-12-07T20:29:03Z | |
dc.date.available | 2018-12-07T20:29:03Z | |
dc.date.issued | 2006-07-16 | |
dc.description | Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006) | en_US |
dc.description.abstract | Weblogs or blogs are an important new way to publish information, engage in discussions, and form communities on the Internet. The Blogosphere has unfortunately been infected by several varieties of spam-like content. Blog search engines, for example, are inundated by posts from splogs – false blogs with machine generated or hijacked content whose sole purpose is to host ads or raise the PageRank of target sites. We discuss how SVM models based on local and link-based features can be used to detect splogs. We present an evaluation of learned models and their utility to blog search engines; systems that employ techniques differing from those of conventional web search engines. We evaluate the effectiveness of a combination of features, and finally report our informal analysis of a blog search engine index. | en_US |
dc.description.sponsorship | This work is supported by NSFAwards NSF-ITR-IIS-0326460 and NSF-ITR-IDM-0219649 | en_US |
dc.description.uri | https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdf | en_US |
dc.format.extent | 6 pages | en_US |
dc.genre | conference papers and proceedings preprints | en_US |
dc.identifier | doi:10.13016/M27M0444D | |
dc.identifier.citation | Pranam Kolari, Akshay Java, Tim Finin, Tim Oates, and Anupam Joshi, Detecting Spam Blogs: A Machine Learning Approach, Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), https://www.aaai.org/Papers/AAAI/2006/AAAI06-212.pdf | en_US |
dc.identifier.uri | http://hdl.handle.net/11603/12192 | |
dc.language.iso | en_US | en_US |
dc.publisher | AAAI | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.relation.ispartof | UMBC Student Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.subject | blog | en_US |
dc.subject | learning | en_US |
dc.subject | social media | en_US |
dc.subject | spam | en_US |
dc.subject | web spam | en_US |
dc.subject | machine learning | en_US |
dc.subject | UMBC Ebiquity Research Group | en_US |
dc.title | Detecting Spam Blogs: A Machine Learning Approach | en_US |
dc.type | Text | en_US |