TISA: Topic Independence Scoring Algorithm

dc.contributor.authorMartineau, Justin
dc.contributor.authorCheng, Doreen
dc.contributor.authorFinin, Tim
dc.date.accessioned2023-10-26T19:09:03Z
dc.date.available2023-10-26T19:09:03Z
dc.date.issued2013-07-13
dc.description9th Int. Conf. on Machine Learning and Data Mining (MLDM'13), 2013en_US
dc.description.abstractTextual analysis using machine learning is in high demand for a wide range of applications including recommender systems, business intelligence tools, and electronic personal assistants. Some of these applications need to operate over a wide and unpredictable array of topic areas, but current in-domain, domain adaptation, and multi-domain approaches cannot adequately support this need, due to their low accuracy on topic areas that they are not trained for, slow adaptation speed, or high implementation and maintenance costs. To create a true domain-independent solution, we introduce the Topic Independence Scoring Algorithm (TISA) and demonstrate how to build a domain-independent bag-of-words model for sentiment analysis. This model is the best preforming sentiment model published on the popular 25 category Amazon product reviews dataset. The model is on average 89.6% accurate as measured on 20 held-out test topic areas. This compares very favorably with the 82.28% average accuracy of the 20 baseline in-domain models. Moreover, the TISA model is highly uniformly accurate, with a variance of 5 percentage points, which provides strong assurance that the model will be just as accurate on new topic areas. Consequently, TISAs models are truly domain independent. In other words, they require no changes or human intervention to accurately classify documents in never before seen topic areas.en_US
dc.description.urihttps://link.springer.com/chapter/10.1007/978-3-642-39712-7_43en_US
dc.format.extent15 pagesen_US
dc.genrebook chaptersen_US
dc.genreconference papers and proceedingsen_US
dc.genrepostprintsen_US
dc.identifierdoi:10.13016/m2x6jw-gt3v
dc.identifier.citationMartineau, J.C., Cheng, D., Finin, T. (2013). TISA: Topic Independence Scoring Algorithm. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science, vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_43en_US
dc.identifier.urihttps://doi.org/10.1007/978-3-642-39712-7_43
dc.identifier.urihttp://hdl.handle.net/11603/30407
dc.language.isoen_USen_US
dc.publisherSpringeren_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en_US
dc.titleTISA: Topic Independence Scoring Algorithmen_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0000-0002-6593-1792en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
705.pdf
Size:
822.76 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: