Comparative Analysis of Data Cleaning Tools
Loading...
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
2018-01-01
Type of Work
Department
Computer Science and Electrical Engineering
Program
Computer Science
Citation of Original Publication
Rights
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Abstract
In the information era, working with data has emerged to be a compulsory task and useful task in decision making. Most of these data set contains impurities that needs to be weeded out before any meaningful decision can be made from the data. Hence, handling dirty data is essential to computing. Dealing with dirty data takes up to 90 percent of time and resource of data analyst. In order to clean this dataset, adequate tools and techniques has to be used, and many tools are emerging everyday. While some of these tools are good, some of these tools can be dangerous to sensitive data. This research aims at helping researchers and organizations make fast decision on the right tool for data cleaning. We analysis some of the current data clean- ing tools and techniques and come up which tool will be useful based on different scenario.