A Comparative Study of Data Cleaning Tools

Date

2019-10-01

Department

Program

Citation of Original Publication

Oni, Samson, Zhiyuan Chen, Susan Hoban, and Onimi Jademi. “A Comparative Study of Data Cleaning Tools.” International Journal of Data Warehousing and Mining (IJDWM) 15, no. 4 (October 1, 2019): 48–65. https://doi.org/10.4018/IJDWM.2019100103.

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

In the information era, data is crucial in decision making. Most data sets contain impurities that need to be weeded out before any meaningful decision can be made from the data. Hence, data cleaning is essential and often takes more than 80 percent of time and resources of the data analyst. Adequate tools and techniques must be used for data cleaning. There exist a lot of data cleaning tools but it is unclear how to choose them in various situations. This research aims at helping researchers and organizations choose the right tools for data cleaning. This article conducts a comparative study of four commonly used data cleaning tools on two real data sets and answers the research question of which tool will be useful based on different scenario.