SPIDER: Data Quality & Data Cleaning Project

Mohammad Sadoghi, D. Stantic, and N. Koudas.

In IBM CASCON, 2005.


Data quality is a serious concern in every organization that relies on data. The quality of data is commonly poor due to a multitude of reasons including, but not limited to, spelling mistakes, abbreviations, lack of standards and inconsistent notations. SPIDER is a declarative data cleaning tool. It incorporates a set of algorithms that can be used to aid the improvement of data quality on any relational data source.


Readers who enjoyed the above work, may also like the following: