Data Cleansing
Data
cleansing is a process whereby missing or invalid data is corrected
before it is loaded into a data warehouse for reporting. This
process is normally carried out using data quality software such as
that offered by Trillium Software, or for address checking and
verification - QAS. Custom code can also be written to check data for
inconsistencies and errors. The software works by applying a set of
business rules and checks to a specified data set. The software will
then load the data, normally into it's own temporary repository, and
check for anomolies. The data cleaning process can then be fixed
manually, with the erroneous data highlighted to the user, or the
application can cleanse data automatically. This step in the
load process is particularly important when populating data from
multiple data sources, particularly as there may be inconsistencies in
data dictionary definitions, user entry errors or missing data. Part
of this process may involve the matching of a list of customer
addresses against, for example, a post office address database. This
will ensure that that addresses are correct, and any missing data such
as post codes can be entered in to the destination load. This
step should not be underestimated or overlooked. The initial expense of
the software normally pays for itself many times over by providing your
users with accurate information.
Trillium SoftwareThe Trillium
Software solution is built upon the Avellino Discovery product.
Discovery was a flagship product for the Avellino company, and was the
market leader for data profiling and analysis. Avellino was acquired by Trillium Software in 2004.
Data Cleansing
Business Intelligence Solutions

|