HIquality Inspect measures data quality and HIquality Transform improves the measurement results
[Click here to download article in PDF format]
The use of Data Quality Profiling tools to measure data quality is growing rapidly. These tools can be used to assess whether the data in the database still complies with the initial requirements, rules and properties.
An interesting aspect of these measurements is the follow-up. How to fit in automated correction of the deviations to these rules in an ever changing business environment?
Human Inference offers HIquality Inspect en HIquality Transform: a suitable and connected software solution for the measurement and improvement of data quality.
HIquality Inspect
Measuring data quality is a prerequisite for the success of any process using the data involved. Without this knowledge, initiatives as CRM, Customer Care, Business Intelligence and ERP, are built on a very weak data foundation. Defective data leads to additional, often hidden, costs and lost revenues. The implementation of applications and initiatives eventually decelerates, causing severe financial and operational consequences.
Knowing the quality of data is important. This means that the properties of the data have to be inspected. These properties may, or may not, be defined in external or company-specific quality rules. The rules determine how the data must be stored in the database. An example of such a property rule is the format of the field 'mobile phone number': (+country code)(0)(9 digits). Another example is the content of field 'Product status'. This field may only contain 'permanently out of assortment', 'temporarily out of assortment' and 'available'. A third example is the field 'Gender' where only the values 'M' for male, 'F' for female and 'U' for unknown are allowed. In day-to-day reality however, it appears that these rules are not being followed and that the database contains all kinds of defective data. The discrepancy between the postulated rules and the everyday practice can be attributed to bad habits of the users and/or missing validation methods in the system.
Defective data (although seldom identified) have a large number of consequences for the operating efficiency and profit. For example, in a direct marketing campaign for all male customers younger than 25 years, it may very well be possible that not all customers in the target group are reached, causing potential turnover decrease. Another example is the interpretation of differently formatted phone numbers. Eventually, this will cause a morbid growth of inconsistent, non-exchangeable formats.
Measuring data quality is not only a prerequisite for efficient operational management, but it is also used to determine the condition of data from external data suppliers.
The process of measuring data quality in a precise and careful way is called Data Quality Profiling.
A good data quality profiling tool must
- be able to quickly inspect large amounts of data;
- be easy to use;
- have powerful import and export functions;
- be able to measure the development of data quality over a certain period of time;
- be integrated with a cleansing tool.
Human Inference has used exactly these demands to develop the data quality profiling tool HIquality Inspect; a product with minimal system requirements and impressive performance and usability.
HIquality Inspect can examine different kinds of files and databases. The user applies various inspections in order to determine different dimensions of data quality.
The users can run these inspections in accordance with their specific investigation requirements.
Inspection results are graphically represented in a report. The results can be exported to different formats, such as HTML, Excel and flat files, making it possible to use the data in other applications.
In addition, it is possible to cluster the various inspections in a profile and to save the accumulated profiles in a project. Consequently, well-organized reports can be created to swiftly gain insight in the improvement or deterioration of the data quality level. Very simple, with just a few mouse clicks.
HIquality Inspect has an intuitive user interface, with all the necessary support functions and on-line help.
HIquality Inspect is part of the HIquality Product Suite for Data Quality Life Cycle Management. HIquality Inspect is integrated with HIquality Transform to improve the relationship between measurement and improvement. Information from the inspect reports, can be directly relayed to HIquality Transform, in order carry out the necessary improvement measures.
HIquality Transform
During the measurement process, the undesirable and unwanted properties of data are being uncovered. The data must now be corrected through the implementation of transformation rules. These rules are not only applied to correct the data according to the valid requirements, but they can also be used to store new data in the desired manner.
The most efficient way to implement transformation rules, is to create a central repository for these rules. This way, new rules can be added quickly throughout the entire company, without adapting all kinds of applications. This calls for a flexible, open solution, that can be deployed in these different applications.
HIquality Transform is the right solution for this task. Using HIquality Transform, the defects discovered by HIquality Inspect, are being converted in corresponding scripts with transformation rules. These scripts can be used by any application and can therefore be put to use consistently throughout the entire company.
In order to determine adequate transformation rules, knowledge of the database structure (such field names and field length) is needed. With HIquality Transform, this so-called metadata can be directly imported from several files. Manual import of metadata is also possible, for example to create new fields. It is even more efficient to additionally import the HIquality Inspect reports to have an overview of unwanted values and deviant formats. As an example, the list of valid and invalid values for the field 'Gender' can be directly read by HIquality Transform. This makes building the script a lot easier and it reduces the possibility of human errors.
This is of course merely a simple example. The advantages will become even more manifest using larger lists with more complex patterns.
HIquality Transform is equipped with a user-friendly development tool, that enables the user to create transformation scripts fast and easy. Every developer has a set of built-in transformation functions at his disposal. The transformation script is created using drag-and-drop actions. Built-in controls, lay-out tools and on-line help provide the additional user support. All you need to invoke the development tool is a web-browser. This makes availability very easy, wherever you are.
Examples of transformation functions are:
- Case Conversion, converts from upper case to lower case and vice versa
- Concatenation, for the merging of fields
- Copy, copies data from one field to another
- Split, to divide data into several fields
- Trim, deletes unnecessary data
- Lookup and Replace, substitutes data by means of automatically or manually provided search tables and expressions.
HIquality Transform can be used for batch transformations, but also for on-line checks during data import. It can be implemented in different kinds of systems and it is available through a rapidly growing number of connectors, such as MQSeries, JAVA, .NET, SOAP, FILE, and database connectors to Oracle and SQL Server environments.
HIquality Transform is a business service plug-in from our HIquality Product Suite. This increases integration ease and guarantees usability, scalability, reliability and high performance.
|