Data Quality and Master Data Management with Microsoft SQL Server 2008 R2

Language: English

Author: Dejan Sarka

This book deals with master data. It explains how we can recognize our master data. It stresses the importance of a good data model for data integrity. It shows how we can find areas of bad or suspicious data. It shows how we can proactively enforce better data quality and make an authoritative master data source through a specialized Master Data Management application.

Description 

“If all men were just, there would be no need of valor,” said Agesilaus, Spartan King, 444 BC-360 BC. Just from this quote we can realize that Agesilaus was not too keen on fighting. Actually, Agesilaus never hurt his enemies without just cause, and he never took any unjust advantages. Nevertheless, the ancient world was just as imperfect as the contemporary world is, and Agesilaus had to fight his share of battles.

If everyone would always insert correct data into a system, there would be no need for proactive constraints or for reactive data cleansing. We could store our data in text files, and maybe the only application we would need would be Notepad. Unfortunately, in real life, things go wrong. People are prone to make errors. Sometimes our customers do not provide us with accurate and timely data. Sometimes an application has a bug and makes errors in the data. Sometimes end users unintentionally make a transposition of letters or numbers. Sometimes we have more than one application in an enterprise, and in each application we have slightly different definitions of the data. (We could continue listing data problems forever.)

A good and suitable data model, like the Relational Model, enforces data integrity through the schema and through constraints. Unfortunately, many developers still do not understand the importance of a good data model. Nevertheless, even with an ideal model, we cannot enforce data quality. Data integrity means that the data is in accordance with our business rules; it does not mean that our data is correct. Not all data is equally important. In an enterprise, we can always find the key data, such as customer data. This key data is the most important asset of a company. We call this kind of data master data.

This book deals with master data. It explains how we can recognize our master data. It stresses the importance of a good data model for data integrity. It shows how we can find areas of bad or suspicious data. It shows how we can proactively enforce better data quality and make an authoritative master data source through a specialized Master Data Management application. It also shows how we can tackle the problems with duplicate master data and the problems with identity mapping from different databases in order to create a unique representation of the master data. For all the tasks mentioned in this book, we use the tools that are available in the Microsoft SQL Server 2008 R2 suite. In order to achieve our goal—good quality of our data—nearly any part of the suite turns to be useful. This is not a beginner’s book. We, the authors, suppose that you, the readers, have quite good knowledge of SQL Server Database Engine, .NET, and other tools from the SQL Server suite.

Achieving good quality of your master data is not an easy task. We hope this book will help you with this task and serve you as a guide for practical work and as a reference manual whenever you have problems with master data