Chris J Powell

Big Data and Data Quality

I am constantly amazed at the number of my clients who allow the rampant growth of their Storage (both internal and cloud based) without any real concern for Data Quality.  It is great to harvest, mine and compile as much data as possible for future reference but at what expense?

Whether you call it Data Quality, Data Governance or Master Data Management it is important to note that as the amount of stored data grows, it gets harder and harder to ensure that the Quality of the information is viable and even more so…duplication of information can cause exponential growth and add to your already skyrocketing costs.  The push to the Cloud has been great on the pocket books for thousands of companies, it began as a cost cutting measure to save in both labor and in maintenance of complex Data Warehouses but I have heard from more than one client that the savings they achieved…are being eaten away by poor Data Quality.

When did we transition from what would now be considered Small Data that we had the ever watchful System Admins and Data Base Administrators watching over to the wild wild west mentality of “Store It All…we will sort it out later!”

Big Data is only Good Data if it can be used later.  If you can’t search it…manipulate it and put it to use…what is the point of heading down the path of Big Data?

According to some smart Analysts over at Ovum…bad Data is costing US Business $700 Billion every year and with the exponential growth of the Data Stream…that number is likely to grow.

Remember the “fat-finger” market panic of 2010? A trader typed in “billions” instead of “millions” in a sell order and caused the Dow Jones Index to drop nearly 1000 points.  Human Error…in a grand scale.

If there is so much Bad Data out there, and products and services like Business Intelligence are out there to assist in long term planning assistance and tactical guidance…well I know that I would not likely take my future financial advise from a Homeless Guy at the corner holding a sign “Spare Change…God Bless” so if we are not going to at least test the quality of the data…why are we pinning so much hope on Data Mining of Bad Data???

That being said…there is hope.

In Education the Data Quality Campaign is pushing for new mandates that make Data in US Education much more usable and ensure that the information not only usable but is more focused on both Long Term Policy and Strategy as well as Tactical Implementations.

The Electronic Data Management Council is another organization that has been built to both promote and elevate the quality of data spinning through the network cables of the Banks and Securities Exchanges around the world.

HIPAA in the US and CIHI in Canada are two good examples of placing the importance of Data Quality ahead of just Big Data.

For me, in these three examples, I would rather have quality Data to help identify and track opportunities for my Daughter’s education with good data, prevent another global financial collapse because of Good Data and well Life or Death can rest in the balance when it comes to Data Quality at our Local Hospital.

I am not saying Big Data is bad…I love the fact that in 2012 I can look for and find anything I want to…that is not the issue.  It scares me that the lowest end of the food chain in IT is the Data Entry Clerk.

Well Cheers and have a Great Day.

Chris J Powell


2 thoughts on “Big Data and Data Quality

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.