Share this
Quality Assurance - How data quality directly affects your business
by Luisa Rey Gomez on Jul 24, 2020 1:15:42 PM
Part 1 of the quality assurance blog series
When your business decisions and customer interactions are based on data coming from multiple sources, repeatedly and in different formats, there is a risk of failures in the processes and in the data. In this blog, we will explain what quality assurance is when it is required and share best practices on how to apply it.
The consequences of poor data quality
Nowadays, raw data is used increasingly to create new information and knowledge. Integrating this data forms the basis for algorithms and machine learning processes. Wrongly matched or enriched data are no more reliable, and when used in business processes, they impact the ability to make informed judgements and decisions about your business.
That is why it is necessary to monitor and check whether the data is being processed and delivered in the right order. The quality of the data needs to be in an implacable state to be used in critical decision processes.
Duplicate data, for example, can negatively impact your business KPIs. Having two or more records for one customer may lead to sending the same person the same campaign multiple times. It affects your email deliverability, personalisation efforts, response rates, campaign results and the overall ROI of your marketing activity.
Wrong business decisions made from incomplete data are incredibly costly. According to Gartner, poor data quality leads to a $15 million average financial impact on organisations per year. Next to financial implications, running analysis with incomplete or incorrect data is also very time-consuming for your business analysts trying to find and fix the errors.
What is quality assurance?
Quality assurance according to ISO 9000, is “part of quality management focused on providing confidence that quality requirements will be fulfilled". It might seem a difficult task to handle because there are many processes that transform and store the data that have to be checked.
Doing all of the monitoring manually is very time-consuming and not 100% reliable. To assure the quality of data, this kind of checks can be configured to be executed automatically, and alerts can be set up in case something goes wrong. Designing and implementing a solution that allows continuous and automated monitoring might be a good idea. That way, your business is immediately informed when something goes wrong.
Consistent, valid and complete data
Your customer data and processes are the foundation of your infrastructure - it’s time to treat them like one. As part of quality assurance, we diagnose issues that affect data stored in a data warehouse, Google BigQuery, for example. We monitor all data processes including dataflows, triggers, stored procedures or any other processes one might think of. The goal is to make sure that all operations run successfully and data is consistent, valid and complete.
Example user story:
When we started a project with one of our clients, we needed to check if all processes were completed successfully based on the data stored in Google Cloud BigQuery. Every evening the consolidated processes were executed over a significant amount of tables, affecting a massive quantity of data, some or part of them could fail. To check whether the tables were updated daily, we created an automated daily check. This is how the query looked like:
The first versions helped the business to identify if something failed quickly. However, after some days, we found out that it didn't apply to all processes that produce a direct output on BigQuery tables.
Thinking out of the box, we designed additional mechanisms to identify possible errors. Using Google Cloud SDK, we created a new kind of check that does not only validate data stored in BigQuery, but also verifies if a file exists in Google Cloud Storage. This way, it was easy to monitor the name of the bucket, the file name format and to set up an email alert when the file was not in the specified location.
During the execution of the project, we faced different challenges. Probably the most interesting one was trying to define what was the best way to do the checks. On the one hand, adding as many options as we could to the configuration table would be very flexible. On the other hand, it would overload the person who was configuring. We needed to make it responsive and easy to use, as well. Finally, together with the business, we defined the most important values to include as configurable, and leave the less important ones by default.
Adopt quality assurance
In this blog, we’ve looked at the importance of a good quality of data and how your business depends on it. Once you adopt quality assurance and start proactively monitoring your data and processes to minimise the chance of an error you will be able to make critical business decisions without consequences.
You might also like to read:
- 5 reasons to adopt DevOps and accelerate software deployment
- The process and best practices of maintenance in Google Cloud
ABOUT CRYSTALLOIDS
Crystalloids helps companies improve their customer experiences and build marketing technology. Founded in 2006 in the Netherlands, Crystalloids builds crystal-clear solutions that turn customer data into information and knowledge into wisdom. As a leading Google Cloud Partner, Crystalloids combines experience in software development, data science, and marketing, making them one of a kind IT company. Using the Agile approach Crystalloids ensures that use cases show immediate value to their clients and frees their time to focus on decision making and less on programming.
Share this
- November 2024 (5)
- October 2024 (2)
- September 2024 (1)
- August 2024 (1)
- July 2024 (4)
- June 2024 (2)
- May 2024 (1)
- April 2024 (4)
- March 2024 (2)
- February 2024 (2)
- January 2024 (4)
- December 2023 (1)
- November 2023 (4)
- October 2023 (4)
- September 2023 (4)
- June 2023 (2)
- May 2023 (2)
- April 2023 (1)
- March 2023 (1)
- January 2023 (4)
- December 2022 (3)
- November 2022 (5)
- October 2022 (3)
- July 2022 (1)
- May 2022 (2)
- April 2022 (2)
- March 2022 (5)
- February 2022 (3)
- January 2022 (5)
- December 2021 (5)
- November 2021 (4)
- October 2021 (2)
- September 2021 (2)
- August 2021 (3)
- July 2021 (4)
- May 2021 (2)
- April 2021 (2)
- February 2021 (2)
- January 2021 (1)
- December 2020 (1)
- October 2020 (2)
- September 2020 (1)
- August 2020 (2)
- July 2020 (2)
- June 2020 (1)
- March 2020 (2)
- February 2020 (1)
- January 2020 (1)
- December 2019 (1)
- November 2019 (3)
- October 2019 (2)
- September 2019 (3)
- August 2019 (2)
- July 2019 (3)
- June 2019 (5)
- May 2019 (2)
- April 2019 (4)
- March 2019 (2)
- February 2019 (2)
- January 2019 (4)
- December 2018 (2)
- November 2018 (2)
- October 2018 (1)
- September 2018 (2)
- August 2018 (3)
- July 2018 (3)
- May 2018 (2)
- April 2018 (4)
- March 2018 (5)
- February 2018 (2)
- January 2018 (3)
- November 2017 (2)
- October 2017 (2)