Share this
Quality assurance - How does it work?
by Luisa Rey Gomez on Aug 3, 2020 9:30:00 AM
Part 2 of the quality assurance blog series
In our first article about quality assurance, we talked about the importance of consistent, valid and complete data. As data is the core requirement for enabling digital business, relying on the quality of data to evaluate the state of your business and make informed decisions is critical. The second part of the quality assurance series focuses on describing how the monitoring dataflow works for data stored in Google BigQuery.
Identifying the errors
As a part of quality assurance monitoring in Google Cloud, you can automatically get alerted about anomalies or problems. That way, you can immediately find out if one of your critical data processes goes down and quickly take action.
The process that helps us assure the quality of data is the monitoring dataflow. It tells us whether some information is missing, incomplete or whether some processes could not be executed as expected. As part of the monitoring process, we can check whether a table contains a number from a given range (other than null) or exceeds a certain value. Also, whether the email address is valid, unique or whether the table is updated every day.
Data is the element you built your business credibility on. Neglecting the quality of your data and processes can have a significant impact on the efficiency and performance of your business.
Monitoring, step by step
The monitoring dataflow performs pre-configured checks on BigQuery tables in repeated intervals. Check results are stored and reported with the frequency configured (checks not passed, and checks passed during the last execution). There are three steps in the dataflow:
- Read from BigQuery (configuration table)
- Validate/execute checks
- Write results to BigQuery.
The first step is to read the configuration table to get all the active checks. The status of the checks can be easily set to “active” or “inactive” based on what needs to happen. For every check we must validate the frequency, it means not all checks need to be executed all times. The frequency setting can be controlled in the field for each one of them.
For the checks that apply, in step two, the rule is validated and a table in BigQuery stores the results. This last action is the final step of monitoring dataflow; after that, a new scheduled procedure is executed. All the results are compiled in a report and emailed to selected recipients from the business. It can look like this:
We have created the colour coding to identify the status of the check results quickly. But the report can be easily configured to include more than that. Each customer can suggest an action to be put in the report based on their preference and needs, such as what action to take when an error is detected.
- Blue - all the checks that could not be executed, also marked as “check_not_executed”. This could be because the request table or field doesn’t exist or the format of the values doesn't match with the condition, among others;
- Red - the checks that didn't pass, it means, we can run the validation, but it doesn’t match with the desired value;
- Yellow - basically this is the same as red, but in this case, the check is marked as a warning;
- Green - all the checks that could be executed and their result match with the desired result.
The result value shows relevant information about the result of the check and the query that was executed to validate it. Once we have this information, the query can be copied into a console to check what went wrong easily. The corrective actions can be taken internally by the business who receives the email or resolved by the development team.
Anomalies or downtimes can not only negatively affect your business bottom line but can also hurt your reputation. Crystalloids provides quality assurance to help you find errors early before they affect your business.
ABOUT CRYSTALLOIDS
Crystalloids helps companies improve their customer experiences and build marketing technology. Founded in 2006 in the Netherlands, Crystalloids builds crystal-clear solutions that turn customer data into information and knowledge into wisdom. As a leading Google Cloud Partner, Crystalloids combines experience in software development, data science, and marketing, making them one of a kind IT company. Using the Agile approach Crystalloids ensures that use cases show immediate value to their clients and frees their time to focus on decision making and less on programming.
Share this
- November 2024 (5)
- October 2024 (2)
- September 2024 (1)
- August 2024 (1)
- July 2024 (4)
- June 2024 (2)
- May 2024 (1)
- April 2024 (4)
- March 2024 (2)
- February 2024 (2)
- January 2024 (4)
- December 2023 (1)
- November 2023 (4)
- October 2023 (4)
- September 2023 (4)
- June 2023 (2)
- May 2023 (2)
- April 2023 (1)
- March 2023 (1)
- January 2023 (4)
- December 2022 (3)
- November 2022 (5)
- October 2022 (3)
- July 2022 (1)
- May 2022 (2)
- April 2022 (2)
- March 2022 (5)
- February 2022 (3)
- January 2022 (5)
- December 2021 (5)
- November 2021 (4)
- October 2021 (2)
- September 2021 (2)
- August 2021 (3)
- July 2021 (4)
- May 2021 (2)
- April 2021 (2)
- February 2021 (2)
- January 2021 (1)
- December 2020 (1)
- October 2020 (2)
- September 2020 (1)
- August 2020 (2)
- July 2020 (2)
- June 2020 (1)
- March 2020 (2)
- February 2020 (1)
- January 2020 (1)
- December 2019 (1)
- November 2019 (3)
- October 2019 (2)
- September 2019 (3)
- August 2019 (2)
- July 2019 (3)
- June 2019 (5)
- May 2019 (2)
- April 2019 (4)
- March 2019 (2)
- February 2019 (2)
- January 2019 (4)
- December 2018 (2)
- November 2018 (2)
- October 2018 (1)
- September 2018 (2)
- August 2018 (3)
- July 2018 (3)
- May 2018 (2)
- April 2018 (4)
- March 2018 (5)
- February 2018 (2)
- January 2018 (3)
- November 2017 (2)
- October 2017 (2)