I’m currently working on a project where I need to validate data quality assertions in Looker.
The data is mirrored in a PostgreSQL database, and it changes frequently (at least once a day), which can potentially affect the metric values I need to assert (deletion).
Goal: Validate Looker assertion test to ensure data quality.
Problem: I need to update the assertion values dynamically before running the tests because the data can change frequently, which might alter the metric values.
Idea:
- Run queries on source tables in Bigquery to retrieve the current assertion values.
- Update the assertion values in Looker at the time of running the test, in the CI.
- Trigger Looker assertion tests via API calls (using spectacle) in the CI.
- Validate the test results.
Questions:
- Is this a good approach to update the test values “on the fly” in a CI/CD pipeline?
- Are there any best practices or alternative methods to achieve this in a more efficient way?
- How can I ensure the accuracy and reliability of the assertions when the data changes frequently?
Thanks !
Aurelien ROBLIN is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.