Data distribution deviation is a widespread pain point for data consumers. Data distribution deviation is when categorical or numerical data distribution changes swiftly or slowly over time - and when these changes go undetected, their downstream impact can be significant.
Let’s dive into a few examples to understand how data distribution changes can affect data users within your organization:
While data teams may rely on manual testing to avoid distribution deviation issues, it requires heavy manual lifts to build and scale it to multiple data assets and, more importantly, considerable time investment in maintaining it over time.
Sifflet now solves this pain.
Introducing distribution deviation monitoring
We are happy to introduce Sifflet’s distribution deviation monitoring. Sifflet leverages advanced statistical models to automatically detect distribution deviation at the field level - based on a rolling or a fixed time reference. With this new capability, combined with Siflet’s auto-coverage, data teams can automatically monitor large numbers of datasets of different sizes with a click of a button, allowing them to be in the know if a significant distribution deviation is happening. On top of these features, data teams using Sifflet can then leverage the field-level lineage to get to the root cause of these anomalies and troubleshoot them efficiently.
The distribution deviation monitoring feature is available for all of our customers.
Video from Sifflet
This feature comes from our continuous efforts to deliver comprehensive, automated, accurate, and actionable data quality monitoring. Want to learn more about our monitoring capabilities? Reach out for a demo.
Know when data breaks and fix it efficiently by leveraging Sifflet’s end-to-end lineage and automated monitoring and anomaly detection. Read more on how our product works.