Data Dialogs: Why Data Observability is Poised for Robust Growth

Data Dialogs is Sifflet’s exclusive interview series with data thought leaders from around the world. In this edition, guest writer Sanjeev Mohan sat down with our CEO and Co-Founder, Salma Bakouk, to better understand where data observability is heading - and the potential that lies ahead.

Salma Bakouk

Table of Contents

From Language Models to Autonomous Agents

What is MCP?

Challenges and Limitations

SANJEEV:
Let’s start with the basics. Everyone's familiar with software observability—think Datadog, New Relic, Splunk. But data observability hasn’t scaled as quickly. Why the lag?

SALMA:
It’s a question I hear all the time—from investors, from customers. The short version? We're in a different maturity cycle. Software observability exploded when cloud-native apps went mainstream. That was the 2010s. Companies realized they couldn’t operate production systems without real-time visibility.

Data, meanwhile, took longer to become a first-class product. The shift started with Snowflake and the modern data stack, but only recently have businesses begun to treat data as mission-critical. Now, we’re finally seeing the urgency around reliability, security, and governance—but we’re in the early innings. Think 2010 for software, but with a much faster clock thanks to AI.

SANJEEV:
So what is data observability? Is it a product? A capability? A category? Depending on who you ask, you get a different answer.

SALMA:
It’s definitely a category. But it’s also evolving. Most vendor websites define it as monitoring metadata, lineage, freshness, volume, etc. That’s part of it—but I think the more important question is: what can you do with that data?

At Sifflet, we gather observability data—logs, code, pipeline telemetry, metadata—and we put it to work. Today, that means anomaly detection and pipeline monitoring. Tomorrow? It could be cost governance, AI validation, security compliance. We’ve built the plumbing. The use cases are just getting started.

SANJEEV:
That sounds a lot like what happened in software observability. One core tech, many surface-level use cases.

SALMA:
Exactly. Datadog didn’t stop at APM. They expanded horizontally across dev, ops, infra, security. We’re seeing something similar play out in data. And in many ways, data observability is even more critical—because data is inherently volatile. It mutates. It breaks silently. And it’s closer to the business than code ever was.

SANJEEV:
I’ve always said: code behaves, data misbehaves. You write software once and run it a million times. Data, on the other hand, changes constantly—without warning.

“I’ve always said: code behaves, data misbehaves. You write software once and run it a million times. Data, on the other hand, changes constantly—without warning.”

Sanjeev Mohan

‍

SALMA:
Totally. And because data touches everything—from finance reports to ML models—bad data isn’t just an engineering issue. It’s a business risk. That’s why we emphasize context in Sifflet. We don’t just show schema drift—we show you which pipelines matter to revenue, or customer satisfaction, or compliance.

SANJEEV:
Let’s dig into scope. What approaches do you take to monitor data across the stack?

SALMA:
We mix several approaches. Metadata monitoring for freshness, volume, and schema changes. Log-based monitoring for pipeline health and orchestration issues. And finally, sampling and profiling actual data to catch anomalies at the value level. That’s the trifecta. And we use AI to help make this not just powerful—but accessible.

SANJEEV:
Speaking of AI—how are you using it inside Sifflet?

SALMA:
We call it “AI for Sifflet” and “Sifflet for AI.” Internally, we’re integrating LLMs to boost user experience—natural language monitor setup, contextual anomaly explanations, and eventually, things like self-healing pipelines. Externally, we’re helping customers observe AI models and monitor unstructured data—think vector stores, document classification, sensitive information tagging. It’s early days, but we’re seeing huge demand.

SANJEEV:
Let’s talk architecture. How is Sifflet deployed?

SALMA:
We’re built on Kubernetes, fully multi-tenant, cloud-agnostic, and with privacy at the core. SaaS is our default deployment, but we support VPC and agent-based deployments too. While we don’t offer on-prem today, our agents can monitor legacy systems like Oracle or Hadoop with ease. We connect across the stack—warehouse, lakehouse, cloud, hybrid.

SANJEEV:
Sounds like you’ve thought a lot about where observability sits in the stack.

SALMA:
Yes—and that matters. Observability isn’t a compartment of the data stack. It’s a top-of-stack layer. It should work across everything. So while components like transformation or warehousing need to be deeply specialized, observability wins by being broad. The more platforms we monitor, the better we fulfill our promise: visibility, everywhere.

SANJEEV:
And who actually uses the product?

SALMA:
Everyone. Our UI is built for both technical users and business stakeholders. A popular example is our Chrome extension—it sits on top of BI tools and shows real-time data health to the end user, no SQL needed. But behind the scenes, engineers and architects are using Sifflet for triage, root cause analysis, and impact assessment. It’s truly cross-functional.

SANJEEV:
What’s next for data observability?

SALMA:
Increased scrutiny. Increased stakes. As AI accelerates and regulation tightens, data observability will be a must-have. We’re already seeing adoption in highly regulated industries—financial services, pharma, media. And we’re only just beginning to explore observability for unstructured data and AI systems.

The category is still defining itself. But one thing is clear: if data is a product, observability is how we ensure it works.

‍

Excerpted from a podcast interview between Salma Bakouk and Sanjeev Mohan. [Click here to watch the full conversation.]

‍

Discover more ressources

Data Observability 101

November 13, 2024

Data Dialogs: Why Data Observability is Poised for Robust Growth

Discover more ressources

Top 5 Trends of Data Observability in 2025: Webinar Debrief

Meet Sifflet AI Assistant - Automating your data observability at scale

Fireworks, Not Fires: How Data Observability Prevents Data Disasters