Challenge

• Limited Visibility: Difficulty tracking granular data volume fluctuations and could not detect anomalies.
• Manual tracking Hurdles: Relied on a mix of custom scripts and manual tracking to monitor data health, which was inefficient and time-consuming and not scalable.
• Pipeline Breakages: Identifying and diagnosing broken data pipelines was slow and reactive rather than proactive.

Results at a glance

• Time Savings: 47.9% reduction in weekly time spent monitoring data quality issues.

• Cost Savings: Early estimates indicate savings of over $10,000 before full product implementation.

• Enhanced Data Quality: Immediate improvements in identifying and resolving data issues, leading to more reliable reporting and decision-making.

• User Adoption: Rapid onboarding, which was completed in a single day.

Introduction

Cribl, the data engine for IT and Security, had set out with a clear goal to improve data quality by implementing a robust data observability tool that delivers measurable metrics for data accuracy, completeness, and timeliness. 

With a mission centered on efficient, flexible, and secure telemetry data management, the company recognized that gaining deep visibility into its data pipelines was crucial to sustaining business intelligence and cost control.

The Challenge

Before adopting Revefi, Cribl faced several critical challenges. The company relied on a mix of custom scripts and manual tracking to monitor data health across its Snowflake environment. This approach resulted in limited visibility into granular data volume fluctuations, slow identification of pipeline breakages, and increased operational overheads. 

  • Data Quality Issues: Triggered by inconsistent or missing data could lead to incorrect business decisions.
  • Operational Inefficiencies: Without automation, engineers and analysts spend excessive time monitoring data manually.
  • Potential Customer & Internal Impact: If bad data makes its way to downstream reporting, it could negatively affect both internal teams and external stakeholders.

These issues threatened overall reporting accuracy, increased engineering time, and risked poor decision-making based on unreliable data.

Deciding On The Right Solution:
Cribl’s Evaluation Criteria

Cribl evaluated multiple data observability tools including Monte Carlo, Great Expectations and Anomalo using the following key criteria to ensure data reliability, observability, and cost-efficiency without increasing its operational overhead:

1. Data Freshness

  • Ability to promptly identify delayed or missing data.
  • Accuracy and responsiveness in alerting stakeholders about data freshness issues.

2. Schema Changes & Anomalies

  • Capability to proactively detect unexpected schema modifications.
  • Effectiveness in identifying anomalous patterns and data inconsistencies.

3. Pipeline Performance & Latency

  • Diagnosing performance bottlenecks, slow-running queries, and pipeline failures.
  • Clarity and details provided in monitoring dashboards to gauge performance metrics.

4. Cost Optimization

  • Identification of inefficient queries, redundant processing, or unnecessarily expensive operations.
  • Visibility into data movement and storage costs, highlighting areas for potential savings.

5. Integration & Automation

  • Ease of integration with existing cloud infrastructure, data stack, and workflows.
  • The degree of automation offered for routine monitoring and alerting to minimize manual intervention.

Stakeholder Involvement & Role

The project was driven by the insights and technical leadership of Cribl’s Analytics Staff Data Engineering team. In an integral role that spanned cloud infrastructure, networking, API development, data pipeline management, and security, Cribl’s Staff Data Engineer, Alec Taggart played a key role in defining evaluation criteria, executing tests, and collecting data to benchmark potential solutions. Key stakeholders across the Analytics, Finance, and Security teams collaborated throughout the process, ensuring the solution met rigorous technical requirements and aligned with cost-optimization goals.

Implementation with Revefi

After evaluating multiple alternatives, which included Monte Carlo, Great Expectations, and Anomalo, Cribl selected Revefi for its simplicity, ease of use, and quick time-to-value. Revefi offered comprehensive metrics such as data lineage, quality assessments, and anomaly detection. 

Furthermore, the solution enabled deep integration with Snowflake along with compliance capabilities like GDPR and CCPA. A single-day onboarding session was all it took to introduce Revefi to the analytics team. 

Once integrated, Revefi automated many manual tracking processes, surfacing critical issues like late or missing data and unexpected schema changes in real time.

Results and Impact

Cribl realized value with the POC and before onboarding their team members.

Post-implementation, the impact was both immediate and measurable. Cribl reported an 47.9% reduction in the weekly time spent tracking and understanding data quality issues. 


These efficiency gains translated into significant cost savings, with early estimates indicating a savings of over $10,000 before full product deployment. 

User feedback was enthusiastic:

Revefi is the Apple of Data Observability; the are Android-Like.
Revefi has become an integral part of Cribl’s daily operations to ensure that data issues are identified and resolved proactively, ultimately reducing potential impacts on business reporting and decision-making.

Conclusion

By partnering with Revefi, Cribl transformed its data observability approach from a reactive, manually intensive process into a streamlined, automated system. 

Not only did this enhance the quality and reliability of data, but it also freed up valuable engineering resources to focus on strategic initiatives. Cribl’s experience underscores how a well-integrated data observability tool can drive operational efficiencies, reduce costs, and support the broader business mission of maintaining high-quality, trustworthy data.

Key Outcomes
47.9%
increase in time savings
Cribl saw a 47.9% increase in time savings, $10K+ in cost savings, improved data quality for better decisions, and rapid user adoption on the same-day of onboarding in real time.
INDUSTRY
IT/ITeS
SOLUTIONS
Cribl utilized Revefi’s solution for its simplicity, ease of use, quick time-to-value, and its range of comprehensive metrics such as data lineage, quality assessments, and anomaly detection.