Data Observability Benchmark and Key Features

NameLinkDeployment supportMonitoring frameworkThreshold settingInterface typeHigh cardinality supportreal-time data monitoringAutomated featuresData sourcesIntegrationsAlert destinationsSecurityMetrics categories trackedRoot cause analysisCommunityGithub
Bigeye
SaaSOn-premises
Anomaly detectionPipeline testing
AutomatedManual
No-code
Yes
Automated threshold settingAutomated threshold updatingAutomated circuit-breakerQuarantine bad data
Main data warehousesMain databases
GmailSlackPagerdutyAPIs
Certified SOC 2 compliantData stays in your environment
FreshnessOutliersFormatsDistributionVolumeCustom metricsNulls & blanks

Soda
Cloudopen-source
Anomaly detectionPipeline testing
AutomatedManual
No-codeCommand-line tool
Yes
Automated threshold settingAutomated circuit-breakerQuarantine bad dataAutomated threshold updating
Main data warehousesMain data lakes
CollibraTableauLookerAlation
SlackE-mailwebhooks for alerts & incidents
Data stays in your environment
FreshnessVolumeFormatsSchemaCustom metricsNulls & blanks
Automated failed row analysis

https://soda-community.slack.com/join/shared_invite/zt-pf67xl6u-n3wexBNDl71VC6vK8fSPjg

Databand
SaaSOpen source core
Pipeline testingAnomaly detection
ManualAutomated
Command-line toolNo-code
Yes
Automated threshold settingAutomated threshold updating
Main data warehousesMain databasesMain data lakes
SlackPagerdutyOpsGenieCustom
Data stays in your environment
SchemaFormatsDistributionCustom metricsOutliersSystemFreshnessData ingestion rate
Lineage

Monte Carlo
SaaSCloud
Anomaly detectionPipeline testing
AutomatedManual
No-codeCommand-line tool
Yes
Automated threshold settingAutomated threshold updatingAutomated circuit-breakerQuarantine bad data
Main data lakesMain data warehousesMain databases500+
LookerTableauPeriscopeChartioAlationAtlanAmundsenDbtDatadogModePowerBIPrefectDatahub
SlackPagerdutyWebhooksOpsgenieCustomTeamsMattermost
Certified SOC 2 compliantHIPAAGDPRPCICCPASOC 2 compliantData stays in your environment
FreshnessVolumeDistributionSchemaOutliersCustom metricsCorrelation across metricsNulls & blanksFormats
LineageCorrelation across metrics

Cito
SaaSCloudVPC
Anomaly detection
AutomatedManual
No-codeAPI
Yes
Automated threshold settingAutomated threshold updating
Main data warehouses
TableauDbtModePowerBILookerMetabase
SlackE-mail
Data stays in your environment
Nulls & blanksSchemaFreshnessOutliersDistributionCustom metricsFormatsVolume
LineageSQL Code AccessibilityColumn-level lineage

great expectations
Open-sourcecloud-product coming soon.
Pipeline testingAnomaly detection
AutomatedManual
Command-line toolPython notebooksPython library
Yes
Automated threshold settingAutomated threshold updatingAutomated circuit-breakerQuarantine bad dataAuto-resolution
Main data warehousesMain data lakesMain databases
AtlanDbtDagsterAstronomerPrefectPandasKedroFlyteDatahubMarquez
SlackPagerdutyOpsgenieE-mail
Data stays in your environment
Nulls & blanksCorrelation across metricsMultivariate feature checksOutliersFreshnessCustom metricsVolumeDistributionSchema

GitHub Slack

Sifflet
SaaSOn-premises
Anomaly detectionPipeline testing
AutomatedManual
No-codeAPI
Yes
Automated threshold settingAutomated threshold updating
Main data warehousesSQL server
TableauLookerDatadogDbt
SlackGmailAPIsPagerduty
Data stays in your environment
FreshnessVolumeOutliersFormatsDistributionSchemaNulls & blanksCustom metrics
Lineage

Validio
SaaSDeployed in the customer cloud environment
Anomaly detectionPipeline testing
AutomatedManual
No-code
Yes
Automated threshold settingAutomated threshold updatingAutomated circuit-breakerAuto-resolution
Main data warehousesMain databases
SlackPagerdutyE-mail
Data stays in your environment
FreshnessDistributionOutliersVolumeData ingestion rateSchemaFormatsMultivariate feature checks

Lightup
SaaSManaged on-premFully on prem
Anomaly detection
AutomatedManual
No-codeAPI
Yes
Automated threshold settingAutomated threshold updating
Main data warehousesMain databasesMain data lakes
SlackTeamsPagerdutyE-mailAPIsMattermostWebhooksFlock
ISAEE 3000 compliantData stays in your environmentCertified SOC 2 compliant
VolumeFreshnessSchemaDistributionFormatsCorrelation across metricsCustom metrics
Correlation across metricsLineage

Lantern
SaaS
Anomaly detection
Automated
No-code
Yes
Automated threshold setting
Main data warehouses
SlackE-mail
DistributionVolume

Metaplane
SaaSVPC
Anomaly detection
AutomatedManual
No-code
Yes
Automated threshold settingAutomated threshold updating
Main data warehousesMain databases
DbtLookerTableauModePowerBI
SlackPagerdutyOpsgenieTeams
Certified SOC 2 compliantData stays in your environment
FreshnessOutliersDistributionVolumeSchemaCustom metricsNulls & blanksFormats
LineageCorrelation across metrics

Datafold
SaaS
Anomaly detection
Automated
No-code
Yes
Automated threshold setting
Main data warehouses
SlackPagerdutyE-mailWebhooks
FreshnessOutliersDistribution

Acceldata
Pipeline testing
Main data warehousesMain data lakes
DistributionSchema
Correlation across metrics

Anomalo
SaaSDeployed in the customer cloud environment
Anomaly detection
Automated
Automated threshold setting
Main data warehouses

Marquez
open-source
Manual
Command-line tool
Amundsen
Lineage