groundtruth v0.2 — single-binary database monitoring is here. Read the docs →

Data checks without the monitoring circus.

No warehouse, no dbt, no Airflow to babysit. One binary runs your SQL checks with HCL assertions — once or on a schedule — speaks Prometheus, ships an MCP server, and hands back the rows that broke.

Linux & macOS
Cargo
Docker
$curl -fsSL https://raw.githubusercontent.com/jondot/groundtruth/main/install.sh | shCopy
or cargo install --git https://github.com/jondot/groundtruth · docker pull ghcr.io/jondot/groundtruth
gt run config.hcl
gt run config.hcl
[PASS] orders_present              1 row(s)
[FAIL] no_orphaned_line_items      3 row(s)
id=3 order_id=999id=4 order_id=998id=5 order_id=997[WARN] table_not_empty[orders]     1 row(s)
[PASS] table_not_empty[line_items] 1 row(s)
[ERROR] deliberately_broken        unknown column: recnt

What's different about groundtruth?

Everything you need to assert your data is healthy, and nothing you don't.

🧮

SQL + HCL assertions

Write checks in plain SQL, assert over row and rows.count. Config errors surface loud, never silently.

Declarative validation

The validate block adds per-column rules — type, regex, ranges, uniqueness, outliers (IQR/zscore), normality.

📡

Pull-first endpoints

/healthz, /metrics, /checks with health-code semantics for k8s probes and uptime monitors.

⏱️

Sustained gating

Per-check interval or cron. sustained only pages after a failure persists — no flapping.

🤖

MCP server built in

Run gt mcp and give any agent list_checks, run_check, explain_failure. HCL reads cleanly to models.

📦

One ~25 MB binary

No runtime, no agents, nothing to install. Talks to Postgres, SQLite, MySQL, and Trino from one static binary — drop it in a Dockerfile layer.

Fewer moving parts

One binary replaces the warehouse-plus-scheduler-plus-exporter stack most teams bolt together.

Components to run a scheduled data check
groundtruth
1
Prometheus + SQL exporter
3
Warehouse + dbt + Airflow
6+
Moving parts = processes/services you deploy, secure, and keep alive. groundtruth is a single static binary.

Checks in HCL, not YAML

A check is a SQL query plus an assertion. The eval context gives you row, rows.count, and each.value for fan-out.

  • Freshness — is the table still being written to?
  • Validation — declarative per-column rules
  • Fan-out — one block, N checks across a list
  • Sustained gating — page only when it persists
Full HCL reference →
config.hcl
connection "postgres" "main" {
  dsn = env("DATABASE_URL")
}

defaults {
  on    = connection.postgres.main
  every = "5m"
}

# Liveness — page after 15m of failure
check "orders_are_flowing" {
  query = "select count(*) as n from orders
           where created_at > now() - interval '5m'"
  fail {
    when      = row.n == 0
    sustained = "15m"
  }
  on_fail = notify.webhook.oncall
}

# Fan-out — one block, N checks
check "table_not_empty" {
  for_each = ["orders", "payments"]
  query    = "select count(*) as n from ${each.value}"
  warn     = row.n < 1
}

groundtruth works with your stack

It holds state and exposes health codes; your existing tooling pulls on its own schedule.

Postgres SQLite Prometheus Kubernetes probes Better Stack Pingdom / UptimeRobot Slack webhooks MCP agents

Install groundtruth and run your first check

Write a check, run it once, then set it on a schedule. Everything you need is in the docs.