Domain teams own their data products. The platform handles orchestration, quality gates, multi-tenant isolation, and lineage. Declare in YAML. Write logic in SQL or Python. Ship.
You declare. The platform executes.
Data products can include SQL transforms, Python models, or full domain-bounded software logic. The platform handles orchestration, quality, and serving.
1# inputs.yaml 2source: 3 type: cdc 4 connection: warehouse-pg 5 table: orders 6 mode: incremental 7 watermark: updated_at
CDC, API polling, file watch, or streaming — declare the source, the platform handles the rest.
1# compute.yaml 2engine: sql 3schedule: "@hourly" 4logic: 5 entrypoint: logic/transform.sql 6 7# logic/transform.sql 8SELECT 9 customer_id, 10 SUM(amount) AS total_spend, 11 COUNT(*) AS order_count 12FROM {{ ref('raw_orders') }} 13GROUP BY customer_id
SQL, Python, or custom code in logic/ — transforms run on schedule with full dependency tracking.
1# serving.yaml 2endpoints: 3 - intent: lookup 4 store: structured 5 cache: { ttl: 300 } 6 - intent: analytics 7 store: analytics 8 - intent: realtime 9 store: realtime
Declare the access intent — lookup, analytics, time-series, realtime — the platform routes to the right store.
1# quality.yaml 2rules: 3 - type: not_null 4 columns: [customer_id, total_spend] 5 - type: range 6 column: total_spend 7 min: 0 8 - type: custom_sql 9 query: "SELECT COUNT(*) = 0 FROM {{table}} WHERE order_count < 0" 10sla: 11 freshness: 2h 12 min_score: 0.95
10 built-in check types plus custom SQL/Python. SLA enforcement, quality scoring, and breach alerts — all declarative.
Whether you build data products or run the platform, start here.