quality.yaml
Declares quality checks that run after every materialization. If a check with severity: error fails, downstream propagation is blocked.
Example
Section titled “Example”apiVersion: akili/v1kind: Quality
checks: # Tier 1: Declarative (built-in check types) - name: revenue_not_null type: completeness config: column: total_revenue threshold: 0.99 severity: error
- name: data_freshness type: freshness config: column: updated_at max_age: 6h severity: error
- name: row_volume type: volume config: min_rows: 50 max_rows: 100000 severity: warn
- name: revenue_positive type: custom_expression config: expression: "total_revenue >= 0" threshold: 1.0 severity: error
# Tier 2: Custom SQL - name: no_duplicate_outlets_per_day type: custom_sql sql: | SELECT COUNT(*) as failures FROM {output} WHERE (outlet_id, sale_date) IN ( SELECT outlet_id, sale_date FROM {output} GROUP BY outlet_id, sale_date HAVING COUNT(*) > 1 ) severity: error expect: "failures = 0"
# Tier 3: Custom Python - name: revenue_distribution_stable type: custom_python entrypoint: checks/revenue_drift.py severity: warnThree Expressiveness Tiers
Section titled “Three Expressiveness Tiers”| Tier | Defined In | Translated To | Use Case |
|---|---|---|---|
| Declarative | YAML type + config | Platform-generated SQL as @asset_check | Standard checks: completeness, freshness, volume, uniqueness, range |
| Custom SQL | Inline sql block | Platform-wrapped SQL as @asset_check | Complex business logic, multi-table assertions |
| Custom Python | External .py file | Platform-wrapped function as @asset_check | Statistical analysis, ML drift detection, custom algorithms |
Built-In Check Types (Tier 1)
Section titled “Built-In Check Types (Tier 1)”| Type | Config Fields | What It Checks |
|---|---|---|
completeness | column, threshold (0.0-1.0) | Fraction of non-null values >= threshold |
freshness | column (timestamp), max_age (duration) | Most recent value within max_age of now |
volume | min_rows, max_rows (optional) | Row count within bounds |
uniqueness | columns (string[]) | No duplicate values for given column combination |
range | column, min, max | All values within [min, max] |
accepted_values | column, values (string[]) | All values in allowed set |
referential | column, reference_product, reference_column | All values exist in referenced product |
custom_expression | expression (SQL boolean), threshold | Fraction of rows satisfying expression >= threshold |
regex | column, pattern | All non-null values match regex |
statistical | column, metric (mean/stddev/median), min, max | Aggregate metric within bounds |
Custom SQL (Tier 2)
Section titled “Custom SQL (Tier 2)”The sql block can use {output} as a placeholder for the materialized table reference. The platform injects the correct tenant-scoped, partition-scoped table name at execution time.
The expect field is a simple assertion: "column_name operator value". Supported operators: =, <, >, <=, >=, !=.
Custom Python (Tier 3)
Section titled “Custom Python (Tier 3)”The entrypoint file must expose a function with this signature:
def check(df: pd.DataFrame, context: dict) -> CheckResult: """ Args: df: The materialized output as a DataFrame. context: Dict with keys: tenant_id, product_name, partition_key, previous_df (last successful materialization, or None). Returns: CheckResult with passed (bool), message (str), metadata (dict). """The context["previous_df"] enables drift detection by comparing current materialization against the previous one.
Severity and Blocking
Section titled “Severity and Blocking”| Severity | On Failure |
|---|---|
error | Block downstream serving writes. Emit failure event. Enter retry/DLQ. |
warn | Log warning. Continue serving writes. Emit warning event. |
Developer Testing
Section titled “Developer Testing”Products can also declare tests in the tests block:
tests: - name: customer_transform_basic type: unit description: "Validates basic customer transform with standard inputs" fixtures: inputs: raw_customers: tests/fixtures/raw_customers.csv region_lookup: tests/fixtures/region_lookup.csv assertions: - column: customer_id condition: not_null - column: lifetime_value condition: ">= 0"| Test Type | Purpose | Execution |
|---|---|---|
unit | Validates transform logic with fixture data | akili test run, runs in CI before deploy |
contract | Validates interface agreement with upstream | Runs on deploy, fails deployment if violated |
canary | Validates production output against expectations | Runs after each execution, alerts only |