Skip to content
GitLab

Quality & Governance

Akili implements Federated Computational Governance — domain teams own their data product quality and classification, while the platform enforces global policies computationally. Governance has four pillars: lineage, classification, quality enforcement, and SLA management.

flowchart TD
    MAT[Materialization Complete] --> QC[Run Quality Checks]

    subgraph Tiers["Three Expressiveness Tiers"]
        QC --> T1[Tier 1: Declarative YAML]
        QC --> T2[Tier 2: Custom SQL]
        QC --> T3[Tier 3: Custom Python]
    end

    T1 --> SCORE[Calculate Quality Score]
    T2 --> SCORE
    T3 --> SCORE

    SCORE --> GATE{Pass / Fail Gate}

    GATE -->|All blocking checks pass| PROMOTE[Promote to Serving Stores]
    GATE -->|Any blocking check fails| BLOCK[Block Downstream Propagation]

    PROMOTE --> DOWNSTREAM[Publish data.available]
    PROMOTE --> CATALOG[Update Quality Score in Catalog]
    BLOCK --> ALERT[Emit quality.failed Event]
    BLOCK --> STALE[Consumers See Stale-but-Correct Data]

Developers declare quality rules in quality.yaml. The platform translates them into platform quality check functions.

TierDeveloper WritesPlatform Translates To
Declarative YAMLtype: not_null, column: XQuality check with SQL null-count query
Custom SQLtype: custom_sql, sql: "..."Quality check that executes SQL and checks assertion
Custom Pythontype: custom_python, module: XQuality check that imports and calls the function

Example quality.yaml:

transform_checks:
- name: completeness_outlet_id
type: not_null
column: outlet_id
severity: blocking
- name: revenue_positive
type: expression
sql: "SELECT COUNT(*) FROM {table} WHERE total_revenue < 0"
threshold: 0
severity: blocking
- name: row_count_reasonable
type: volume_anomaly
threshold: 0.3
severity: warning
  • severity: blocking — The platform blocks downstream materialization. No data is promoted to serving stores.
  • severity: warning — The platform logs a warning; downstream processing continues.

This is the structural mechanism behind the platform’s “no bad data served” guarantee. Quality gates are not opt-in — they are prerequisites for data promotion.

Each product’s quality score is a rolling average of check results over the last 30 days:

quality_score = (passing_checks / total_checks) * 100

Scores are synced to the data catalog and displayed in the catalog and portal.

Every data product declares a sensitivity level, ordered by increasing restriction:

public -> internal -> confidential -> restricted

The output classification of a data product must be greater than or equal to the highest classification of any input. This is enforced at deploy time.

raw.orders (internal) + raw.payroll (confidential)
= output MUST be >= confidential

This prevents two governance violations:

  1. Classification laundering — Creating a “public” product that reads from “confidential” inputs
  2. Clearance bypass — Building a product using inputs the developer lacks clearance to access

Classification propagation is transitive — if product C depends on B which depends on A (confidential), then C’s classification must be >= confidential even though C only directly references B.

Beyond product-level classification, individual columns can declare their own sensitivity using a hierarchical taxonomy:

ClassificationFull AccessMasked AccessDenied
pii.nameOriginal valueSHA-256 hash (truncated)Column omitted
pii.identifierOriginal valueLast 4 chars, rest *Column omitted
pii.contactOriginal value[REDACTED]Column omitted
business.confidentialOriginal valueN/A (full or nothing)Column omitted
business.internalAlways visibleAlways visibleAlways visible
publicAlways visibleAlways visibleAlways visible

The serving layer applies dynamic masking at query time based on consumer clearance. This eliminates the need for derivative products just to strip sensitive columns.

Metadata flows into the data catalog from four pathways:

PathwayWhenWhat
Manifest RegistrationBuild-timeProduct identity, schemas, classification, access teams
Asset graphDeploy-timeDependency edges, external system connections, serving store edges
Execution EventsRun-timeFreshness, row count, duration, quality scores, partition status
Deployment LineageVersion trackingWhich manifest version produced which data snapshot

The sync is one-directional: The execution engine is the source of truth for operational data; the data catalog provides discovery, search, and visualization.

The lineage graph enables two key queries:

  • “What breaks if X fails?” — Show all downstream dependents
  • “Where does Y come from?” — Show full upstream lineage to source systems

Each data product has implicit SLA expectations based on its schedule:

SignalThresholdAlert
Freshness2x schedule interval”Product X is stale”
Quality score< 95% over 24h”Quality degradation on X”
Execution duration> 3x historical p95”Slow execution on X”
AvailabilityAny serving failure”Serving endpoint down for X”

A platform sensor monitors these every 5 minutes, emitting alerts via event bus messages to notification channels.

As products are deployed, the platform automatically builds a business ontology — a structured vocabulary of the organization’s data concepts. Concepts emerge from product metadata rather than being imposed top-down.

SourceExtraction Rule
identity columnsEach unique identity column name becomes a concept
Domain namesEach domain becomes a concept
Product tagsTags in product.yaml are registered as concept associations
Semantic intentsIntent metadata enriches column concepts with operational semantics
StateMeaning
draftAuto-extracted, not yet reviewed
proposedSubmitted for domain approval
acceptedApproved by domain owner
canonicalOrganization-wide standard term (Published Language)
deprecatedRetained for historical reference

The platform supports structured deletion that propagates through the lineage graph:

  1. Request — Deletion request submitted via API
  2. Impact — Platform traverses lineage graph to identify affected products
  3. Plan — Deletion plan: which products, which records, which method
  4. Execute — Position delete files in the data lake written (non-destructive, auditable)
  5. Verify — Post-deletion verification confirms no residual data
  6. Audit — Permanent audit record (never deleted)

Deletion propagation stops at aggregation boundaries where individual contributions cannot be identified (e.g., SUM, COUNT).

Products declare retention in product.yaml:

retention:
period: "365d"
basis: created_at
review_date: "2026-06-01"

The platform evaluates retention daily and emits retention.expired events, but does not auto-delete. Product owners must explicitly trigger deletion or extend the period.