Data Lifecycle

Every data product on the Akili platform follows a deterministic lifecycle defined entirely by 6 YAML manifest files. Developers declare what they want — the platform handles how it executes.

Lifecycle Stages

%%{init: {'flowchart': {'curve': 'basis'}}}%%
flowchart LR
    A[Declare] --> B[Ingest]
    B --> C[Transform]
    C --> D[Quality Gate]
    D --> E[Serve]
    E --> F[Monitor]
    F --> G[Archive]
    D -- "fail (critical)" --> H[DLQ]
    H -- "replay" --> C

Stage 1: Declare

The data product starts as 6 YAML manifest files that fully describe its behavior:

Manifest	What It Defines
`product.yaml`	Identity, ownership, archetype, classification, schedule
`inputs.yaml`	Source connections and upstream product dependencies
`transform.sql` / `transform.py`	Business logic (SQL or Python)
`output.yaml`	Output schema, partitioning, retention
`quality.yaml`	Quality checks and SLA definitions
`serving.yaml`	Serving intents (analytics, lookup, streaming)

Manifests are validated locally with akili validate and registered with the platform via akili product create.

# Validate manifests
akili validate .akili/

# Register the product
akili product create --name daily-orders --namespace sales

# Deploy to the execution engine
akili product deploy daily-orders

Stage 2: Ingest

Input ports pull data from external sources or upstream data products. The ingestion strategy is declared in inputs.yaml:

Strategy	Behavior	Use Case
`cdc`	Continuous change data capture from database WAL	Real-time from relational databases
`incremental`	Cursor-based extraction (`WHERE updated_at > last_run`)	Periodic sync from any SQL source
`full_refresh`	Complete dataset extraction on every run	Small reference tables
`snapshot_diff`	Full extract with row-level diff against previous snapshot	Legacy sources without change tracking

Each ingestion strategy tracks its own state (WAL position, cursor high-water mark, or snapshot hash) to enable exactly-once semantics across failures.

Stage 3: Transform

The execution engine runs the declared transformation logic. Transforms reference upstream data using the {{ ref() }} macro:

SELECT
    DATE_TRUNC('day', o.order_date) AS date,
    o.region,
    SUM(o.total_amount) AS total_revenue
FROM {{ ref('orders') }} o
GROUP BY 1, 2

The platform generates all orchestration code automatically via akili codegen. Developers write only business logic — no Dagster code, no boilerplate.

Transforms execute in isolated compute containers with resource limits from compute.yaml:

compute:
  engine: dagster
  schedule: "0 6 * * *"
  resources:
    cpu: "1"
    memory: 2Gi
  timeout: 1800
  retries: 2

Stage 4: Quality Gate

After transformation, quality checks from quality.yaml execute. This is a blocking gate — data does not proceed to serving stores until quality is verified.

%%{init: {'flowchart': {'curve': 'basis'}}}%%
flowchart TB
    TRANSFORM[Transform Complete] --> QC[Run Quality Checks]
    QC --> CRITICAL{Critical checks pass?}
    CRITICAL -- yes --> WARNING[Evaluate warnings]
    WARNING --> PROMOTE[Promote to serving]
    CRITICAL -- no --> BLOCK[Block promotion]
    BLOCK --> DLQ[DLQ after retries]
    BLOCK --> ALERT[SLA breach alert]

Severity	Behavior
`critical`	Blocks promotion. Consumers continue seeing the last good data.
`warning`	Logged but does not block. Data is promoted, issue is flagged.

Quality scores are recorded per product and per run, enabling trend analysis and regression detection.

Stage 5: Serve

Once quality gates pass, data is routed to serving stores based on serving.yaml. The platform supports intent-based routing — each serving mode targets a different access pattern:

Intent	Backing Store	Access Pattern
`analytics`	StarRocks (via Iceberg federation)	OLAP queries, dashboards, aggregations
`lookup`	PostgreSQL	Point lookups by key, low-latency reads
`streaming`	Redpanda	Real-time event stream for downstream consumers

Data is always written to the data lake (Iceberg on Ceph RGW) first, then materialized to serving stores. The lake is the single source of truth; serving stores are derived views.

serving:
  modes:
    - intent: analytics
      config:
        engine: starrocks
        materialized: true
        refresh_interval: "1h"
    - intent: lookup
      config:
        engine: postgresql
        cache_ttl: "5m"

Stage 6: Monitor

The platform continuously monitors every deployed data product:

Freshness SLA — is the latest materialization within the configured threshold?
Quality score — rolling average of quality check pass rates
Execution health — failure rate, duration trends, retry frequency
Serving availability — are serving endpoints responsive?

When an SLA is breached, the platform emits an sla.breach event and notifies configured channels (email, webhook, PagerDuty).

# Check current SLA status
akili governance sla daily-orders

# View execution history
akili run list daily-orders

# Check quality scores
akili governance quality daily-orders

Stage 7: Archive

Data products have configurable retention policies. When data exceeds the retention period:

A retention.expired governance event is emitted
The product owner is notified
The owner explicitly triggers deletion or extends retention
Deletion uses position delete files — non-destructive until compaction

The platform never auto-deletes data. This is a deliberate safety mechanism to prevent accidental data loss from misconfigured retention.

Scheduling Modes

Three scheduling modes control when a data product’s pipeline executes:

Mode	Trigger	Use Case
`cron`	Time-based schedule (e.g., `"0 6 * * *"`)	Regular batch processing
`event`	Upstream `data.available` event	Reactive, data-driven pipelines
`adaptive`	Event-driven with debounce and fallback cron	Near-real-time freshness SLAs

The adaptive mode deploys a per-product trigger agent that consumes event bus messages, debounces them, and fires materializations when freshness would breach the SLA.

Failure Handling

At every stage, failures are handled gracefully:

Failure Point	Behavior
Ingestion failure	Retry with exponential backoff, then DLQ
Transform error	Retry up to `retries` count, then DLQ
Quality gate failure	Block promotion, retry, then DLQ
Serving write failure	Retry, circuit breaker opens after threshold

The dead letter queue (DLQ) captures all failed events with full context for manual inspection and replay. See DLQ Management for details.

Orchestration — execution engine mechanics
Serving Layer — intent-based store routing
Quality and Governance — quality check details
Governance Model — classification and retention policies