Skip to content
GitLab

Patterns & Examples

Do not build separate products for each report that joins the same sources. Build one aggregate product and use multiple serving intents.

# BAD: Three products with duplicate logic
monthly-revenue-by-store/
monthly-revenue-by-segment/
weekly-top-products/
# GOOD: One product, three serving endpoints
store-performance/
serving.yaml:
endpoints:
- type: analytics # Powers all three reports
- type: lookup # Portal detail pages
- type: realtime # Live dashboard

Validate data quality before it enters the transform pipeline:

inputs.yaml
inputs:
- id: raw-pos-extract
type: connector
connector_ref: pg-production
ingestion_strategy: incremental
ingestion_config:
cursor_field: updated_at
primary_key: transaction_id
fitness:
- type: row_count_min
threshold: 100
- type: schema_match
severity: error # error = block, warn = log and continue

Always validate before deploying:

Terminal window
# Fast mode -- schema-only, no cross-refs (<100ms)
akili validate --fast my-product/
# Standard -- full validation (<2s)
akili validate my-product/
# Strict -- full + style checks (<5s)
akili validate --strict my-product/

Validation checks:

  1. JSON Schema validation on each file
  2. Cross-reference checks (quality.yaml columns exist in output.yaml)
  3. Classification propagation (output >= max of input classifications)
  4. Logic file syntax checks

Note: The output placeholder (written as curly-brace output curly-brace) in custom SQL quality checks is replaced with the correct tenant-scoped, partition-scoped table name at execution time. You never hardcode table names.


Complete Working Example: user-events Source Product

Section titled “Complete Working Example: user-events Source Product”

Here are all 6 files for a source-aligned product that captures user interaction events.

product.yaml:

apiVersion: akili/v1
kind: DataProduct
metadata:
name: user-events
domain: analytics
version: 1.0.0
owner: platform-team
description: >
Captures user interaction events from the web application.
Source of truth for user behavior analytics and session analysis.
tags:
- events
- user-behavior
- clickstream
classification: internal

inputs.yaml:

apiVersion: akili/v1
kind: Inputs
inputs:
- id: webapp-events
type: connector
connector_ref: webapp-postgres
ingestion_strategy: cdc
ingestion_config:
primary_key: event_id
timeout: 2h
fallback: fail

output.yaml:

apiVersion: akili/v1
kind: Output
schema:
- name: event_id
type: uuid
primary_key: true
role: identity
description: Unique event identifier
- name: user_id
type: string
nullable: false
role: event_key
description: References the user who triggered the event
- name: event_type
type: string
nullable: false
description: Type of user interaction
- name: event_timestamp
type: timestamp
nullable: false
description: When the event occurred (UTC)
- name: page_url
type: string
nullable: true
description: URL of the page where the event happened
- name: session_id
type: uuid
nullable: true
description: Session grouping identifier
- name: session_duration_ms
type: integer
nullable: true
role: measure
description: Duration of the session in milliseconds
format: parquet
partitioning:
- field: event_timestamp
granularity: day

serving.yaml:

apiVersion: akili/v1
kind: Serving
endpoints:
- type: lookup
description: Point lookups by event_id for detail pages
config:
index_columns:
- event_id
- user_id
- type: analytics
description: Event analytics for Superset dashboards

quality.yaml:

apiVersion: akili/v1
kind: Quality
checks:
- name: event_id_complete
type: completeness
config:
column: event_id
threshold: 1.0
severity: error
- name: timestamp_fresh
type: freshness
config:
column: event_timestamp
max_age: 6h
severity: error
- name: valid_event_types
type: accepted_values
config:
column: event_type
values: [click, view, scroll, submit, navigate, search]
severity: error
- name: event_volume
type: volume
config:
min_rows: 100
severity: warn
- name: no_duplicate_events
type: uniqueness
config:
columns: [event_id]
severity: error

compute.yaml:

apiVersion: akili/v1
kind: Compute
runtime: sql
engine: auto
schedule:
type: event
resources:
cpu: "500m"
memory: "1Gi"
timeout: 30m
entrypoint: logic/transform.sql
retry:
max_attempts: 3
backoff: exponential
initial_delay: 30s

After writing all 6 files, validate and deploy:

Terminal window
akili validate user-events/
akili deploy user-events/