Skip to content
GitLab

Troubleshooting

This guide covers common problems you may encounter when using the Akili platform and how to diagnose and resolve them.

Before diving into specific issues, these commands help you assess the current state:

Terminal window
# Platform health check
akili status
# Product deployment status
akili product status <product-name>
# Recent executions for a product
akili run list <product-name>
# Quality check results
akili governance quality <product-name>
# SLA status
akili governance sla <product-name>
# DLQ entries
akili dlq list --product <product-name>

Symptom: All commands fail with 401 Unauthorized.

Error: API returned 401: Token expired or invalid

Resolution:

Terminal window
# Check current auth status
akili auth status
# Re-authenticate with a fresh token
akili auth login <new-token>
# Verify
akili status

Symptom: akili status shows WARN Auth: no token configured.

Resolution:

Terminal window
# Initialize config if needed
akili config init --api-url https://api.akili.io
# Login with your token
akili auth login <token>

Symptom: Commands succeed but return empty results or unexpected data.

Resolution:

Check which tenant your token is scoped to:

Terminal window
akili auth status
# Look at the email/tenant displayed
# If using multiple tenants, verify your profile
akili config show

Symptom: akili product deploy fails with 422 ClassificationLaundering.

Error: Classification violation: product 'daily-report' is classified as
'internal' but depends on 'raw-payroll' which is 'confidential'

Resolution:

Raise the product’s classification to at least match its highest-classified input:

product.yaml
metadata:
classification: confidential # was 'internal', must be >= 'confidential'

Classification propagation is transitive — check the full dependency chain, not just direct inputs.

Symptom: Deploy fails because an upstream product is not deployed.

Error: Upstream product 'raw-orders' is not deployed

Resolution:

Terminal window
# Check upstream product status
akili product status raw-orders
# Deploy the upstream first
akili product deploy raw-orders
# Then deploy your product
akili product deploy daily-summary

Symptom: akili validate or akili product deploy --dry-run reports errors.

Resolution:

Terminal window
# Validate locally to see all errors
akili validate .akili/
# Common fixes:
# - Missing required fields in product.yaml
# - Invalid strategy in inputs.yaml
# - Malformed SQL in quality.yaml checks
# - Unknown serving intent in serving.yaml

Symptom: Execution fails with a SQL error in the execution logs.

Terminal window
# Check recent runs
akili run list daily-orders
# Get details on the failed run
akili run get run-abc123
# View execution steps
akili run steps run-abc123

Common causes:

  • Column name mismatch between inputs.yaml and the actual source schema
  • SQL syntax errors in transform.sql
  • Missing {{ ref() }} macro usage
  • Source schema changed without updating the product

Symptom: Execution times out or is killed.

Resolution: Increase resource limits in compute.yaml:

compute:
resources:
cpu: "2" # was "1"
memory: 4Gi # was 2Gi
timeout: 3600 # was 1800 (30 min -> 60 min)

Symptom: Execution retries are exhausted and the event is in the DLQ.

Terminal window
# Check DLQ for the product
akili dlq list --product daily-orders
# Inspect the failed entry
akili dlq get dlq-entry-abc123

See DLQ Management for detailed replay and recovery procedures.

Symptom: Data is not promoted to serving stores. Consumers see stale data.

Terminal window
# Check quality results
akili governance quality daily-orders
# View quality history for trends
akili governance quality-history daily-orders

Resolution depends on the check type:

Check TypeTypical Fix
not_nullFix the transform to handle NULL values or update the source
uniqueAdd deduplication to the transform
freshnessInvestigate why the pipeline is not running on schedule
row_countCheck if the source has data for the expected period
custom SQLReview the SQL check logic for false positives

Symptom: sla.breach alert or akili governance sla shows a threshold exceeded.

Terminal window
akili governance sla daily-orders
# Check if the pipeline is running
akili run list daily-orders
# Check if there are DLQ entries
akili dlq list --product daily-orders

Common causes of SLA breaches:

  • Pipeline not executing (scheduling issue)
  • Pipeline executing but failing (check DLQ)
  • Pipeline succeeding but quality gate blocking promotion
  • Upstream product delayed (cascading freshness breach)
Terminal window
akili connection test conn-abc123
# FAIL Connection 'production-db': timeout after 30s

Diagnosis:

  1. Network connectivity — is the source reachable from the platform?
  2. Credentials — have the Kubernetes secrets been updated?
  3. Firewall rules — does the source allow connections from the platform’s IP range?
  4. SSL configuration — is the ssl_mode correct?
Terminal window
# Get connection details
akili connection get conn-abc123 --json
# Verify the connection is listed
akili connection list

After rotating credentials in the source system:

Terminal window
# 1. Update the Kubernetes secret (cluster admin task)
# 2. Test the connection
akili connection test conn-abc123
# 3. No product redeployment needed

Symptom: akili status shows FAIL API Health.

FAIL API Health: Connection refused

Possible causes:

  • Platform is down or undergoing maintenance
  • Network configuration changed
  • API URL is incorrect
Terminal window
# Check your configured API URL
akili config show
# Try with an explicit URL
akili status --api-url https://api.akili.io

Symptom: Commands take longer than expected.

Terminal window
# Increase timeout
akili product list --timeout 60
# Check if the issue is specific to one command
akili status # Quick health check

Symptom: retention.expired event in governance events.

Terminal window
akili governance retention daily-orders

Resolution: Either extend the retention period or acknowledge and delete the expired data:

Terminal window
# Extend retention
akili governance set-retention \
--product daily-orders \
--period 730d \
--basis created_at
# Or investigate and approve deletion via the governance workflow

Symptom: Validation warning about colliding column names with canonical concepts.

Resolution: Align your column names with the canonical concepts in the business glossary:

Terminal window
# List concepts in your domain
akili governance concepts daily-orders
# Use canonical names in your output schema

If the issue is not covered here:

  1. Check execution logs: akili run steps <run-id>
  2. Check DLQ entries: akili dlq list --product <name>
  3. Check governance events: akili governance quality <name>
  4. Verify platform health: akili status