Skip to content
GitLab

Multi-Tenancy

Akili is a single-deployment, multi-tenant platform. One control plane serves all tenants. Tenant isolation is enforced at the service layer — every query, every storage path, and every event topic is scoped to a single tenant. There is no shared data surface between tenants.

%%{init: {'flowchart': {'curve': 'basis'}}}%%
flowchart TB
    subgraph API["API Layer"]
        JWT["JWT with tenant_id claim"]
        MW["Middleware extracts tenant_id"]
    end

    subgraph Service["Service Layer (Isolation Boundary)"]
        SVC["Every operation scoped by tenant_id"]
    end

    subgraph Storage["Storage Layer"]
        PG["PostgreSQL\nRow-Level Security\ntenant_id on every table"]
        S3["Ceph RGW / S3\nPrefix: /tenant-{id}/..."]
        RP["Redpanda\nTopic: {tenant}.{product}.events"]
        SR["StarRocks\nCatalog: tenant_{slug}_catalog"]
    end

    JWT --> MW --> SVC
    SVC --> PG
    SVC --> S3
    SVC --> RP
    SVC --> SR

These rules are non-negotiable. Every component in the platform enforces them.

Every SQL table in the control plane database includes a tenant_id UUID NOT NULL column. This is the primary isolation key.

CREATE TABLE products (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL, -- isolation key
name TEXT NOT NULL,
namespace TEXT NOT NULL,
-- ...
);

Row-Level Security (RLS) policies ensure that queries can only see rows belonging to the current tenant:

ALTER TABLE products ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON products
USING (tenant_id = current_setting('app.tenant_id')::uuid);

At the service layer, every database query includes an explicit tenant_id filter. This is defense-in-depth alongside RLS — even if RLS were misconfigured, the application logic would still scope correctly.

// Service layer always passes tenant_id
let products = repo.list_products(claims.tenant_id).await?;

The repository layer enforces this at the SQL level:

SELECT * FROM products WHERE tenant_id = $1 AND namespace = $2

All object storage paths in Ceph RGW include the tenant identifier as a prefix. Each tenant’s data lives in its own namespace within the object store.

s3://akili-data/tenant-{tenant_id}/products/{product_name}/data/
s3://akili-data/tenant-{tenant_id}/products/{product_name}/staging/
s3://akili-data/tenant-{tenant_id}/products/{product_name}/quality/

The platform never constructs a storage path without the tenant prefix. Cross-tenant bucket access is blocked by IAM policies on the Ceph RGW.

Every event topic on the Redpanda message bus includes the tenant identifier:

{tenant_slug}.{product_name}.data.available
{tenant_slug}.{product_name}.execution.started
{tenant_slug}.{product_name}.dlq.execution

Consumer group IDs also include the tenant, preventing cross-tenant message consumption:

{tenant_slug}.{product_name}.consumer-group

There is no API endpoint, CLI command, or internal service method that can access data across tenant boundaries. The tenant boundary is enforced at the service layer:

  • JWT claims carry the tenant_id — the middleware extracts it before any handler runs
  • Service methods accept tenant_id as a required parameter
  • The platform does not support “super-admin” queries that span tenants (platform-level operations use a dedicated default tenant)

Tenants progress through a defined state machine:

%%{init: {'stateDiagram': {'curve': 'basis'}}}%%
stateDiagram-v2
    [*] --> PROVISIONING: tenant create
    PROVISIONING --> ACTIVE: provisioning complete
    ACTIVE --> SUSPENDED: tenant suspend
    SUSPENDED --> ACTIVE: tenant reactivate
    SUSPENDED --> ARCHIVED: tenant archive
    ARCHIVED --> [*]: data purged after retention
StateBehavior
PROVISIONINGResources being created (DB schema, S3 bucket, Redpanda topics, StarRocks catalog)
ACTIVEFully operational — all APIs available
SUSPENDEDRead-only — no new deployments, no executions, data still accessible
ARCHIVEDNo access — data retained per retention policy, then purged

When a new tenant is created, the platform provisions the following resources:

  1. PostgreSQL schema — tenant-specific tables with RLS policies
  2. Ceph RGW bucket prefixtenant-{id}/ namespace in the data bucket
  3. Redpanda topic namespace — base topics for the tenant’s event bus
  4. StarRocks catalog — external Iceberg catalog for analytics serving
Terminal window
# Create a tenant
akili tenant create acme-corp --display-name "Acme Corporation"
# Check provisioning status
akili tenant get acme-corp

Each tenant can be assigned resource quotas to prevent any single tenant from consuming disproportionate platform resources:

QuotaDescriptionDefault
max_productsMaximum number of data products100
max_connectionsMaximum external connections20
max_storage_gbMaximum S3 storage in GB500
max_compute_cpuMaximum concurrent compute CPU8
max_compute_memory_gbMaximum concurrent compute memory32

Quotas are enforced at the service layer before resource creation. When a tenant exceeds a quota, the API returns a 429 QuotaExceeded error.

  • tenant_id is a UUID, never user-facing in URLs or file paths
  • Tenants also have a slug (human-readable identifier) used in topic names and catalog names
  • The slug is immutable after creation — it is used in Redpanda topic names and StarRocks catalogs, which cannot be renamed
  • Tenant creation is admin-only — there is no self-service tenant provisioning

A default tenant exists for platform-level data that does not belong to any specific customer tenant:

  • Platform metrics and health data
  • Connector registry entries
  • Community contribution catalog
  • System-level configuration

The default tenant follows all the same isolation rules — it is not a “god mode” that bypasses multi-tenancy.