forge.yaml Platform Schema (v3)

Internal — Conservice

This reference is internal to Conservice, for Conservice GitHub users building apps on the greenfield platform.

URL major version v3 is the current forge.yaml schema contract. The headline change from v1: multi-service apps with mixed exposure — each service gets its own expose, image, and dns shape, plus gateway-level authentication and authorization. Single-service apps render identically to v1 (backwards compatible). Old majors stay live forever so apps pinned to v1 keep validating. Machine-readable JSON Schema: forge.yaml.schema.json — drop into .vscode/settings.json yaml.schemas for IDE autocomplete + validation.

What this is
Quick start — minimal forge.yaml
- How to deploy
- Migrating from v1
Common patterns
App repo layout
Top-level fields
Language
Services
- Per-service hostname resolution (v3)
- Per-service image (v3)
- Per-service autoscaling (scaling)
DNS
App config keys
Env vars
Resources
- S3
- SQS
- SNS
- DynamoDB
- EventBridge
- Step Functions
- Bedrock
- Database
  - Database migrations
  - Preview-only seed
- Temporal
- Firehose
- KMS
Cross-app access (per-resource policy + consumes)
Authentication (auth)
Authorization (authz)
Authenticating an MCP server
GitHub access (github)
- Declaring access
- Using a read token
- Using a write token
- Scope, lanes, and environments
- Common errors
Scheduled services (schedule)
Environment opt-out (disabled_envs)
Preview
Image tags
Replicas
Render channel (render_channel)
Canary (deprecated)
Monitoring
- App kind (app_kind)
- SLO tier (slo_tier)
- Monitors (monitors)
Observability (observability)
Reserved naming
Naming patterns (resolved at scaffold)
Schema versioning
- Versioning policy for this doc
Where to file requests
More information

What this is

forge.yaml is the declarative spec for an app on Conservice's greenfield platform. It lives at infra/forge.yaml in your app repo. Forge reads it and renders all the underlying infrastructure (Terraform for AWS resources, Kustomize for K8s manifests, GitHub Actions workflows, ArgoCD apps, Kargo pipelines, Workspace + Identity Center bindings).

You don't write Terraform. You write forge.yaml. Forge takes it from there.

This document is the complete schema reference for forge.yaml — every field, allowed value, naming constraint, and validation rule the schema enforces.

Quick start — minimal forge.yaml

forge_version: 3.0.0
app_name: my-app
team: sre
language: typescript
services:
  - name: api
    port: 8080
    # no `expose:` → ClusterIP-only (in-cluster HTTP, sister-service callable). Add
    # `expose: internal` for VPN-only routing or `expose: public` for internet-facing.
    health_path: /health
    dockerfile: Dockerfile
    context: .

That's enough to scaffold an app: an in-cluster API service (no external or VPN exposure — sister services can call it via mesh) deploying to all platform envs (prev/stg/prod). Add expose: internal to the service for VPN-only routing or expose: public for internet-facing. Add resource declarations under resources: to get S3 buckets, SQS queues, databases, etc. Use disabled_envs: to opt an app out of specific envs (see § Environment opt-out).

Resource-only apps (no runtime code — just S3 + DDB + queues) are allowed: set services: [] and omit language. See § Language.

Multi-service example (v3)

forge_version: 3.0.0
app_name: status-page
team: sre
language: typescript
dns:
  zone: conservice.ai
  required: true
services:
  - name: web                            # primary (services[0]) → status-page.conservice.ai
    port: 3000
    expose: public

    health_path: /health
  - name: dashboard                      # → status-page-dashboard.conservice.ai
    port: 3001
    expose: internal

    health_path: /health
  - name: api                            # → status-page-api.conservice.ai
    port: 8080
    expose: internal

    health_path: /health
  - name: mcp                            # → status-page-mcp.conservice.ai
    port: 8081
    expose: internal

    health_path: /health
authz:
  initial_grants:
    - alice@conservice.com

Each service gets its own Deployment + Service + HTTPRoute. The primary service (services[0]) routes at the bare apex hostname; non-primary exposed services suffix as {app}-{service}. See § Per-service hostname resolution.

How to deploy

Save your forge.yaml at infra/forge.yaml in your app repo.
Push a branch and open a PR.
Platform CI validates the schema, renders manifests, and commits them back to the PR.
Review the PR diff — it shows both your source change and the rendered infrastructure effect.
Merge to main. ArgoCD syncs the rendered manifests; Kargo promotes through prev → stg → prod.

That's it. You don't run Terraform, kubectl, or any CLI — the platform handles provisioning, DNS, TLS, secrets, and promotion.

Migrating from v1

v3 is backwards compatible — single-service apps render identically. The headline changes:

What's new	Summary
Multi-service apps	`services[]` may now contain multiple entries with mixed `expose:` values. See § Services.
Per-service image	`services[].image` block replaces flat `dockerfile`/`context`. See § Per-service image.
Authorization (authz)	Gateway-level authorization with per-app roles. See § Authorization (authz).
Per-service DNS	`services[].dns.hostname` overrides the default hostname per service. See § Per-service hostname resolution.

To upgrade: change forge_version: 1.0.0 to forge_version: 3.0.0 in your forge.yaml. Or run forge_migrate_yaml to auto-upgrade the file shape. The flat dockerfile and context fields still work but are soft-deprecated — migrate to the image block when convenient.

Common patterns

Complete examples for the most common app shapes. Copy the one closest to your use case and adjust.

Web API with PostgreSQL database

forge_version: 3.0.0
app_name: billing-api
team: billing
language: typescript
dns:
  zone: conservice.ai
  required: true
services:
  - name: api
    port: 8080
    expose: internal
    health_path: /health
resources:
  database:
    main:
      extensions: [uuid-ossp]

Public web app with SQS + DynamoDB

forge_version: 3.0.0
app_name: status-page
team: sre
language: typescript
dns:
  zone: conservice.ai
  required: true
services:
  - name: web
    port: 3000
    expose: public
    health_path: /health
resources:
  sqs:
    events:
      dlq: true
  dynamodb:
    sessions:
      hash_key: id
      ttl_attribute: expires_at

Multi-service app (API + worker)

forge_version: 3.0.0
app_name: ingest-pipeline
team: data
language: python
dns:
  zone: conservice.ai
  required: true
services:
  - name: api
    port: 8080
    expose: internal
    health_path: /health
  - name: worker
    image:
      dockerfile: Dockerfile
      target: worker
resources:
  sqs:
    jobs:
      visibility_timeout: 120
  s3:
    raw-data:
      versioning: true

Resource-only app (no services)

forge_version: 3.0.0
app_name: shared-data
team: data
services: []
resources:
  s3:
    exports: {}
  dynamodb:
    config:
      hash_key: key

AI app with Bedrock + vector DB

forge_version: 3.0.0
app_name: rates-agent
team: ai
language: typescript
dns:
  zone: conservice.ai
  required: true
services:
  - name: api
    port: 8080
    expose: internal
    health_path: /health
resources:
  database:
    main:
      extensions: [vector, uuid-ossp]
  bedrock:
    model_ids:
      - us.anthropic.claude-sonnet-4-20250514-v1:0
      - amazon.titan-embed-text-v2:0

App repo layout

forge.yaml is the source of truth, but the rendered output of the platform lives alongside it in your app repo. Knowing which paths are dev-editable vs CI-generated matters when you read a PR diff or wonder why your hand-edit "disappeared."

forge.yaml does not configure GitHub repository visibility or team access. Forge-created app repos are always private; repo creation and access grants are handled by Forge outside this schema. Do not add a visibility key to forge.yaml.

my-app/
└── infra/
    ├── forge.yaml                 # SOURCE OF TRUTH — dev edits this
    └── deploy/
        ├── patches/{env}/         # dev-editable escape hatch (kustomize patches)
        │   └── *.yaml             # strategic merge or JSON6902
        ├── rendered/{env}/        # CI-generated, do NOT edit
        │   ├── kustomization.yaml # plain rendered manifests
        │   └── *.yaml             # ArgoCD reads from here
        └── overlays/{env}/        # legacy, being deprecated

Path	Who edits	What it does
`infra/forge.yaml`	dev (source of truth)	The declarative spec this document describes. Everything below is rendered from it.
`infra/deploy/patches/{env}/*.yaml`	dev (escape hatch)	Kustomize patches layered on top of the base render. Use when you need a hand-edit not yet expressible in the schema (env vars / labels / annotations the platform hasn't surfaced as a field yet). Each patch represents a tracked schema-gap.
`infra/deploy/rendered/{env}/`	CI (platform render workflow)	Plain rendered Kubernetes manifests committed back to the PR branch by the platform's GitHub App on every push. ArgoCD reads from here. Dev hand-edits to this directory get overwritten on the next push.
`infra/deploy/overlays/{env}/`	legacy, being deleted	Pre-2026-05 scaffold output. Apps that haven't migrated still have it; the directory disappears once all apps cut over to the rendered-manifests layout.

The schema itself doesn't change under this layout — the platform renderer still consumes forge.yaml from infra/forge.yaml, validates it against the JSON Schema linked at the top of this page, and emits the rendered output. The only thing that's new is where rendered files land in the repo and who maintains them. Direct forge.yaml edits in a feature branch + PR are the supported flow: CI re-renders on every push, so the PR diff shows BOTH the source change AND the rendered effect before merge.

Top-level fields

The root object accepts these keys (and rejects unknowns — forge_version: 1.0.0 typo'd as forgeVersion will fail at validation, not silently default). Strict-key validation applies recursively: every nested object below also rejects unknown keys.

Field	Type	Required	Notes
`forge_version`	string	yes	Schema version. `X.Y` or `X.Y.Z` semver.
`app_name`	string	yes	Kebab-case (lowercase + digits + hyphens, start with letter, end with letter/digit), 3-22 chars — the 22 cap keeps every derived identifier (IAM role names, S3 bucket names, the PostgreSQL login role `aws-{app}-db-{tier}`) under its platform limit. Reserved prefixes are rejected — see Reserved app-name prefixes.
`team`	string	yes	Owning team's kebab-case slug. Resolves to `team-{team}@conservice.com` for membership and is stamped on every resource as the `team` AWS tag. AWS access is keyed by team via `aws-team-{team}-{tier}` groups; there is no per-app `aws-{app}-admin/readonly` group. Must match an entry in forge's allowed-teams list.
`domain`	string	no	Business domain (e.g., `billing`, `identity`, `platform`). Used for AWS resource tagging only — does not affect provisioning shape.
`portfolio`	string	no	Portfolio grouping for finance/cost-allocation rollups. Used for AWS resource tagging. Falls back to `team` when unset.
`services`	array	no	Containers Forge builds and deploys. Defaults to `[]` (resource-only app). When non-empty, `language` is required. See § Services.
`language`	enum	conditional	Runtime/language for the app's services. Required when `services` is non-empty. Closed enum: `typescript`, `javascript`, `python`, `go`, `csharp`, `java`, `rust`. See § Language.
`dns`	object	no	DNS exposure config (primary zone + hostname + optional sister aliases). See § DNS.
`app_config_keys`	array of strings	no	UPPER_SNAKE_CASE env var names for dev-managed secrets. See § App config keys.
`resources`	object	no	AWS resources to provision. See § Resources.
`consumes`	object	no	Cross-app resource and service consume declarations. See § Cross-app access.
`auth`	`"none"` or object	no	Authentication. A union: the literal `"none"` (public app, no authentication — served via a private ALB), OR an `auth:` block declaring the access mode (authorization itself is AVP-grant-based — see `authz`). See § Authentication (auth).
`authz`	object	no	Gateway-level authorization. See § Authorization (authz).
`github`	object	no	Opt-in runtime GitHub access via short-lived, Pod-Identity-minted tokens (no stored credential). `{ read?: bool, write?: [code\|issues] (exactly one), projects?: read\|write }`. Targets/breadth/permissions are server-side and platform-managed — never declared here. See § GitHub access (github).
`env_vars`	object	no	Static per-env config injected into the app ConfigMap. See § Env vars.
`disabled_envs`	array of enums	no	Opt the app out of specific platform envs. See § Environment opt-out.
`preview`	object	no	Preview-environment opt-in. See § Preview.
`image_tags`	object	no	Per-env image tags written by Kargo after promotion. See § Image tags.
`replicas`	object	no	Per-env fixed replica counts for non-autoscaled services (keys `stg`/`prod` only — no `prev`). For autoscaling, use the per-service `scaling` block instead. See § Replicas.
`render_channel`	enum	no	One of `general` (default) or `canary`. Selects which renderer pin channel the app's CI workflows track. Supersedes the deprecated `canary` boolean. See § Render channel.
`canary`	boolean	no	DEPRECATED — use `render_channel: canary` instead. Backwards-compatible alias for `render_channel: canary`. When both are set, `render_channel` wins. See § Canary (deprecated).
`app_kind`	enum	no	One of `web-service`, `worker`, `cron`, `batch`. Drives the Datadog monitor catalog gate — selects which kind-specific monitors emit. See § App kind.
`slo_tier`	enum	no	One of `tier-1`, `tier-2`, `tier-3`. Catalog-membership lever: how much of the monitor catalog this app gets. See § SLO tier.
`monitors`	object	no	Datadog monitor routing + paging config (PagerDuty service, Google Chat webhook, per-env routing, extras). See § Monitors.
`observability`	object	no	Per-app observability tuning (e.g. Datadog APM trace sample rate). See § Observability.

environments: is rejected at parse. The root object is strict and has no environments key — a file carrying that block fails validation. Remove the block; use disabled_envs: instead. See § Environment opt-out.

Language

The language field declares the runtime/ecosystem the app's services are written in. Forge uses it to pick the right Dockerfile base image and emit the matching per-language env contract.

language: typescript

Value	Notes
`typescript`	Per-language emitter shipped.
`javascript`	Per-language emitter shipped.
`python`	Per-language emitter shipped.
`go`	Per-language emitter shipped.
`csharp`	Per-language emitter shipped.
`java`	Per-language emitter shipped.
`rust`	Per-language emitter shipped.

Lowercase, no version suffix — the value identifies the ecosystem, not the specific runtime version. language is required when services is non-empty and accepted-but-ignored when services: [] (resource-only apps have no code to language-tag, but round-tripping the field is fine).

Adding a new language requires both a schema-enum bump and a corresponding per-language emitter in the platform renderer — file a platform request.

Services

Every app may declare zero or more services. Forge emits three structurally distinct shapes per service depending on port: and expose::

`port:` set?	`expose:` value	Forge emits
no	(n/a)	Just a Deployment (worker)
yes	(absent)	Deployment + ClusterIP Service (in-cluster HTTP, mesh-callable)
yes	`internal`	Deployment + ClusterIP + HTTPRoute, VPN-only (private gateway, auto-derived from `auth`)
yes	`public`	Deployment + ClusterIP + HTTPRoute, internet-facing (served from a private origin — no public-subnet load balancer). Authenticated public apps are supported today; no-auth public apps are not yet supported.

services:
  - name: api
    port: 8080
    expose: public                       # internet-facing (public)
    health_path: /health
    dockerfile: services/api/Dockerfile
    context: services/api/
    replicas: 2
    env:
      LOG_LEVEL: info
    resources:
      cpu_request: "100m"
      memory_request: "256Mi"
      memory_limit: "512Mi"
  - name: worker                         # no port + no expose → worker-only
    dockerfile: services/worker/Dockerfile
    context: services/worker/

Field	Type	Required	Notes
`name`	string	yes	2-40 chars, kebab-case (`^[a-z][a-z0-9-]*[a-z0-9]$` — lowercase letters, digits, dashes; start with letter, end with letter/digit). Used as the K8s service name + ECR repo suffix. Must be unique within `services[]`.
`port`	integer	conditional	1-65535. Required when `expose:` is set. Omitted = worker (Deployment only).
`expose`	enum	no	`"public"` (internet-facing, served from a private origin — no public-subnet load balancer; authenticated public apps are supported today, no-auth public apps are not yet supported) or `"internal"` (VPN-only). Omitted = ClusterIP-only (in-cluster HTTP, no gateway route). The routing tier is platform-derived from `auth` — nothing to configure. v3: multiple services may have `expose:` set (mixed public + internal). At most ONE service may be `public`; that service must be `services[0]` (the primary). Any number of services may be `internal`.
`dns`	object	no	v3 NEW. Per-service DNS override. `{ hostname?: string }` — `hostname` must match `^[a-z0-9]([a-z0-9.-]*[a-z0-9])?$`. When set, overrides the default hostname for this service's HTTPRoute. App-level `dns.zone` and `dns.required` are shared. See § Per-service hostname resolution.
`image`	object	no	v3 NEW. Per-service build configuration. `{ dockerfile?: string, target?: string }`. `dockerfile` replaces the flat `dockerfile` field (which is soft-deprecated). `target` enables multi-target Dockerfile builds (`docker build --target {target}`). See § Per-service image.
`auth`	object or `"none"`	no	Per-service auth declaration. `kind: "bearer"` flags a service whose callers can't follow OIDC redirects (MCP, CLI, programmatic clients) and routes it through the OAuth 2.1 bearer path — see § Authenticating an MCP server; the literal `"none"` declares an unauthenticated service surface. Omitted = the service inherits the app-level auth posture (OIDC-cookie for browser-facing apps). Per-user/role access is governed by `authz` grants, not per-service fields. See § Authentication.
`health_path`	string	no	HTTP path the K8s liveness/readiness probes hit. Probes are only emitted when BOTH `health_path` and `port` are set.
`dockerfile`	string	no	Soft-deprecated in v3 — use `image.dockerfile` instead. Path to Dockerfile, relative to `context`. Default: `Dockerfile`. Still accepted for backwards compatibility.
`context`	string	no	Soft-deprecated in v3 — use the `image` block instead. Docker build context, relative to repo root. Default: `.`. When `image.dockerfile` is set, this field is ignored (the Dockerfile path is repo-root-relative, making a separate context unnecessary).
`replicas`	integer	no	Fixed replica count for the Deployment. Non-negative. Default: 2 when `port` is set, 1 otherwise — applied by the platform renderer at emit time, not by the schema. Set `0` to suspend deployment without removing the resources. Mutually exclusive with `scaling` — a service is either fixed-count (`replicas`) or autoscaled (`scaling`), never both. (Not to be confused with the top-level per-env `replicas:` map — see § Replicas.)
`scaling`	object	no	v3 NEW. Per-service horizontal autoscaling (HPA). `{ min_replicas, max_replicas, target_cpu }`. When set, the service is managed by a HorizontalPodAutoscaler in stg + prod instead of a fixed replica count. Requires `resources.cpu_request` and is mutually exclusive with `replicas`. See § Per-service autoscaling.
`schedule`	object	no	Run this service on a schedule (a Kubernetes CronJob) instead of as an always-on Deployment. `{ cron, timezone, concurrency?, active_deadline_seconds? }` — mutually exclusive with everything routed/long-running (`port`, `expose`, `health_path`, `dns`, `scaling`, `replicas`). See § Scheduled services.
`env`	object (string→string)	no	Static env vars baked into the Deployment manifest (UPPER_SNAKE_CASE keys). Reserved prefixes (see Reserved env-var name prefixes) cannot be shadowed here.
`resources`	object	no	K8s resource requests/limits. Keys: `cpu_request`, `memory_request`, `memory_limit`. All three must be set together when the block is present — per-field fallback isn't supported. Omit the block entirely for platform defaults (100m / 256Mi / 512Mi).

Service-to-service communication

Services within the same app share a Kubernetes namespace. Call a sibling service at http://{service-name}:{port} (e.g. http://api:8080). No DNS suffix or service mesh config needed — Istio ambient handles mTLS transparently.

Resource ownership

All AWS resources declared in resources: are app-scoped, shared across every service in the app. Every service runs under the same pod IAM role and can access the same S3 buckets, SQS queues, databases, etc. There is no per-service resource isolation within an app.

DNS

Controls the app's public hostname, which Gateway-API listener routes to it, and any sister hostnames on additional zones.

dns:
  required: true
  zone: conservice.ai
  hostname: my-app
  aliases:
    - zone: conservice.cloud
      hostname: my-app.conservice.cloud

Field	Type	Required	Notes
`required`	bool	no	Whether DNS records get created. Default: false. Only a service with `expose:` set can satisfy `required: true`.
`zone`	enum	no	Primary zone. One of `conservice.ai`, `conservice.cloud`, `capturis.ai`, `svc.conservice.ai`. `conservice.ai` is the documented default for new internet-facing apps; the others are peer primaries for apps with audience/brand reason to live there. `svc.conservice.ai` is reserved for AWS infra CNAMEs and is rejected for forge apps by the scaffold-input refine — use `conservice.ai` with `services[].expose: internal` for VPN-only apps.
`hostname`	string	no	Optional explicit hostname override (e.g. `rates-prod.conservice.ai`). Must match `^[a-z0-9]([a-z0-9.-]*[a-z0-9])?$` (lowercase letters, digits, dots, dashes). Default: `{app_name}.{zone}`.
`aliases`	array of objects	no	Sister hostnames on additional zones for the same workload (max 5). Each entry: `{ zone, hostname }` — `hostname` must match `^[a-z0-9]([a-z0-9.-]*[a-z0-9])?$`. Each alias renders an additional HTTPRoute on the same Gateway pointing at the same backend Service (TLS terminates at the NLB via the wildcard cert per TLD). Used for multi-TLD apps like the auth front-door — primary on `conservice.ai`, sisters on `conservice.cloud` and `capturis.ai`.

Per-entry alias rules (enforced at parse time):

aliases[].zone must differ from the primary dns.zone (aliases are sister hostnames on a different TLD).
aliases[].zone cannot be svc.conservice.ai.
aliases[].zone values must be distinct across all entries — two aliases on the same zone would collide on the same Gateway listener.
aliases[].hostname must end with its declared zone (e.g. auth.conservice.cloud for zone: conservice.cloud).
aliases requires zone to be set (you can't alias-only without a primary).

Public vs. internal exposure is per-service — set expose: public or expose: internal on the service that should accept external/VPN traffic. See § Services.

Per-service hostname resolution (v3)

When a service has expose: set, Forge resolves its hostname using this precedence (first match wins):

Priority	Source	Example
1	`services[i].dns.hostname` (explicit per-service override)	`auth.conservice.ai`
2	`dns.hostname` (app-level, only when this is the sole exposed service — backwards-compat)	`rates.conservice.ai`
3	Primary service (`services[0]`): `{app_name}.{dns.zone}`	`status-page.conservice.ai`
4	Non-primary service: `{app_name}-{service}.{dns.zone}`	`status-page-dashboard.conservice.ai`

The primary-service convention is position-based: services[0] gets the bare apex hostname. Non-primary exposed services append -{service} to the app name, yielding {app_name}-{service}.{dns.zone}. This means the order of services in services[] matters for public apps.

tip

If you need two public-facing services (e.g. a web UI + a public API), either use two separate apps or route through a single public service that reverse-proxies to internal siblings.

Constraint: at most one public service, and it must be services[0]. Any number of internal services are allowed at any position. An all-internal app (no public) has no position constraint.

Per-service image (v3)

The image block on each service controls Docker build inputs:

services:
  - name: web
    image:
      dockerfile: Dockerfile        # path relative to repo root
      target: web                    # multi-stage build target
  - name: worker
    image:
      dockerfile: Dockerfile
      target: worker

Field	Type	Notes
`dockerfile`	string	Path to Dockerfile. Default: `Dockerfile`. Replaces the flat `services[].dockerfile` field (soft-deprecated).
`target`	string	Docker `--target` stage name. Enables a single multi-stage Dockerfile serving multiple services. Default: none — the whole Dockerfile builds.
`base`	enum	Base-image variant: `musl` (default, Alpine-class) or `glibc`. Use `glibc` when your service loads glibc-linked native addons (e.g. Temporal's core bridge) — on the musl base those fail at startup with `ERR_DLOPEN_FAILED`.

When image.dockerfile is set, the flat dockerfile and context fields are ignored. The CI build workflow emits a matrix entry per service, each with its own --target.

Per-service autoscaling (scaling)

The scaling block puts a service under a Kubernetes HorizontalPodAutoscaler (HPA) instead of a fixed replica count. The autoscaler holds the fleet at a target average CPU utilization, adding pods as load rises and removing them as it falls.

services:
  - name: api
    port: 8080
    expose: public
    health_path: /health
    resources:
      cpu_request: "250m"            # required when scaling is set
      memory_request: "256Mi"
      memory_limit: "512Mi"
    scaling:
      min_replicas: 2
      max_replicas: 10
      target_cpu: 70

Field	Type	Notes
`min_replicas`	integer	Positive. The floor the autoscaler scales down to — the replica count the Deployment holds under no load. Set ≥ 2 for high availability across rolling updates and single-pod failures.
`max_replicas`	integer	Positive, must be ≥ `min_replicas`. The ceiling the autoscaler scales up to under load. Size it to peak expected traffic divided by per-pod throughput.
`target_cpu`	integer	1-100. Target average CPU utilization as a percent of the pod's CPU request. The autoscaler adds pods when average CPU exceeds this and removes them when it falls below. `70` is a sensible default for CPU-bound services.

The block is strict — unknown keys are rejected.

Validation rules (enforced at parse time):

scaling requires resources.cpu_request on the same service. target_cpu is a percentage of the CPU request, so without a request the autoscaler has no denominator to compute desired replicas against.
scaling is mutually exclusive with the per-service replicas field. A service is either autoscaled or fixed-count — declaring both is rejected.
min_replicas must be ≤ max_replicas.

Behavior: Autoscaling is active in stg and prod only. In those envs Forge emits an autoscaling/v2 HorizontalPodAutoscaler and omits the Deployment's spec.replicas so the HPA owns the count; ArgoCD is configured not to reconcile that field, so it won't fight the autoscaler. Preview environments are not autoscaled — an autoscaled service's Deployment runs at the Kubernetes default of 1 pod in preview. For a fully-autoscaled app the kustomize per-env replicas: overlay disappears; for a mixed app it stays in place for the non-autoscaled services only.

Recently added field

The scaling block is a recent addition. Apps still on an older renderer build will fail schema validation at CI render — the older build's strict schema rejects scaling as an unknown key. If your render step rejects scaling as unknown, your app's renderer hasn't picked up the field yet; wait for the platform render version to advance (or opt into the canary channel) before adding the block.

App config keys

Dev-supplied secrets (API tokens, OAuth credentials, third-party keys) flow through app_config_keys:

app_config_keys:
  - STRIPE_API_KEY
  - SENTRY_DSN
  - DATADOG_API_KEY

UPPER_SNAKE_CASE, must start with a letter.
These get REPLACE_ME placeholders in {app}/config Secrets Manager on first apply. Devs populate values via GitHub Environment Secrets → External Secrets Operator (ESO) sync (NOT by editing AWS Secrets Manager directly).
PR sync semantics: app_config_keys changes on a PR (additions / removals) are reflected in the PR's preview environment before merge. On merge to main, the same diff flows through to stg/prod placeholders.
Reserved prefixes (rejected at scaffold time, before infrastructure rendering): see Reserved env-var name prefixes below for the full list. These flow through the platform-managed ConfigMap, not through dev-supplied secrets. Putting them in app_config_keys is a mistake.

Env vars

Static per-environment config variables injected directly into the app's platform ConfigMap. Use for non-secret values that differ per environment (e.g., the app's own public URL). Secrets should go through app_config_keys + GitHub Environment Secrets instead.

env_vars:
  EXTERNAL_HOSTNAME:
    prev: "https://auth.prev.conservice.ai"
    stg: "https://auth.stg.conservice.ai"
    prod: "https://auth.conservice.ai"

Each key is an UPPER_SNAKE_CASE env var name; the value is a map of env name (prev / stg / prod) to string. Only the declared envs receive the value — you may declare a subset (e.g., just stg and prod) if the variable isn't needed in every environment.

Delivery path: the platform renderer emits the values into the app's {app}-env ConfigMap at kustomize-render time. The pod picks them up via envFrom.
Reserved prefixes rejected: The same prefixes reserved for platform-managed vars (DATABASE_*, S3_BUCKET_*, AWS_REGION, etc.) are rejected at schema validation. See Reserved env-var name prefixes.
Typed env contract: Keys declared in env_vars appear in the generated src/env.d.ts and src/lib/env.ts files, so TypeScript apps get compile-time type checking.
Preview: The prev value is used for all preview environments (per-PR envs inherit the prev entry).

Resources

Optional top-level block declaring AWS resources the app needs. Every resource type is optional. Unknown top-level keys (e.g., a typo'd resources.bedrocks) are rejected.

Naming convention: account-scoped resources (SQS, SNS, EventBridge, Step Functions, DynamoDB, Firehose, KMS, IAM) use {env}-{region}-{app}-{key} — no prefix, since the account ID in every ARN already disambiguates. S3 buckets are the one exception: they retain the conservice- prefix because S3 bucket names are globally namespaced and need a company prefix to prevent collisions across all of AWS.

resources:
  s3:
    chat-history:
      versioning: true
  sqs:
    jobs: {}
    notifications:
      dlq: true
      dlq_retention_seconds: 1209600
  database:
    main:
      extensions: [vector, uuid-ossp]
  bedrock:
    model_ids:
      - us.anthropic.claude-sonnet-4-20250514-v1:0
      - amazon.titan-embed-text-v2:0
  kms:
    token-envelope:
      description: "Envelope-encrypt OAuth tokens before writing to DDB"
      actions: [Encrypt, Decrypt, GenerateDataKey]

Resource key naming (applies to ALL resource types)

Map keys must be:

1-64 chars (resource-specific tighter caps noted per type below)
Lowercase, start with a letter
Contain only [a-z0-9_-]
NOT start with pr- — reserved for per-PR ephemeral resources

The key becomes the resource suffix. Example: s3.chat-history → bucket conservice-{env}-{app}-chat-history.

Per-resource grant fields (s3 / sqs / dynamodb / eventbridge / database)

These resource types may carry optional grant arrays. They name principals that should get tiered access to this resource only — narrower than a full app-level grant. (sns, stepfunctions, firehoses, and kms don't currently take the array-shaped grants; they participate only in the cross-app access: policy described below. KMS has a parallel-but-nested access.team_grants shape — see § KMS.)

team_grants: [{ team, tier }] — give a team's per-team Permission Set tiered access to this resource. team is the kebab slug (resolves to team-{team}@conservice.com); tier is admin or readonly.
user_grants: [{ email, tier }] — direct-add a single @conservice.com user (use sparingly; team_grants is preferred for code-review auditability).
group_grants: [{ group, tier }] — give a non-team Google group (e.g. conservice-finance@conservice.com) tiered access.

Status of materialization:

database.{name}.team_grants / user_grants — fully wired end-to-end. The platform materializes the per-app PostgreSQL login role and the cross-team rds-db:connect grant on the team's AWS Permission Set. Recipient can psql into the DB and has NO access to the app's other AWS resources.
database.{name}.group_grants — accepted at the schema layer; materializes via the per-team Permission Set enumeration.
s3 / sqs / dynamodb / eventbridge team_grants / user_grants / group_grants — accepted at the schema layer; platform-side consumption ships in a follow-on phase. Declaring entries today has no runtime effect on these resource types yet, but the declaration shape is stable.

Per-resource cross-app policy (`access` / `allowed_teams` / `allowed_apps` / `tags`)

The eight resource kinds that participate in cross-app consumes: (s3, sqs, sns, dynamodb, eventbridge, stepfunctions, firehoses, database) each accept an optional cross-app consume policy:

resources:
  s3:
    embeddings-cache:
      access: team
      allowed_teams: [ai]
      tags:
        sensitivity: pii

Field	Type	Notes
`access`	enum	`open` (any app in the org may declare `consumes:` against this resource), `team` (only apps owned by a team in `allowed_teams`), `app` (only apps in `allowed_apps` — most restrictive). Auto-defaults to `team` for `database` and for any resource carrying `tags.sensitivity ∈ {pii, pci, hipaa, soc2}` — declare explicitly to override.
`allowed_teams`	array of team slugs	Required non-empty when `access: team` is set explicitly. Empty array (`[]`) is a parse error — either omit, add an entry, or pick a different `access` level.
`allowed_apps`	array of app names	Required non-empty when `access: app` is set explicitly. Same kebab-case rules as `app_name`.
`tags.sensitivity`	enum	`pii`, `pci`, `hipaa`, `soc2` (auto-default `access: team`), or `public` / `internal` (positive-intent labels, no auto-default). Closed enum — typos like `confidential` fail at parse time.

Consumer-side declarations go in the top-level consumes: block.

S3 (`resources.s3`)

s3:
  history:
    versioning: true
  uploads: {}

Field	Type	Notes
`versioning`	bool	Enable S3 versioning. Default: true. All platform buckets default versioning ON; set `false` only when you don't want object history.

KMS encryption + public-access-block are always on. Per-bucket team_grants / user_grants / group_grants are accepted (renderer-side consumption is in flight — see above).

Bucket-key cap: 20 chars (enforced at the provisioning layer, not at schema parse). S3's 63-char bucket-name limit minus conservice-{env}-{app}- prefix leaves ~24 chars; cap at 20 for safety.

Resolved bucket name: conservice-{env}-{app}-{key} (e.g., conservice-prod-my-app-history). S3 retains the conservice- prefix because S3 bucket names are globally namespaced.

Emitted env var: S3_BUCKET_{KEY} → e.g. S3_BUCKET_HISTORY containing conservice-prod-my-app-history.

SQS (`resources.sqs`)

sqs:
  jobs:
    visibility_timeout: 30
    retention_seconds: 1209600
    dlq: true
    max_receive_count: 5

Field	Type	Notes
`dlq`	bool	Provision a Dead-Letter Queue and wire the redrive policy on the main queue. Default: true.
`dlq_retention_seconds`	int	DLQ retention in seconds. Default: 1209600 (14 days, AWS max).
`visibility_timeout`	int	Seconds. Default: 30. Set higher when the consumer's per-message processing time can exceed 30s — otherwise the same message is redelivered while still being processed.
`retention_seconds`	int	Main-queue retention in seconds. Default: 345600 (4 days; AWS max 14 days).
`max_receive_count`	int	DLQ-trigger threshold. Default: 5. Lower for fail-fast; higher when transient retries are normal.

Server-side encryption via SQS-managed keys is always on. The pod role gets Send/Receive/Delete on the queue and its DLQ.

Resolved name: {env}-use1-{app}-{key}-queue.

Emitted env vars: SQS_QUEUE_{KEY} (queue URL) and SQS_QUEUE_{KEY}_ARN → e.g. SQS_QUEUE_JOBS.

SNS (`resources.sns`)

sns:
  events: {}

No type-specific knobs declared today — empty object opts in. The pod role gets sns:Publish on the topic ARN.

Resolved name: {env}-use1-{app}-{key}-topic.

Emitted env var: SNS_TOPIC_{KEY}_ARN → e.g. SNS_TOPIC_EVENTS_ARN.

DynamoDB (`resources.dynamodb`)

dynamodb:
  sessions:
    hash_key: id
    hash_key_type: S
    ttl_attribute: expires_at
    point_in_time_recovery: true
  events:
    hash_key: stream_id
    range_key: event_id
    range_key_type: S
    billing_mode: PAY_PER_REQUEST
    gsi:
      by-status:
        hash_key: status
        range_key: created_at
        projection_type: ALL

Field	Type	Notes
`hash_key`	string	Required. Partition key attribute name.
`hash_key_type`	enum	`S` (string), `N` (number), `B` (binary). Default: `S`.
`range_key`	string	Sort key attribute name.
`range_key_type`	enum	Same set as `hash_key_type`. Default: `S`.
`billing_mode`	enum	`PAY_PER_REQUEST` (default) or `PROVISIONED`.
`gsi`	object map	Global Secondary Indexes. Each entry: `hash_key` (required), `range_key?`, `projection_type?` (`ALL` / `KEYS_ONLY` / `INCLUDE`, default: `ALL`). Key types on GSI attributes inherit from the parent table's `hash_key_type` / `range_key_type` — there are no per-GSI type overrides.
`ttl_attribute`	string	Attribute holding TTL epoch seconds. Enables TTL when set.
`point_in_time_recovery`	bool	Default: true.

KMS-encrypted by default. The auto-emitted pod-role policy covers data-plane read/write/query/scan PLUS dynamodb:DescribeTable — DescribeTable is the canonical no-op control-plane probe for readiness checks (verifies IAM + resource exists without leaking item data), so app /readyz handlers can call it without hitting AccessDenied.

Resolved name: {env}-use1-{app}-{key}.

Emitted env var: DYNAMODB_TABLE_{KEY} → e.g. DYNAMODB_TABLE_SESSIONS containing prod-use1-my-app-sessions.

EventBridge (`resources.eventbridge`)

eventbridge:
  domain:
    rules:
      order-placed:
        pattern:
          source: ["my-app.orders"]
          detail-type: ["OrderPlaced"]
        description: "Fire on new orders"

Field	Type	Notes
`rules`	object map	Each rule: `pattern` (object — EventBridge event pattern, validated server-side at apply time), `description` (string).

Resolved bus name: {env}-use1-{app}-{key}.

Emitted env var: EVENTBRIDGE_BUS_{KEY} → e.g. EVENTBRIDGE_BUS_DOMAIN.

Step Functions (`resources.stepfunctions`)

stepfunctions:
  flow:
    type: STANDARD
    definition: |
      {
        "StartAt": "Hello",
        "States": { "Hello": { "Type": "Pass", "End": true } }
      }
    log_level: ALL
    log_retention_days: 30

Field	Type	Notes
`type`	enum	`STANDARD` (default) or `EXPRESS`.
`definition`	string	Required. ASL JSON definition.
`log_level`	enum	`ALL` / `ERROR` / `FATAL` / `OFF`.
`log_retention_days`	int	CloudWatch log retention.

Key cap: 16 chars (enforced at the provisioning layer — IAM role name {prefix}-sfn-{key}-role hits AWS's 64-char ceiling; not enforced at schema parse).

Resolved name: {env}-use1-{app}-{key}.

Emitted env var: SFN_ARN_{KEY} → e.g. SFN_ARN_FLOW.

Bedrock (`resources.bedrock`)

bedrock:
  model_ids:
    - us.anthropic.claude-sonnet-4-20250514-v1:0
    - amazon.titan-embed-text-v2:0

Field	Type	Notes
`model_ids`	array of strings	Required. Non-empty. AWS Bedrock model IDs. Adds `bedrock:InvokeModel` to the pod role.
`knowledge_bases`	bool	Adds Knowledge Base API permissions. Default: false.
`guardrails`	bool	Adds `bedrock:ApplyGuardrail` permission. Default: false. The per-app guardrail resource itself is not provisioned yet — IAM scope only.

Anthropic models need a region prefix

Bare anthropic.* model IDs are rejected. Use the regional inference profile form: us.anthropic.claude-sonnet-4-20250514-v1:0, not anthropic.claude-sonnet-4-20250514-v1:0.

Model ID validation: Each entry must be a valid Bedrock invocation target — either a versioned foundation-model ID ending in :N (e.g. amazon.titan-embed-text-v2:0) or a cross-region inference profile starting with a region prefix (us. / eu. / apac. / global. / ap.). Anthropic models REQUIRE a regional inference profile prefix — bare anthropic.* IDs are rejected at parse time because AWS Bedrock fails them at invocation with "on-demand throughput isn't supported" (validated 2026-05-09).

Common gotcha: the schema accepts model_ids (plural, underscore). The block uses strict-key validation, so any unknown key fails — including the common typos enabled, models, and singular model_id. Presence of the block (non-null) is the opt-in; there's no enabled: true.

Database (`resources.database`)

The simplest database declaration — just a PostgreSQL database with no extras:

database:
  main: {}

Full example with all optional fields:

database:
  main:
    extensions: [vector, uuid-ossp]
    schemas: [app, audit]              # extra schemas owned by the migration role
    connection_limit: 100
    team_grants:
      - team: data
        tier: readonly
    user_grants:
      - user: alice
        tier: admin
    # migrations default ON (managed Liquibase) — omit the field entirely to keep it.
    # Override with a custom command, or set `migrations: false` to opt out:
    migrations:
      command: ["npm", "run", "migrate"]
      runs_on: [prev, stg, prod]
    seed:                              # preview-only fixture seeding (optional)
      command: ["npm", "run", "seed:preview"]

Field	Type	Notes
`extensions`	array of strings	PostgreSQL extensions to enable. Allowlist: `vector` (NOT `pgvector` — the schema rejects `pgvector`; `pgvector` is the project name, `vector` is the extension name), `uuid-ossp`, `pg_trgm`, `hstore`, `citext`, `postgis`, `btree_gist`, `btree_gin`, `unaccent`, `fuzzystrmatch`.
`connection_limit`	int	Per-role PostgreSQL `CONNECTION LIMIT`. Default: unlimited at the module.
`team_grants`	array	Per-DB team grants — `{ team: <slug>, tier: admin\|readonly }`. On apply, the team's per-team Permission Set gains `rds-db:connect` on `arn:aws:rds-db:::dbuser:*/aws-{app}-db-{tier}`. Recipient gets DB-ONLY AWS access (no S3, queues, secrets-other-than-the-DB-secret, console for any other app resource).
`user_grants`	array	Per-DB user grants — `{ user: <google-username>, tier: admin\|readonly }`. Narrow per-user exception adding `rds-db:connect` on `aws-{app}-db-{tier}` for a single individual. `user` is the LEFT side of `@conservice.com` (e.g. `alice`, `bob.smith`) — no `@conservice.com` suffix.
`group_grants`	array	Per-DB Google-group grants — `{ group: <conservice.com group email>, tier }`. For non-team groups (e.g. `conservice-finance@conservice.com`). Materializes via the per-team Permission Set enumeration.
`schemas`	array of strings	Additional PostgreSQL schema names to pre-create in this database (default: none — the app uses `public`). Each name is created by the platform and owned by the migration role, so the managed migration job can create and own objects in it while the runtime service role gets `USAGE` only (no DDL). Use for frameworks that default to a named schema (e.g. EF Core `HasDefaultSchema("app")`). The same schemas are pre-created in per-PR (preview) databases. Each entry must match `^[a-z][a-z0-9_]$`, be ≤ 63 chars, and is not* `public`, `information_schema`, or any `pg_*` name — those are reserved/system schemas and rejected at parse.
`migrations`	`false` or object	Managed database migrations. Default ON (the field is optional and defaults to enabled when omitted): forge bakes Liquibase into the app image and runs a migrate Job — a `migration-job.yaml` ArgoCD Sync hook — in every env overlay before app pods start, so schema is bootstrapped before traffic. Set `migrations: false` to turn it OFF — forge then skips both the migrate Job and the Liquibase/JRE image layer (use this when the app manages its own schema or has none). Provide an object `{ command: [string], runs_on?: [env] }` to customize. See § Database migrations below.
`seed`	object	Preview-only fixture-data seed runner. `{ command: [string], runs_on?: [prev] }`. When set, forge emits a `seed-job.yaml` ArgoCD Sync hook into the preview overlay only, running `command` as the app's runtime role after the Deployment is healthy. Preview-only by design — `stg` / `prod` are rejected in `runs_on`. See § Preview-only seed below.

Database tenancy: Aurora is a SHARED cluster across all apps in an env. Each database.{key} declaration creates a logical PostgreSQL database inside the shared cluster.

Resolved DB name: {app_underscored} (hyphens become underscores). Service user: {app_underscored}_svc.

engine is NOT a valid key. Always aurora-postgresql (set at the platform layer). Strict-key validation rejects any unknown key — engine is a particularly common one to accidentally include because it's standard in raw RDS Terraform. Don't.

Emitted env vars: DATABASE_HOST, DATABASE_PORT, DATABASE_NAME, DATABASE_USER — your app reads these from the pod environment. IAM auth (rds_iam) is enabled automatically; the platform provisions the PG roles and IAM bindings. You don't create DB users in forge.yaml.

Granting another team DB access: add a team_grants entry. That's the entire dev-facing surface. The platform wires the IAM role, Permission Set, and PG login role automatically.

How DB access works under the hood

Two grant surfaces apply, layering from broad → narrow. Both wire to the same per-app PostgreSQL login role aws-{app}-db-{tier}.

1. Team-keyed full AWS access (configured outside forge.yaml) — the owning team and any team granted DB access get full AWS access via per-team Permission Sets (team-{team}-{env_short}-admin / team-{team}-{env_short}-readonly). AWS access is keyed by team, not by app — there is no aws-{app}-admin/readonly group.

2. Per-DB grants on database.{name} — narrow scope, DB-only. These are the declarations developers write in forge.yaml:

tier: admin → full read/write via login role aws-{app}-db-admin. No access to S3, queues, or other AWS resources.
tier: readonly → SELECT-only via login role aws-{app}-db-readonly. Same DB-only narrow scope.

Per-DB grants are the right surface for cross-team data access (e.g. team-data needs read-only access to the billing app's databases but should NOT see S3 or queues). Multiple teams sharing a tier share one cluster-level login role; per-team identity gating happens at the Permission Set layer.

warning

Removed fields: admin_groups, readonly_groups, admin_users, readonly_users are no longer accepted. Use team_grants / user_grants instead.

Database migrations

Managed migrations are ON by default. When you declare a database and omit the migrations field, forge bakes Liquibase (plus a JRE layer) into the app image and emits a migrate Job — migration-job.yaml, an ArgoCD Sync hook — in every env overlay. The Job re-uses the app image, mints an RDS IAM token, and runs the migration against the per-env (or per-PR) database before app pods start. Sync-wave ordering guarantees the Deployment only rolls once the migrate Job completes, so app code never sees an un-migrated schema. The default command for the bundled Liquibase wrapper is ["infra/database/migrations/migrate.sh"] (mints the IAM token, then runs liquibase update). You don't write any of this — declaring the database is enough to get migrations.

The field is a union of three shapes:

`migrations:` value	Behavior
omitted (default)	Managed migrations ON — forge runs the Liquibase migrate Job and bakes the Liquibase/JRE layer into the image. Nothing to declare.
`false`	Managed migrations OFF — forge skips both the migrate Job and the Liquibase/JRE image layer. Use when the app manages its own schema, or has none.
`{ command, runs_on? }`	Custom runner — replace the default command and/or scope which env overlays run it.

When you supply an object:

Field	Type	Notes
`command`	array of strings	Required, non-empty. Argv array the Job's container runs — each entry is a literal string, no shell expansion. Tool-agnostic: point it at node-pg-migrate, Prisma, Atlas, a `psql` script, etc. (also update the install layer in your Dockerfile to match).
`runs_on`	array of enums	Which env overlays emit the migrate Job. Each entry is `prev`, `stg`, or `prod`. Default (omitted): all three. Scope to a subset (e.g. `[prev, stg]`) to skip prod migrations during a schema-stable window.

# Opt out of managed migrations entirely:
database:
  main:
    migrations: false

# Or supply a custom runner:
database:
  main:
    migrations:
      command: ["npm", "run", "migrate"]
      runs_on: [prev, stg, prod]

Preview-only seed

The optional seed block loads fixture data into preview databases only — fake data must never reach staging or production, so the runner is hard-scoped to prev. When set, forge emits a seed-job.yaml ArgoCD Sync hook into the preview overlay only, running command as the app's runtime service role (fixtures are DML inserts, not DDL — they don't use the migration identity) after the app Deployment is healthy.

Field	Type	Notes
`command`	array of strings	Required, non-empty. Argv array the seed Job runs (e.g. `["npm", "run", "seed:preview"]`). Literal strings, no shell expansion. Make it idempotent (e.g. `INSERT ... ON CONFLICT DO NOTHING`) — ArgoCD re-syncs re-run it.
`runs_on`	array of enums	Only `prev` is accepted; `stg` / `prod` are rejected at parse. Optional, defaults to `[prev]` — the field mostly exists to document intent.

database:
  main:
    seed:
      command: ["npm", "run", "seed:preview"]

Temporal (`resources.temporal`)

temporal:
  retention_days: 30
  api_key_expiry: "2027-05-03T00:00:00Z"

Field	Type	Notes
`retention_days`	int	Workflow history retention. Default: 30.
`api_key_expiry`	string	Required. ISO 8601 timestamp. Pinned at scaffold; runtime re-reads from forge.yaml (deterministic re-render: byte-identical input produces byte-identical output). Rotate by editing this value and re-rendering.

Common gotchas: rejected keys include enabled, namespace, regions, search_attributes, enable_delete_protection. Presence of the block is the opt-in. Namespace is derived from app_name.

Resolved namespace: {app}-{env}.<your-temporal-cloud-namespace>.

Firehose (`resources.firehoses`)

firehoses:
  webhook-archive:
    destination: s3
    bucket: webhook-events  # MUST match a key in resources.s3
    prefix: "events/"
    buffer_size_mb: 5
    buffer_interval_seconds: 300
    compression: GZIP

Field	Type	Notes
`destination`	string	Required, only `s3` today. Redshift / OpenSearch / Splunk are deferred.
`bucket`	string	Required. Must match a key declared in `resources.s3` (cross-validated at parse time).
`prefix`	string	S3 key prefix for delivered records. Default: `""`.
`buffer_size_mb`	int	1-128. Default: 5.
`buffer_interval_seconds`	int	60-900. Default: 300.
`compression`	enum	`UNCOMPRESSED`, `GZIP` (default), `SNAPPY`, `ZIP`, `HADOOP_SNAPPY`.

Key cap: 16 chars (enforced at the provisioning layer — IAM role {prefix}-fh-{key}-role hits 64-char ceiling; not enforced at schema parse).

Resolved name: {env}-use1-{app}-{key}.

Emitted env var: FIREHOSE_STREAM_{KEY} → e.g. FIREHOSE_STREAM_WEBHOOK_ARCHIVE.

KMS (`resources.kms`)

Per-app customer-managed KMS keys (CMKs) for app-initiated envelope encryption — e.g., an auth service that envelope-encrypts upstream IdP tokens before writing them to DynamoDB. Distinct from the AWS-managed SSE-KMS that already covers DDB and S3 at rest (alias/aws/dynamodb / alias/aws/s3); that's transparent to the app. The case here is app code calling kms:Encrypt / Decrypt / GenerateDataKey directly against a CMK the app controls.

resources:
  kms:
    token-envelope:
      description: "Envelope-encrypt upstream IdP tokens before DDB writes"
      actions:
        - Encrypt
        - Decrypt
        - GenerateDataKey
        - DescribeKey
      rotation: enabled

Key name (the map key) is kebab-case, 2-20 chars, must start with a letter and end alphanumeric. Same reserved-prefix list as app_name. Pick a name that describes the encryption purpose (token-envelope, secrets), not the resource type.

Field	Type	Notes
`description`	string	Optional human-readable description (shown in the KMS console). When `rotation: disabled` is set, the description should mention the compliance reason — auditor-friendly. Max 8192 chars.
`key_spec`	enum	`SYMMETRIC_DEFAULT` (default; only value accepted today). Asymmetric specs (`RSA_`, `ECC_`) are deferred — they need a different action allowlist (`Sign`/`Verify`/`GetPublicKey`) and will arrive in a future release.
`actions`	array of strings	Required, non-empty. Data-plane KMS actions to grant to the pod role on this key. Allowlist: `Encrypt`, `Decrypt`, `GenerateDataKey`, `GenerateDataKeyWithoutPlaintext`, `ReEncryptFrom`, `ReEncryptTo`, `DescribeKey`. Unknown verbs fail at parse time. Key administration verbs (`CreateKey`, `ScheduleKeyDeletion`, `PutKeyPolicy`, ...) are deliberately omitted — key lifecycle is managed by the platform, not the app.
`rotation`	enum	`enabled` (default — annual KMS rotation) or `disabled`. Set `disabled` only with a compliance reason in `description`.
`tags`	object (string→string)	Optional pass-through tags applied to the KMS key. Standard k/v string map; no platform-side validation beyond the type. Use for cost-allocation (`cost_center: ...`) or compliance markers.
`access.team_grants`	array	Per-key team grants (accepted at the schema level; platform-side consumption deferred to a follow-on release — declaring entries today has no runtime effect).

Resolved name: {env}-{region_code}-{app}-{key_name}-key (e.g., prod-use1-auth-service-token-envelope-key). Resolved alias: alias/{env}-{region_code}-{app}-{key_name}-key.

Auto-emitted env vars (KMS_KEY_ is a reserved env-var prefix):

KMS_KEY_{KEY_NAME_UPPER_SNAKE}_ID (e.g. KMS_KEY_TOKEN_ENVELOPE_ID)
KMS_KEY_{KEY_NAME_UPPER_SNAKE}_ARN

App code reads these from the pod environment — never construct a KMS key ID or alias in app code.

Cross-app access (per-resource policy + consumes)

Cross-app resource access is a two-sided contract: producers declare who is allowed to consume each resource (the per-resource access / allowed_teams / allowed_apps / tags fields documented above); consumers declare which producer resources and services they want to use, via the top-level consumes: block.

# Consumer side: my-app declares it wants to read rates' embeddings cache
consumes:
  resources:
    - producer_app: rates
      resource_kind: s3
      resource_key: embeddings-cache
      actions: [read]
  services:
    - producer_app: rates
      service_name: api

Field	Type	Notes
`consumes.resources`	array	Cross-app resource declarations. Each entry: `{ producer_app, resource_kind, resource_key, actions: [string] }`. The resolver matches each entry against the producer's `resources.{kind}.{key}` and the producer's `access` / `allowed_teams` / `allowed_apps` policy at scaffold/modify time.
`consumes.services`	array	Cross-app service declarations. Each entry: `{ producer_app, service_name }`. Data-only today — no Istio AuthorizationPolicy is emitted yet (deferred to a future release). Records intent so future wiring lands without a schema migration.

producer_app follows the same kebab-case rules as app_name (reserved prefixes rejected). resource_kind is one of: s3, sqs, sns, dynamodb, eventbridge, stepfunctions, firehoses, database.

actions: is high-level, not raw IAM verbs — entries like read, write, consume, produce, admin. The action-expansion utility maps each high-level action to per-kind IAM actions at emit time; unknown high-level actions for a kind throw at scaffold/modify time (NOT at schema parse — the schema treats actions as opaque strings to keep the kind/action coupling in one place).

Same-account only today. Cross-account consumes raise cross_account_not_supported at resolve time rather than at parse time.

Preview-environment plumbing: when the consumer is itself running in a per-PR preview environment, the resolver injects the consumer's pr_number into non-S3 ARNs to disambiguate per-PR resources (S3 is per-app, not per-PR, so its ARN doesn't carry the segment).

Authentication (`auth`)

The auth field controls who can reach the app at the identity layer — whether the app requires a logged-in @conservice.com user. It pairs with authz: auth is authentication (WHO can reach the app), authz is AVP Cedar authorization (WHO is permitted, and with which roles, once authenticated). See How auth and authz fit together below.

auth is a union — either the literal string "none" or an auth: block.

`auth: "none"` — public, no authentication

auth: "none"

A no-auth app. Forge skips tier-group creation, Auth Service registration, and AVP setup. The app is served via a private ALB (not the Istio gateway) — there is no authentication enforcement. No-auth internal apps are VPN-only; no-auth public apps are not yet supported.

Two combinations are rejected at validation when auth: "none":

authz: cannot be set. A public app has no authenticated principal, so AVP grants are meaningless.
Routing is platform-derived. A no-auth app is automatically kept OFF the authenticated (ext_authz-enforcing) routing tier and served via a private ALB — there is nothing to configure. (A legacy gateway: key is accepted for back-compat and stripped; never set it.)

`auth:` block — authenticated

auth:
  access_mode: restricted
  strict: false
  hidden: false

Field	Type	Required	Notes
`access_mode`	enum	no	`"restricted"` (default) allows only users granted access via AVP Cedar policies (your team is seeded at scaffold; everything after that is managed in auth-portal or via forge's grant tools). `"all"` allows any authenticated `@conservice.com` user — authentication is still enforced, per-user authorization is not.
`strict`	boolean	no	Default `false`. When `true`, the Auth admin UI cannot grant access beyond what's declared — `forge.yaml` is the sole authority. Recommended for prod-critical or Aurora-touching apps.
`hidden`	boolean	no	Default `false`. Hides the app's tile from the Portal dashboard. The app stays reachable at its hostname; it just doesn't appear in the user's tile grid.

Migrating from auth.tiers? The per-app tier-group model (tiers:, self_register:, per-service auth.tier) has been retired — those fields now fail validation. Role and access management moved to authz grants (AVP Cedar): declare authz.initial_grants for scaffold-time seeds and manage everything else in auth-portal. Your app reads the caller's identity and roles from the x-verified-* headers.

How auth and authz fit together

These are two distinct layers, both enforced at the Istio gateway before the request reaches your app:

auth — authentication. Validates the user's Google session. Answers WHO can reach the app at the identity layer.
authz — AVP (AWS Verified Permissions) Cedar authorization. Checks whether the authenticated user holds a grant for this app. Answers WHO is permitted once authentication succeeds. Fail-closed: no grant means 403.

Your app reads the authenticated caller's identity and roles from the injected x-verified-* headers (see Reading user identity and roles in your app) — role logic lives in your code against those headers, not in per-service config.

MCP servers (and any caller that can't follow a browser login redirect) use the per-service auth.kind: bearer field instead of the cookie front door. See § Authenticating an MCP server.

Authorization (`authz`)

All authentication and authorization for apps on the greenfield platform flows through the Istio gateway:

Authentication — validates the user's Google session. Any @conservice.com employee is authenticated. No app code needed.
Authorization — checks if the authenticated user has a grant for this app. Fail-closed: no grant = 403.

You don't write auth code. The platform handles both layers at the gateway before the request reaches your app. If your app needs to know who the user is or what roles they have, read the x-verified-* headers from the request.

How it works

User → Google login → Authorization check → Your app
                        ↓
                ALLOW → request forwarded with x-verified-* headers (identity + roles)
                DENY  → 403 Forbidden (user never reaches your app)

forge.yaml setup

The simplest setup — your team automatically gets access:

services:
  - name: api
    port: 8080
    expose: internal
    # gateway auto-derived: istio (authenticated apps get ext_authz enforcement)
    health_path: /health
dns:
  zone: conservice.ai
authz: {}

That's it. At scaffold time, forge automatically creates a group-based grant for Group::"team-{team}@conservice.com" on Application::"{app_name}" — the Admin role in preview, the Access role in staging/production. Every member of your team can access the app immediately — no individual grants needed.

With additional individual grants (optional — for people outside your team):

authz:
  initial_grants:
    - alice@conservice.com
    - bob@conservice.com

Field	Type	Required	Notes
`initial_grants`	array or per-env map	no	Default: `[]`. Grants seeded at scaffold time, beyond the automatic team grant. Two accepted shapes — see below.

initial_grants accepts either of two shapes:

Legacy flat list — an array of @conservice.com emails. Each listed user is granted the access role in every env (prev / stg / prod).
```
authz:
  initial_grants:
    - alice@conservice.com
    - bob@conservice.com
```
Per-env, per-role map — keys are env names (prev / stg / prod), each mapping a role name to a list of principals. A principal is either a team slug in team-<slug> form or a @conservice.com email.
```
authz:
  initial_grants:
    prev:
      admin: [team-ai]
      access: [alice@conservice.com]
    stg:
      access: [team-ai]
    prod:
      access: [team-ai]
```
Every role key used in the map must be a built-in (access / admin) or a name declared under authz.roles[] — an undeclared-role grant is rejected at parse (it would otherwise be silently dropped and the principal would never get access). Each env is optional; omit envs you don't want to seed.

Key points:

Auth is automatic. Authenticated apps (the default) auto-derive to the Istio gateway, which runs oauth2-proxy (authentication) + AVP validator (authorization) via ext_authz. Users must log in with Google and have an AVP grant to access the app. No gateway: field needed.
Your team always has access. Forge creates a team-group grant automatically at scaffold time. You don't need initial_grants for your own team members.
initial_grants is scaffold-time only. Entries are seeded once for people outside your team. After that, auth-portal manages all grants (create, revoke, access requests).
No-auth apps use auth: "none". This auto-derives off Istio to a private ALB (no authentication). No-auth internal apps are VPN-only; no-auth public apps are not yet supported.

How authorization works

The platform handles authorization automatically. You declare what you need in forge.yaml; the platform creates the authorization policies and enforces them at the gateway.

Roles

Every app starts with two built-in roles:

Role	Description
Access	Can use the application
Admin	Can manage grants and settings for the app via auth-portal or forge's grant tools

Your team automatically gets the Admin role in preview and the Access role in staging/production at scaffold time. Use auth-portal to grant roles to additional people or groups.

Custom roles

Apps can declare custom roles beyond the built-in Access and Admin:

authz:
  roles:
    - name: editor
      description: "Can edit content but not manage access"
    - name: viewer
      description: "Read-only access to dashboards"

Custom roles appear automatically in auth-portal's grant management UI. Admins can assign them to individuals or groups — no code changes needed.

Reserved role names. The names access, admin, and ping are reserved platform actions and cannot be used as a custom authz.roles[].name — declaring one is rejected at parse. access is the default individually-grantable role; admin is group-scoped (used for auth-portal admin gating) and is not individually grantable; ping is a platform health action. To give a single user elevated access, declare a custom role (e.g. editor) and grant that.

Roles are additive — add a new role to authz.roles and it's available in auth-portal immediately. No re-scaffold needed.

Reading user identity and roles in your app

The gateway sets these headers on every authenticated + authorized request:

Header	Value	Example
`x-verified-email`	User's email	`alice@conservice.com`
`x-verified-name`	Display name (may be empty)	`Alice Smith`
`x-verified-groups`	Comma-separated group memberships	`team-sre@conservice.com`
`x-verified-roles`	Comma-separated roles the user has on this app	`access,editor`

Example — display user info (TypeScript/Express):

app.get("/api/me", (req, res) => {
  res.json({
    email: req.headers["x-verified-email"],
    name: req.headers["x-verified-name"],
    roles: req.headers["x-verified-roles"]?.split(",") ?? [],
  });
});

Example — gate a feature by role:

app.put("/api/content/:id", (req, res) => {
  const roles = req.headers["x-verified-roles"]?.split(",") ?? [];
  if (!roles.includes("editor")) {
    return res.status(403).json({ error: "Editor role required" });
  }
  // ... update content
});

That's it. No auth libraries, no JWT verification, no session management. The platform handles authentication, authorization, and role resolution before the request reaches your app.

Managing grants after scaffold

Use auth-portal — or forge's grant tools (forge_grant_role / forge_revoke_role / forge_list_grants) — to:

View grants — see who has access to each app and their role (Access / Admin)
Create grants — grant an individual or group access to an app
Revoke grants — remove access
Request access — employees can request access; admins approve (auth-portal only)

Who can manage grants. Grants are authored by app admins (anyone holding the Admin role on the app in the target environment) or platform admins — the same rule in auth-portal and the forge grant tools. Team membership alone grants access to the app, not grant-authoring: your team is seeded Admin in preview (self-service — manage your own preview grants) but Access-only in staging/production, so creating or revoking staging/production grants requires an app admin or a platform admin. Viewing an app's grant list has the same bar as authoring. If a grant call is denied, you don't hold Admin on the app in that environment — ask an app admin or a platform admin, or file an access request in auth-portal.

Grants are per-environment — a grant in staging doesn't automatically exist in production. The team-group grant and any initial_grants are created in all envs at scaffold time, so your team has access everywhere from day one. Additional grants created via auth-portal are per-env (a grant added in staging must be separately added in prod if needed).

Per-environment grants

Authorization is per-environment — a grant in staging doesn't automatically exist in production. The team grant and initial_grants are created in all environments at scaffold time. Additional grants created via auth-portal are environment-scoped.

Authenticating an MCP server

If you're building an MCP server (a tool surface that Claude, an IDE, or an agent connects to), declare one field and the platform handles auth. You write zero auth code — no OAuth flow, no JWT verification, no token store. Same contract as a web app, different front door.

The web front door (auth / authz) is browser-shaped: oauth2-proxy sets a session cookie and bounces the user through a Google login redirect. An MCP client is a programmatic HTTP client with no cookie jar and nowhere to follow a redirect. So MCP servers authenticate over the OAuth 2.1 bearer path instead, per the platform's MCP auth model. The mechanics differ, but what reaches your handler is identical: the same x-verified-* headers the cookie path injects.

What to declare

Set auth.kind: bearer on the MCP service. That's the entire dev-facing surface.

forge_version: 3.0.0
app_name: google-sheets-mcp
team: ai
language: typescript
dns:
  zone: conservice.ai
  required: true
services:
  - name: mcp
    port: 8080
    expose: internal          # the bearer gateway is the public entry — the service itself stays off the public edge
    health_path: /health
    auth:
      kind: bearer            # OAuth 2.1 bearer path — no auth code in your app
authz:
  initial_grants:
    - alice@conservice.com

kind: bearer flags the service as one whose callers can't follow OIDC redirects. Forge routes it through the dedicated bearer gateway instead of the cookie chain. Everything else in your forge.yaml (resources, DNS, monitors) works the same as any other app.

What forge emits

Declaring auth.kind: bearer drives the scaffold to wire the full OAuth 2.1 resource-server contract on your behalf:

An RFC 9728 discovery document at /.well-known/oauth-protected-resource, the file a compliant MCP client fetches to find the authorization server.
A route through the dedicated bearer gateway to avp-validator Path B (JWT verification), kept off the human-web cookie chain. The gateway strips any client-supplied x-auth-request-* and x-verified-* headers so a caller can't spoof an identity past the validator.
A per-app Cedar authorization policy in AWS Verified Permissions (AVP), the same Application::"{app}" resource model the web authz block uses. Your team is seeded automatically at scaffold time, exactly like a web app (Admin in preview, Access in staging and production).

How a client authenticates

You don't implement any of this — it's what happens at the gateway when an MCP client connects. Knowing the shape helps when you read a 401 or wonder where the identity header comes from:

The client calls your server with no token. avp-validator returns 401 with WWW-Authenticate: Bearer resource_metadata="https://<your-host>/.well-known/oauth-protected-resource".
The client fetches that discovery document and finds the central platform authorization server.
The client does Dynamic Client Registration plus a PKCE authorization-code flow against the platform AS. It receives a token scoped to your server's audience, carrying the caller's Google-group claims.
The client retries with Authorization: Bearer <token>.
avp-validator verifies the JWT signature and audience, evaluates your app's Cedar policy, and on success injects the identity headers before the request reaches your handler. On failure it returns 401 with the WWW-Authenticate challenge again.

Tokens are deliberately short-lived: the access token expires after 15 minutes, and the client holds a 30-day refresh token to renew it silently — a compliant MCP client refreshes before expiry, so your app never handles token renewal.

Caveat — brand-new bearer MCP previews and token exchange

A brand-new auth.kind: bearer service being exercised in a per-PR preview could fail step 3 with invalid_target ("resource does not name a registered MCP server"): the preview ingress raises the bearer wall from your PR branch, but the authorization server's registration was only emitted from main — so until your auth.kind: bearer declaration merged, the AS refused to mint tokens for the preview hostname. A platform fix making the two emits symmetric has landed — previews hydrated before it deployed may need a fresh hydrate (push a commit to the PR) to pick it up. Steady-state (stg/prod, and previews of services whose bearer declaration is already on main) is unaffected. If you hit invalid_target on a new bearer MCP preview, it's this gap — not your client.

What your handler reads

On an authorized request, your handler reads the same x-verified-* headers documented in Reading user identity and roles in your app — the bearer path and the cookie path converge on the identical header contract:

Header	Value
`x-verified-email`	Authenticated caller's email
`x-verified-groups`	Comma-separated Google-group memberships
`x-verified-roles`	Comma-separated roles the caller holds on this app

(This is the subset most MCP handlers need — the full header set, including x-verified-name, is in the linked canonical table.)

Read the header. Write tool logic. Don't verify the JWT yourself — the platform already did, and re-verifying in-app re-introduces exactly the per-app auth code this path eliminates.

What you must NOT do

Don't put an MCP server behind auth: "none". That serves it off a private ALB with no authentication. An unauthenticated tool surface is exactly the failure mode the platform's MCP auth model exists to close.
Don't try to use the cookie front door for MCP. A programmatic client can't complete the Google login redirect — it gets a 302 it can't follow. auth.kind: bearer is the only supported MCP auth path.
Don't verify the bearer token in your app. avp-validator does it at the gateway. Read x-verified-* and trust it.
Don't hold MCP session state in a per-pod in-memory map. It split-brains at replicas > 1. Use a shared store or a stateless transport. (Per-pod session state forces a service down to a single replica — use a shared store instead.)

forge's own MCP is different

forge's own /mcp runs a standalone authorization server as platform infrastructure — that's not the pattern here. A dev scaffolding a new workload MCP server (google-chat-mcp, google-sheets-mcp, and the like) uses auth.kind: bearer and gets the central platform AS described above. You never stand up your own authorization server.

Machine / agent callers

The flow above is the human-in-the-loop path: a person's Claude or IDE connecting on their behalf, resolving to their Google identity. Laptop agents (dev Claude Code, agent cockpits) are human-attended and use this same path — they authenticate as the dev, with no machine credentials. A pure machine caller — an EKS workload calling an MCP with no human in the loop — does not use client credentials or any stored key: the pod proves its identity with its existing AWS Pod Identity (no stored credential), and the platform mints a short-lived bearer scoped to the target MCP's audience, authorized as a machine principal for the pod's team. Same-team machine access is seeded automatically at scaffold; there is no forge.yaml field to set — cross-team machine access needs a platform-team grant.

GitHub access (`github`)

Your app can read (and, in one lane, write) GitHub at runtime without holding any GitHub credential. You declare the capability in forge.yaml; at runtime your pod mints a short-lived, narrowly-scoped GitHub App installation token from forge — authenticated purely by its Pod Identity AWS role. There is no PAT, no App private key, and nothing in a secret to rotate.

The declaration is a top-level github field; this section is the how-to for using the access once you've declared it — the mint call, what comes back, and how to consume it.

Declaring access

The github block is a top-level, opt-in field. Every sub-field is optional; omit the block entirely for no GitHub access.

github:
  read: true          # READ your own team's forge-created repos
  # write: [issues]   # ONE write lane (see below) — exactly one element
  # projects: read    # org GitHub Projects — read is self-serve; write is SRE-gated

Field	Type	Notes
`read`	boolean	`true` grants a read capability over your own team's forge-created repos. Breadth + permission scope are server-side and platform-managed — you can't name a target.
`write`	array (exactly 1)	`[code]` or `[issues]` — exactly one lane per app (`.min(1).max(1)`; `write: []` is a loud parse error, omit the field to grant nothing). `code` = contents + pull-requests, `issues` = issues only. Targets/breadth/permissions are server-derived from the verified caller — never from this file.
`projects`	enum	`read` = org-wide Projects-V2 read (self-serve; read can't mutate a board). `write` = org-wide board write (records intent only; activates only on an explicit SRE grant). This is a sibling of the `write` lanes above, not one of them.

write includes read for its own repos — GitHub permissions are leveled, so a write token can also read. metadata:read is always included. To grant no write, omit write (an empty write: [] is rejected).

Using a read token

Declaring github: { read: true } provisions a mint endpoint your pod calls at runtime. Your pod authenticates with a presigned sts:GetCallerIdentity request (the standard AWS SigV4 mechanism its Pod-Identity role already gives it) — forge replays it, reads the caller ARN from the STS response, maps it to your team, and mints a token clamped to your team's repos. You never send a secret and you never name a repo.

# From inside your pod. The mint base is https://forge.conservice.ai
# (internal mesh — not on the public edge). The body carries a presigned
# GetCallerIdentity blob (SigV4). IMPORTANT: the presign must include AND
# SIGN the header `x-forge-mint-serverid: forge-github-read` — a default
# STS presign does not, and without it the mint returns 401. Full envelope
# recipe + copy-paste signers:
# recipe + copy-paste signers: ask the platform team for the forge mint-APIs guide.
curl -sS -X POST "https://forge.conservice.ai/ci/mint-github-read-token" \
  -H 'content-type: application/json' \
  -d "$PRESIGNED_STS_IDENTITY_JSON"

The response is the token and its expiry:

{ "token": "ghs_xxxxxxxxxxxxxxxxxxxx", "expires_at": "2026-07-02T18:30:00Z" }

Consume it like any GitHub installation token — set it as the bearer for the REST API, or as the credential for a clone:

# Enumerate exactly what this token can reach:
curl -sS -H "Authorization: token $TOKEN" \
  https://api.github.com/installation/repositories

# Clone one of them:
git clone "https://x-access-token:$TOKEN@github.com/conservice-ai/<your-team-repo>.git"

import { Octokit } from "@octokit/rest";
const octokit = new Octokit({ auth: token });          // token from the mint call
const { data } = await octokit.request("GET /installation/repositories");
// data.repositories == the repos this token can read (your team's forge repos)

GET /installation/repositories is the authoritative "what can I reach" call — the token is clamped server-side, so this is how you discover its actual reach rather than guessing. The read token carries contents, metadata, pull_requests, checks, and statuses — all read-only — which covers cloning and PR/CI-status polling (e.g. reading a PR or its check-runs).

Using a write token

Same authentication (presigned Pod-Identity ARN), a different endpoint:

curl -sS -X POST "https://forge.conservice.ai/ci/mint-github-write-token" \
  -H 'content-type: application/json' \
  -d "$PRESIGNED_STS_IDENTITY_JSON"

The response shape is identical — { token, expires_at } — and the token is scoped to the single lane you declared, clamped to your own team's repos:

issues — mints issues:write only. Open, comment on, and label issues in your team's repos. Live and self-serve today.
code — mints contents:write + pull_requests:write. Push branches and open PRs; merge stays human-gated (branch protection on the target repos — the platform never auto-merges app-token PRs). Live and self-serve — the platform GitHub App's permission ceiling now includes pull_requests:write, closing the earlier mint-422 gap.

The request body may also carry an optional lane field ("code" | "issues" | "projects_read"). Omitted, it defaults to your single declared lane; if your app holds more than one lane (e.g. write: [issues] plus projects: read), the field is required — omitting it returns a 400 ambiguous — name one via the "lane" field. Note that projects: read tokens mint from this same write endpoint, with lane: "projects_read".

Consume the token exactly like the read token (Authorization: token <minted>, or as the git credential). Example — open an issue with octokit:

const octokit = new Octokit({ auth: token });          // token from mint-github-write-token
await octokit.rest.issues.create({
  owner: "conservice-ai",
  repo: "<your-team-repo>",
  title: "Automated report",
  body: "Filed by the app at runtime.",
});

Scope, lanes, and environments

What read reaches: your own team's forge-created repos — not just the calling app's repo, and not fleet-wide. The clamp is derived server-side from your pod-role ARN → team; you cannot name a target repo, and the restricted infra-* control-plane repos are always excluded. (Fleet-wide read is a separate, SRE-gated cross-fleet tier granted case-by-case — not something a forge.yaml field turns on.)

What write reaches: your own team's repos, team-scoped, for the one declared lane. Merge stays human-gated — branch protection on the target repos requires human review, and the platform never auto-merges app-token PRs.

Lane status at a glance:

Lane	Self-serve	Status
`read`	✅	Live
`write: [issues]`	✅	Live
`write: [code]`	✅	Live
`projects: read`	✅	Live (org-wide Projects read; can't mutate a board)
`projects: write`	❌	SRE-gated

Environments: the mint endpoints admit preview (prev), staging (stg), and production (prod) pods — both lanes. (Preview access is a deliberate, ratified tradeoff: a preview pod runs pre-merge code and can mint an own-team-scoped token, so review what your PR code does with it like any other change.) A pod whose role ARN doesn't resolve to a known platform env is refused fail-closed.

TTL: minted tokens are short-lived GitHub installation tokens (~1 hour). Mint on demand, cache within the returned expires_at, and re-mint when it nears expiry — never persist a minted token.

Common errors

Response	Meaning	Fix
`403 — github read-mint is available only to preview, staging, and production workloads` (or the write analog)	Your pod's role ARN didn't resolve to a known platform env (`prev`/`stg`/`prod`) — the env gate is fail-closed on unrecognized identities.	Mint from a platform pod running under its standard Pod-Identity role.
`401 — server-ID header missing or invalid`	Your presigned STS blob didn't include and sign the `x-forge-mint-serverid: forge-github-read` header — a default STS presign omits it.	Presign with the header in `SignedHeaders` — the platform team's mint-APIs guide has the full envelope recipe and copy-paste signers.
`403 — this app has not declared github: { read: true }` / `has not declared a github: { write: ... } profile`	The capability isn't declared in your `forge.yaml`, or the change hasn't been re-registered.	Add the `github` block and re-scaffold / re-register the deployment so the capability lands.
`403 — no repos resolved for caller`	The server resolved zero repos for your team — e.g. your team has no forge-created repos yet, so there's nothing to scope to (empty clamps are refused, never widened).	Ensure your team owns at least one forge-created repo before minting.
`400 — ambiguous — app holds multiple lanes`	You hold more than one lane (e.g. `write: [issues]` + `projects: read`) and omitted the `lane` body field.	Name the lane explicitly in the request body.
`422` on the code lane	Should no longer occur — the platform GitHub App's permission ceiling includes `pull_requests:write`. Seeing it means the ceiling regressed below the lane's requested permissions.	Escalate to the platform team.
`403 — the 'projects_write' lane is SRE-gated`	`projects: write` is not self-serve.	Request an explicit SRE grant.
`429`	Rate-limited — mints are budgeted per app/team.	Cache the token and re-mint near `expires_at`, not per-request.

Scheduled services (`schedule`)

Any service can run on a schedule instead of as an always-on Deployment: declare services[].schedule and forge renders that service as a Kubernetes CronJob. The container's CMD is the job — it runs to completion and exits 0 on success (nonzero = the run failed). Everything else about the service is normal: its own Dockerfile and ECR repo, the same env vars and secrets (envFrom the same ConfigMap/Secret), and the same pod IAM role — a scheduled service reaches the app's S3 buckets, queues, and database exactly like its always-on siblings.

services:
  - name: nightly-sync
    schedule:
      cron: "0 2 * * *"              # 02:00 daily, evaluated in `timezone`
      timezone: America/Denver       # REQUIRED — no default, wall-clock intent is always explicit
      # concurrency: forbid          # default — skip the tick if the previous run is still going
      # active_deadline_seconds: 1800  # default — kill a hung run after 30 min

app_kind: cron does not do this. That top-level field only classifies which Datadog monitor pack the app gets — it schedules nothing. services[].schedule is what creates the CronJob.

Fields

Field	Type	Required	Notes
`cron`	string	yes	Standard 5-field cron expression (`minute hour day-of-month month day-of-week`, e.g. `0 2 * * `) or one of `@hourly` / `@daily` / `@weekly` / `@monthly`. Lists (`1,15`), ranges (`1-5`), and range steps (`/15`, `0-59/15`) are supported. Rejected by design: seconds/Quartz forms (`? L W #`), `@every`, `@yearly`/`@reboot`, bare value steps (`5/2` — write `5-59/2`), and day-of-week `7` (use `0` for Sunday).
`timezone`	string	yes	`UTC` or a Region/City IANA name (`America/Denver`). No default — the intended wall-clock time is always explicit. Raw offsets (`+05:00`), `Local`, and abbreviations (`EST`) are rejected. Zones with daylight-saving time shift the job's UTC firing time twice a year — use `UTC` for a DST-free schedule.
`concurrency`	enum	no	What happens when the previous run is still active at the next tick: `forbid` (default) skips the new run, `allow` runs them concurrently, `replace` cancels the running job and starts fresh.
`active_deadline_seconds`	integer	no	Wall-clock kill switch for a hung run. Default `1800` (30 min); range 60–86400. A run exceeding it is terminated and counted as failed. Size it to the job's worst-case healthy runtime plus headroom.

What a scheduled service cannot declare

A scheduled service has no long-running or routed shape — each of these is a loud parse error on it: port, expose, health_path, dns, scaling, replicas.

When it fires (and when it deliberately doesn't)

Per environment, the schedule is OFF until that env has a promoted image AND a nonzero top-level replicas.{env} count — the same lever that activates Deployments on first promotion. Set replicas.{env}: 0 to pause an env's scheduled runs without touching the config.
Per-PR preview environments never fire scheduled runs — the CronJob ships suspended there, by design. Test the job logic by running the container directly; the schedule only ticks in stg/prod.

Common errors

Symptom	Meaning	Fix
Parse error on `schedule.cron`	The expression uses a rejected form (seconds field, Quartz token, `@every`, bare `5/2` step, day-of-week `7`).	Rewrite in the standard 5-field grammar (`5-59/2`, Sunday = `0`) or use an `@` macro.
Parse error on `schedule.timezone`	Offset/abbreviation/`Local` given, or the field omitted.	Use `UTC` or a Region/City IANA name. The field is required.
Job never fires in an env	That env has no promoted image yet, or `replicas.{env}` is `0`.	Promote once and set a nonzero `replicas.{env}`.
Job never fires in a preview	Previews ship the CronJob suspended — always.	Expected; run the container manually to test job logic.
Runs killed at 30 minutes	Default `active_deadline_seconds` (1800) hit.	Raise it (up to 86400) to worst-case healthy runtime + headroom.
A tick was skipped	Previous run still active and `concurrency: forbid` (default).	Expected under the default; use `allow`/`replace` if overlap is safe/desired.

Environment opt-out (`disabled_envs`)

Apps deploy to all platform envs by default (prev, stg, prod). Use disabled_envs: to opt an app out of one or more — useful for tooling-only apps that don't need a preview env, or single-env utilities.

disabled_envs: [prev]

Field	Type	Notes
`disabled_envs`	array of enums	Each entry must be `prev`, `stg`, or `prod`. Omit the field entirely (or set `[]`) to deploy to all envs.

Per-env resource overrides (different SQS retention in prod vs stg, etc.) aren't supported in this schema today. Tune at the resource level instead.

environments: is rejected at parse — remove the block; use disabled_envs: instead. Earlier schema versions accepted a top-level environments: { prev: {...}, stg: {...}, prod: {...} } block carrying per-env account_id. The current schema is strict and has no environments key, so a file that still carries the block fails validation outright. Account IDs are platform constants the renderer supplies, and the enabled-env list is derived from the platform default minus disabled_envs: — there was never anything dev-meaningful in the old block. Delete it and express env opt-out with disabled_envs:.

Preview

Per-PR ephemeral environment opt-in.

preview:
  enabled: true

Field	Type	Notes
`enabled`	bool	Default: false.
`stale_days`	int	Positive integer. Days of PR inactivity before the scaffolded stale-PR workflow applies the `stale` label. Default: 60 when omitted. To turn the workflow off entirely, omit `preview` or set `preview.enabled: false`.
`close_grace_days`	int	Positive integer. Days after the `stale` label is applied before the PR is auto-closed (which tears down the preview env via the PR-closed trigger). Default: 7 when omitted — i.e. close at 67 days inactive.

When true, every PR gets a preview env at pr-{N}-{app}.prev.conservice.ai (VPN-only). When dns.hostname is set, the leading label honors the override (zone-stripped) — e.g. dns.hostname: demo.conservice.ai → previews at pr-{N}-demo.prev.conservice.ai. Per-PR ephemeral resources (DBs, SQS queues, S3 buckets keyed pr-{N}-{name}) are provisioned automatically per PR. The pod IAM role gains anchored-wildcard ARNs for pr-* resources.

An abandoned PR holds its preview environment indefinitely, so forge scaffolds a stale-PR workflow: after stale_days of inactivity it labels the PR stale, and after a further close_grace_days it auto-closes the PR — which fires the per-PR teardown for free. The two values default to 60 and 7 (stale at 60 days, close at 67).

Image tags

Per-env image tags managed by Kargo after each promotion. You don't set these manually — Kargo's argocd-update step writes the promoted commit SHA into forge.yaml after each successful promotion.

image_tags:
  stg: bd55eef2ffe08f1c7ad67b7ac14f1b2d69e1fc9a
  prod: PROMOTION_PENDING-prod

Key	Description
`stg`	Image tag for staging. Written by Kargo after stg promotion.
`prod`	Image tag for production. Written by Kargo after prod promotion.

Before the first Kargo promotion, the value is PROMOTION_PENDING-{env} — a sentinel that renders an obviously-unhealthy state rather than silently pulling latest.

Replicas

Per-env fixed replica counts for the Deployment. This sets a static pod count per environment — it is not autoscaling. For services that should scale with load, use the per-service scaling block instead; an autoscaled service is excluded from this block entirely (its HPA owns the count).

Valid keys are stg and prod only — prev is not accepted (the block is rejected at validation if you add other keys). A new app scaffolds with { stg: 0, prod: 0 }; Kargo bumps the value 0 → 2 on the first promotion to each env, so the app stays provisioned-but-idle until a real image is promoted.

replicas:
  stg: "2"
  prod: 0

Values are integers or quoted strings (YAML scalars — both 2 and "2" work). 0 means the env is provisioned but not running pods (e.g., prod before the first promotion). The platform renderer emits these into the kustomize overlay's Deployment patch.

Render channel (`render_channel`)

Selects which renderer pin channel the app's CI workflows track. Supersedes the deprecated canary boolean.

render_channel: canary

Value	Notes
`general`	Default. The app's CI tracks the post-bake-promoted (stable) renderer version. New renderer releases reach `general` apps only after the bake window.
`canary`	The app's CI tracks the advance-on-every-publish renderer version. Reserved for SRE-owned canary apps (`forge-canary-*`); general apps should omit the field or set `general` explicitly. Useful for catching template-rendering regressions before fleet-wide rollout.

Omit for general apps — the default is general. The deprecated canary: true boolean is still accepted as a backwards-compatible alias for render_channel: canary; when both are set, render_channel wins.

Canary (deprecated)

Deprecated

Use render_channel: canary instead. The canary boolean is retained as a backwards-compatible alias only; new apps and edits should use render_channel. When both fields are set, render_channel wins.

canary: true   # equivalent to: render_channel: canary

When true, the app's CI workflows track the canary renderer channel instead of the general channel. Same semantics as render_channel: canary — kept solely so existing forge.yaml files don't have to be rewritten in lockstep with the schema change.

Monitoring

Three related top-level fields configure the app's Datadog monitor pack. app_kind and slo_tier gate which monitors emit from the default catalog; monitors configures where they route and page. All three are optional at the schema level — an app that targets stg/prod without a monitors: block gets a non-blocking preflight warning rather than a hard failure.

App kind (`app_kind`)

app_kind: web-service

Value	Notes
`web-service`	HTTP-serving app — gets 5xx, latency, and request-rate monitors.
`worker`	Background processor — gets queue-lag and throughput monitors.
`cron`	Scheduled job — gets run-success / missed-run monitors.
`batch`	Batch job — gets completion / duration monitors.

Optional. Drives the Datadog monitor catalog gate — the value selects which kind-specific monitors emit. A typo here means those monitors silently fail to emit, so set it deliberately when the app targets stg/prod.

SLO tier (`slo_tier`)

slo_tier: tier-2

Value	Notes
`tier-1`	Full catalog, including the P3 monitors.
`tier-2`	Default. Drops the most-sensitive P3s.
`tier-3`	P1/P2 only — the internal-tool profile.

Optional. This is a catalog-membership lever: it controls how much of the monitor catalog the app receives, not where monitors route. Apps that need paging behavior on a non-prod env declare that in monitors.routing, not here. When unset, the downstream module defaults to tier-2.

Monitors (`monitors`)

monitors:
  pagerduty_service: billing
  google_chat:
    name: billing-alerts
    space_id: AAAAabcdefg
    token: xyz123_token-value
  dev_channel: alice-debug

Field	Type	Required	Notes
`pagerduty_service`	string	no	Lowercase alphanumeric / underscore / hyphen, 1-64 chars (e.g. `billing`, `forge-runtime`). Routes prod monitors via `@pagerduty-${name}`. Optional — omit only with an explicit `routing.prod` override (e.g. Datadog-event-only routing).
`pagerduty_create_new`	boolean	no	When `true`, Forge auto-PRs a new PagerDuty service; when `false` or omitted, it reuses an existing one. Requires `pagerduty_escalation_policy` when `true`.
`pagerduty_escalation_policy`	string	no	Escalation-policy name, 1-64 chars. Required when `pagerduty_create_new: true` — the auto-PR references the policy to wire the new service. Forge does not create escalation policies; they stay team-owned in the PagerDuty UI.
`google_chat`	object	no	Google Chat webhook config: `name` + `space_id` + `token`. See the shape below. Optional — omit only with an explicit `routing.stg` override (e.g. `routing: { stg: ["datadog-event"] }`).
`dev_channel`	string	no	Kebab-case prev-env opt-in. When set, prev monitors route to `@webhook-${dev_channel}`. When omitted, prev emits zero monitors (prev is silent by default).
`routing`	object	no	Per-env routing override. A map keyed by `prev` / `stg` / `prod`; each value is a non-empty array of `@`-handles (e.g. `@pagerduty-billing`). When set for an env, it replaces the default routing for that env (prod → PagerDuty, stg → Google Chat, prev → `dev_channel` or silent). Envs not listed fall back to the default. This is the only surface for multi-handle destinations.
`extra`	array	no	Team-supplied extra monitors, unioned with the default catalog at emit time. See the entry shape below.

The google_chat block:

Field	Type	Required	Notes
`name`	string	yes	Kebab-case webhook name. Datadog routes to it as `@webhook-${name}`.
`space_id`	string	yes	Google Chat space ID — alphanumeric / underscore / hyphen, 1-64 chars. Plaintext (useless without the org-level API key held in AWS Secrets Manager).
`token`	string	yes	Per-webhook token — alphanumeric / underscore / hyphen, 1-128 chars. Plaintext (defense-in-depth: the token + space pair is useless without the org-level key).

Each entry in extra:

Field	Type	Required	Notes
`name`	string	yes	Human-readable monitor name, 1-200 chars. Shown in the Datadog UI and alert previews.
`type`	enum	yes	A Datadog monitor type literal — `metric alert`, `query alert`, `log alert`, `service check`, `slo alert`, etc. Composite monitors are not permitted.
`query`	string	yes	Raw Datadog query string. Authors own the `env:` / `service:` scoping; the module emits the query verbatim.
`priority`	integer	yes	1-5 (1 = highest, page-worthy; 5 = lowest, FYI). Stamped as the native Datadog priority and duplicated into a `priority:P{n}` tag.
`category`	string	yes	Kebab-case category slug. Tagged as `category:${value}` for filter-view grouping. Reuse a default-catalog category name when the extra is a conceptual peer; otherwise pick an app-specific slug.
`thresholds`	object	yes	`{ critical?, warning?, critical_recovery? }` — numeric threshold values for the monitor's `monitor_thresholds {}` block.
`tags`	array	no	Additional tags appended to the module's common tag set (e.g. `subsystem:invoice-render`).

Observability (`observability`)

Optional per-app observability tuning. Dev-owned; not machine-written. The block is strict-keyed (unknown keys are rejected at parse).

observability:
  trace_sample_rate_prod: 0.1

Field	Type	Notes
`trace_sample_rate_prod`	number (0.0–1.0)	Datadog APM trace sample rate for the production overlay only. Optional opt-in cost control; omit to use the platform / Datadog default. `prev` and `stg` always sample at 1.0 (full-fidelity traces in lower envs). Sets `DD_TRACE_SAMPLE_RATE` on the prod overlay.

Use this when prod trace volume is creating non-trivial Datadog cost and the app is willing to give up per-request trace fidelity for sampled aggregates. Lower envs are unaffected — debugging-time visibility stays full-fidelity.

Reserved naming

Reserved key prefixes (resource map keys)

pr- — reserved for per-PR ephemeral resources. A static bucket keyed pr-foo would produce {prefix}-pr-foo, colliding with the per-PR wildcard {prefix}-pr-*-*. Schema rejects all pr-* keys in s3, sqs, dynamodb, etc.

Reserved env-var name prefixes

These are platform-managed; do NOT include them in app_config_keys or shadow them in services[].env. They're injected automatically:

DATABASE_* — Aurora connection metadata
S3_BUCKET_* — bucket names
SQS_QUEUE_* — queue URLs
SNS_TOPIC_* — topic ARNs
EVENTBRIDGE_BUS_* — bus names
DYNAMODB_TABLE_* — table names
SFN_ARN_* — state machine ARNs
FIREHOSE_STREAM_* — firehose stream names
BEDROCK_* — Bedrock config
KMS_KEY_* — per-app CMK ID + ARN env vars

And the exact-match reserved names (whole name reserved, not a prefix):

AWS_REGION
TEMPORAL_ADDRESS
TEMPORAL_NAMESPACE

Reserved app-name prefixes (forge schema rejection list)

Forge rejects app names starting with these:

csvc- — legacy prefix; retained in the rejection list to prevent resurrection
conservice- — reserved for global-namespace S3 buckets only
aws- — group prefix
infra- — repo prefix for SRE-owned repos
forge- — forge platform itself (carve-out: forge-canary-* allowed for SRE canary fleet)
k8s- — Kubernetes-managed

app- is no longer a reserved app-name prefix. Dev-owned repos use bare names (rates-agent, not app-rates-agent); classification is via a GitHub repo custom property rather than the name. Note that the auth tier-group prefix app-{app}-admins@conservice.com is a separate concept (a Google-group identifier) and is unaffected.

Reserved exact app names (rejected regardless of prefix): forge, argocd, kargo, platform, admin, infrastructure, terraform, staff, workloads, gateway. (auth-service is deliberately not on the list — it is a live forge-managed app with that exact slug.)

The same list is reused to validate resources.kms.<key_name> — pick a logical name that describes the encryption purpose, not a platform prefix.

Naming patterns (resolved at scaffold)

Resource	Pattern	Example
Repo	`conservice-ai/{app}`	`conservice-ai/my-app` (bare names)
Namespace	`{app}`	`my-app`
ECR repo	`apps/{app}-{service}`	`apps/my-app-api`
Aurora DB name	`{app_underscored}`	`my_app`
Aurora service user	`{app_underscored}_svc`	`my_app_svc`
S3 bucket	`conservice-{env}-{app}-{key}`	`conservice-prod-my-app-history`
SQS queue	`{env}-use1-{app}-{key}-queue`	`prod-use1-my-app-jobs-queue`
SNS topic	`{env}-use1-{app}-{key}-topic`	`prod-use1-my-app-events-topic`
EventBridge bus	`{env}-use1-{app}-{key}`	`prod-use1-my-app-domain`
Step Function	`{env}-use1-{app}-{key}`	`prod-use1-my-app-flow`
DynamoDB table	`{env}-use1-{app}-{key}`	`prod-use1-my-app-sessions`
Firehose stream	`{env}-use1-{app}-{key}`	`prod-use1-my-app-events`
KMS key	`{env}-use1-{app}-{key}-key`	`prod-use1-auth-service-token-envelope-key`
KMS alias	`alias/{env}-use1-{app}-{key}-key`	`alias/prod-use1-auth-service-token-envelope-key`
Pod IAM role	`{env}-use1-{app}-pod-role`	`prod-use1-my-app-pod-role`
Secrets path	`{app}/{key}`	`my-app/api-token`
Temporal namespace	`{app}-{env}.<temporal-cloud-namespace>`	`my-app-prod.<temporal-cloud-namespace>`
Per-team Permission Set (Identity Center)	`team-{team}-{env_short}-{tier}`	`team-ai-prod-admin` — kebab-case. Drives team AWS Console access AND per-app DB IAM access (via `rds-db:connect` ARNs in the inline policy, enumerated from the SRE-managed team-to-app mapping).
PG login role (per-app, cluster-scoped)	`aws-{app}-db-{tier}`	`aws-rates-db-admin` — created by the platform when `database.{name}.team_grants` is declared. One role per (app, tier) at the Aurora cluster level; multiple teams sharing a tier share one login role. The `-db-` infix marks it as DB-scoped (the role gates ONLY PostgreSQL access — every other AWS service uses per-team Permission Sets).
PG tier role (per-app, NOLOGIN, holds object grants)	`{app}_admin` / `{app}_readonly`	`rates_admin`, `rates_readonly` — table/sequence + DEFAULT PRIVILEGES on each DB the app owns. The login role above inherits from these.
Authorization entity	`{app}`	`rates-agent` — registered in each env's authorization store at scaffold time. Grants managed via auth-portal.
Kargo project	`{app}`	`my-app` (literal — no env suffix)

App env vars are uniform: {TYPE}_{KEY} matches the AWS name's {key}. A bucket keyed history becomes env var S3_BUCKET_HISTORY containing conservice-prod-my-app-history. A KMS key named token-envelope becomes env vars KMS_KEY_TOKEN_ENVELOPE_ID and KMS_KEY_TOKEN_ENVELOPE_ARN. Always read names from env vars; never construct them in app code.

Schema versioning

Two distinct version lines apply to forge.yaml:

forge_version is the YAML schema-contract value users put in their file. Current: 3.0.0. It declares which schema contract the file expects. Forge carries N and N-1 renderers in parallel for a deprecation window; apps on the old version see a deprecation warning on forge_status until they migrate. Use forge_migrate_yaml to auto-upgrade your forge.yaml to the latest shape.
The URL major version (v3) is the schema's published-contract major. It is independent of the platform's internal implementation versions — the URL versions the schema contract, not the implementation. The published JSON Schema artifact on this site is regenerated from the same validator the platform runs at parse time, so the file you point your editor at always matches what CI enforces.

Versioning policy for this doc

Each major schema version (v1, v2, v3, ...) gets its own URL path: schemas.conservice.ai/forge/v1/, schemas.conservice.ai/forge/v2/, schemas.conservice.ai/forge/v3/.
Old majors stay live forever. An app pinned to v1 keeps validating against the v1 schema even after v2 ships.
Within a major (v1.0 → v1.x), this doc updates in place. Additive changes (new fields, relaxed constraints) don't bump the major.
Breaking changes (renamed fields, removed fields, tightened constraints) bump the major.

Where to file requests

Need a new resource type (DocumentDB, OpenSearch, etc.) — file a platform request. Don't add raw resource blocks to your repo — the IaC guardrail rejects PRs.
Need an existing knob exposed (S3 lifecycle, DynamoDB autoscaling, etc.) — same. The platform team adds it to the renderer + this schema doc.
Need to tune a per-env value that's currently uniform — file a request; per-env override support is a roadmap item.
Found a bug in this doc — let the Conservice platform team know.

More information

JSON Schema for IDE auto-validation: forge.yaml.schema.json — the machine-readable contract. Auto-published on every schema change; drop it into .vscode/settings.json yaml.schemas for autocomplete + validation.
Need to read the validator source? The schema is generated from a runtime validator; the Conservice platform team can point you at it. The JSON Schema linked above is the same shape and is what your editor and CI validate against.
Questions or access requests — reach the Conservice platform team.

Contents​

What this is​

Quick start — minimal forge.yaml​

Multi-service example (v3)​

How to deploy​

Migrating from v1​

Common patterns​

App repo layout​

Top-level fields​

Language​

Services​

DNS​

Per-service hostname resolution (v3)​

Per-service image (v3)​

Per-service autoscaling (scaling)​

App config keys​

Env vars​

Resources​

Resource key naming (applies to ALL resource types)​

Per-resource grant fields (s3 / sqs / dynamodb / eventbridge / database)​

Per-resource cross-app policy (access / allowed_teams / allowed_apps / tags)​

S3 (resources.s3)​

SQS (resources.sqs)​

SNS (resources.sns)​

DynamoDB (resources.dynamodb)​

EventBridge (resources.eventbridge)​

Step Functions (resources.stepfunctions)​

Bedrock (resources.bedrock)​

Database (resources.database)​

Database migrations​

Preview-only seed​

Temporal (resources.temporal)​

Firehose (resources.firehoses)​

KMS (resources.kms)​

Cross-app access (per-resource policy + consumes)​

Authentication (auth)​

auth: "none" — public, no authentication​

auth: block — authenticated​

How auth and authz fit together​

Authorization (authz)​

How it works​

forge.yaml setup​

How authorization works​

Roles​

Custom roles​

Reading user identity and roles in your app​

Managing grants after scaffold​

Per-environment grants​

Authenticating an MCP server​

What to declare​

What forge emits​

How a client authenticates​

What your handler reads​

What you must NOT do​

GitHub access (github)​

Declaring access​

Using a read token​

Using a write token​

Scope, lanes, and environments​

Common errors​

Scheduled services (schedule)​

Fields​

What a scheduled service cannot declare​

When it fires (and when it deliberately doesn't)​

Common errors​

Environment opt-out (disabled_envs)​

Preview​

Image tags​

Replicas​

Render channel (render_channel)​

Canary (deprecated)​

Monitoring​

App kind (app_kind)​

SLO tier (slo_tier)​

Monitors (monitors)​

Observability (observability)​

Reserved naming​

Reserved key prefixes (resource map keys)​

Reserved env-var name prefixes​

Reserved app-name prefixes (forge schema rejection list)​

Contents

What this is

Quick start — minimal forge.yaml

Multi-service example (v3)

How to deploy

Migrating from v1

Common patterns

App repo layout

Top-level fields

Language

Services

DNS

Per-service hostname resolution (v3)

Per-service image (v3)

Per-service autoscaling (scaling)

App config keys

Env vars

Resources

Resource key naming (applies to ALL resource types)

Per-resource grant fields (s3 / sqs / dynamodb / eventbridge / database)

Per-resource cross-app policy (`access` / `allowed_teams` / `allowed_apps` / `tags`)

S3 (`resources.s3`)

SQS (`resources.sqs`)

SNS (`resources.sns`)

DynamoDB (`resources.dynamodb`)

EventBridge (`resources.eventbridge`)

Step Functions (`resources.stepfunctions`)

Bedrock (`resources.bedrock`)

Database (`resources.database`)

Database migrations

Preview-only seed

Temporal (`resources.temporal`)

Firehose (`resources.firehoses`)

KMS (`resources.kms`)

Cross-app access (per-resource policy + consumes)

Authentication (`auth`)

`auth: "none"` — public, no authentication

`auth:` block — authenticated

How auth and authz fit together

Authorization (`authz`)

How it works

forge.yaml setup

How authorization works

Roles

Custom roles

Reading user identity and roles in your app

Managing grants after scaffold

Per-environment grants

Authenticating an MCP server

What to declare

What forge emits

How a client authenticates

What your handler reads

What you must NOT do

GitHub access (`github`)

Declaring access

Using a read token

Using a write token

Scope, lanes, and environments

Common errors

Scheduled services (`schedule`)

Fields

What a scheduled service cannot declare

When it fires (and when it deliberately doesn't)

Common errors

Environment opt-out (`disabled_envs`)

Preview

Image tags

Replicas

Render channel (`render_channel`)

Canary (deprecated)

Monitoring

App kind (`app_kind`)

SLO tier (`slo_tier`)

Monitors (`monitors`)

Observability (`observability`)

Reserved naming

Reserved key prefixes (resource map keys)

Reserved env-var name prefixes

Reserved app-name prefixes (forge schema rejection list)