From 3b68365347dba2eb1d808d2727fdb1315a16599e Mon Sep 17 00:00:00 2001 From: Winrey Date: Sun, 21 Jun 2026 06:33:01 +0800 Subject: [PATCH 1/3] docs: plan aws t9 staging environment --- .../plans/2026-06-21-aws-t9-staging.md | 172 ++++++++++++++++++ .../specs/2026-06-21-aws-t9-staging-design.md | 92 ++++++++++ scripts/verify-aws-staging-config.sh | 43 +++++ 3 files changed, 307 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-21-aws-t9-staging.md create mode 100644 docs/superpowers/specs/2026-06-21-aws-t9-staging-design.md create mode 100755 scripts/verify-aws-staging-config.sh diff --git a/docs/superpowers/plans/2026-06-21-aws-t9-staging.md b/docs/superpowers/plans/2026-06-21-aws-t9-staging.md new file mode 100644 index 0000000..b7f7390 --- /dev/null +++ b/docs/superpowers/plans/2026-06-21-aws-t9-staging.md @@ -0,0 +1,172 @@ +# AWS t9 Staging Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add a first-class AWS t9 staging environment for the aHand hub. + +**Architecture:** Reuse the existing `ahand-hub` Terraform module with a new +`infra/envs/staging` stack, and extend the global hub deploy workflow and deploy +script to map the `staging` branch to `ahand-hub-staging`. + +**Tech Stack:** Terraform, AWS ECS/Fargate, SSM Parameter Store, GitHub Actions, +Bash. + +--- + +### Task 1: Add A Staging Config Guard + +**Files:** +- Create: `scripts/verify-aws-staging-config.sh` + +- [ ] **Step 1: Write the failing guard script** + +Create `scripts/verify-aws-staging-config.sh` with checks for: + +- `infra/envs/staging/main.tf` +- module env validation containing `staging` +- `.github/workflows/deploy-hub.yml` triggering on `staging` +- deploy script accepting and mapping `staging` + +- [ ] **Step 2: Run the guard and confirm it fails** + +Run: `bash scripts/verify-aws-staging-config.sh` + +Expected before implementation: non-zero exit and a missing staging message. + +- [ ] **Step 3: Commit the guard** + +Run: + +```bash +git add scripts/verify-aws-staging-config.sh docs/superpowers/specs/2026-06-21-aws-t9-staging-design.md docs/superpowers/plans/2026-06-21-aws-t9-staging.md +git commit -m "docs: plan aws t9 staging environment" +``` + +### Task 2: Add Terraform Staging Stack + +**Files:** +- Create: `infra/envs/staging/backend.tf` +- Create: `infra/envs/staging/main.tf` +- Create: `infra/envs/staging/providers.tf` +- Create: `infra/envs/staging/variables.tf` +- Create: `infra/envs/staging/versions.tf` +- Modify: `infra/modules/ahand-hub/variables.tf` + +- [ ] **Step 1: Add `staging` to module env validation** + +Update the validation list from `["prod", "dev"]` to +`["prod", "dev", "staging"]`. + +- [ ] **Step 2: Copy the dev stack shape into staging** + +Use the same t9 VPC, subnet, Traefik, and RDS values as dev. Set: + +- backend key `ahand-hub/envs/staging/terraform.tfstate` +- default tag `Environment = "staging"` +- module `env = "staging"` +- `ecs_cluster_name = "openclaw-hive-dev"` +- `api_domain = "ahand-hub.staging.team9.ai"` +- `gateway_public_url = "https://api.staging.team9.ai"` + +- [ ] **Step 3: Run Terraform validation** + +Run: + +```bash +terraform fmt -check infra/modules/ahand-hub infra/envs/dev infra/envs/staging +terraform -chdir=infra/envs/staging init -backend=false +terraform -chdir=infra/envs/staging validate +``` + +Expected: all commands exit 0. + +- [ ] **Step 4: Commit Terraform changes** + +Run: + +```bash +git add infra/envs/staging infra/modules/ahand-hub/variables.tf +git commit -m "infra: add t9 staging hub stack" +``` + +### Task 3: Wire Staging Deployment + +**Files:** +- Modify: `.github/workflows/deploy-hub.yml` +- Modify: `deploy/hub/deploy.sh` +- Modify: `infra/README.md` + +- [ ] **Step 1: Extend branch mapping** + +Add `staging` to the workflow branch trigger and determine-env block. + +- [ ] **Step 2: Extend deploy script** + +Allow `staging` and set: + +- `ECS_CLUSTER=openclaw-hive-dev` +- `SERVICE_NAME=ahand-hub-staging` +- `API_DOMAIN=ahand-hub.staging.team9.ai` + +- [ ] **Step 3: Update runbook** + +Document prod/dev/staging, t9 profile/account, staging DNS, state key, and +deploy command. + +- [ ] **Step 4: Validate** + +Run: + +```bash +bash scripts/verify-aws-staging-config.sh +ruby -e 'require "yaml"; YAML.load_file(".github/workflows/deploy-hub.yml"); puts "yaml ok"' +bash -n deploy/hub/deploy.sh +``` + +Expected: all commands exit 0. + +- [ ] **Step 5: Commit deploy wiring** + +Run: + +```bash +git add .github/workflows/deploy-hub.yml deploy/hub/deploy.sh infra/README.md scripts/verify-aws-staging-config.sh +git commit -m "ci: deploy hub staging on aws t9" +``` + +### Task 4: Live AWS Staging Bring-Up + +**Files:** +- No source changes expected unless validation reveals missing config. + +- [ ] **Step 1: Inspect Terraform plan** + +Run: + +```bash +terraform -chdir=infra/envs/staging init +terraform -chdir=infra/envs/staging plan +``` + +Expected: planned resources are isolated to `ahand-hub-staging` and +`/ahand-hub/staging/*`. + +- [ ] **Step 2: Apply after plan review** + +Run: `terraform -chdir=infra/envs/staging apply` + +- [ ] **Step 3: Seed runtime secrets** + +Write real values for: + +- `/ahand-hub/staging/DATABASE_URL` +- `/ahand-hub/staging/SENTRY_DSN` + +- [ ] **Step 4: Deploy staging image** + +Push the branch and let the `staging` branch deploy, or run +`./deploy/hub/deploy.sh staging` after pushing an image tagged `staging`. + +- [ ] **Step 5: Verify live service** + +Run AWS ECS checks and `curl -fsS https://ahand-hub.staging.team9.ai/api/health`. diff --git a/docs/superpowers/specs/2026-06-21-aws-t9-staging-design.md b/docs/superpowers/specs/2026-06-21-aws-t9-staging-design.md new file mode 100644 index 0000000..9dd552b --- /dev/null +++ b/docs/superpowers/specs/2026-06-21-aws-t9-staging-design.md @@ -0,0 +1,92 @@ +# AWS t9 Staging Environment Design + +## Goal + +Add a first-class AWS t9 staging environment for the aHand hub, matching the +existing global AWS deployment path instead of only relying on the Qisi staging +runtime. + +## Scope + +This change covers the Rust hub service deployed by `.github/workflows/deploy-hub.yml`. +The global AWS path does not currently deploy the Next.js hub dashboard +container, so dashboard Sentry/runtime settings remain Qisi-only until a global +dashboard deployment path exists. + +## Environment Mapping + +The GitHub branch mapping should be: + +| Branch | Environment | ECS cluster | ECS service | Public host | +|---|---|---|---|---| +| `main` | `prod` | `openclaw-hive` | `ahand-hub-prod` | `ahand-hub.team9.ai` | +| `dev` | `dev` | `openclaw-hive-dev` | `ahand-hub-dev` | `ahand-hub.dev.team9.ai` | +| `staging` | `staging` | `openclaw-hive-dev` | `ahand-hub-staging` | `ahand-hub.staging.team9.ai` | + +Staging uses the t9 account (`149614785083`), the non-production cluster +(`openclaw-hive-dev`), and a separate Terraform stack under +`infra/envs/staging`. + +## Terraform + +Add `infra/envs/staging` as an independent backend state: + +- S3 backend bucket: `team9-tfstate` +- State key: `ahand-hub/envs/staging/terraform.tfstate` +- AWS profile: `t9` +- Lock table: `terraform-state-lock` + +The staging stack should call `../../modules/ahand-hub` with: + +- `env = "staging"` +- `ecs_cluster_name = "openclaw-hive-dev"` +- `api_domain = "ahand-hub.staging.team9.ai"` +- The same t9 VPC, subnet, Traefik, and RDS values used by dev +- `gateway_public_url = "https://api.staging.team9.ai"` +- `redis_mode = "create"` + +The shared module's `env` validation must accept `staging`. + +## Runtime Parameters + +Terraform seeds `/ahand-hub/staging/*` parameters the same way as dev/prod. +Operator-seeded values remain out of Terraform state: + +- `/ahand-hub/staging/DATABASE_URL` +- `/ahand-hub/staging/SENTRY_DSN` + +`DATABASE_URL` must point at a dedicated `ahand_hub_staging` database/user in +the staging/non-production RDS instance before the service can run healthily. +`SENTRY_DSN` can be placeholder initially, but automatic Sentry capture is not +active until a real DSN is written and ECS is redeployed. + +## CI/CD + +`.github/workflows/deploy-hub.yml` should trigger on `staging` and set: + +- `ENV=staging` +- `ECS_CLUSTER=openclaw-hive-dev` +- `SERVICE_NAME=ahand-hub-staging` + +`deploy/hub/deploy.sh` should accept `staging`, select the same cluster/service, +and render `API_DOMAIN=ahand-hub.staging.team9.ai`. + +## Validation + +Local validation should include: + +- A repo check proving staging is wired into Terraform, workflow, and deploy + script. +- YAML parse of `.github/workflows/deploy-hub.yml`. +- Bash syntax check of `deploy/hub/deploy.sh`. +- `terraform fmt -check` for shared module and env stacks. +- `terraform init -backend=false` and `terraform validate` for + `infra/envs/staging`. + +Live validation after merge/apply should include: + +- Terraform plan/apply for `infra/envs/staging` using profile `t9`. +- SSM presence checks for `/ahand-hub/staging/*`. +- ECS service check for `ahand-hub-staging`. +- Staging deploy run from the `staging` branch. +- Health check at `https://ahand-hub.staging.team9.ai/api/health`. diff --git a/scripts/verify-aws-staging-config.sh b/scripts/verify-aws-staging-config.sh new file mode 100755 index 0000000..a3dcfcc --- /dev/null +++ b/scripts/verify-aws-staging-config.sh @@ -0,0 +1,43 @@ +#!/usr/bin/env bash +set -euo pipefail + +fail() { + printf 'aws_staging_config=missing: %s\n' "$1" >&2 + exit 1 +} + +require_file() { + [[ -f "$1" ]] || fail "$1" +} + +require_match() { + local pattern="$1" + local file="$2" + rg -q "$pattern" "$file" || fail "$file lacks $pattern" +} + +require_file infra/envs/staging/backend.tf +require_file infra/envs/staging/main.tf +require_file infra/envs/staging/providers.tf +require_file infra/envs/staging/variables.tf +require_file infra/envs/staging/versions.tf + +require_match 'contains\(\["prod", "dev", "staging"\], var\.env\)' infra/modules/ahand-hub/variables.tf +require_match 'branches: \[main, dev, staging\]' .github/workflows/deploy-hub.yml +require_match 'refs/heads/staging' .github/workflows/deploy-hub.yml +require_match 'SERVICE_NAME=ahand-hub-staging' .github/workflows/deploy-hub.yml +require_match 'openclaw-hive-dev' .github/workflows/deploy-hub.yml + +require_match '\[\[ "\$ENV" == "dev" \|\| "\$ENV" == "staging" \|\| "\$ENV" == "prod" \]\]' deploy/hub/deploy.sh +require_match 'SERVICE_NAME="ahand-hub-staging"' deploy/hub/deploy.sh +require_match 'API_DOMAIN="ahand-hub\.staging\.team9\.ai"' deploy/hub/deploy.sh + +require_match 'env[[:space:]]*=[[:space:]]*"staging"' infra/envs/staging/main.tf +require_match 'ecs_cluster_name[[:space:]]*=[[:space:]]*"openclaw-hive-dev"' infra/envs/staging/main.tf +require_match 'api_domain[[:space:]]*=[[:space:]]*"ahand-hub\.staging\.team9\.ai"' infra/envs/staging/main.tf +require_match 'gateway_public_url[[:space:]]*=[[:space:]]*"https://api\.staging\.team9\.ai"' infra/envs/staging/main.tf +require_match 'key[[:space:]]*=[[:space:]]*"ahand-hub/envs/staging/terraform\.tfstate"' infra/envs/staging/backend.tf +require_match 'profile[[:space:]]*=[[:space:]]*"t9"' infra/envs/staging/backend.tf +require_match 'Environment[[:space:]]*=[[:space:]]*"staging"' infra/envs/staging/providers.tf + +printf 'aws_staging_config=ok\n' From 333d33783a8aa448ce6e92532bae31b523b304ec Mon Sep 17 00:00:00 2001 From: Winrey Date: Sun, 21 Jun 2026 06:36:52 +0800 Subject: [PATCH 2/3] infra: add t9 staging hub stack --- infra/envs/staging/backend.tf | 10 ++++++++ infra/envs/staging/main.tf | 37 ++++++++++++++++++++++++++++ infra/envs/staging/providers.tf | 12 +++++++++ infra/envs/staging/variables.tf | 29 ++++++++++++++++++++++ infra/envs/staging/versions.tf | 14 +++++++++++ infra/modules/ahand-hub/ecs.tf | 2 +- infra/modules/ahand-hub/variables.tf | 4 +-- 7 files changed, 105 insertions(+), 3 deletions(-) create mode 100644 infra/envs/staging/backend.tf create mode 100644 infra/envs/staging/main.tf create mode 100644 infra/envs/staging/providers.tf create mode 100644 infra/envs/staging/variables.tf create mode 100644 infra/envs/staging/versions.tf diff --git a/infra/envs/staging/backend.tf b/infra/envs/staging/backend.tf new file mode 100644 index 0000000..a8daa1c --- /dev/null +++ b/infra/envs/staging/backend.tf @@ -0,0 +1,10 @@ +terraform { + backend "s3" { + bucket = "team9-tfstate" + key = "ahand-hub/envs/staging/terraform.tfstate" + region = "us-east-1" + profile = "t9" + dynamodb_table = "terraform-state-lock" + encrypt = true + } +} diff --git a/infra/envs/staging/main.tf b/infra/envs/staging/main.tf new file mode 100644 index 0000000..db68644 --- /dev/null +++ b/infra/envs/staging/main.tf @@ -0,0 +1,37 @@ +data "aws_lb" "traefik" { + name = var.traefik_alb_name +} + +module "ahand_hub" { + source = "../../modules/ahand-hub" + + # t9 staging is isolated from dev at ECS/SSM/Redis/IAM while reusing the + # non-production cluster and network shape. + env = "staging" + ecs_cluster_name = "openclaw-hive-dev" + api_domain = "ahand-hub.staging.team9.ai" + openclaw_rds_host = var.openclaw_rds_host + openclaw_rds_security_group_id = var.openclaw_rds_security_group_id + vpc_id = var.vpc_id + subnet_ids = var.subnet_ids + traefik_security_group_id = var.traefik_security_group_id + gateway_public_url = "https://api.staging.team9.ai" + redis_mode = "create" +} + +output "execution_role_arn" { + value = module.ahand_hub.execution_role_arn +} + +output "task_role_arn" { + value = module.ahand_hub.task_role_arn +} + +output "file_ops_bucket_name" { + value = module.ahand_hub.file_ops_bucket_name +} + +output "traefik_lb_dns_name" { + description = "Configure Cloudflare CNAME ahand-hub.staging.team9.ai -> this value (DNS-only, gray cloud)" + value = data.aws_lb.traefik.dns_name +} diff --git a/infra/envs/staging/providers.tf b/infra/envs/staging/providers.tf new file mode 100644 index 0000000..ead109e --- /dev/null +++ b/infra/envs/staging/providers.tf @@ -0,0 +1,12 @@ +provider "aws" { + region = "us-east-1" + profile = "t9" + + default_tags { + tags = { + Environment = "staging" + Service = "ahand-hub" + ManagedBy = "Terraform" + } + } +} diff --git a/infra/envs/staging/variables.tf b/infra/envs/staging/variables.tf new file mode 100644 index 0000000..19f4c99 --- /dev/null +++ b/infra/envs/staging/variables.tf @@ -0,0 +1,29 @@ +variable "vpc_id" { + type = string + default = "vpc-05804f4c4dd8965f3" +} + +variable "subnet_ids" { + type = list(string) + default = ["subnet-0eaffec23bfd7eb63", "subnet-0cdb64bc3cf4c6ee3"] +} + +variable "traefik_alb_name" { + type = string + default = "traefik-dev-nlb" +} + +variable "traefik_security_group_id" { + type = string + default = "sg-0368318519318a4ba" +} + +variable "openclaw_rds_host" { + type = string + default = "openclaw-hive-dev.c89gkagwy37d.us-east-1.rds.amazonaws.com" +} + +variable "openclaw_rds_security_group_id" { + type = string + default = "sg-0b7b9a007a8b5b7a6" +} diff --git a/infra/envs/staging/versions.tf b/infra/envs/staging/versions.tf new file mode 100644 index 0000000..e7a6887 --- /dev/null +++ b/infra/envs/staging/versions.tf @@ -0,0 +1,14 @@ +terraform { + required_version = ">= 1.6.0" + + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 5.40" + } + random = { + source = "hashicorp/random" + version = "~> 3.6" + } + } +} diff --git a/infra/modules/ahand-hub/ecs.tf b/infra/modules/ahand-hub/ecs.tf index fae9597..5bf6b1b 100644 --- a/infra/modules/ahand-hub/ecs.tf +++ b/infra/modules/ahand-hub/ecs.tf @@ -79,7 +79,7 @@ resource "aws_ecs_task_definition" "stub" { task_role_arn = aws_iam_role.task.arn # Placeholder only — deploy-hub.yml registers fresh revisions on every - # push to main/dev with the real image + secrets + Traefik labels. + # push to main/dev/staging with the real image + secrets + Traefik labels. container_definitions = jsonencode([ { name = "ahand-hub" diff --git a/infra/modules/ahand-hub/variables.tf b/infra/modules/ahand-hub/variables.tf index 341fa99..cde6a0f 100644 --- a/infra/modules/ahand-hub/variables.tf +++ b/infra/modules/ahand-hub/variables.tf @@ -2,8 +2,8 @@ variable "env" { description = "Deployment environment" type = string validation { - condition = contains(["prod", "dev"], var.env) - error_message = "env must be 'prod' or 'dev'" + condition = contains(["prod", "dev", "staging"], var.env) + error_message = "env must be 'prod', 'dev', or 'staging'" } } From 7e9f8b0c3dcf6f9e67861361fdccae0895236427 Mon Sep 17 00:00:00 2001 From: Winrey Date: Sun, 21 Jun 2026 06:36:55 +0800 Subject: [PATCH 3/3] ci: deploy hub staging on aws t9 --- .github/workflows/deploy-hub.yml | 8 +++- deploy/hub/deploy.sh | 28 ++++++++----- infra/README.md | 68 ++++++++++++++++++-------------- 3 files changed, 64 insertions(+), 40 deletions(-) diff --git a/.github/workflows/deploy-hub.yml b/.github/workflows/deploy-hub.yml index bceb1c6..62a0aba 100644 --- a/.github/workflows/deploy-hub.yml +++ b/.github/workflows/deploy-hub.yml @@ -1,7 +1,7 @@ name: Deploy Hub on: push: - branches: [main, dev] + branches: [main, dev, staging] paths: - "crates/ahand-hub/**" - "crates/ahand-hub-core/**" @@ -10,6 +10,8 @@ on: - "proto/**" - "Cargo.lock" - "deploy/hub/Dockerfile" + - "deploy/hub/deploy.sh" + - "deploy/hub/task-definition.template.json" - ".github/workflows/deploy-hub.yml" permissions: @@ -32,6 +34,10 @@ jobs: echo "ENV=prod" >> "$GITHUB_ENV" echo "ECS_CLUSTER=openclaw-hive" >> "$GITHUB_ENV" echo "SERVICE_NAME=ahand-hub-prod" >> "$GITHUB_ENV" + elif [ "${{ github.ref }}" = "refs/heads/staging" ]; then + echo "ENV=staging" >> "$GITHUB_ENV" + echo "ECS_CLUSTER=openclaw-hive-dev" >> "$GITHUB_ENV" + echo "SERVICE_NAME=ahand-hub-staging" >> "$GITHUB_ENV" else echo "ENV=dev" >> "$GITHUB_ENV" echo "ECS_CLUSTER=openclaw-hive-dev" >> "$GITHUB_ENV" diff --git a/deploy/hub/deploy.sh b/deploy/hub/deploy.sh index abe3fda..a784f11 100755 --- a/deploy/hub/deploy.sh +++ b/deploy/hub/deploy.sh @@ -3,7 +3,7 @@ set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" ENV="${1:-}" -[[ "$ENV" == "dev" || "$ENV" == "prod" ]] || { echo "Usage: $0 {dev|prod}"; exit 1; } +[[ "$ENV" == "dev" || "$ENV" == "staging" || "$ENV" == "prod" ]] || { echo "Usage: $0 {dev|staging|prod}"; exit 1; } AWS_REGION="us-east-1" ACCOUNT_ID="149614785083" @@ -11,15 +11,23 @@ ECR_REGISTRY="${ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com" ECR_REPO="ahand-hub" GIT_SHA="${GIT_SHA:-$(git rev-parse --short HEAD)}" -if [[ "$ENV" == "prod" ]]; then - ECS_CLUSTER="openclaw-hive" - SERVICE_NAME="ahand-hub-prod" - API_DOMAIN="ahand-hub.team9.ai" -else - ECS_CLUSTER="openclaw-hive-dev" - SERVICE_NAME="ahand-hub-dev" - API_DOMAIN="ahand-hub.dev.team9.ai" -fi +case "$ENV" in + prod) + ECS_CLUSTER="openclaw-hive" + SERVICE_NAME="ahand-hub-prod" + API_DOMAIN="ahand-hub.team9.ai" + ;; + staging) + ECS_CLUSTER="openclaw-hive-dev" + SERVICE_NAME="ahand-hub-staging" + API_DOMAIN="ahand-hub.staging.team9.ai" + ;; + dev) + ECS_CLUSTER="openclaw-hive-dev" + SERVICE_NAME="ahand-hub-dev" + API_DOMAIN="ahand-hub.dev.team9.ai" + ;; +esac ECR_IMAGE="${ECR_REGISTRY}/${ECR_REPO}:${ENV}" SSM_PREFIX="arn:aws:ssm:${AWS_REGION}:${ACCOUNT_ID}:parameter/ahand-hub/${ENV}" diff --git a/infra/README.md b/infra/README.md index 2912b39..39b2e28 100644 --- a/infra/README.md +++ b/infra/README.md @@ -3,10 +3,10 @@ Terraform + AWS operations guide for the **ahand-hub** control-plane service. - **Design spec**: `../docs` (ahand) and `../team9/docs/superpowers/specs/2026-04-22-ahand-integration-design.md` -- **AWS account**: `471112576951` +- **AWS account**: `149614785083` (t9) - **Region**: `us-east-1` -- **AWS profile (local)**: `ww` -- **Environments**: prod (`ahand-hub.team9.ai`), dev (`ahand-hub.dev.team9.ai`) +- **AWS profile (local)**: `t9` +- **Environments**: prod (`ahand-hub.team9.ai`), staging (`ahand-hub.staging.team9.ai`), dev (`ahand-hub.dev.team9.ai`) ## Prerequisites @@ -17,8 +17,8 @@ Terraform + AWS operations guide for the **ahand-hub** control-plane service. | Docker | ≥ 24 | Docker Desktop | | `psql` | any | `brew install libpq && brew link --force libpq` | -`aws configure --profile ww` must be set up with credentials for the -`ww-admin` IAM user (or any principal with equivalent IAM / SSM / ECS / RDS +`aws configure --profile t9` must be set up with credentials for a principal +with equivalent IAM / SSM / ECS / RDS access). ## Directory layout @@ -28,13 +28,14 @@ infra/ ├── shared/ # account-wide resources (ECR, OIDC deploy role, log group) ├── envs/ │ ├── prod/ # prod stack — module "ahand_hub" { env = "prod" } +│ ├── staging/ # staging stack — module "ahand_hub" { env = "staging" } │ └── dev/ # dev stack — module "ahand_hub" { env = "dev" } └── modules/ └── ahand-hub/ # per-env IAM, SSM, Redis, ECS resources ``` -Each of `shared/`, `envs/prod/`, `envs/dev/` is an independent Terraform -stack with its own `backend.tf` key inside `s3://weightwave-tfstate`. +Each of `shared/`, `envs/prod/`, `envs/staging/`, and `envs/dev/` is an +independent Terraform stack with its own `backend.tf` key. ## First-time apply @@ -51,7 +52,13 @@ terraform init terraform plan terraform apply -# 3. Prod +# 3. Staging +cd ../staging +terraform init +terraform plan +terraform apply + +# 4. Prod cd ../prod terraform init terraform plan @@ -69,6 +76,7 @@ name is known: | Host | Type | Target | Cloudflare mode | |---|---|---|---| | `ahand-hub.team9.ai` | CNAME | `traefik-nlb-9d708d124f9805ad.elb.us-east-1.amazonaws.com` | DNS-only (gray cloud) | +| `ahand-hub.staging.team9.ai` | CNAME | `traefik-dev-nlb-8cda97ce6b37e5e1.elb.us-east-1.amazonaws.com` | DNS-only (gray cloud) | | `ahand-hub.dev.team9.ai` | CNAME | `traefik-dev-nlb-8cda97ce6b37e5e1.elb.us-east-1.amazonaws.com` | DNS-only (gray cloud) | Always DNS-only — Traefik's LetsEncrypt HTTP-01 challenge fails through the @@ -85,7 +93,7 @@ ahand-hub runs on the shared `openclaw-hive-{prod,dev}` RDS instances. The real value is seeded by hand: ```bash -ENV=prod # or dev +ENV=prod # or staging or dev RDS_HOST=openclaw-hive-$ENV.chq8i2se49qd.us-east-1.rds.amazonaws.com DB_ADMIN=openclaw # RDS master user DB_ADMIN_PASSWORD=... # operator-held password @@ -109,7 +117,7 @@ SQL # 3. Build the postgres:// URL and publish to SSM DATABASE_URL="postgres://ahand_hub_${ENV}:${AHAND_PW}@${RDS_HOST}:5432/ahand_hub_${ENV}?sslmode=require" aws ssm put-parameter --name /ahand-hub/$ENV/DATABASE_URL \ - --type SecureString --value "$DATABASE_URL" --overwrite --profile ww + --type SecureString --value "$DATABASE_URL" --overwrite --profile t9 ``` Rotate by repeating with a new password and `ALTER ROLE ... WITH PASSWORD`. @@ -122,9 +130,9 @@ Sentry project is provisioned: ```bash aws ssm put-parameter --name /ahand-hub/prod/SENTRY_DSN \ --type SecureString --value "https://@o.ingest.sentry.io/" \ - --overwrite --profile ww + --overwrite --profile t9 aws ecs update-service --cluster openclaw-hive --service ahand-hub-prod \ - --force-new-deployment --profile ww + --force-new-deployment --profile t9 ``` Other secrets (`JWT_SECRET`, `SERVICE_TOKEN`, `WEBHOOK_SECRET`, @@ -136,34 +144,35 @@ Other secrets (`JWT_SECRET`, `SERVICE_TOKEN`, `WEBHOOK_SECRET`, ```bash # Push to dev branch → deploy-hub.yml auto-deploys to dev +# Push to staging branch → deploy-hub.yml auto-deploys to staging # Push to main branch → deploy-hub.yml auto-deploys to prod -git push origin dev # or main +git push origin dev # or staging or main # Manual deploy from a developer laptop -aws ecr get-login-password --profile ww --region us-east-1 \ - | docker login --username AWS --password-stdin 471112576951.dkr.ecr.us-east-1.amazonaws.com +aws ecr get-login-password --profile t9 --region us-east-1 \ + | docker login --username AWS --password-stdin 149614785083.dkr.ecr.us-east-1.amazonaws.com docker build --platform linux/amd64 \ - -t 471112576951.dkr.ecr.us-east-1.amazonaws.com/ahand-hub:dev . -docker push 471112576951.dkr.ecr.us-east-1.amazonaws.com/ahand-hub:dev -AWS_PROFILE=ww ./deploy/hub/deploy.sh dev + -t 149614785083.dkr.ecr.us-east-1.amazonaws.com/ahand-hub:dev . +docker push 149614785083.dkr.ecr.us-east-1.amazonaws.com/ahand-hub:dev +AWS_PROFILE=t9 ./deploy/hub/deploy.sh dev ``` ## View logs ```bash -aws logs tail /ecs/ahand-hub --profile ww --region us-east-1 --since 1h --follow -aws logs tail /ecs/ahand-hub --profile ww --region us-east-1 --since 1h \ +aws logs tail /ecs/ahand-hub --profile t9 --region us-east-1 --since 1h --follow +aws logs tail /ecs/ahand-hub --profile t9 --region us-east-1 --since 1h \ --filter-pattern '{$.level = "error"}' ``` -Log streams are prefixed `ahand-hub-prod-*` and `ahand-hub-dev-*`, so the -single log group serves both environments. +Log streams are prefixed `ahand-hub-prod-*`, `ahand-hub-staging-*`, and +`ahand-hub-dev-*`, so the single log group serves all hub environments. ## Rollback ```bash -aws ecs list-task-definitions --profile ww --family-prefix ahand-hub-prod --sort DESC | head -10 -aws ecs update-service --profile ww --cluster openclaw-hive \ +aws ecs list-task-definitions --profile t9 --family-prefix ahand-hub-prod --sort DESC | head -10 +aws ecs update-service --profile t9 --cluster openclaw-hive \ --service ahand-hub-prod --task-definition ahand-hub-prod:42 --force-new-deployment ``` @@ -180,10 +189,10 @@ aws ecs update-service --profile ww --cluster openclaw-hive \ ### ECS task crashes on startup -1. Logs: `aws logs tail /ecs/ahand-hub --profile ww --since 30m`. +1. Logs: `aws logs tail /ecs/ahand-hub --profile t9 --since 30m`. 2. Likely causes: - `DATABASE_URL` or `REDIS_URL` missing — verify - `aws ssm get-parameters-by-path --path /ahand-hub// --with-decryption --profile ww`. + `aws ssm get-parameters-by-path --path /ahand-hub// --with-decryption --profile t9`. - Migration failure — connect as `ahand_hub_` and inspect the `schema_migrations` table (or Drizzle/SQLx equivalent). - Redis unreachable — confirm the task SG has egress to the Redis SG on 6379. @@ -196,8 +205,8 @@ DNS-only. ### High RDS connections -`openclaw-hive-{prod,dev}` is shared with other services (folder9, control-plane). -Hub caps its pool in code; confirm by querying `pg_stat_activity`. +`openclaw-hive-dev` is shared by dev and staging. Hub caps its pool in code; +confirm by querying `pg_stat_activity`. ## Terraform @@ -205,9 +214,10 @@ Hub caps its pool in code; confirm by querying `pg_stat_activity`. |---|---|---| | shared | `infra/shared/` | `ahand-hub/shared/terraform.tfstate` | | prod | `infra/envs/prod/` | `ahand-hub/envs/prod/terraform.tfstate` | +| staging | `infra/envs/staging/` | `ahand-hub/envs/staging/terraform.tfstate` | | dev | `infra/envs/dev/` | `ahand-hub/envs/dev/terraform.tfstate` | -State bucket `weightwave-tfstate`, lock table `terraform-state-lock` — both +State bucket `team9-tfstate`, lock table `terraform-state-lock` — both pre-existing and shared with folder9 / other team9 services. ## Preflight audit