cloudflare · Muna-Lombe · Feb 25, 2026 · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026
diff --git a/.RALPH/README.md b/.RALPH/README.md
@@ -0,0 +1,22 @@
+# .RALPH – moltworker Pattern Library
+
+This is the project-level knowledge base for `moltworker` (the OpenClaw-based Cloudflare
+Worker + Sandbox project). Every validated pattern, recurring problem, and architectural
+decision from this project is logged here.
+
+> For the future **nanoworker** project, see `nanoworker/.RALPH/` — it inherits and
+> extends many of these patterns.
+
+## Rules for agents
+
+1. **Before trying an approach**, check `patterns.md` and `problems.md`.
+2. **After solving a non-trivial problem**, add an entry here.
+3. **After making an architectural decision**, log it in `decisions.md`.
+
+## Files
+
+| File | Purpose |
+|------|---------|
+| `patterns.md` | Validated reusable implementation strategies |
+| `problems.md` | Recurring problems and their confirmed solutions |
+| `decisions.md` | Architectural decisions with rationale |
diff --git a/.RALPH/decisions.md b/.RALPH/decisions.md
@@ -0,0 +1,71 @@
+# Architectural Decisions – moltworker
+
+---
+
+## ADR-001 – Config patcher runs unconditionally on every container boot
+
+**Date**: 2026-03-03  
+**Status**: Accepted
+
+**Context**: Should the Node.js config patcher in `start-openclaw.sh` run only when
+no config exists (i.e. first boot), or unconditionally?
+
+**Decision**: Unconditionally, after any R2 restore and before the gateway starts.
+
+**Rationale**: Running it conditionally means that changing a Cloudflare secret requires
+manually deleting the R2 config to force re-onboard. This is error-prone and was the
+direct cause of PROB-001 and PROB-002 in production. Running it unconditionally means
+`wrangler secret put` + `npm run deploy` is always sufficient to propagate new secret values.
+
+**Trade-offs**: Startup adds a small overhead (~50 ms for the Node.js one-shot). Manual
+in-container edits to patched fields (provider apiKey, channel tokens, gateway token) will
+be overwritten on next restart. This is documented and acceptable.
+
+---
+
+## ADR-002 – Use rclone (not rsync or s3fs) for R2 persistence
+
+**Date**: 2026-03-03  
+**Status**: Accepted
+
+**Context**: The container needs to persist OpenClaw config and workspace to R2 across restarts.
+
+**Decision**: rclone with `--fast-list --s3-no-check-bucket`, not rsync or s3fs mount.
+
+**Rationale**: R2 does not support setting file timestamps. `rsync -a` (which preserves
+timestamps) fails with I/O errors against R2 (PROB-004). rclone works correctly with R2
+by default and does not attempt to set timestamps.
+
+---
+
+## ADR-003 – CF AI Gateway requires `CF_AI_GATEWAY_MODEL` to be explicitly set
+
+**Date**: 2026-03-03  
+**Status**: Accepted
+
+**Context**: Should the config patcher try to infer the model from other config,
+or require an explicit `CF_AI_GATEWAY_MODEL` env var?
+
+**Decision**: Require explicit `CF_AI_GATEWAY_MODEL` (format: `{provider}/{model}`).
+
+**Rationale**: Inferring the model is ambiguous and error-prone. An explicit var makes
+the configuration unambiguous, testable, and easy to change without touching code.
+The format `{provider}/{model}` allows the patcher to construct the correct gateway base URL
+and set the correct `api` mode (`anthropic-messages` vs `openai-completions`).
+
+---
+
+## ADR-004 – `MOLTBOT_GATEWAY_TOKEN` is mapped to `OPENCLAW_GATEWAY_TOKEN` in the container
+
+**Date**: 2026-03-03  
+**Status**: Accepted
+
+**Context**: The Worker-facing secret is named `MOLTBOT_GATEWAY_TOKEN` (worker-level
+naming convention). The OpenClaw container expects `OPENCLAW_GATEWAY_TOKEN`.
+
+**Decision**: `buildEnvVars()` maps `MOLTBOT_GATEWAY_TOKEN` → `OPENCLAW_GATEWAY_TOKEN`.
+The `start-openclaw.sh` script reads `OPENCLAW_GATEWAY_TOKEN` internally.
+
+**Rationale**: Keeps the Worker env namespace decoupled from the container's internal
+naming. If OpenClaw is ever replaced, only `buildEnvVars()` needs to change, not the
+Worker-facing secret name.
diff --git a/.RALPH/patterns.md b/.RALPH/patterns.md
@@ -0,0 +1,121 @@
+# Validated Patterns – moltworker
+
+---
+
+## P-001 – Inline config patcher (always runs on every container boot)
+
+**Date**: 2026-03-03  
+**Context**: OpenClaw reads provider config from `~/.openclaw/openclaw.json`. Secrets live
+in Cloudflare Worker env and must reach the container. The container may have a persisted
+config from R2 which must not be fully overwritten.
+
+**Approach**: In `start-openclaw.sh`, after the R2 restore, run an inline Node.js heredoc
+that reads the existing config, writes/overrides only the sections it owns (provider entry,
+gateway auth, channels), and writes it back. This runs **unconditionally** — not just on
+first boot.
+
+**Location**: `start-openclaw.sh` lines 141–265  
+**Result**: ✅ Validated. Fixes stale R2 config issues (PROB-002). Ensures new secrets
+take effect on next container restart after redeploy.
+
+**Caveats**:
+- Patcher must not write fields that fail OpenClaw's strict config validation (PROB-006).
+- Patcher must be idempotent (running twice produces the same output).
+- Test: run `openclaw status` after patching; non-zero exit = bad config.
+
+---
+
+## P-002 – CF AI Gateway provider injection via config patcher
+
+**Date**: 2026-03-03  
+**Context**: Using Cloudflare AI Gateway as the model provider. Requires building a provider
+entry in `openclaw.json` with a `baseUrl`, `apiKey`, and `models` array.
+
+**Approach**: In the patcher, detect `CF_AI_GATEWAY_MODEL` (format: `{provider}/{model}`).
+Extract the provider prefix and model ID. Build the base URL:
+```
+https://gateway.ai.cloudflare.com/v1/{CF_AI_GATEWAY_ACCOUNT_ID}/{CF_AI_GATEWAY_GATEWAY_ID}/{provider}
+```
+Write a provider entry named `cf-ai-gw-{provider}` with:
+- `baseUrl`: gateway URL
+- `apiKey`: value of `CLOUDFLARE_AI_GATEWAY_API_KEY`
+- `api`: `"anthropic-messages"` for Anthropic provider, `"openai-completions"` otherwise
+- `models`: array with the single specified model
+
+Set `agents.defaults.model.primary` to `cf-ai-gw-{provider}/{modelId}`.
+
+**Location**: `start-openclaw.sh` lines 183–219  
+**Result**: ✅ Validated. This is the working path for CF AI Gateway models.
+
+**Caveats**:
+- For `workers-ai` provider, append `/v1` to the base URL.
+- All four env vars must be set together: `CLOUDFLARE_AI_GATEWAY_API_KEY`,
+  `CF_AI_GATEWAY_ACCOUNT_ID`, `CF_AI_GATEWAY_GATEWAY_ID`, `CF_AI_GATEWAY_MODEL`.
+- `apiKey` must be non-empty — do not write an empty string.
+
+---
+
+## P-003 – Worker WebSocket proxy with token injection
+
+**Date**: 2026-03-03  
+**Context**: Cloudflare Workers proxy WebSocket connections to Sandbox containers.
+CF Access redirects strip query parameters, losing the `?token=` needed by the gateway.
+
+**Approach**: In the WS proxy handler (`src/index.ts`):
+1. Check if `MOLTBOT_GATEWAY_TOKEN` is set and URL lacks `?token=`.
+2. If so, clone the URL and inject the token as `?token={value}`.
+3. Use the modified URL for `sandbox.wsConnect()`.
+4. Create a `WebSocketPair`, accept both ends, wire `message`/`close`/`error` relays.
+5. Return `new Response(null, { status: 101, webSocket: clientWs })`.
+
+**Location**: `src/index.ts` lines 283–429  
+**Result**: ✅ Validated. Fixes PROB-005.
+
+**Caveats**:
+- WS close reasons must be ≤ 123 bytes (WebSocket spec); truncate if longer.
+- `containerWs` may be null if container not ready; handle gracefully.
+- Error messages from the gateway can be transformed before relaying to the client.
+
+---
+
+## P-004 – rclone for R2 config sync (not rsync)
+
+**Date**: 2026-03-03  
+**Context**: Container config and workspace must persist across restarts via R2.
+
+**Approach**: Use `rclone` (not `rsync`) with these flags:
+```bash
+rclone sync "$LOCAL_DIR/" "r2:${R2_BUCKET}/{prefix}/" \
+  --transfers=16 --fast-list --s3-no-check-bucket \
+  --exclude='*.lock' --exclude='*.log' --exclude='*.tmp' --exclude='.git/**'
+```
+Background sync loop checks for changed files every 30 s via `find -newer {marker}`.
+
+**Location**: `start-openclaw.sh` lines 270–310  
+**Result**: ✅ Validated. Avoids PROB-004 (timestamp errors on R2).
+
+**Caveats**:
+- Never use `rsync -a` or `rsync --times` against R2.
+- Update the marker file (`touch $MARKER`) after each sync, not before.
+- The sync loop runs in background (`&`); do not wait for it before starting gateway.
+
+---
+
+## P-005 – `buildEnvVars()` — Worker env → container env mapping
+
+**Date**: 2026-03-03  
+**Context**: Worker secrets must be forwarded to the container as process env vars.
+
+**Approach**: A dedicated `buildEnvVars(env: MoltbotEnv): Record<string, string>` function
+in `src/gateway/env.ts` handles all mapping logic:
+- Conditionally includes only vars that are set (no empty strings).
+- Handles provider priority: CF AI Gateway > Anthropic (with legacy AI Gateway as override).
+- Maps `MOLTBOT_GATEWAY_TOKEN` → `OPENCLAW_GATEWAY_TOKEN` (container-internal name).
+
+**Location**: `src/gateway/env.ts`  
+**Result**: ✅ Validated. Well-tested (see `src/gateway/env.test.ts`).
+
+**Caveats**:
+- Never log secret values from `buildEnvVars()` output. Log `Object.keys(envVars)` only.
+- Legacy AI Gateway path (`AI_GATEWAY_API_KEY` + `AI_GATEWAY_BASE_URL`) overrides direct
+  Anthropic key when both are set — this is intentional but can be surprising.
diff --git a/.RALPH/problems.md b/.RALPH/problems.md
@@ -0,0 +1,98 @@
+# Recurring Problems – moltworker
+
+---
+
+## PROB-001 – `"x-api-key header is required"` on model calls
+
+**Date**: 2026-03-03  
+**Symptom**:
+```json
+{ "type": "error", "error": { "type": "authentication_error", "message": "x-api-key header is required" } }
+```
+**Root causes (ordered by likelihood)**:
+
+1. **`CF_AI_GATEWAY_MODEL` not set** — Without this var, the inline Node.js config patcher
+   in `start-openclaw.sh` never creates the `cf-ai-gw-{provider}` provider entry with `apiKey`.
+   Fix: `wrangler secret put CF_AI_GATEWAY_MODEL` (format: `{provider}/{model}`) → redeploy.
+
+2. **API key secret missing from deployed worker** — Key only exists in `.dev.vars`, not
+   set via `wrangler secret put`. Fix: `wrangler secret put ANTHROPIC_API_KEY` → redeploy.
+
+3. **Stale R2 config** — First deploy ran with no key; a keyless provider entry was written to R2.
+   Subsequent boots skip `openclaw onboard` and load the stale config. The inline Node patcher
+   (which always runs) should overwrite this — if it doesn't, check that `CF_AI_GATEWAY_MODEL`
+   is set so the patcher block is triggered.
+
+4. **Two provider entries — agent using the keyless one** — Config has both the stale keyless
+   `cloudflare-ai-gateway` provider AND the correctly keyed `cf-ai-gw-anthropic` provider,
+   but `agents.defaults.model.primary` points to the keyless one. Fix: verify
+   `/debug/container-config` and ensure `agents.defaults.model.primary` matches the entry
+   with a non-empty `apiKey`.
+
+5. **Deploy cancelled (Ctrl-C)** — Secret was set but deploy never completed. Old worker
+   version is still running. Fix: run `npm run deploy` again and let it complete.
+
+**Verification**: `GET /_admin/` is not relevant. Hit `/debug/container-config` and inspect
+`models.providers.{name}.apiKey` — must be non-empty.
+
+---
+
+## PROB-002 – Stale R2 config not updated after adding new secrets
+
+**Date**: 2026-03-03  
+**Symptom**: After setting new Cloudflare secrets and redeploying, the container behaves as
+if the secrets are not there. `/debug/container-config` shows old values.  
+**Cause**: `start-openclaw.sh` only runs `openclaw onboard` if no config exists. R2-persisted
+config survives redeploy. Onboard is skipped; new secrets are never applied.  
+**Fix**: The inline Node patcher in `start-openclaw.sh` always runs and overwrites provider
+entries from the current env. Ensure the patcher logic covers the field you changed.
+If the patcher doesn't cover it, add it.
+
+---
+
+## PROB-003 – Deploy interrupted by Ctrl-C; new secrets not live
+
+**Date**: 2026-03-03  
+**Symptom**: Secret added via `wrangler secret put` but issue persists after what looks like
+a deploy. `wrangler tail` shows `Has ANTHROPIC_API_KEY: false`.  
+**Cause**: `npm run deploy` was interrupted. The old worker version is still serving.
+`wrangler secret put` succeeds independently of deploy; the worker must be redeployed to
+pick up the new secret.  
+**Fix**: `npm run deploy` — let it run to completion. Verify with `wrangler tail`.
+
+---
+
+## PROB-004 – rclone/rsync fails with "Input/output error" on R2
+
+**Date**: 2026-03-03  
+**Symptom**: R2 sync exits non-zero with timestamp-related errors.  
+**Cause**: R2 does not support setting file timestamps. `rsync -a` preserves timestamps
+and fails.  
+**Fix**: Use `rclone sync` with `--transfers=16 --fast-list --s3-no-check-bucket`.
+Never use `rsync -a` or `rsync --times` against R2.
+
+---
+
+## PROB-005 – WebSocket drops immediately after CF Access redirect
+
+**Date**: 2026-03-03  
+**Symptom**: User authenticates via CF Access and is redirected, but WebSocket connections
+fail with code 1006 or 4001.  
+**Cause**: CF Access redirects strip query parameters. `?token=` is lost.  
+**Fix**: In `src/index.ts` WS proxy handler, inject the token server-side before calling
+`sandbox.wsConnect()` — already implemented. Confirm `MOLTBOT_GATEWAY_TOKEN` is set as
+a Worker secret.
+
+---
+
+## PROB-006 – OpenClaw config validation fails after manual edits or patcher bugs
+
+**Date**: 2026-03-03  
+**Symptom**: Gateway fails to start; logs show config parsing/validation error from OpenClaw.  
+**Common causes**:
+- `agents.defaults.model` set to a bare string instead of `{ "primary": "provider/model" }`.
+- Provider entry missing `models` array or `api` field.
+- Channel config containing stale keys from an old backup.
+- Empty string written for `apiKey` (some OpenClaw versions reject this).
+**Fix**: Use `/debug/container-config` to inspect the config. Fix `start-openclaw.sh`
+patcher to not write the offending field, or write it correctly.
diff --git a/.dev.vars.example b/.dev.vars.example
@@ -40,3 +40,8 @@ MOLTBOT_GATEWAY_TOKEN=dev-token-change-in-prod
 # CDP (Chrome DevTools Protocol) configuration for browser automation
 # CDP_SECRET=shared-secret-for-cdp-auth
 # WORKER_URL=https://your-worker.example.com
+
+# Trading bridge (optional)
+# TRADING_ENABLED=true
+# TRADE_BRIDGE_URL=https://trade-bridge.internal
+# TRADE_BRIDGE_HMAC_SECRET=replace-with-shared-secret
diff --git a/Dockerfile b/Dockerfile
@@ -3,6 +3,7 @@ FROM docker.io/cloudflare/sandbox:0.7.0
 # Install Node.js 22 (required by OpenClaw) and rclone (for R2 persistence)
 # The base image has Node 20, we need to replace it with Node 22
 # Using direct binary download for reliability
+
 ENV NODE_VERSION=22.13.1
 RUN ARCH="$(dpkg --print-architecture)" \
     && case "${ARCH}" in \

diff --git a/README.md b/README.md
@@ -181,6 +181,8 @@ https://your-worker.workers.dev/?token=YOUR_TOKEN
 wss://your-worker.workers.dev/ws?token=YOUR_TOKEN
 ```
 
+
+
 **Note:** Even with a valid token, new devices still require approval via the admin UI at `/_admin/` (see Device Pairing above).
 
 For local development only, set `DEV_MODE=true` in `.dev.vars` to skip Cloudflare Access authentication and enable `allowInsecureAuth` (bypasses device pairing entirely).
@@ -438,6 +440,55 @@ The previous `AI_GATEWAY_API_KEY` + `AI_GATEWAY_BASE_URL` approach is still supp
 | `SLACK_APP_TOKEN` | No | Slack app token |
 | `CDP_SECRET` | No | Shared secret for CDP endpoint authentication (see [Browser Automation](#optional-browser-automation-cdp)) |
 | `WORKER_URL` | No | Public URL of the worker (required for CDP) |
+| `TRADING_ENABLED` | No | Set to `true` to enable admin trading controls that call trade-bridge |
+| `TRADE_BRIDGE_URL` | No | Base URL for the trade-bridge service (e.g. private tunnel URL) |
+| `TRADE_BRIDGE_HMAC_SECRET` | No | Shared HMAC secret used to sign outbound requests to trade-bridge |
+
+
+## Trade Bridge Integration
+
+`moltworker` never talks to exchange APIs directly. Instead, the admin routes call an external `trade-bridge` service that is responsible for risk checks and Freqtrade execution.
+
+### Connection Flow
+
+1. Operator calls a protected admin endpoint in this worker (Cloudflare Access auth already enforced for `/api/admin/*`).
+2. Worker checks feature/config gates:
+   - `TRADING_ENABLED` must be `true`
+   - `TRADE_BRIDGE_URL` and `TRADE_BRIDGE_HMAC_SECRET` must be set
+3. Worker signs the outbound request with HMAC-SHA256 using the canonical string:
+   - `{timestamp}.{nonce}.{method}.{path}.{jsonBody}`
+4. Worker sends request to `TRADE_BRIDGE_URL` with these headers:
+   - `X-Molt-Timestamp`
+   - `X-Molt-Nonce`
+   - `X-Molt-Signature`
+   - `X-Molt-Skew-Ms`
+5. `trade-bridge` validates signature + timestamp + nonce replay protection before executing anything.
+
+### Admin API -> Trade Bridge API Mapping
+
+| Moltworker endpoint | Bridge endpoint | Purpose |
+|---|---|---|
+| `POST /api/admin/trading/signal` | `POST /signals` | Submit a signed trading signal payload (for example `TON/USDT`). |
+| `GET /api/admin/trading/status` | `GET /status` | Read bridge/trading mode and health status. |
+| `POST /api/admin/trading/pause` | `POST /pause` | Pause new trade execution. |
+| `POST /api/admin/trading/kill-switch` | `POST /kill-switch` | Trigger global emergency stop. |
+
+### Example Signal Request
+
+```json
+{
+  "symbol": "TON/USDT",
+  "action": "buy",
+  "strategy": "manual-test",
+  "notional": 25
+}
+```
+
+### Deployment Notes
+
+- Keep `TRADE_BRIDGE_URL` private (Cloudflare Tunnel / WireGuard / private network).
+- Keep `TRADE_BRIDGE_HMAC_SECRET` unique per environment (`local`, `staging`, `prod`).
+- Leave `TRADING_ENABLED` unset or `false` by default; enable only where intended.
 
 ## Security Considerations