collective-memory-repo/shared/long-term/projects/warp-acp-integration.md

# Warp ACP Integration

## Purpose

This document defines the long-term design for routing ACP harnesses through Warp-managed cloud providers. Phase 1 standardizes `opencode`; future phases may extend the same pattern to `gemini cli` and `claude code`.

## Scope

- Control plane: OpenClaw on `mac-5`
- Execution nodes: `mac-6`, `mac-7`
- Current phase: `opencode`
- Future candidates: `gemini cli`, `claude code`

## Core Design Principles

1. **Provider identity and model identity are different things.**
   - Model names can repeat across providers.
   - Environment variables and agent ids must therefore encode the **provider**, not just the model.
2. **Do not rely on `~/.zshrc` for ACP runtime secrets.**
   - ACP / acpx / harness execution may not inherit interactive shell startup files.
   - Secrets should be loaded explicitly.
3. **Secrets and routing rules should be separated.**
   - Secrets live in `~/.openclaw/.env`.
   - Routing / alias / model selection live in ACP config, wrapper scripts, and documented policy.
4. **Wrapper scripts are the stable injection point.**
   - Wrapper scripts should explicitly load `~/.openclaw/.env`, validate required variables, and then exec the target harness.
5. **Fallback must be designed up front.**
   - Cloud provider quota exhaustion and rate limiting are expected operating conditions, not edge cases.

## Secret Storage Standard

Primary local secret file:

- `~/.openclaw/.env`

Node-local requirement:

- Any machine that may locally execute a Warp-backed ACP harness must have its own local `~/.openclaw/.env`.
- For the current cluster plan, this means `mac-5`, `mac-6`, and `mac-7` should all be configured.
- Do not assume secrets from `mac-5` will automatically be available when a harness actually runs on `mac-6` or `mac-7`.

Recommended format:

```bash
WARP_INFINI_API_KEY=
WARP_INFINI_BASE_URL=https://cloud.infini-ai.com/maas/coding/v1
WARP_CKIMI_API_KEY=
WARP_CKIMI_BASE_URL=
```

Notes:

- Use plain `KEY=value` lines.
- Do not store secrets in `~/.openclaw/openclaw.json`, `~/.acpx/config.json`, or long-term memory files.
- Restrict file permissions: `chmod 600 ~/.openclaw/.env`.

## Environment Variable Naming Rule

Use the pattern:

- `WARP_<PROVIDER>_API_KEY`
- `WARP_<PROVIDER>_BASE_URL`

Examples:

- `WARP_INFINI_API_KEY`
- `WARP_INFINI_BASE_URL`
- `WARP_CKIMI_API_KEY`
- `WARP_CKIMI_BASE_URL`

Do **not** name variables by model alone, because identical model names may exist on multiple providers.

## Agent Naming Standard

For Warp-backed ACP harnesses, use:

- `<harness>-warp-<provider>-<model-family>`

Phase 1 examples for `opencode`:

- `opencode-warp-infini-kimi`
- `opencode-warp-infini-minimax`
- `opencode-warp-infini-glm`

Future examples:

- `gemini-warp-infini-kimi`
- `claude-warp-ckimi-kimi`

## Current Default Model Policy (2026-03-18)

The cluster no longer uses a single "default to each machine's own local model" rule for `opencode` ACP.

Current node-specific default policy:

- `mac-5`: default `opencode` model is `opencode/minimax-m2.5-free`
- `mac-6`: default `opencode` model is `vllm/Qwen3.5-27B` via `http://100.64.0.5:8000/v1`
- `mac-7`: default `opencode` model is `vllm/Qwen3.5-27B` via `http://100.64.0.5:8000/v1`

Operational meaning:

- `mac-5` prefers the already-validated free cloud minimax route for daily ACP stability.
- `mac-6` and `mac-7` prefer the shared local vLLM endpoint instead of their previous per-node local `oMLX` default for `opencode` ACP.
- This rule is specific to current `opencode` defaults; it does not invalidate separate worker/subagent topology docs.

Observed validation status:

- `mac-5`: direct `opencode` and ACP minimal tests succeeded with `opencode/minimax-m2.5-free`
- `mac-6`: ACP minimal test succeeded with `vllm/Qwen3.5-27B`
- `mac-7`: ACP minimal test succeeded with `vllm/Qwen3.5-27B`

## Configuration Layer Responsibilities

### OpenClaw

Responsible for:

- Allowing ACP agent ids to be called
- High-level routing policy

Key file:

- `~/.openclaw/openclaw.json`

### acpx

Responsible for:

- Mapping ACP `agentId` to a concrete command

Key file:

- `~/.acpx/config.json`

### Wrapper scripts

Responsible for:

- Loading `~/.openclaw/.env`
- Validating provider secrets / URLs
- Fixing provider and model selection
- Launching the harness ACP command

Suggested location:

- `~/.local/bin/`

### Harness config (`opencode` in phase 1)

Responsible for:

- Harness-specific provider usage and model invocation details

Key file:

- `~/.config/opencode/opencode.json`

## Wrapper Contract

Every Warp-backed ACP wrapper should:

1. Load `~/.openclaw/.env`
2. Validate that required variables exist
3. Select the target provider
4. Pin the intended model
5. Launch the harness ACP entrypoint
6. Fail fast with a readable error if env/config is missing

Pseudo-flow:

```bash
set -a
source ~/.openclaw/.env
set +a

# validate WARP_<PROVIDER>_API_KEY and WARP_<PROVIDER>_BASE_URL
# select model
exec opencode-ai acp
```

## Fallback Policy

### Why fallback exists

Warp-backed cloud providers may fail due to:

- 5-hour quota exhaustion
- weekly quota exhaustion
- rate limiting / throttling
- transient upstream 5xx
- model retirement or temporary unavailability
- provider auth / billing issues

These are normal operational conditions and must be documented as first-class routing rules.

### Fallback priority

When a primary provider/model is unavailable, use this order:

1. **Same model, different provider**
2. **Same provider, adjacent model**
3. **Different provider, adjacent model**

This preserves behavior consistency as much as possible before changing model family.

### Example fallback chain

For `opencode-warp-infini-kimi`:

1. primary: `infini / kimi-k2.5`
2. fallback-1: `ckimi / kimi-for-coding`
3. fallback-2: `infini / glm-5`
4. fallback-3: `infini / minimax-m2.5`

For `opencode-warp-ckimi-kimi-for-coding`:

1. primary: `ckimi / kimi-for-coding`
2. fallback-1: `infini / kimi-k2.5`
3. fallback-2: `infini / glm-5`
4. fallback-3: `infini / minimax-m2.5`

Exact fallback order should be documented per agent as providers are added.

### ACP continuation rule after provider limit / request failure

When an ACP task fails mid-run because a provider is limited, rate-limited, quota-exhausted, or otherwise request-blocked, the system should not just restart blindly.

Use this continuation rule:

1. detect that the failure is provider/request related rather than task-logic related
2. produce a concise task-state summary of what had already been completed, what failed, and what remains
3. start a new `opencode` ACP run using the next Warp fallback agent
4. pass the summary into the new run so the replacement agent can continue instead of redoing everything from scratch

The summary should include, when available:

- original task goal
- completed steps
- files changed or attempted
- command outputs or relevant error lines
- exact provider failure signal (for example rate limit / quota exhaustion)
- remaining work

This behavior is part of the Warp fallback design, especially for long-running coding tasks where provider limits may interrupt execution mid-stream.

## Operational Reminder

If runtime errors show quota exhaustion or provider-specific capacity limits during an ACP task, the operator should treat that as a routing event, not just a task failure. The next step is to move to the documented fallback provider/model path.

## Extension Path

After `opencode` stabilizes, the same naming / secret-loading / fallback framework should be reused for:

- `gemini cli`
- `claude code`

The framework should stay provider-centric and wrapper-based even as the harnesses differ.