258 lines
7.5 KiB
Markdown
258 lines
7.5 KiB
Markdown
# Warp ACP Integration
|
|
|
|
## Purpose
|
|
|
|
This document defines the long-term design for routing ACP harnesses through Warp-managed cloud providers. Phase 1 standardizes `opencode`; future phases may extend the same pattern to `gemini cli` and `claude code`.
|
|
|
|
## Scope
|
|
|
|
- Control plane: OpenClaw on `mac-5`
|
|
- Execution nodes: `mac-6`, `mac-7`
|
|
- Current phase: `opencode`
|
|
- Future candidates: `gemini cli`, `claude code`
|
|
|
|
## Core Design Principles
|
|
|
|
1. **Provider identity and model identity are different things.**
|
|
- Model names can repeat across providers.
|
|
- Environment variables and agent ids must therefore encode the **provider**, not just the model.
|
|
2. **Do not rely on `~/.zshrc` for ACP runtime secrets.**
|
|
- ACP / acpx / harness execution may not inherit interactive shell startup files.
|
|
- Secrets should be loaded explicitly.
|
|
3. **Secrets and routing rules should be separated.**
|
|
- Secrets live in `~/.openclaw/.env`.
|
|
- Routing / alias / model selection live in ACP config, wrapper scripts, and documented policy.
|
|
4. **Wrapper scripts are the stable injection point.**
|
|
- Wrapper scripts should explicitly load `~/.openclaw/.env`, validate required variables, and then exec the target harness.
|
|
5. **Fallback must be designed up front.**
|
|
- Cloud provider quota exhaustion and rate limiting are expected operating conditions, not edge cases.
|
|
|
|
## Secret Storage Standard
|
|
|
|
Primary local secret file:
|
|
|
|
- `~/.openclaw/.env`
|
|
|
|
Node-local requirement:
|
|
|
|
- Any machine that may locally execute a Warp-backed ACP harness must have its own local `~/.openclaw/.env`.
|
|
- For the current cluster plan, this means `mac-5`, `mac-6`, and `mac-7` should all be configured.
|
|
- Do not assume secrets from `mac-5` will automatically be available when a harness actually runs on `mac-6` or `mac-7`.
|
|
|
|
Recommended format:
|
|
|
|
```bash
|
|
WARP_INFINI_API_KEY=
|
|
WARP_INFINI_BASE_URL=https://cloud.infini-ai.com/maas/coding/v1
|
|
WARP_CKIMI_API_KEY=
|
|
WARP_CKIMI_BASE_URL=
|
|
```
|
|
|
|
Notes:
|
|
|
|
- Use plain `KEY=value` lines.
|
|
- Do not store secrets in `~/.openclaw/openclaw.json`, `~/.acpx/config.json`, or long-term memory files.
|
|
- Restrict file permissions: `chmod 600 ~/.openclaw/.env`.
|
|
|
|
## Environment Variable Naming Rule
|
|
|
|
Use the pattern:
|
|
|
|
- `WARP_<PROVIDER>_API_KEY`
|
|
- `WARP_<PROVIDER>_BASE_URL`
|
|
|
|
Examples:
|
|
|
|
- `WARP_INFINI_API_KEY`
|
|
- `WARP_INFINI_BASE_URL`
|
|
- `WARP_CKIMI_API_KEY`
|
|
- `WARP_CKIMI_BASE_URL`
|
|
|
|
Do **not** name variables by model alone, because identical model names may exist on multiple providers.
|
|
|
|
## Agent Naming Standard
|
|
|
|
For Warp-backed ACP harnesses, use:
|
|
|
|
- `<harness>-warp-<provider>-<model-family>`
|
|
|
|
Phase 1 examples for `opencode`:
|
|
|
|
- `opencode-warp-infini-kimi`
|
|
- `opencode-warp-infini-minimax`
|
|
- `opencode-warp-infini-glm`
|
|
|
|
Future examples:
|
|
|
|
- `gemini-warp-infini-kimi`
|
|
- `claude-warp-ckimi-kimi`
|
|
|
|
## Current Default Model Policy (2026-03-18)
|
|
|
|
The cluster no longer uses a single "default to each machine's own local model" rule for `opencode` ACP.
|
|
|
|
Current node-specific default policy:
|
|
|
|
- `mac-5`: default `opencode` model is `opencode/minimax-m2.5-free`
|
|
- `mac-6`: default `opencode` model is `vllm/Qwen3.5-27B` via `http://100.64.0.5:8000/v1`
|
|
- `mac-7`: default `opencode` model is `vllm/Qwen3.5-27B` via `http://100.64.0.5:8000/v1`
|
|
|
|
Operational meaning:
|
|
|
|
- `mac-5` prefers the already-validated free cloud minimax route for daily ACP stability.
|
|
- `mac-6` and `mac-7` prefer the shared local vLLM endpoint instead of their previous per-node local `oMLX` default for `opencode` ACP.
|
|
- This rule is specific to current `opencode` defaults; it does not invalidate separate worker/subagent topology docs.
|
|
|
|
Observed validation status:
|
|
|
|
- `mac-5`: direct `opencode` and ACP minimal tests succeeded with `opencode/minimax-m2.5-free`
|
|
- `mac-6`: ACP minimal test succeeded with `vllm/Qwen3.5-27B`
|
|
- `mac-7`: ACP minimal test succeeded with `vllm/Qwen3.5-27B`
|
|
|
|
## Configuration Layer Responsibilities
|
|
|
|
### OpenClaw
|
|
|
|
Responsible for:
|
|
|
|
- Allowing ACP agent ids to be called
|
|
- High-level routing policy
|
|
|
|
Key file:
|
|
|
|
- `~/.openclaw/openclaw.json`
|
|
|
|
### acpx
|
|
|
|
Responsible for:
|
|
|
|
- Mapping ACP `agentId` to a concrete command
|
|
|
|
Key file:
|
|
|
|
- `~/.acpx/config.json`
|
|
|
|
### Wrapper scripts
|
|
|
|
Responsible for:
|
|
|
|
- Loading `~/.openclaw/.env`
|
|
- Validating provider secrets / URLs
|
|
- Fixing provider and model selection
|
|
- Launching the harness ACP command
|
|
|
|
Suggested location:
|
|
|
|
- `~/.local/bin/`
|
|
|
|
### Harness config (`opencode` in phase 1)
|
|
|
|
Responsible for:
|
|
|
|
- Harness-specific provider usage and model invocation details
|
|
|
|
Key file:
|
|
|
|
- `~/.config/opencode/opencode.json`
|
|
|
|
## Wrapper Contract
|
|
|
|
Every Warp-backed ACP wrapper should:
|
|
|
|
1. Load `~/.openclaw/.env`
|
|
2. Validate that required variables exist
|
|
3. Select the target provider
|
|
4. Pin the intended model
|
|
5. Launch the harness ACP entrypoint
|
|
6. Fail fast with a readable error if env/config is missing
|
|
|
|
Pseudo-flow:
|
|
|
|
```bash
|
|
set -a
|
|
source ~/.openclaw/.env
|
|
set +a
|
|
|
|
# validate WARP_<PROVIDER>_API_KEY and WARP_<PROVIDER>_BASE_URL
|
|
# select model
|
|
exec opencode-ai acp
|
|
```
|
|
|
|
## Fallback Policy
|
|
|
|
### Why fallback exists
|
|
|
|
Warp-backed cloud providers may fail due to:
|
|
|
|
- 5-hour quota exhaustion
|
|
- weekly quota exhaustion
|
|
- rate limiting / throttling
|
|
- transient upstream 5xx
|
|
- model retirement or temporary unavailability
|
|
- provider auth / billing issues
|
|
|
|
These are normal operational conditions and must be documented as first-class routing rules.
|
|
|
|
### Fallback priority
|
|
|
|
When a primary provider/model is unavailable, use this order:
|
|
|
|
1. **Same model, different provider**
|
|
2. **Same provider, adjacent model**
|
|
3. **Different provider, adjacent model**
|
|
|
|
This preserves behavior consistency as much as possible before changing model family.
|
|
|
|
### Example fallback chain
|
|
|
|
For `opencode-warp-infini-kimi`:
|
|
|
|
1. primary: `infini / kimi-k2.5`
|
|
2. fallback-1: `ckimi / kimi-for-coding`
|
|
3. fallback-2: `infini / glm-5`
|
|
4. fallback-3: `infini / minimax-m2.5`
|
|
|
|
For `opencode-warp-ckimi-kimi-for-coding`:
|
|
|
|
1. primary: `ckimi / kimi-for-coding`
|
|
2. fallback-1: `infini / kimi-k2.5`
|
|
3. fallback-2: `infini / glm-5`
|
|
4. fallback-3: `infini / minimax-m2.5`
|
|
|
|
Exact fallback order should be documented per agent as providers are added.
|
|
|
|
### ACP continuation rule after provider limit / request failure
|
|
|
|
When an ACP task fails mid-run because a provider is limited, rate-limited, quota-exhausted, or otherwise request-blocked, the system should not just restart blindly.
|
|
|
|
Use this continuation rule:
|
|
|
|
1. detect that the failure is provider/request related rather than task-logic related
|
|
2. produce a concise task-state summary of what had already been completed, what failed, and what remains
|
|
3. start a new `opencode` ACP run using the next Warp fallback agent
|
|
4. pass the summary into the new run so the replacement agent can continue instead of redoing everything from scratch
|
|
|
|
The summary should include, when available:
|
|
|
|
- original task goal
|
|
- completed steps
|
|
- files changed or attempted
|
|
- command outputs or relevant error lines
|
|
- exact provider failure signal (for example rate limit / quota exhaustion)
|
|
- remaining work
|
|
|
|
This behavior is part of the Warp fallback design, especially for long-running coding tasks where provider limits may interrupt execution mid-stream.
|
|
|
|
## Operational Reminder
|
|
|
|
If runtime errors show quota exhaustion or provider-specific capacity limits during an ACP task, the operator should treat that as a routing event, not just a task failure. The next step is to move to the documented fallback provider/model path.
|
|
|
|
## Extension Path
|
|
|
|
After `opencode` stabilizes, the same naming / secret-loading / fallback framework should be reused for:
|
|
|
|
- `gemini cli`
|
|
- `claude code`
|
|
|
|
The framework should stay provider-centric and wrapper-based even as the harnesses differ.
|