Files

hotwa 537ee62363 Update opencode default provider to infini kimi-k2.5

2026-03-18 15:43:07 +08:00

7.7 KiB

Raw Permalink Blame History

Warp ACP Integration

Purpose

This document defines the long-term design for routing ACP harnesses through Warp-managed cloud providers. Phase 1 standardizes opencode; future phases may extend the same pattern to gemini cli and claude code.

Scope

Control plane: OpenClaw on mac-5
Execution nodes: mac-6, mac-7
Current phase: opencode
Future candidates: gemini cli, claude code

Core Design Principles

Provider identity and model identity are different things.
- Model names can repeat across providers.
- Environment variables and agent ids must therefore encode the provider, not just the model.
Do not rely on ~/.zshrc for ACP runtime secrets.
- ACP / acpx / harness execution may not inherit interactive shell startup files.
- Secrets should be loaded explicitly.
Secrets and routing rules should be separated.
- Secrets live in ~/.openclaw/.env.
- Routing / alias / model selection live in ACP config, wrapper scripts, and documented policy.
Wrapper scripts are the stable injection point.
- Wrapper scripts should explicitly load ~/.openclaw/.env, validate required variables, and then exec the target harness.
Fallback must be designed up front.
- Cloud provider quota exhaustion and rate limiting are expected operating conditions, not edge cases.

Secret Storage Standard

Primary local secret file:

~/.openclaw/.env

Node-local requirement:

Any machine that may locally execute a Warp-backed ACP harness must have its own local ~/.openclaw/.env.
For the current cluster plan, this means mac-5, mac-6, and mac-7 should all be configured.
Do not assume secrets from mac-5 will automatically be available when a harness actually runs on mac-6 or mac-7.

Recommended format:

WARP_INFINI_API_KEY=
WARP_INFINI_BASE_URL=https://cloud.infini-ai.com/maas/coding/v1
WARP_CKIMI_API_KEY=
WARP_CKIMI_BASE_URL=

Notes:

Use plain KEY=value lines.
Do not store secrets in ~/.openclaw/openclaw.json, ~/.acpx/config.json, or long-term memory files.
Restrict file permissions: chmod 600 ~/.openclaw/.env.

Environment Variable Naming Rule

Use the pattern:

WARP_<PROVIDER>_API_KEY
WARP_<PROVIDER>_BASE_URL

Examples:

WARP_INFINI_API_KEY
WARP_INFINI_BASE_URL
WARP_CKIMI_API_KEY
WARP_CKIMI_BASE_URL

Do not name variables by model alone, because identical model names may exist on multiple providers.

Agent Naming Standard

For Warp-backed ACP harnesses, use:

<harness>-warp-<provider>-<model-family>

Phase 1 examples for opencode:

opencode-warp-infini-kimi
opencode-warp-infini-minimax
opencode-warp-infini-glm

Future examples:

gemini-warp-infini-kimi
claude-warp-ckimi-kimi

Current Default Model Policy (2026-03-18)

The cluster no longer uses a single "default to each machine's own local model" rule for opencode ACP.

Current node-specific default policy:

mac-5: default opencode model is infini/kimi-k2.5 via https://cloud.infini-ai.com/maas/coding/v1
mac-6: default opencode model is vllm/Qwen3.5-27B via http://100.64.0.5:8000/v1
mac-7: default opencode model is vllm/Qwen3.5-27B via http://100.64.0.5:8000/v1

Operational meaning:

mac-5 prefers the Infini coding endpoint with kimi-k2.5; the earlier ckimi / kimi-for-coding stopgap was removed because Kimi For Coding is currently targeted at coding-agent products rather than this opencode path.
mac-6 and mac-7 prefer the shared local vLLM endpoint instead of their previous per-node local oMLX default for opencode ACP.
This rule is specific to current opencode defaults; it does not invalidate separate worker/subagent topology docs.

Observed validation status:

mac-5: direct opencode config validation succeeded with infini/kimi-k2.5 (opencode models infini returned infini/kimi-k2.5)
mac-6: ACP minimal test succeeded with vllm/Qwen3.5-27B
mac-7: ACP minimal test succeeded with vllm/Qwen3.5-27B

Configuration Layer Responsibilities

OpenClaw

Responsible for:

Allowing ACP agent ids to be called
High-level routing policy

Key file:

~/.openclaw/openclaw.json

acpx

Responsible for:

Mapping ACP agentId to a concrete command

Key file:

~/.acpx/config.json

Wrapper scripts

Responsible for:

Loading ~/.openclaw/.env
Validating provider secrets / URLs
Fixing provider and model selection
Launching the harness ACP command

Suggested location:

~/.local/bin/

Harness config (`opencode` in phase 1)

Responsible for:

Harness-specific provider usage and model invocation details

Key file:

~/.config/opencode/opencode.json

Wrapper Contract

Every Warp-backed ACP wrapper should:

Load ~/.openclaw/.env
Validate that required variables exist
Select the target provider
Pin the intended model
Launch the harness ACP entrypoint
Fail fast with a readable error if env/config is missing

Pseudo-flow:

set -a
source ~/.openclaw/.env
set +a

# validate WARP_<PROVIDER>_API_KEY and WARP_<PROVIDER>_BASE_URL
# select model
exec opencode-ai acp

Fallback Policy

Why fallback exists

Warp-backed cloud providers may fail due to:

5-hour quota exhaustion
weekly quota exhaustion
rate limiting / throttling
transient upstream 5xx
model retirement or temporary unavailability
provider auth / billing issues

These are normal operational conditions and must be documented as first-class routing rules.

Fallback priority

When a primary provider/model is unavailable, use this order:

Same model, different provider
Same provider, adjacent model
Different provider, adjacent model

This preserves behavior consistency as much as possible before changing model family.

Example fallback chain

For opencode-warp-infini-kimi:

primary: infini / kimi-k2.5
fallback-1: ckimi / kimi-for-coding
fallback-2: infini / glm-5
fallback-3: infini / minimax-m2.5

For opencode-warp-ckimi-kimi-for-coding:

primary: ckimi / kimi-for-coding
fallback-1: infini / kimi-k2.5
fallback-2: infini / glm-5
fallback-3: infini / minimax-m2.5

Exact fallback order should be documented per agent as providers are added.

ACP continuation rule after provider limit / request failure

When an ACP task fails mid-run because a provider is limited, rate-limited, quota-exhausted, or otherwise request-blocked, the system should not just restart blindly.

Use this continuation rule:

detect that the failure is provider/request related rather than task-logic related
produce a concise task-state summary of what had already been completed, what failed, and what remains
start a new opencode ACP run using the next Warp fallback agent
pass the summary into the new run so the replacement agent can continue instead of redoing everything from scratch

The summary should include, when available:

original task goal
completed steps
files changed or attempted
command outputs or relevant error lines
exact provider failure signal (for example rate limit / quota exhaustion)
remaining work

This behavior is part of the Warp fallback design, especially for long-running coding tasks where provider limits may interrupt execution mid-stream.

Operational Reminder

If runtime errors show quota exhaustion or provider-specific capacity limits during an ACP task, the operator should treat that as a routing event, not just a task failure. The next step is to move to the documented fallback provider/model path.

Extension Path

After opencode stabilizes, the same naming / secret-loading / fallback framework should be reused for:

gemini cli
claude code

The framework should stay provider-centric and wrapper-based even as the harnesses differ.

7.7 KiB Raw Permalink Blame History

Warp ACP Integration

Purpose

Scope

Core Design Principles

Secret Storage Standard

Environment Variable Naming Rule

Agent Naming Standard

Current Default Model Policy (2026-03-18)

Configuration Layer Responsibilities

OpenClaw

acpx

Wrapper scripts

Harness config (opencode in phase 1)

Wrapper Contract

Fallback Policy

Why fallback exists

Fallback priority

Example fallback chain

ACP continuation rule after provider limit / request failure

Operational Reminder

Extension Path

7.7 KiB

Raw Permalink Blame History

Harness config (`opencode` in phase 1)