Files
openharmony-mlx/usage.md
2025-10-08 11:11:46 +08:00

162 lines
4.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## 模型权重准备
Metal后端需要特定格式的权重文件。你有两个选择
### 转换现有权重:
```bash
python gpt_oss/metal/scripts/create-local-model.py -s <model_dir> -d <output_file>
```
### 下载预转换权重:
```bash
huggingface-cli download openai/gpt-oss-120b --include "metal/*" --local-dir gpt-oss-120b/metal/
huggingface-cli download openai/gpt-oss-20b --include "metal/*" --local-dir gpt-oss-20b/metal/
```
这里的"Metal版本"指的是GPT-OSS模型的Metal后端实现。
## 环境准备
macOS系统Apple Silicon
1. 准备环境
```bash
# 安装Xcode并完成初始化验证
xcode-select --install
xcrun -find metal || echo "metal not found"
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
# 打开 xcode 安装 macos sdk命令行安装不成功在图形化界面安装即可
sudo xcodebuild -license accept
xcodebuild -runFirstLaunch
# 安装着色器的工具链
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
xcodebuild -downloadComponent MetalToolchain
# 验证是否安装成功
xcrun --sdk macosx --find metal
xcrun --sdk macosx --show-sdk-path
# 创建虚拟环境
micromamba create -n gptoss python=3.12 -y
micromamba activate gptoss
micromamba install pybind11 -c conda-forge -y
```
2. 手动运行CMake构建
```bash
git clone https://github.com/hotwa/openharmony-mlx.git
cd openharmony-mlx
# 自动安装cmake安装
GPTOSS_BUILD_METAL=1 pip install -e ".[metal]"
# 手动编译cmake安装
cd gpt_oss/metal
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DGPTOSS_BUILD_PYTHON=ON
export pybind11_DIR=$(python -c "import pybind11; print(pybind11.get_cmake_dir())")
cmake -S .. -B . \
-DCMAKE_BUILD_TYPE=Release \
-DGPTOSS_BUILD_PYTHON=ON \
-DPYBIND11_FINDPYTHON=ON \
-Dpybind11_DIR="$(python -c 'import pybind11;print(pybind11.get_cmake_dir())')"
cmake --build . --config Release --parallel
make -j$(nproc)
ctest --output-on-failure
```
3. Metal着色器编译
CMake会自动编译Metal源文件 CMakeLists.txt:16-28
这些.metal文件会被编译成.air中间文件然后链接成default.metallib
4. Python扩展模块构建
CMake会创建名为_metal的Python扩展模块
## python 安装包
```bash
# 手动安装
# 安装扩展模块
cp _metal.so /path/to/your/python/site-packages/gpt_oss/metal/
# 安装Metal库文件
cp default.metallib /path/to/your/python/site-packages/gpt_oss/metal/
```
```bash
# 在 gpt_oss 仓库根目录(不是 metal/build
cd /path/to/gpt_oss
# 确保环境中 pybind11、Xcode 都就绪
export GPTOSS_BUILD_METAL=1
python -m pip install -e ".[metal]" # 开发模式安装(可改代码即时生效)
# 或者正式安装
# python -m pip install ".[metal]"
```
5. 验证metal模块是否正确安装
```python
python -c "import gpt_oss.metal._metal; print('Metal module loaded successfully')"
```
## 启动服务
缓存下载并启动服务
```bash
mkdir -p ~/.cache/openai_harmony/
cd ~/.cache/openai_harmony/
wget https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken
export OPENAI_HARMONY_CACHE_DIR=~/.cache/openai_harmony/
chmod 755 ~/.cache/openai_harmony/
python -m gpt_oss.responses_api.serve --inference-backend metal --checkpoint /Volumes/long990max/gpustack_data/openai/gpt-oss-20b/metal/model.bin --host 0.0.0.0 --port 8080
```
## cherrystudio 配置
添加提供商选择`OpenAI-Response`
添加参数如下
模型IDgpt-oss-120b
模型名称gpt-oss-120b
分组名称gpt-oss
API 地址http://localhost:8080
密钥:无
请求虽然是gpt-oss-120b但是实际使用的是gpt-oss-20b。由于后台写死的是120b所以请求使用gpt-oss-120b
## codex 使用
```bash
vim .codex/config.toml
```
```toml
disable_response_storage = true
show_reasoning_content = true
model = "gpt-5-codex"
[model_providers.local]
name = "local"
base_url = "http://100.64.0.4:8080/v1"
wire_api = "responses"
include_apply_patch_tool = false
[profiles.oss]
model = "gpt-oss-120b"
model_provider = "local"
include_apply_patch_tool = false
[mcp_servers.web-mcp]
url = "https://web-mcp.koyeb.app/sse/04824d01-60c3-4f20-9340-65b60d3e8344"
# 如果需要认证,可以添加 bearer_token
# bearer_token = "your-token-here"
startup_timeout_sec = 60
tool_timeout_sec = 120
```