Transformers responses API (#1)

This commit is contained in:
Lysandre Debut
2025-08-05 19:02:16 +02:00
committed by GitHub
parent 0106ce5ba3
commit a601a63cdc
3 changed files with 59 additions and 0 deletions

View File

@@ -273,6 +273,7 @@ You can start this server with the following inference backends:
- `metal` — uses the metal implementation on Apple Silicon only
- `ollama` — uses the Ollama /api/generate API as a inference solution
- `vllm` — uses your installed vllm version to perform inference
- `transformers` — uses your installed transformers version to perform local inference
```bash
usage: python -m gpt_oss.responses_api.serve [-h] [--checkpoint FILE] [--port PORT] [--inference-backend BACKEND]