diff --git a/README.md b/README.md index b8e8538..f7f515c 100644 --- a/README.md +++ b/README.md @@ -63,7 +63,12 @@ print(outputs[0]["generated_text"][-1]) vLLM recommends using [`uv`](https://docs.astral.sh/uv/) for Python dependency management. You can use vLLM to spin up an OpenAI-compatible webserver. The following command will automatically download the model and start the server. ```bash -uv run --with vllm vllm serve openai/gpt-oss-20b +uv pip install --pre vllm==0.10.1+gptoss \ + --extra-index-url https://wheels.vllm.ai/gpt-oss/ \ + --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \ + --index-strategy unsafe-best-match + +vllm serve openai/gpt-oss-20b ``` [Learn more about how to use gpt-oss with vLLM.](https://cookbook.openai.com/articles/gpt-oss/run-vllm)