vllm : Failed to infer device type
Trying to run vllm without gpu will typicaly lands you into this issue. To resolve this issue: we can use set these both. Setting one does not work for me. So i had to configure both settings below:- 1. environment variable CUDA_VISIBLE_DEVICES = "" 2. in the command line, change to use device cpu and remove tensor-parallel-size. For example, python3 -m vllm.entrypoints.openai.api_server --port 8080 --model deepseek-ai/DeepSeek-R1 --device cpu --trust-remote-code --max-model-len 4096" References https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#failed-to-infer-device-type