vllm : Failed to infer device type

Trying to run vllm without gpu will typicaly lands you into this issue. To resolve this issue: we can use set these both. Setting one does not work for me. So i had to configure both settings below:-

1.  environment variable CUDA_VISIBLE_DEVICES = ""

2. in the command line, change to use device cpu and remove  tensor-parallel-size. For example,

python3 -m vllm.entrypoints.openai.api_server --port 8080 --model deepseek-ai/DeepSeek-R1 --device cpu --trust-remote-code --max-model-len 4096"

References

https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#failed-to-infer-device-type



Comments

Popular posts from this blog

The specified initialization vector (IV) does not match the block size for this algorithm

NodeJS: Error: spawn EINVAL in window for node version 20.20 and 18.20