The specified repository contains sharded GGUF. Ollama does not support this yet

August 23, 2025

Getting this error while trying to run

ollama run hf.co/unsloth/Kimi-K2-Instruct-GGUF:Q4_K_M "why is the sky blue"

Unfortunately, the other option is to use llama.cpp to use it. Here is how you can do that. Please note the notebook crash out with disk storage full because kimi-k2 does take up quite a lot of space - approximately 373G of storage.

First we installed the dependencies

!apt-get -qq install build-essential cmake

!git clone https://github.com/ggerganov/llama.cpp
%cd llama.cpp
!cmake -B build
!cmake --build build --config Release

After you successfully built it, the llama.cpp will be place in this path here.

!/content/llama.cpp/llama.cpp/build/bin/llama-server -h

And finally to run it, use the following command

!/content/llama.cpp/llama.cpp/build/bin/llama-server -hf unsloth/Kimi-K2-Instruct-GGUF:Q4_K_M 

--host 0.0.0.0 --port 8000

And we will see that it will download the kimi-k2 model

Search This Blog

mitzen

The specified repository contains sharded GGUF. Ollama does not support this yet

Comments

Popular posts from this blog

vllm : Failed to infer device type

android studio kotlin source is null error

gemini cli getting file not defined error