minimax - running minimax model on a google colab - crash out with memory issue.

To run minimax llm model on a google colab we can open google colab and change the runtime type to T4/GPU.

Then use the following scripts

!pip install -U transformers

Next, we specify the model. A 32  billion parameter and takes up about 4.9G of space. Unfortunately, when trying to run this on a free Google Colab it crashes out. 😀


# Use a pipeline as a high-level helper

from transformers import pipeline

pipe = pipeline("text-generation", model="MiniMaxAI/SynLogic-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

Then tried using vllm to run the same model, got crashed out as well. Seems like it does requires quite abit of memory. 

# Install vLLM from pip:
!pip install vllm

# Load and run the model:

!vllm serve "MiniMaxAI/SynLogic-32B"





Comments

Popular posts from this blog

gemini cli getting file not defined error

NodeJS: Error: spawn EINVAL in window for node version 20.20 and 18.20

vllm : Failed to infer device type