qwen 3 (0.6B) - running in google colab
Trying to run qwen 3 with 0.6 billion parameters on google colab using code straight from hugging face here. (https://huggingface.co/Qwen/Qwen3-0.6B). The code looks like this. from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Qwen/Qwen3-0.6B" # load the tokenizer and the model tokenizer = AutoTokenizer.from_pretrained( model_name ) model = AutoModelForCausalLM.from_pretrained( model_name , torch_dtype = "auto" , device_map = "auto" ) # prepare the model input prompt = "Give me a short introduction to large language model." messages = [ { "role" : "user" , "content" : prompt } ] text = tokenizer .apply_chat_template( messages , tokenize = False , add_generation_prompt = True , enable_thinking = True # Switches between thinking and non-thinking modes. Default is True. ) model_inputs = tokenizer ([ text ], return_tensors = "p...