Posts

Showing posts from August, 2025

azure aks preventing external resource modifications by enforcing resource group lockdown

Azure AKS can now prevent unwannted resource modification externally via azure portal that may cause issue later for the cluster, especially when we need to maintain the cluster.  We can do this by registering this component here az feature register --namespace "Microsoft.ContainerService" --name "NRGLockdownPreview" To check if this feature is already enabled, run the following command az feature show --namespace "Microsoft.ContainerService" --name "NRGLockdownPreview" Then update the cluster to enable lockdown az aks update --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP_NAME --nrg-lockdown-restriction-level ReadOnly  And you can also remove those lockdown if you wanted to.  So what type of resources are we talking about here?  And now, lets try an delete the public IP for my cluster and see what happens. So I get the folloing error here below:- The access is denied because of the deny assignment with name 'kubernetes.azure.com: node ...

AKS what happens if you remove Automatic upgrade scheduler

Image
So what happens of you remove Automatic upgrade scheduler in your AKS cluster? Firstly yes we can remove it. In fact we can even remove plan updates schedule as well. Once you remove this and if you go and update your AKS cluster version to say 1.31.10, then you can see that the upgrade process is immediate.  As you can see here, that I have remove both schedule in my AKS Then you can see my cluster gets update almost immediately.

error: psycopg2.errors.UndefinedFunction: function similarity(text, text) does not exist

While tryig to create a similiarity search with postgres, bump into this error with this simple query cur . execute ( """ SELECT id, name, category, subcategory,        (1 - (embedding <=> %s ::vector)) AS semantic_score,        similarity(name, CAST( %s AS text)) AS keyword_score,        (0.7 * (1 - (embedding <=> %s ::vector)) +         0.3 * similarity(name, CAST( %s AS text))) AS final_score FROM items WHERE category = %s ORDER BY final_score DESC LIMIT 5; """ , ( query_emb , query , query_emb , query , category )) To resolve this you need to activate pg_trgm extension. For some reason, I have to do it everytime and need to ensure that the extension is activated before my query can be successful.

No operator matches the given name and argument types. You might need to add explicit type casts

 Hitting this error when running postgres with the following queries in python  cur . execute ( """ SELECT name, category, subcategory FROM items ORDER BY embedding <-> %s LIMIT 3; """ , ( query_emb ,)) Appparently the fix is as follows  cur . execute ( """ SELECT name, category, subcategory FROM items ORDER BY embedding <-> %s ::vector LIMIT 3; """ , ( query_emb ,))

Liquid AI - LFM2-350M

Image
Liquid AI introduce LFM2-350M model that is a lightweight, efficient model that can be deployed to devices. This makes it a quite interesting model to work with.  To get started, you can try it out in Google Colab.  https://colab.research.google.com/#scrollTo=4Ll7XVT78LUA&fileId=https%3A//huggingface.co/LiquidAI/LFM2-350M.ipynb To give you an idea how much resources is it taking up.  And total cpu and memory usage. For the 1.2 billion parameters model, it takes up about 2.4G of storage.  Fine tuning the model

getting started with gemini cli

Image
To install gemini cli globally, run the following command npm install -g @google/gemini-cli  Then run "gemini" from the command prompt to get you login into google.  Once you have successfully login, lets start creating some code. You can fire up vscode and install extension "Gemini CLI companion" and then go into its terminal.  From the terminal, please run gemini.  Lets create a simple flappy bird app:- To configure your custom commad, please follow it here. https://cloud.google.com/blog/topics/developers-practitioners/gemini-cli-custom-slash-commands

vscode code completion using cline extension

Image
First we need to install "cline" extension. Cline is the sort tools that allow you to create any app locally on your laptop with free credits available.   Once you have install cline, go to sign up using github or google account.  Next, all you need to do is - fire it up and ask it to create a bunch of apps for you.  Then you can also use  other provider like OpenRouter, Deepseek or Groq. If you're settin these up, you might need to have some credits there. 

vscode code completion using continue extension

Image
Continue is a AI code assistant that you can use in your vscode that supports different models. In this setup, we are going to setup "Continue" and make your vscode smarter and allows you to code faster.  To do that, ensure you have vscode installed and then you can install VSCode extension called Continue and Cline.  Apparently continue extension has a  dependency for Llama 3.1 model.  Once you have that install, you also need to install and keep ollama  running.  We also need to download a model called "qwen2.5-coder:1.5b-base" - this will be used for local agent. Normally, ollama executable will get installed here.  "C:\Users\your-user-path\AppData\Local\Programs\Ollama". Once you have access to that, proceed to download by running the following command:-  ollama.exe pull llama3.1:8b ollama.exe pull qwen2.5-coder:1.5b-base Continue would be able to auto detect that. As you can see here, I have ollama  running and then have my gwen cod...

llama cpp running it in google colab

Image
There are 2 options to do this in Google  colab.  Option 1 (Easiest)  With this option, we can just install the relevant package and then run the model.  First we need to install the required packages using the following command:-  ! pip install -U llama-cpp-python Next,  we will try to get the model using the following command # !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained(   repo_id= "unsloth/Qwen3-0.6B-GGUF" ,   filename= "Qwen3-0.6B-IQ4_NL.gguf" , ) We can see the model being downloaded:- And then we will run the following command to test this model llm.create_chat_completion(   messages = [     {       "role" : "user" ,       "content" : "What is the capital of France?"     }   ] ) And the output will look something like this:-  Option 2 llama.cpp is a powerful inference engine and if you wanted to get it running in google colab, you co...

The specified repository contains sharded GGUF. Ollama does not support this yet

Image
Getting this error while trying to run  ollama run hf.co/unsloth/Kimi-K2-Instruct-GGUF:Q4_K_M "why is the sky blue" Unfortunately, the other option is to use llama.cpp to use it.  Here is how you can do that. Please note the notebook crash out with disk storage full because kimi-k2 does take up quite a lot of space - approximately 373G of storage.  First we installed the dependencies  ! apt-get -qq install build-essential cmake ! git clone https://github.com/ggerganov/llama.cpp %cd llama.cpp ! cmake -B build ! cmake --build build --config Release After you successfully built it, the llama.cpp will be place in this path here. ! /content/llama.cpp/llama.cpp/build/bin/llama-server -h And finally to run it, use the following command ! /content/llama.cpp/llama.cpp/build/bin/llama-server -hf unsloth/Kimi-K2-Instruct-GGUF:Q4_K_M --host 0.0.0.0 --port 8000 And we will see that it will download the kimi-k2 model

Gemma 3 RAG sample implementation

Image
RAG can be a better alternative compare to fine tuning where you can keep on using existing model yet you can feed it with new information during runtime or during a Q&A session. This saves time trying to redeploy and re-test to ensure everything works correctly.  Anyways here is a hello world implementation of Gemma3 with RAG  Go into your Google colab and run the following codes # --- Step 1: Install dependencies --- ! apt-get update ! apt-get install -y curl wget gnupg ! curl -fsSL https://ollama.com/install.sh | sh Next install all the require dependencies ! pip install langchain langchain_community faiss-cpu sentence-transformers Install Ollama ! curl -fsSL https://ollama.com/install.sh | sh Run ollama  ! nohup ollama serve & Then pull down gemma3 model ! ollama pull gemma3 Then you have the following code  # --- Step 3: Setup LangChain with FAISS --- #from langchain.vectorstores import FAISS from langchain_community.vectorstores.faiss import FAISS fro...

google colab running unsloth gpt oss model 20B parameters

Image
To run Unsloth gpt oss model in Google colab, you can follow the notebook here https://github.com/mitzenjeremywoo/google-colab-notebooks/blob/main/unsloth_gpt_oss_20b_dynamic2_qtz.ipynb This model doesn't take up too much space but does consume quite abit of memory.  Reponses is quite slow as you can see the memory has reach its limits. 

ollama running it in google colab

You can easily run ollama in google colab by following these steps ! apt-get update ! apt-get install -y curl wget gnupg ! curl -fsSL https://ollama.com/install.sh | sh ! nohup ollama serve & ! ollama pull llama3 .2 ! ollama run llama3 .2 "Why is the sky blue?" What if you want to use the REST API chat mode? That is supported to  ! curl http://localhost: 11434 /api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "why is the sky blue?" }  ]}' So is the generate mode  ! curl http://localhost: 11434 /api/generate -d '{"model": "llama3.2", "prompt":"Why is the sky blue?"}' Example google colab notebook can be found here https://github.com/mitzenjeremywoo/google-colab-notebooks/blob/main/ollama_in_google_colab.ipynb

kubectl apply does not remove resource

Many believe that kubectl apply will remove resources from a deployment if you remove certain section of the yaml. Believe it or not, kubectl apply is used to add or update resource but not delete it. 

kuma - exposing your application to the external world

Image
To expose your application to the world, let use the demo app given by Kong/Kuma. This means we don't need to do port forward to post a payload to our app here. We can do it via localhost.  Run the following command. I assume you have installed  your kuma control plane. To install the demo app. kubectl apply -f https://raw.githubusercontent.com/kumahq/kuma-counter-demo/refs/heads/main/k8s/000-with-kuma.yaml Once you have installed it, then please the following so we can see the layout: kubectl port-forward svc/demo-app -n kuma-demo 5050:5050 The demo app should just look like this  Let's expose it to the external world now via port 80 and not 8080 as specified in the documentation. we can do that by creating MeshGatewayInstance, MeshGateway and MeshHttpRoute. We are also creating the MeshTrafficPermission here. Run the following yaml. --- apiVersion : kuma.io/v1alpha1 kind : MeshGatewayInstance metadata :   name : edge-gateway   namespace : kuma-demo spec : ...

kong service mesh - getting metrics from data plane

Since kong mesh (kuma) uses injected namespace, we can just run a container and then access metrics exposed by envoy sidecar.  Let's do that right now and create a dummy pod that runs ubuntu image kubectl run -it -n kuma-demo ubuntu-shell --image=ubuntu:22.04 --restart=Never --rm -- bash Then you can either install curl into that container and curl  http://localhost:9901/stats/prometheus Or you can do a port forward  kubectl port-forward pod/ubuntu-shell -n kuma-demo 9901:9901 And then load this up using your browser: http://localhost:9901/stats/prometheus Example of the metrics expose are shown below:

kubectl run ubuntu image straight into bash shell mode

This is  really one of my favourite command to use when i troubleshoot issues with pods kubectl run -it ubuntu-shell --image=ubuntu:22.04 --restart=Never --rm -- bash

getting control plane metrics from kong service mesh

To get metrics from kong mesh, make sure you have install kong mesh, if not please go here to install it.  Next, run the following command to get access to the metrics endpoint. This is a different port compare to 5681 (UI port)  kubectl port-forward svc/kong-mesh-control-plane -n kong-mesh-system 5680:5680 At this point, goto http://localhost:5680/metrics and you can see a list of all the metrics that is available.  Details of the metrics expose are shown here:-  

docker mcp with vs code integration

Image
Docker mcp toolkit allows you can access different mcp tools easily without having multiple setup or diffeeret configuration settings. It just requires one config and then you can use all those tools.  Fire up your vscode editor.  You need to create a folder call .vscode and create file call mcp.json with the following values  "mcp" : {   "servers" : {     "MCP_DOCKER" : {       "command" : "docker" ,       "args" : [         "mcp" ,         "gateway" ,         "run"      ],       "type" : "stdio"    }  } } Next, ensure you activate your chat or co-pilot extension by  going into View -> Chat.  Ensure that you have selected Agent model -> click on tools.  Then you can see a list of those tools appearing for you - as shown here. You are seeing the puppeteer because I have added it. You can add more if you ...

azure private dns zone setup and walkthrough

Image
Setting up a private DNS zone is straight forward. Goto Private DNS zone and click on "create".  Then create a new VM in your VNET that's it is linked to. You can see that my VM automatically get a new IP address here.  You can create a new A record in there, for example we creating A record for the VM with the name "db".  Then you can use the following command discover the newly created DNS records. As you can see, if you do a reserve lookup, it works too