mitzen

Posts

Showing posts from 2025

github actions setup and authenticating terraform provider using managed identity

April 17, 2025

You can use the following yaml to setup terraform azurerm which authenticate using managed identity. All you need is the following information - AZURE_CLIENT_ID - this would be your managed identity client id - AZURE_SUBSCRIPTION_ID - AZURE_TENANT_ID The permission is important and you need this to work. name : 'Build .Net app' on : [ push , workflow_dispatch ] permissions : id-token : write contents : read jobs : build-and-deploy : runs-on : ubuntu-latest steps : - name : Azure login uses : azure/login@v2 with : client-id : ${{ secrets.AZURE_CLIENT_ID }} tenant-id : ${{ secrets.AZURE_TENANT_ID }} subscription-id : ${{ secrets.AZURE_SUBSCRIPTION_ID }} - name : Azure CLI script ...

how to call azure REST API endpoint

April 17, 2025

The followings are steps you need to do to call Azure REST API endpoint. We need to obtain the token and then hit this endpoint using bearer token authorizations TOKEN=$(az account get-access-token --query accessToken --output tsv) curl -X GET \ " https://management.azure.com/ subscriptions/ $SUBSCRIPTION_ID /resourceGroups/$RESOURCE_ GROUP/providers/Microsoft.Sql/ servers/$SERVER_NAME?api- version=2024-02-01" \ -H "Authorization: Bearer $TOKEN"

azure devops terraform provider - needs the following permission(s) on the resource Users to perform this action: Add Users

April 17, 2025

While trying to setup my users using azure devops terraform provider, i bump into this error "needs the following permission(s) on the resource Users to perform this action: Add Users" Then i had to update the following permission to get it to work.

llm tool to help with quantization

April 12, 2025

To decrease the size and memory footprint of machine learning models, a technique called quantization is employed. This method, akin to lossy image compression, converts model weights into lower precision formats such as 8-bit or 4-bit. While this significantly reduces resource demands, it's important to note that, like image compression, quantization can potentially lead to a reduction in the model's accuracy. Tool that you can use are https://github.com/ModelCloud/GPTQModel https://github.com/casper-hansen/AutoAWQ

k8s gateway api - setting up gateway and routes to different service in a cluster

April 12, 2025

To setup a k8s gateway api in a gke cluster, you typically have to - create gatewayapi - setup necessary http route - deploy your services A and service B. Setting up gateway by running the following yaml apiVersion : gateway.networking.k8s.io/v1 kind : Gateway metadata : name : my-shared-gateway spec : gatewayClassName : gke-l7-regional-external-managed listeners : - protocol : HTTP # Or HTTPS for production port : 80 # Or 443 for HTTPS name : http hostname : "*.example.com" create your http route A. apiVersion : gateway.networking.k8s.io/v1 kind : HTTPRoute metadata : name : a-route namespace : default spec : parentRefs : - name : my-shared-gateway hostnames : - "a.example.com" rules : - matches : - path : type : PathPrefix value : / ...

gke gateway :- error ensuring load balancer: generic::invalid_argument: Insert: Invalid value for field 'resource.target'

April 11, 2025

This error comes about when you do not have a proxy subnet created for your region when you're trying to deploy load balancer or when creating k8s API gateway's gateway resources. These load balancers use a Google-managed proxy layer, and for internal communication between the proxy and your backend VMs, GCP needs a reserved IP range within your VPC. That’s where the proxy-only subnet comes in. You can create it by going here: https://cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create You can see after i apply the fixes, my gateway resources starts to sync successfully. And then allocated a ip for my gateway here:-

gke - deploying multiple llm models with gradio interface

April 11, 2025

First create a gke cluster and you will need to be able to support GPU (required by the hugging face text generation inference) . Once you have provision one create a secret :- kubectl create secret generic l4-demo \ --from-literal=HUGGING_FACE_TOKEN=hf_token Next choose the ML of your choice for deployment. In this example, we going to use falcon. Please refer to this page here for other models. apiVersion : apps/v1 kind : Deployment metadata : name : llm spec : replicas : 1 selector : matchLabels : app : llm template : metadata : labels : app : llm spec : containers : - name : llm image : us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.1-4.ubuntu2204.py310 resources : ...

rust converting from u32 to string and vice versa

April 10, 2025

Converting to int to string and vice versa can be challenging especially for beginner like me. The following code demonstrate how to convert back and forth. fn main () { // u32 to string let a = 32 ; let s = a . to_string (); println! ( "{}" , s ); // string to u32 let a = "32" ; let b : u32 = a . parse () . unwrap (); } Notice how vscode infer the type here u can see a.to_string() returns a string - which are stored in heap. Converting str to u32 can be achieve by calling parse().unwrap(). unwrap() returns a Result<T, E> which gives results and error.

using rust default value

April 09, 2025

Rust Default::default is really useful to place default value to a structure or pass it into a function. Example below shows how we use the .. operator to assigned default value to more than one fields, when we instantiate the struct / type Point. #[derive( Debug , Default )] struct Point { x : i32 , y : i32 , a : i32 , isOk : bool } fn main () { let s = Point { isOk : true , .. Default :: default () }; println! ( "{:?}" , s ); } You can also use that to pass function - for example and it prints out 0 as the total. fn print_value ( total : u32 ) { print! ( "{}" , total ); } fn main () { print_value ( Default :: default ()) }

gcloud - local login account setup

April 09, 2025

After installing gcloud, we need to setup our user account for testing and development purpose. To do that run the following command:- gcloud init gcloud auth application-default login

dockerfile and $@

April 06, 2025

The $@ symbol are basically expanding or executing arguments passed from a client/entry point to the executing code. Let's say for example we have a file call ./hello2.sh #!/bin/bash echo "Hello, World! - Argument: $@ " if we run hello2.sh test1 test2 test3 We will get the following output From a docker perspective, we can make a highly customizable image. For example, let's say we have this entrypoint.sh with the following contents entrypoint.sh #!/bin/bash ldconfig 2> /dev/null || echo 'unable to refresh ld cache, not a big deal in most cases' source /usr/src/.venv/bin/activate exec text-generation-launcher $@ And in my dockerfile, Dockerfile ENTRYPOINT [ "/entrypoint.sh" ] CMD [ "--model-id" , "tiiuae/falcon-40b-instruct"] When my container runs, essentially what i am executing will be as follows. /entrypoint.sh --model-id tiiuae/falcon-40b-instruct which in turn calls ./text-generation-launcher --model-id tiiu...

gke - testing out Mixtral-8x7B-Instruct-v0.1 deployment not successful on a non-gpu pool

April 05, 2025

To deploy this, it requires GPU. I was trying to deploy without one and wasn't successfully mainly because the images huggingface-text-generation-inference does not have a cpu compatible image. This is the image that is required image : us-docker.pkg.dev/ deeplearning-platform-release/gcr.io/huggingface-text- generation-inference-cu124.2-3.ubuntu2204.py311 First you need to create a secret and ensure you have accepted mistral agreement. If not head over to hugging face and accept the license agreement. Then create your secret kubectl create secret generic l4-demo \ --from-literal=HUGGING_FACE_TOKEN=hf_your-key-here Then deploy gradio and your model to k8s cluster. You can find additional detail here. The gradio deployment looks like this apiVersion : apps/v1 kind : Deployment metadata : name : gradio labels : app : gradio spec : strategy : type : Recreate replicas : 1 selector : m...