mitzen

Posts

dotnet ??= and array? - getting basic understanding

April 30, 2025

Working with ??= is really good to reduce some boiler plate code such as the ones below // using ??= List < string > ? arrayList3 = null ; var result5 = arrayList3 ??= new List < string >(); // replacing tedious if else statement :) if ( arrayList3 == null ) { var result6 = new List < string > { "test" }; } And allow us to quickly, checks for null and apply default value string ? dummyString = null ; arrayList . Add ( dummyString ??= "default" ); To work with array?[], it seems to be confusing to me. I wasn't really sure if it is checking if the array itself is null or array index value is null. It is checking if the array is null. And of course, whether it will throw exception if i tries to access an invalid index. Additional details are shown here: // working with array var arrayList = new List < string >(); arrayList . Add ( "test" ); //var result = arrayList?[2]; // exception var result2 = a...

hotchoc graphql with database services

April 29, 2025

To setup graphql backend that lookup database records, you can use hotchoc's ServiceAttribute, as shown in the code snippet here: using System . Collections . Generic ; using HotChocolate ; using HotChocolate . Types ; using HotChocolate . Types . Relay ; namespace Accounts . Types { [ QueryType ] public class Query { public IEnumerable < User > GetUsers ([ Service ] UserRepository repository ) => repository . GetUsers (); [ NodeResolver ] public User GetUser ( int id , [ Service ] UserRepository repository ) => repository . GetUser ( id ); } } You can add database customization code later, for a demo example, a UserRepository would look like this. public class UserRepository { ...

dotnet - uninstalling templates

April 27, 2025

To uninstall template, first you need to figure out what template you already installed. To do that, you can use dotnet new -u This will get you template currently installed Once you have done that, you can use the following command to uninstall the template. In my laptop, my templates is still using dotnet 6, there's a newer version which support dotnet 8 . dotnet new uninstall HotChocolate.Templates

graphql spec

April 27, 2025

Here is a location for graphql spec https://spec.graphql.org/

github action directory basic detalis

April 26, 2025

Github action tends to place code into 2 levels, for example if you called your repository github-action-dotnet-build , it will be check out into github-action-dotnet-build/github-action-dotnet-build . What if you have publish artifact which you use in another stage, where are the code being downloaded into? It will be place in the same directory. For example if you publish an artifact and called it myartifact, when you download your artifact. It will be automatically extracted into the default folder. ( GITHUB_WORKSPACE) In this example here I download my artifact Then you can see that it is being extracted automatically into GITHUB_WORKSPACE if you do not specify the path. And notice that is unzip for you too.

github actions stage dependency using needs

April 26, 2025

In github action, we often needs to specify dependencies between stages - to ensure stage B will continue to run stage A if stage A is completed. In order to show this relationship we can use "needs" as shown below:- artifact-work : runs-on : ubuntu-latest needs : build-and-publish steps : - name : download artifact uses : actions/download-artifact@v4 with : name : published-artifact-net

azurerm terraform authentication using oidc

April 25, 2025

It is confusing trying to understand which authentication method to use in terraform especially when there's concepts like managed identity and oidc. What is the difference and how to use it? OIDC authentication using Azure Devops task Abit of confusion i think - but this setup here ( https://registry.terraform. io/providers/hashicorp/ azurerm/latest/docs/guides/ service_principal_oidc ), uses what identity that your service connection configures on - it can be a managed identity and it could be an app registration. It is NOT to say, you have a k8s workload identity setup and you wanted to use that managed identity (by passing service connection) 1. Setup service connection and federate:- Prereq 1. Do you need to grant project collection admin to this service account No you don't but you need to provide it with BASIC license access. 2. Do you need to federate it to the workload identity? Yes you do - if you use Azure Devops service con...

Eagle and its uses in LLM (VLLM)

April 25, 2025

The primary goal of EAGLE is to reduce the computational cost and latency associated with generating text from LLMs. It achieves this by introducing methods that allow for faster decoding during inference, making it particularly useful for applications requiring real-time or large-scale language processing. If you're looking for a faster and more performant technical for text generation - this can help. In the context of vllm, this approach can provide faster performance for model served with VLLM. https://docs.vllm.ai/en/latest/getting_started/examples/eagle.html As you can see here under speculative_config.

vllm - using mistral model and resolving some of its issues

April 22, 2025

Bump into mistral assertion error - while trying to run vllm chat example here . While it is quite confusing what are the actual cause for this, I turn on VLLM_LOGGING_LEVEL=debug And then re-run the code, then i saw "Forbidden for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/resolve/main/params.json". And you guess it, i had to goto mistral hugging face model card and accept the license agreement. The log looks like this below: Once i have accepted the license agreement, i re-run and it was working. Then quickly, I ran into this error ValueError: The model's max seq len (128000) is larger than the maximum number of tokens that can be stored in KV cache (32768). Try increasing `VLLM_CPU_KVCACHE_SPACE` or decreasing `max_model_len` when initializing the engine. To resolve this, i tried running this from the command line:- vllm serve mistralai/Mistral-7B-Instruct-v0.3 --max-model-len 32768 How to configure this from LLM class itself?...

vllm - curl for completion

April 20, 2025

Something i need to be handy when running or testing vllm. Start up your model using vllm using the following command vllm serve Qwen/Qwen2.5-1.5B-Instruct Then run the following command to generate possible 100 tokens outputs from the model. curl http: // localhost: 8000 / v1 / completions \ - H "Content-Type: application/json" \ - d '{ "model" : "Qwen/Qwen2.5-1.5B-Instruct" , "prompt" : "San Francisco is a" , "max_tokens" : 100 , "temperature" : 0 } '

vllm error in running sample: "reshape_and_cache_cpu_impl" not implemented for 'Half'

April 20, 2025

When running the sample, I hit this error above when running one of the basic example. To resolve it, you just need to add dtype and set it to "float32" - as shown below:- from vllm import LLM, SamplingParams # Sample prompts. prompts = [ "Hello, my name is" , "The president of the United States is" , "The capital of France is" , "The future of AI is" , ] # Create a sampling params object. sampling_params = SamplingParams( temperature = 0.8 , top_p = 0.95 ) def main (): # Create an LLM. llm = LLM( model = "facebook/opt-125m" , dtype = "float32" ) # Generate texts from the prompts. Example output after the fix

operator torchvision::nms does not exist when setting up vllm

April 19, 2025

Trying to run a basic app in vllm but ran into this error. I had a run in with this sometime ago but forgotten about it. My vm does not have gpu and i just installed a default (non-cpu only version) of vllm. If you face similiar issue: Please follow the cpu installation guide here . AVOID previously installed version of vllm. After i have completed my setup, I am able to run vllm serve Qwen/Qwen2.5-1.5B-Instruct

github action publishing nuget packages

April 18, 2025

To publish nuget packages for your dotnet application using github action, you can use the following yaml. In github directory structure is pretty straight forward. Here is the yaml that you can use. You will notice that i have configure permission for packages. This is to ensure we are able to use the default secret.GITHUB_TOKEN to publish to nuget packages. So we're not using a PAT token here. name : .NET on : push : branches : [ "main" ] pull_request : branches : [ "main" ] permissions : packages : write jobs : build-and-publish : runs-on : ubuntu-latest strategy : matrix : dotnet-version : [ '8.0' ] # Add all your supported versions steps : - name : Checkout code uses : actions/checkout@v4 - name : Set up .NET ${{ matrix.dotnet-version }} uses : actions/setup-do...

github actions setup and authenticating terraform provider using managed identity

April 17, 2025

You can use the following yaml to setup terraform azurerm which authenticate using managed identity. All you need is the following information - AZURE_CLIENT_ID - this would be your managed identity client id - AZURE_SUBSCRIPTION_ID - AZURE_TENANT_ID The permission is important and you need this to work. name : 'Build .Net app' on : [ push , workflow_dispatch ] permissions : id-token : write contents : read jobs : build-and-deploy : runs-on : ubuntu-latest steps : - name : Azure login uses : azure/login@v2 with : client-id : ${{ secrets.AZURE_CLIENT_ID }} tenant-id : ${{ secrets.AZURE_TENANT_ID }} subscription-id : ${{ secrets.AZURE_SUBSCRIPTION_ID }} - name : Azure CLI script ...

how to call azure REST API endpoint

April 17, 2025

The followings are steps you need to do to call Azure REST API endpoint. We need to obtain the token and then hit this endpoint using bearer token authorizations TOKEN=$(az account get-access-token --query accessToken --output tsv) curl -X GET \ " https://management.azure.com/ subscriptions/ $SUBSCRIPTION_ID /resourceGroups/$RESOURCE_ GROUP/providers/Microsoft.Sql/ servers/$SERVER_NAME?api- version=2024-02-01" \ -H "Authorization: Bearer $TOKEN"

azure devops terraform provider - needs the following permission(s) on the resource Users to perform this action: Add Users

April 17, 2025

While trying to setup my users using azure devops terraform provider, i bump into this error "needs the following permission(s) on the resource Users to perform this action: Add Users" Then i had to update the following permission to get it to work.

llm tool to help with quantization

April 12, 2025

To decrease the size and memory footprint of machine learning models, a technique called quantization is employed. This method, akin to lossy image compression, converts model weights into lower precision formats such as 8-bit or 4-bit. While this significantly reduces resource demands, it's important to note that, like image compression, quantization can potentially lead to a reduction in the model's accuracy. Tool that you can use are https://github.com/ModelCloud/GPTQModel https://github.com/casper-hansen/AutoAWQ

k8s gateway api - setting up gateway and routes to different service in a cluster

April 12, 2025

To setup a k8s gateway api in a gke cluster, you typically have to - create gatewayapi - setup necessary http route - deploy your services A and service B. Setting up gateway by running the following yaml apiVersion : gateway.networking.k8s.io/v1 kind : Gateway metadata : name : my-shared-gateway spec : gatewayClassName : gke-l7-regional-external-managed listeners : - protocol : HTTP # Or HTTPS for production port : 80 # Or 443 for HTTPS name : http hostname : "*.example.com" create your http route A. apiVersion : gateway.networking.k8s.io/v1 kind : HTTPRoute metadata : name : a-route namespace : default spec : parentRefs : - name : my-shared-gateway hostnames : - "a.example.com" rules : - matches : - path : type : PathPrefix value : / ...

gke gateway :- error ensuring load balancer: generic::invalid_argument: Insert: Invalid value for field 'resource.target'

April 11, 2025

This error comes about when you do not have a proxy subnet created for your region when you're trying to deploy load balancer or when creating k8s API gateway's gateway resources. These load balancers use a Google-managed proxy layer, and for internal communication between the proxy and your backend VMs, GCP needs a reserved IP range within your VPC. That’s where the proxy-only subnet comes in. You can create it by going here: https://cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create You can see after i apply the fixes, my gateway resources starts to sync successfully. And then allocated a ip for my gateway here:-

gke - deploying multiple llm models with gradio interface

April 11, 2025

First create a gke cluster and you will need to be able to support GPU (required by the hugging face text generation inference) . Once you have provision one create a secret :- kubectl create secret generic l4-demo \ --from-literal=HUGGING_FACE_TOKEN=hf_token Next choose the ML of your choice for deployment. In this example, we going to use falcon. Please refer to this page here for other models. apiVersion : apps/v1 kind : Deployment metadata : name : llm spec : replicas : 1 selector : matchLabels : app : llm template : metadata : labels : app : llm spec : containers : - name : llm image : us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.1-4.ubuntu2204.py310 resources : ...