mitzen

Posts

gcloud - local login account setup

April 09, 2025

After installing gcloud, we need to setup our user account for testing and development purpose. To do that run the following command:- gcloud init gcloud auth application-default login

dockerfile and $@

April 06, 2025

The $@ symbol are basically expanding or executing arguments passed from a client/entry point to the executing code. Let's say for example we have a file call ./hello2.sh #!/bin/bash echo "Hello, World! - Argument: $@ " if we run hello2.sh test1 test2 test3 We will get the following output From a docker perspective, we can make a highly customizable image. For example, let's say we have this entrypoint.sh with the following contents entrypoint.sh #!/bin/bash ldconfig 2> /dev/null || echo 'unable to refresh ld cache, not a big deal in most cases' source /usr/src/.venv/bin/activate exec text-generation-launcher $@ And in my dockerfile, Dockerfile ENTRYPOINT [ "/entrypoint.sh" ] CMD [ "--model-id" , "tiiuae/falcon-40b-instruct"] When my container runs, essentially what i am executing will be as follows. /entrypoint.sh --model-id tiiuae/falcon-40b-instruct which in turn calls ./text-generation-launcher --model-id tiiu...

gke - testing out Mixtral-8x7B-Instruct-v0.1 deployment not successful on a non-gpu pool

April 05, 2025

To deploy this, it requires GPU. I was trying to deploy without one and wasn't successfully mainly because the images huggingface-text-generation-inference does not have a cpu compatible image. This is the image that is required image : us-docker.pkg.dev/ deeplearning-platform-release/gcr.io/huggingface-text- generation-inference-cu124.2-3.ubuntu2204.py311 First you need to create a secret and ensure you have accepted mistral agreement. If not head over to hugging face and accept the license agreement. Then create your secret kubectl create secret generic l4-demo \ --from-literal=HUGGING_FACE_TOKEN=hf_your-key-here Then deploy gradio and your model to k8s cluster. You can find additional detail here. The gradio deployment looks like this apiVersion : apps/v1 kind : Deployment metadata : name : gradio labels : app : gradio spec : strategy : type : Recreate replicas : 1 selector : m...

angular setting up server side rendering setup

April 05, 2025

To setup ssr on a new project, simply run the following command ng new --ssr and to setup existing project ng add @angular/ssr Let's check src/app/app.config.ts file where we have provideClientHydration added. export const appConfig : ApplicationConfig = { providers : [ provideZoneChangeDetection ({ eventCoalescing : true }), provideRouter ( routes ), provideClientHydration ( withEventReplay ())] }; To start running your server, run the following command and you will get the outputs shown here ng build Noticed that server..mjs is generated and if you look at your package.json, it uses node + express to run your server app. Go ahead and run the following command:- npm run serve:ssr:myssr Then go to your chrome browser and then locate localhost, you will see the response. Notice that the html,css and images is return by the web server. Looking at server.ts file, we noticed that if route request for static file goes to browser dist fol...

ray serve - unable to run in gke

April 04, 2025

Trying to setup ray serve to run in the gke cluster. Noticed that the sample uses : rayproject/ray-ml:2.9.0 which uses gpu. A non gpu version would be rayproject/ray-ml:2.9.0-cpu I can see from here that it is not using gpu. The 2.9.0 version does require this - as shown below

vllm : Failed to infer device type

March 31, 2025

Trying to run vllm without gpu will typicaly lands you into this issue. To resolve this issue: we can use set these both. Setting one does not work for me. So i had to configure both settings below:- 1. environment variable CUDA_VISIBLE_DEVICES = "" 2. in the command line, change to use device cpu and remove tensor-parallel-size. For example, python3 -m vllm.entrypoints.openai.api_server --port 8080 --model deepseek-ai/DeepSeek-R1 --device cpu --trust-remote-code --max-model-len 4096" References https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#failed-to-infer-device-type

postgres - working with vector

March 29, 2025

Here is a simple tutorial how to work with postgres vector. We will create a table and one of its column embedding uses a embedding dimension of 2. CREATE TABLE doctest ( id SERIAL PRIMARY KEY , content TEXT , embedding vector( 2 ) -- OpenAI's ada-002 embedding size ) ; To insert value into the table INSERT INTO doctest (content , embedding) VALUES ( 'test' , '[1.23,1.245]' ) INSERT INTO doctest (content , embedding) VALUES ( 'test7' , '[7, 10]' ) To select SELECT content , 1 - (embedding <=> '[7, 9]' ) AS similarity FROM doctest ORDER BY similarity DESC LIMIT 1 ; And gives you the output here: If you make another search SELECT content , 1 - (embedding <=> '[1.23, 1.3]' ) AS similarity FROM doctest ORDER BY similarity DESC LIMIT 1 ;

pip command not found - setting up new VM

March 29, 2025

This could happened for many reason. Path not set correctly etc etc. For my case, I am setting up a new VM and probably need to install the required pip. To resolve this, I simply had to :- sudo apt install python3-pip

azure devops sdk - how to use managed identity to connect to your azure devops

March 29, 2025

In the Azure Devops example, we seen quite alot of examples using PAT token to authenticate and access Azure Devops. Here we will try to change that and use managed identity instead. The following code show how you can achieve this. public async Task SetupConnectionUsingAzureIdentity ( AzRepositoryConfiguration configuration ) { logger . LogInformation ( "Starting SetupConnectionUsingAzureIdentity, getting access token" ); var azureCredential = new DefaultAzureCredential (); var token = await azureCredential . GetTokenAsync ( new TokenRequestContext ( new string [] { ScaffolderAppConstants . AzureDevopsApplicationIdScope })); VssCredentials creds = new VssAadCredential ( new VssAadToken ( ScaffolderAppConstants . BearAuthorizationText , token . Token )); logger . LogInformation ( $"Authenticating and setting up connection to Azure Devops: { confi...

postgres creating vector column getting error pq: type "vector" does not exist

March 29, 2025

Bump into this error while trying to create a vector column. Resolution:- Enable vector extension by running the following in your sql studio or other client CREATE EXTENSION vector ; CREATE TABLE documents ( id SERIAL PRIMARY KEY , content TEXT , embedding vector( 1536 ) -- OpenAI's ada-002 embedding size ) ; And you will see that it executes successfully

gcp cloud sql creating postgres database

March 29, 2025

We can easily create a postgres SQL using gcp. Go to cloud sql, then select postgres. Then, provide a name for your instance. Provide a password and specify a region. For my dev purposes, i would just use sandbox edition as oppose to production. Ensure you select 1vCPU for your machine configuration and storage to use HDD. Once you have created it, you can connect to your database using SQL Cloud studio in your browser. So probably don't need to install another PG client. Ensure you have whitelisted ips to connect to your database. You can goto Connection -> Add Network.

Deploying TF serving on gke

March 29, 2025

In the google example, it uses gpu to deploy training and inference job into a gke workload but requires gpu. We can easily convert that to run on normal cpu. First we need to create 1. a gke autopilot cluster 2. create a cloud storage called PROJECTID-gke-gpu-bucket Next clone the repository git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples cd kubernetes-engine-samples/ai-ml/gke-online-serving-single-gpu Configure the following environment variables: export PROJECT_ID = $( gcloud config get project ) export REGION = $( gcloud config get compute/region ) export K8S_SA_NAME = gpu-k8s-sa export GSBUCKET = $PROJECT_ID -gke-bucket export MODEL_NAME = mnist export CLUSTER_NAME = online-serving-cluster Create a service account in gcp IAM called "gke-ai-sa" and provide it with 2 roles namely "storage insights collector service" and ""storage object admin". create the following resources in k8s kubectl create...

Search This Blog

mitzen

Posts

using rust default value

gcloud - local login account setup

dockerfile and $@

gke - testing out Mixtral-8x7B-Instruct-v0.1 deployment not successful on a non-gpu pool

angular setting up server side rendering setup

ray serve - unable to run in gke

vllm : Failed to infer device type

postgres - working with vector

pip command not found - setting up new VM

azure devops sdk - how to use managed identity to connect to your azure devops

postgres creating vector column getting error pq: type "vector" does not exist

gcp cloud sql creating postgres database

Deploying TF serving on gke