Posts

gke running batch ml workload using redis and spot instances

Image
We can setup kubernetes job to run our machine learning workload that uses redis to manage job queues and save eheckpoints to filestore.  First we need to create our filestore by running the following command    gcloud filestore instances create batch-aiml-filestore \         --zone=australia-southeast2-a \         --tier=BASIC_HDD \         --file-share=name="NFSVol",capacity=1TB \         --network=name="default" Next we will replace this filestore IP in our kubernete manifest but we need to get the ip address of our filestore gcloud filestore instances list \     --project=$PROJECT_ID \     --zone=australia-southeast2-a Next we can proceed by cloning GCP kubernetes samples repository. git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples cd kubernetes-engine-samples/batch/aiml-workloads sed -i "\   s/ <FILESTORE_IP_ADDRESS> /192.168.147.21...

vllm with cpu docker image

https://hub.docker.com/r/openeuler/vllm-cpu 

gke running ollama with qwen3:0.6b example

Image
In this post, we are going to deploy gwen to an autopilot cluster. This is really better than creating your own dockerfile and your deployment yaml. To get started, we would run the following command: helm repo add ollama-helm https://otwld.github.io/ollama-helm/ helm repo update ollama-helm helm upgrade --install ollama ollama-helm/ollama \   --namespace ollama \   --create-namespace \   --values values.yaml The value file would look something like this. We are using gwen3.0.6b. You can use choose to use other model too. ollama:   gpu:     enabled: false       service:     type: "ClusterIP"   models:     pull:       - qwen3:0.6b persistentVolume:   enabled: true   size: 10Gi   storageClass: "standard-rwx" This takes abit longer like 5-10 minutest to provision the storage and get the pod running. It will automatically download your chosen model. In a way it is way better then baking it in...

istio gateway vs kubernetes gateway

Sometimes I get confuse by the gateway resource in istio and kubernetes gateway api. They do almost the same thing but not exactly the same. :D As you can see here, the API group and short name is different $ kubectl api-resources NAME SHORTNAMES APIGROUP NAMESPACED KIND gateways gw networking.istio.io/v1beta1 true Gateway gateways gtw networking.k8s.io/v1beta1 true Gateway And another clear distinction when working with these 2 resources # Kubernetes Gateway $ kubectl get gtw NAME CLASS multi-cluster-gateway gke-l7-global-external-managed-mc $ kubectl get gateway.networking.x-k8s.io NAME CLASS multi-cluster-gateway gke-l7-global-external-managed-mc # Istio Gateway $ kubectl get gw NAME AGE bookinfo-gateway 64m $ kubectl get gateway.networking.istio.io NAME AGE bookinfo-gateway 64m