Posts

Showing posts from October, 2023

k8s autoscaler reaction time to scale up or scale down

 You can change the behaviour of the autoscaler scan time by adding this on the pod level. "cluster-autoscaler.kubernetes.io/pod-scale-up-delay": "600s" \

helm adopt - creating new charts with existing k8s resources

Let's say you have deployed k8s resources into a cluster and you wanted a way to generate a chart and deploy, then helm adopt is good options.  https://github.com/HamzaZo/helm-adopt (based on https://github.com/helm/helm/issues/2730) You need to specify all the resources like deployment, services in one go, otherwise there's some error. For example, let say you use the following command: helm adopt resources deployment:mytest-demo services:mytest-demo --output frontend instead of  helm adopt resources deployment:mytest-demo  --output frontend helm adopt resources services:mytest-demo  --output frontend

helm - unable to change label as it is immutable

I tested this with helm v.3.10 - doesn't seems to happen and only happened to me when i am using helm version 3.7. Perhaps upgrading the helm runtime version might help. 

nextjs 13 upgrade to 14

You can do that using the following command - magic  npm i next@latest react@latest react-dom@latest eslint-config-next@latest Unfortunately not a straight forward upgrade for me as i am using apolog server integration package. Anyways I force it and will be trying to test that out.

Nextjs 14 : next dev --turbo

 Really fast start up after install nextjs 14 and start dev with --turbo options.

class NextRequest extends Request { ^ ReferenceError: Request is not defined

 This is when trying to run Nextjs 14 on node 16 :)  After you install, all the error goes away.

use the following to stress test, create memory and intensive cpu load for your load

  --- apiVersion : apps/v1 kind : Deployment metadata :   name : progrium spec :   replicas : 2   selector :     matchLabels :       apptype : backend   template :     metadata :       labels :         app : progrium         version : v1         apptype : backend     spec :       containers :       - image : progrium/stress         imagePullPolicy : IfNotPresent         name : progium         command : "stress"         args : [ "--cpu" , "100" , "--io" , "10" , "--vm" , "2" , "--vm-bytes" , "928M" ]         ports :         - containerPort : 80

aks how does node auto-scaling works

Image
  According to the documentation, if the memory and cpu request falls below 50%. So both CPU and memory needs to be above 50%. I also noticed that you need to have a k8s deployment. If you do not have a deployment, those nodes just keeps on throttling.  There will be a wait between 10-30 minutes before you can start seeing auto-scaling.  You don't necessarily need to get your pods on "pending" state for it to scale.  FAQ for how autoscaling works  https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-scale-down-work You can also view the status via kubectl top - which is an accurate account of what's happening.  You can get the status of your node auto-scaler using the following command  kubectl describe configmap --namespace kube-system cluster-autoscaler-status Not using k8s deployment yaml Pods needs to be in pending state and meets the cpu / memory utilization factor of above 50% to scale. If pods can be scheduled,...

Simulating loads using kubectl

 You can try the following command to create some load stress for your nodes kubectl run progrium4 --image=progrium/stress  -- --cpu 80 --io 10 --vm 2 --vm-bytes 928M 

referencing a variable from another stage requires dependsOn to be specified

  To use variable output from another stage, you require a direct (1 level) of reference to that stage. Lets say you have created a 3 stage pipeline that looks like this  Stage A -> Stage B -> Stage C  Variable X is output in Stage A. Then Stage B would be able to reference Stage A. Stage C will not be able to reference variable X. 

k8s prevent node from being scaled down

  The following annotation can be used to prevent node from scaling down. "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true"

helm - using condtional if to reliably check for nil value

  If you're within a scope  {{- if and ($.Values.envs) ($.Values.envs.demo) }}     - name : ok       value : oktoo   {{- end }} If you're not within any scope {{- if and (.Values.envs) (.Values.envs.demo) }}     - name : ok       value : oktoo   {{- end }}

golang - net ListenIP network type

It can be difficult to get right. Perhaps the docs should be better. If you try to use net.ListenIP and since the network is a string parameter, it should be what listed in the code below but it can also just be "ip4" - and i guess that's what trip alot of developers.  ipServer , err := net. ResolveIPAddr (CONN_TYPE, CONN_HOST)     if err != nil {         fmt. Println ( "unable to resolve ip address" , err. Error ())         os. Exit ( 1 )     }     l , err := net. ListenIP ( "ip4:icmp" , ipServer)     if err != nil {         fmt. Println ( "Error listening:" , err. Error ())         os. Exit ( 1 )     }     defer l. Close () For a complete list of options available, you can check this link out https://github.com/golang/go/blob/9341fe073e6f7742c9d61982084874560dac2014/src/net/lookup.go#L22  

keycloak 22.0.3 k8s CRDS

 Go and clone keycloak git repostory . Then go into operator folder and run "mvn package". Then it will generate all the deployment required. Please make sure you have install maven and jdk17.

Unhandled exception. System.ArgumentException: The connection string used for an Event Hub client must specify the Event Hubs namespace host, and either a Shared Access Key (both the name and value) or Shared Access Signature to be valid

After getting some error that looks like this.  Basically you're using an ROOT access key that don't come with eventhub name. Probably use another constructor if you're using root managed connection string. ----- Unhandled exception. System.ArgumentException: The connection string used for an Event Hub client must specify the Event Hubs namespace host, and either a Shared Access Key (both the name and value) or Shared Access Signature to be valid. The path to an Event Hub must be included in the connection string or specified separately. (Parameter 'connectionString') at Azure.Messaging.EventHubs. EventHubsConnectionStringPrope rties.Validate(String explicitEventHubName, String connectionStringArgumentName) at Azure.Messaging.EventHubs. Primitives.EventProcessor`1.. ctor(Int32 eventBatchMaximumCount, String consumerGroup, String connectionString, String eventHubName, EventProcessorOptions options) at Azure.Messaging.EventHubs. EventProcessorClient..ctor( BlobContain...

updating k8s deployment label can caused helm future deployment to fail

 After we tried to update an existing helm deployment label, it complaint about label immutability and then stop all deployment.  Unfortunately, there's no easy fix because helm stores these information in its namespace secrets. 

golang : no new variables on left side of

 so happens that my code redeclare the same variable name.

go: cannot determine module path for source directory

  To resolve this  go mod init your-module-name/your-module-subname

mongodb university - free course on index design

  https://learn.mongodb.com/courses/mongodb-indexes

Setting your ts project quickly for node

npm init -y  npm install typescript --save-dev npm install @types/node --save-dev npx tsc --init --rootDir src --outDir dist --esModuleInterop --resolveJsonModule --lib es6,dom  --module commonjs 

typescript generating typings from your project

First start with creating a normal npm project  npm init  Then run tsc --init to create tsconfig.js file Then add the following to your tsconfig.js      "declaration" : true , When you export code from your typscripts, the relevant typings gets generated.  Source code can be found here .  

html - for and autocomplete replacement for react

Weird but kinda interesting  for = htmlfor  autocomplete = autoComplete

nunjucks removing newline when rendering the template

 Normally when apply nunjucks templating, we ended up with newlines and yaml doesn't like this. To get around this, you can use the syntax here.  Notice we have an extra - (dash) at the end of the line below and also at the beginning of the closing tag.  {% if values.addManagedIdentitySupport === "auto" -%}   workloadIdentityEnabled: true  {%- endif %}

Key Vault access denied when first created - checking for presence of existing Secret

I guess the issue here is we need a depends_on block to help with the access timings.  data "azurerm_client_config" "current" {} resource "azurerm_resource_group" "example" { name = "tom-devrg3" location = "West Europe" } resource "azurerm_key_vault" "example" { name = "tomdevkv3" location = azurerm_resource_group.example.location resource_group_name = azurerm_resource_group.example.name tenant_id = data.azurerm_client_config.current.tenant_id sku_name = "premium" } resource "azurerm_key_vault_access_policy" "example" { key_vault_id = azurerm_key_vault.example.id tenant_id = data.azurerm_client_config.current.tenant_id object_id = data.azurerm_client_config.current.object_id secret_permissions = [ "delete", "get", "set", ] } resource "azurerm...

Managed Identities object id

Managed identity object id and application id is the same if you go to the Azure portal -> Managed Identity vs Microsoft Extra -> Enterprise Application -> Managed Identity

az cli - getting details about a managed identities

 This is really handy command to get information about your managed identity or service principal.  It returns displayName or Application Id. az ad sp list --display-name  'your-managed-identity-name' This can be very useful compare to az identity show  --- which requires a resource group. Sometimes we might not know what the resource group is. 

service account long live token

Image
 We can create sa easily using the following code  apiVersion : v1 kind : ServiceAccount metadata :   name : robot   namespace : default Next, we can generate a long live token  apiVersion : v1 kind : Secret metadata :   name : my-long-lived-secret   annotations :     kubernetes.io/service-account.name : robot type : kubernetes.io/service-account-token Then we can decode it by using kubectl describe secret/my-long-lived-secret - as shown below:  There's no audience.  What cloud resources would able to allow or deny access to resources based on this service account token? Then use postman to check the token  curl --location 'http://localhost:8080/api/v1/namespaces/default/serviceaccounts/robot/token' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlpUSzhiVVRXYVdQN1RoTGgxODVyVTJFSk1jNzNYQ0EtZlVEckt1YnZoWkkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5...

What sort of permission required to execute aks get credentials

 In general, a person or service principal needs to have this permission to be able to run aks get credentials successfully.  Microsoft.ContainerService/managedClusters/listClusterUserCredential/action

Getting Google access code via postman

Image
  First start by creating a client in Google console API and Services. On your left, click on Credentials-> Create Credential -> OAuth client id -> Create a windows desktop client.  Then enter details for your client Id and choose Desktop for "client type". Please note down the client id, which you will be using here:   Then replace client_id here with the actual client-id generated. Copy and paste this in your browser.  https://accounts.google.com/o/oauth2/v2/auth?client_id= &response_type=code&scope=openid&redirect_uri=urn:ietf:wg:oauth:2.0:oob Copy and paste this url on your browser and then get the authorization code  - as shown below:  Then you need to use the following curl command: curl --location 'https://oauth2.googleapis.com/token' \ --header 'Content-Type : application/x-www-form-urlencoded' \ --data-urlencode 'client_id=<your-client-id>' \ --data-urlencode 'code=<authorization-code-from-previous-steps>...

Principals of type Application cannot validly be used in role assignments

Image
 While running my terraform code I ran into this issue.  It is really good to know that it is important to differentiate between id used for authentication vs auhtorization. Application Object in Application Registration is used authentication. Enterprise application object id is used for authorizations.  In short, using the wrong id when doing role assignment. When doing role assignment, ensure that you're using Enterprise Application Object Id as shown here. 

Microsoft Extra - Authorizing 3rd party OIDC provider using Auth0 token - There is an issue with the key ''. It has both x5t and x5c values, but they do not match. Please make sure the x5t value is the Base64Url-encoded SHA-1 thumbprint of the first certificate in x5c.

Image
I am getting the following error trying to merge an SSL cert with the item I created in Key Vault to get the CSR for the order. Error information CODE BadParameter MESSAGE Something went wrong with the certificate creation. RAW ERROR Property x5c has invalid value X5C must have at least one valid item I found the below linked question but the solution does not work as my cert has the documented solution and I still get the error. I have put a comment on that question as well as raising this one to try and get a response to the issue. https://learn.microsoft.com/en-us/answers/questions/713593/got-the-error-while-merging-the-certificate-in-azu.html So I am looking for a solution so I can import the certificate and create the additional items required for the web site. Sample of my post and the token obtain from Auth0 curl --location 'https://login.microsoftonline.com/mytenant/oauth2/v2.0/token' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data-urlencod...

"AADSTS700222: AAD-issued tokens may not be used for federated identity flows

Token issued between Azure AD for different subscriptions are not supported. Hence cannot be used as part of the Federated Identity.

k8s yaml rendering issues

  Symptoms: When you see yaml renders has empty value {} when you expect it to have a value? After checking helm manifest, it turn out to be ok.  Once of the way, I like to do it is -- output your helm manifest to normal say my-deployment.yaml. Then use a good editor to view your generated yaml. Normally the issues will be some duplicate field like repeat "spec" field. After you can through, try to run kubectl apply -f my-deployment.yaml and look for any deployment messages. Normally chances are it is in the yaml generated. 

k8s service basic - service selector

Image
 what happens if your service selector doesn't match any of your deployment/pod label?  There won't be any endpoints visbile In terms of selector, either one match would be sufficient. For example, this is your deployment label   template :     metadata :       labels :         app : httpbin         version : v1         apptype : backend So your service selector, just need to be either one:  spec :   ports :   - name : http     port : 8000     targetPort : 80   selector :     #app: httpbin     #apptype: backend     version : v1 The same applies to deployment resources. 

Azure AD Error codes

 This might came in handy in case you're looking for error codes and definitions https://learn.microsoft.com/en-us/azure/active-directory/develop/reference-error-codes Generally, the response provide enough details that we don't have to lookup the code.

Getting token for initiating REST based request to Google vertex AI

  gcloud auth print-access-token

Google generative AI Self Pace Tutorial

https://www.cloudskillsboost.google/focuses/63250?parent=catalog  

AKS - what if you want to scale out certain / specific nodes

 You can use VMSS protection from Azure https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-instance-protection

securityContext in kubernetes

 Security contexts in kubernetes is confusing, partly because it uses the same name and appears both in deployment.spec.template.securitycontext and also deployment.spec.template.containers.securitycontext.  But you will see below in yaml below. So the securitycontext defined in the container level OVERRIDES the security context in the template level.  Have a look at the yaml below: you'll noticed that we have 2 securitycontext.  I have provided the link to the docs so you can have some fun there.  apiVersion : apps/v1 kind : Deployment metadata :   name : nginx-deployment spec :   selector :     matchLabels :       app : nginx   replicas : 1   template :     metadata :       labels :         app : nginx     spec :       securityContext :         # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#podsecurityco...

getting error trying to run strace with kubectl debug mode

  kubectl debug is quite limited and won't let you add for example, SYS_PTRACE (to run strace on your app). To resolve it, try adding shareProcessNamespace and add SYS_PTRACE in your deployment or pod.  For example,  apiVersion : v1 kind : Pod metadata :   name : nginx spec :   shareProcessNamespace : true   containers :   - name : nginx     image : nginx   - name : shell     image : ubuntu     command : [ "sleep" , "3600" ]     securityContext :       capabilities :         add :         - SYS_PTRACE     stdin : true     tty : true As you can see, we have a shell container and just runs ubuntu image. You can install strace -p PROCESS-ID since these containers are already sharing processes.  Also noticed the SYS_PTRACE entry. This will resolve issues with strace permissions. 

k8s shared process docs sample - doesn't quite work

  In the example, the following yaml is given but when it ask you to do  kubectl attach -it nginx -c shell It will go into a black hole. To resolve this, try  kubectl exec -it nginx -c shell sh apiVersion : v1 kind : Pod metadata :   name : nginx spec :   shareProcessNamespace : true   containers :   - name : nginx     image : nginx   - name : shell     image : ubuntu     command : [ "sleep" , "3600" ]     securityContext :       capabilities :         add :         - SYS_PTRACE     stdin : true     tty : true

Unable to use a TTY - input is not a terminal or the right kind of file

 Use the wrong terminal on my Windows. Switching to powershell resolve it.

AzRoleAssignment for service principal.

  Surprisingly the powershell command to add role assignment to service principal can be set using -applicationId.  New-AzRoleAssignment - ApplicationId serviceprincipal  -RoleDefinitionName "Azure Kubernetes Service Cluster User Role"  -Scope  "your-resource-scope"

AKS setting up user/group RBAC

Wiring up your Azure AD users to your AKS cluster seems to be a good way to go about managing and securing resources.  AKS has some built in cluster role that we can use. But i think many would end up creating their own special roles.  - cluster-admin - admin -edit  -view  Quick and dirty way to use existing cluster role are as follows (if you wanted to test some stuff out)  Setting up by group  kubectl create clusterrolebinding <name of your cluster role binding> --clusterrole=view -- group =<Azure AD group object ID> Setting up by user  kubectl create clusterrolebinding <name of your cluster role binding> --clusterrole=view -- user =<Azure AD user object ID> A more formal way of doing it  role.yaml kind : Role apiVersion : rbac.authorization.k8s.io/v1 metadata :   name : dev-user-full-access   namespace : dev rules : - apiGroups : [ "" , "extensions" , "apps" ]   resources : [ "*" ]   verbs : [ "*" ] -...

az cli - newly created unable to login with the following error: ERROR: No subscriptions found for

I followed stackoverflow guide. Went to my subscriptions -> IAM -> Add role -> Select my service principal and walla.... done! 

install az cli in linux

  curl -sL https://aka.ms/ InstallAzureCLIDeb | sudo bash or  curl -sL https://aka.ms/ InstallAzureCLIDeb | bash

az cli login as a service principal

 The following command could be handy to help with az cli - logging in as a service principal.  az login \ --service-principal \ --tenant <Tenant-ID> \ --username <Client-ID> \ --password <Client-secret> \ --output table