Posts

Showing posts from September, 2024

AKS turning on network watcher linux extension

Image
Sometimes we need to do some packet capture or even do some network monitoring for AKS cluster's running nodes. To do that we first need to configure the network watcher linux extension.  To do that, goto your AKS's VMSS instance by going to AKS cluster -> Properties -> Infrastructure resource group -> Select your VMSS.  Find and install the following extension Even after i installed it, I was able to do my packet capture. Make sure you have check "Network watcher" Once you kick start the process, it will start to run. Then you can stop if by right clicking on your newly created job and select "Stop" packet capture. The file is a pcap file.  You can open up the pcap file using tools like wireshark.

using network watcher to quickly diagnose VM connection issue

Image
To verify flow connectivity from VM to bing or external internet  az network watcher test-ip-flow --direction 'outbound' --protocol 'TCP' --local '10.0.0.4:60000' --remote '13.107.21.200:80' --vm 'myVM' --nic 'myVmVMNic' --resource-group 'myResourceGroup' --out 'table' To verify flow from another remote VM or IP  az network watcher test-ip-flow --direction 'inbound' --protocol 'TCP' --local '10.0.0.4:80' --remote '10.10.10.10:6000' --vm 'myVM' --nic 'myVmVMNic' --resource-group 'myResourceGroup' --out 'table' Then you may want to see what NSG rules might be applied. az network nic list-effective-nsg --resource-group 'myResourceGroup' --name 'myVmVMNic' Please note that this only applies for VM. It won't be able to do it for kubernetes cluster. If you're trying to troubleshoot for kubernetes cluster yo

installing k3 - getting the following error : System has not been booted with systemd as init system (PID 1). Can't operate. Failed to connect to bus: Host is down

Image
  Hit this issue trying to get k3 working on a Windows Subsystem Linux.  wsl.conf  Ensure you have the following configuration in your  /etc/wsl.conf Add the following: [boot] systemd=true From a powershell command prompt, you may need to do  wsl --update  wsl --shutdown Then restart your wsl and run the k3 install script :-  curl -sfL https://get.k3s.io | sh - And that should be it. Notice the following output from kubectl.

azure aks cni overlay with cilium powered network policy unable update to calico

Image
  Cilium dataplane requires network policy cilium -  getting this error when trying to update my network policy to use calico.  I think the only way around this is not to configure or turn off cilium during the initial AKS setup and specify calico or azure network policy instead.

Allow Azure services and resources to access this server - does it include azure devops services?

 In short no, " Allow Azure services and resources to access this server " includes Azure services such as virtual machines, azure app service. Azure devops is NOT part of this services. From Azure docs Checking  Allow Azure services and resources to access this server   adds an IP based firewall rule with start and end IP address of 0.0.0.0

databricks troubleshooting download file path using dbutils.fs.cp

 When downloading a file using dbutils.fs.cp command, sometimes we might not have a clear understanding of where we are placing this file, for example  import os os.environ[ "UNITY_CATALOG_VOLUME_PATH" ] = "databrick-my-store" os.environ[ "DATASET_DOWNLOAD_URL" ] = "https://health.data.ny.gov/api/views/jxy9-yhdk/rows.csv" os.environ[ "DATASET_DOWNLOAD_FILENAME" ] = "rows.csv" dbutils.fs. cp ( f " { os.environ. get ( 'DATASET_DOWNLOAD_URL' ) } " , f " { os.environ. get ( 'UNITY_CATALOG_VOLUME_PATH' ) } / { os.environ. get ( 'DATASET_DOWNLOAD_FILENAME' ) } " ) When it is downloaded, we often tries to read it using spark.read.csv but we need to know the path to our file.    df = spark.read. csv ( "dbfs:/ databrick-my-store /rows.csv" , header = True , inferSchema = True ) We can quickly figure out what to pass to spark.read.csv by using the following command. Please note

using powershell to set folder permission recursively for a user

We can use icacls to set folder permission for a user by using the following command. The /t option does this for folder and subfolders (recursive)  icacls "C:\Path\To\Your\Folder" /grant "Username:(OI)(CI)(M)" /T Another option would be to use: $folder = "C:\MyFolder" $acl = Get-Acl $folder $rule = New-Object System.Security.AccessControl.FileSystemAccessRule("JohnDoe", "Modify", "ContainerInherit, ObjectInherit", "None", "Allow") $acl.SetAccessRule($rule) # Apply recursively Get-ChildItem -Path $folder -Recurse | ForEach-Object { $itemAcl = Get-Acl $_.FullName $itemAcl.SetAccessRule($rule) Set-Acl -Path $_.FullName -AclObject $itemAcl }

linux - check if a file support 32 or 64 bit

Image
  You can use file my-linux-executable to reveal if it supports 32 and 64 bit.  On the os level you can use 'uname -m' 

windows - listing supported tls cipher suites on a window server

Image
You can run the following command on a windows machine to checkout what are the ciphertext supported Get-TlsCipherSuite | format-wide If you would like to check the ciphertext on the registry,  Get-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\00010002" | Select-Object -ExpandProperty Functions This by itself is not much use - nobody is going to look at the cipher suit without any reason. This comes out as part of a debugging session when we try to connect to another website.  Easiest way to use SSLLab and entering the domain that you like to connect to  https://www.ssllabs.com/ssltest/index.html Sometimes the server might not be hosted on the public web, which is why we need: Nmap nmap --script ssl-enum-ciphers -p 443 www.example.com Openssl - can be quite tedious openssl s_client -connect example.com:443 -cipher ECDHE-RSA-AES256-GCM-SHA384 Additional info of changing order of TLS cipher To change the order of TLS. Not to add

c# int[,] vs int[][]

  int[,] is used to declare a multi-dimension array of x,y equal equal multi-dimensional array of int in this case. Due to its fixed size, it is often stored as a continuous memory location For example,    int [,] matrix = {      { 0 , 0 , 0 , 1 },      { 0 , 1 , 0 , 0 },      { 0 , 0 , 1 , 0 },      { 1 , 0 , 0 , 0 },      { 0 , 0 , 0 , 0 }  }; You can't do this - as element of your matrix is not of the same size.   int [,] matrix_error = {      { 0 , 0 , 0 , 1 },      { 0 , 1 },      { 0 , 0 , 1 , 0 },      { 1 , 0 },      { 0 , 0 , 0 , 0 }  }; To traverse it, int [,] matrix = {     { 1 , 2 , 3 , 4 },     { 5 , 6 , 7 , 8 },     { 9 , 10 , 11 , 12 },     { 1 , 0 , 0 , 0 },     { 0 , 0 , 0 , 0 } }; var m = matrix . GetLength ( 0 ); var n = matrix . GetLength ( 1 ); for ( int row = 0 ; row < m ; row ++ ) {     for ( int col = 0 ; col < n ; col ++ )     {         Console . WriteLine ( matrix [ row , col ]);     } } Also notice how we accessi

databricks delta lives table intro

Image
A delta live table is a declarative way to process data in a pipeline. You essentially using coding this in a notebook and generate a pipeline.  Let's say you have the following codes Customers from pyspark.sql.functions import * from pyspark.sql.types import * import dlt @ dlt . view (   comment = "The customers buying finished products, ingested from /databricks-datasets." ) def customers ():   return spark.read. csv ( '/databricks-datasets/retail-org/customers/customers.csv' , header = True ) Sales_Orders_Raw @ dlt . table (   comment = "The raw sales orders, ingested from /databricks-datasets." ,   table_properties = {     "myCompanyPipeline.quality" : "bronze" ,     "pipelines.autoOptimize.managed" : "true"   } ) def sales_orders_raw ():   return (     spark.readStream. format ( "cloudFiles" ) \       . option ( "cloudFiles.schemaLocation" , "/tmp/john.odwyer/pythonsalestest&q

databricks connecting to mongodb atlas data source

Image
First we create a notebook and then we install the require python library for accessing mongodb atlas. % pip install pymongo Then we obtain the require connection string for mongodb by going into  Next, you setup the require datasource and connection string by clicking on add code.  connectionString = "mongodb+srv://your-user-name:your-password@cluster0.psk0yd7.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0" database = "sample_supplies" collection = "sales" df = spark.read. format ( "com.mongodb.spark.sql.DefaultSource" ) \        . option ( "database" , database) \        . option ( "spark.mongodb.input.uri" , connectionString) \        . option ( "collection" , collection) \        . load () display (df) Run the code to get connected, pull out all the data and then display it

databricks how to install additional java packages

Image
  To create to a mongodb atlas, you need to have the proper library and have your compute instance setup already.  Select Libraries tab and click on "install new". Then select maven and click on search packages. You will see a whole list of packages available.  You can select from Maven central or spark packages. We are just going to search for "mongo-spark" and then select this package.  If you're getting similiar error messages as shown here, then try to provision your compute as a single personal instance.  " Jars and Maven Libraries on Shared Clusters must be on the allowlist. Failed Libraries: org.mongodb.spark:mongo-spark-connector_2.12:3.0.1: PERMISSION_DENIED: 'org.mongodb.spark:mongo-spark-connector_2.12:3.0.1' is not in the artifact allowlist "

databrick add data that resides in AWS s3

Image
To connect to your data source, you have placed in an s3 bucket, goto DataBricks -> Catalog ->   Then Then click on "+" to start the integration process Then select AWS Quickstart. This will help you to create the necessary role and permission in your AWS stack. Then click on "Next".  Provide your s3 bucket name and then you can click on Generate PAT token. Make sure you copy it and then click on "Launch Quickstart" - and you need to paste this information into the AWS cloud formation.  The rest of the information would have been populated in AWS console and all you need to do is paste in your PAT token. Click on next. Once it is completed - about 5-10 minutes, refresh your databrick workspace and you should be able to see all your files there. 

windbg memory leak trail

Image
In this scenario, we are trying to explore the dump for memory leaks. Maybe that will be easier to focus on certain commands instead of learning all windbg commands.  Analyzing memory  Open your dump file and then run the following !address -summary address-summary will give a bird eye view of what's happening with the memory.  !dumpheap -stat You can see that we have about 20,000 objects of type Product and it is taking up about 0.8M. If you click on the  7ffb81e16330  , it will display a whole list of product instances.  Click on any one of them or do a  !dumpobj /d 185b3ff8260 , you will be able to see details information of the object type. In this case, it contains only a simple name, id and details.  To see what is holding on to these objects - or still having a direct references to this instance here, we do !gcroot and follow by any address of Product type we identify earlier. !gcroot  0185b3ff3398 Look at the strong handle in screen below:-  To see what's happening in o