Posts

Showing posts from May, 2025

model context protocol - error - " is not valid JSON {"context":"connection","stack":"SyntaxError: Unexpected token 'm'

This is quite a generic error when i tried to start my mcp server with claude. The only hint I got was - it has something to do with JSON.  And after some debugging and looking at the logs from Claude - i was able to figure that my class that i was going to pass to my target API endpoint needs pydantic BaseModel support.  So I had to update my class to which resolves the issue for me:- from pydantic import BaseModel class PizzaOrder ( BaseModel ):     def __init__ ( self , pizza_type : str , crust_size : str ):         self .pizza_type = pizza_type         self .crust_size = crust_size     def __str__ ( self ):         return f " {self .crust_size } crust {self .pizza_type } pizza"  

Azure AI foundry model setup and playground

Image
To be able to use use AI model like Mistral, ChatGPT and other model in Azure, you need to create and deploy the model first. Assuming you have an existing project in your Azure AI foundry.  By default it is all GPT. Here we will use a Mistral-3B model (which cost less) and setup our playground for fun and I don't want to use Chatgpt for my model. Go to Model catalog and select Mistral-3B. Click on Goto Model page and accept the terms and condition.  Deploy this model. We only have Global standard here. Once you have deployed it, goto My models and select your deployment Then select "Open in Playground" Then you can interact with it

meta-llama/Llama-3.2-3B-Instruct supports 8 bit quantization

Image
To run this on Google Colab, ensure you have the following runtime setup.  Ensure you have run the following command  ! pip install -U bitsandbytes  This will basically updates existing library to use a CUDA versions of it.  And then to infer the model you can use the following code from transformers import BitsAndBytesConfig from transformers import AutoModelForCausalLM, AutoTokenizer quant_config = BitsAndBytesConfig(load_in_8bit= True ) model_name = "meta-llama/Llama-3.2-3B-Instruct" model = AutoModelForCausalLM.from_pretrained(     model_name,     torch_dtype= "auto" ,     quantization_config=quant_config ) tokenizer = AutoTokenizer.from_pretrained(model_name) text = "hello world" model_inputs = tokenizer([text], return_tensors= "pt" ).to( "cuda" ) generated_ids = model.generate(**model_inputs, max_length= 30 ) tokenizer.batch_decode(generated_ids)[ 0 ] The take away is  meta-llama/Llama-3.2-3B-Instruct  supports 8 bit ...

python asyncio hello world

 To start using Python asyncio, it is important to get the main using code below, using asyncio,run() :-  async def main ():     if len (sys.argv) < 2 :         print ( "Usage: python client.py <path_to_server_script>" )         sys.exit( 1 )     client = MCPClient()     try :         await client.connect_to_server(sys.argv[ 1 ])         await client.chat_loop()     finally :         await client.cleanup() if __name__ == "__main__" :     import sys     asyncio.run(main())

mcp protocol weather server

Image
 We are going to build a weather server mcp app and then we're going to use claude desktop for this integration. We will query what is the weather and then claude desktop is going to ask us if we allow our server to be used to query weather data and then reports back to us. To get started ensure you create your python environment using uv. Make sure you use uv command for this   # Create a new directory for our project uv init weather cd weather # Create virtual environment and activate it uv venv .venv\Scripts\activate # Install dependencies uv add mcp[ cli ] httpx Then add a file call weather.py from typing import Any import httpx from mcp.server.fastmcp import FastMCP # Initialize FastMCP server mcp = FastMCP( "weather" ) # Constants NWS_API_BASE = "https://api.weather.gov" USER_AGENT = "weather-app/1.0" async def make_nws_request ( url : str ) -> dict[ str , Any] | None :     """Make a request to the NWS API with proper...

Model context protocol SDK - how you can integrate with your LLM and give it more context

The SDK can be found here. https://modelcontextprotocol.io/introduction

hotchoc graphql entity framework database integration sample

I have a simple database integration example in graphql that uses EF with in memory database to demostrate mutation and query.  You can find more example here: https://github.com/mitzenjeremywoo/hot-choc-15-data-service 

Llama factory for faster better fine tuning of LLMs

Started looking into this project to see how much I can gain from LLM fine-turning.  https://github.com/hiyouga/LLaMA-Factory

hotchoc - upgrading from v14 to v15 issues with AddTypes() for visual studio

Image
Upgrading from v14 to v15 causes Visual studio to report error  But when i run from the command line or use visual studio code, it is fine.  Apparently after cleaning and rebuilding didn't work. Also tried restarting Visual studio.  You can have a look here https://github.com/mitzenjeremywoo/hot-choc-15-data-service  

dotnet configuration : 'IConfigurationSection' does not contain a definition for 'Get'

 To resolve this, we just need to add relevant packages  dotnet add package Microsoft.Extensions.Configuration.Binder

python function with variable keyword argument

In Python, we have function that accepts variable keyword argument.  This makes it easy to pass function without resorting to using too many parameters. The example below helps to provide one example, how to pass this in and get some understanding of how to work with these values. # get key def test ( ** kwargs ):   # unpacking   a, b = kwargs   print (a)   print (b)   print (kwargs) # get values def test ( ** kwargs ):   # unpacking   a, b = kwargs.values()   print (a)   print (b)   print (kwargs) let a = { "test" : "test1" , "name" : "test2" } test( ** a)  

python list comprehension

Python list or other comprehension has the following syntax {key_expr: value_expr for item in iterable if condition} So let's look at an example here, that allow us to get only property   import inspect class Employee :     name: str     id : int     def __init__ ( self , ename : str , eid : int ):       self .name = ename       self .id = eid     def hello ( _self ):       print ( "employee" )     @ property     def email ( self ):       return "kepung@gmail.com" e = Employee( "Alice" , 2 ) properties = [name for name, value in inspect.getmembers( type (e))               if isinstance (value, property )] print ( "Properties:" , properties) You can see here that we're trying to get name only.  Running inspect.getmembers(type(e)) would return name and value. Then we have an if statement there that will check isi...

python reflection during runtime

Using Python reflection feature to discover properties and methods of an class class Employee :     name: str     id : int     def __init__ ( self , ename : str , eid : int ):       self .name = ename       self .id = eid     def hello ( _self ):       print ( "emplyoee" )     @ property     def email ( self ):       return "kepung@gmail.com" e = Employee( "Alice" , 2 ) List all methods and properties  print ( dir (e)) Outputs  ['__annotations__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__...

python attribute decorator

To create a decorator in python is really easy. If you apply hello_world_decorator to say any method, it will execute the wrapper first. def hello_world_decorator ( func ):     def wrapper ( * args , ** kwargs ):         print ( "Hello, world" )         return func( * args, ** kwargs)     return wrapper class Employee ( BaseModel ):     name: str     id : int     @hello_world_decorator     def hello ( _self ):       print ( "emplyoee" )  So you will get the following output Hello, world employee

C# extension method just got easier

We used to write code like this for setting up our extension.  public static class MyExtensions { public static int WordCount ( this string str ) => str.Split([ ' ' , '.' , '?' ], StringSplitOptions.RemoveEmptyEntries).Length; } The new way of writing extension code is here public static class MyExtensions { extension( string str) { public int WordCount ( ) => str.Split([ ' ' , '.' , '?' ], StringSplitOptions.RemoveEmptyEntries).Length; } } Magic! 

graphql - creating multiple concurrent query

Image
We can create multiple concurrent query in graphql by using the following query {   a : blog {     id     userInput   }   b : blog {     id     userInput   } } And that would look something like this 

hotchoc - supporting different schema/types for code first and implementation approach

This shows an example of using implementation approach for hotchoc graphql.  It shows how we can expose different type/schema using 2 different approaches namely - implementation first  - code first In the implementation approach, it is straight forward. We setup the root query (that's a pre-requisite - can't have more than one root query for single graphql or federate graphql - typically you use extend for this) Then, we add the relevant Type - The type here can be confusing. To me, it is classes that contains method which retrieves types and these types can be book, author. Hotchoc can automatically infer these types. using apollo_gateway . Types ; using apollo_gateway . Types . CodeFirst ; var builder = WebApplication . CreateBuilder ( args ); // implementation approach builder . Services . AddGraphQLServer (). AddApolloFederation (). AddQueryType ( q => q . Name ( "Query" ))     . AddType < BookDataReturnDataHelper >(). AddType < ProductDataRet...

hotchoc's AddApolloFederation schema vs non-apollo subgraph

Image
You might be wondering what is the difference if you use 'AddApolloFederation'  besides resolving a bunch of ambiguous static extension in your code?  One of the biggest difference I noticed is schema generated - as you can see below. Although this is not a apple by apple comparison. there's extra schema definition here:-  schema   @link (     url : "https://specs.apollo.dev/federation/v2.6"     import : [ "@tag" , "FieldSet" ]   ) {   query : Query }          As you can see here, we have "query: Query". This can only be one. If you have more than one, we will get into a schema definition error.  In graphql federation, you will see similar schema definition. Let's say you have a product and user subgraph, it would look something like below. You will notice we extended Query type using extend type Query . So really important to keep this construct or setup.  // product subgraph extend type Query { ...

gke - serving model with monitoring

 This is a great link for serving llm model with monitoring https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-multihost-gpu Configuring autoscaling for LLM https://cloud.google.com/kubernetes-engine/docs/how-to/machine-learning/inference/autoscaling