Showing posts from 2017

PySpark - Working with JDBC Sqlite database

Spark supports connectivity to a JDBC database. In this session, we going to see how you connect to a sqlite database.  As of this writting, i am using Spark 2.1.1.

First, have your spark-defaults.conf file setup. in your conf folder.

Next, fire up your pyspark, then run the following script in your REPL.

The return data is a list.  Then you probably want to access the data using :-

>>> a[0].ID
>>> a[0].Name
>>> a[0].Name

Looping through it :-

 for i in a: print(i.Name)

>> mark
>> eric

That's it.

django rest framework - getting started

To get started with django rest framework, you gotta started reading from the back cuz for some reasons these guys like to give out the best implementation in the end.

Anyways, go ahead to create your virtualenv

pip install django pip install djangorestframework
# Set up a new project with a single application startproject tutorial .# Note the trailing '.' character cd tutorial startapp quickstart cd .. Let's start of with Request/Response object. As these remain critical for our development efforts and work with

What it is saying on 2nd like, is we are creating a root api for our application and the "name" properties you see in 3rd and 4th line are references for us to do reverse look up which you will see later. 

urlpatterns = [ url(r'^$', views.api_root), url(r'^deploy/$', views.SimpleDeployer.as_view(), name='deploy-list'), url(r'^deploy/simple/$', views.SimpleDeployer.as_view(), name='…

Quick start using PySpark

Now that we have configured and started our pyspark, lets go over some common functions that we will be using :-

Let's take a look at our data file.

Assume we started our shell with the following command :-

from pyspark import SparkContext sc = SparkContext.getOrCreate() tf = sc.textFile("j:\\tmp\\data.txt")

filter - filtering results tahat matches a boolean condition. Example 

tf.filter(lambda a  : "test" in a).count()

SyntaxError: invalid syntax
>>> tf.filter(lambda a  : "test" in a).count()

Finds a line that contains

collect - really useful and it returns list of all element in a RDD

collect is pretty handy especially when you want to see results

>>> tf.filter(lambda a  : "test" in a).collect()
['test11111', 'testing ', 'reest of the world; test11111']

map - returns RDDs by applying a function on it. Here I am applying upper case to my lines and i returns the results using collect()

>>> …

Pyspark with spark 2.1.0 - Python cannot be version 3.6.

To get started using spark with pyton, you can

1. Install Anaconda Python - which has all the goodies you need.

2.Download Spark and unzip into a folder.

3. After you have all these setup, next you need to issue the following command (spark only supports python 3.5)

conda create -n py35 python=3.5 anaconda

activate py35
4. Goto your spark installation folder, goto "bin" and run "pyspark".
5. You probably going to get some exceptions but still should be able to run the following scripts :

from pyspark import SparkContext sc = SparkContext.getOrCreate() tf = sc.textFile("j:\\tmp\\data.txt") tf.count()
Please make sure you have your "data.txt" pointed correctly.
This setup looks easier than it is. Spent a lot of time today trying to get it up and running.

Search relevance

Search relevance


Demand prediction

Purchase Prediction

- Current competition is

Product recommendation…

Creating your first pypi

To create your first pip installable package, create a folder and add the following files in there (alternative you can download from here)


index-servers =



Next create a file call which contains some basic information about your package

from distutils.core import setup setup( name = 'kepungmath', packages = ['kepungmath'], # this must be the same as the name above version = '0.1', description = 'A random test lib', author = 'Jeremy Woo', author_email = '', url = '', # use the URL to the github repo download_url = '', # I'll explain this in a second keywords = ['testing', 'logging', 'example'], # arbitrary…

Getting started with Jupyter - internals

Jupyter notebook is a web based application that uses web socket to talk to ipython, execute python (or other languages like Julia and Haskell). It is really popular when it comes to executing machine learning kernels.

Say for example, you trying to use numpy to read off certain csv files and runs mathplot, you can easily do so using IPython.

Great! So how does it work?

First you need to setup jupyter notebook. I assumed that you have installed Anaconda in your system.

First, lets setup our virtual environment
virtualenv notebook Now activate our newly created environment. activate notebookpip install --upgrade setuptools pip git clone cd notebook pip install -e . From diagram below, we can see notebook depends on jupyter_client (purple) to talk to ipython (yellow)

At the end of the cell execute request, results will be display / render on notebook. Communication is achieve via web socket and all the code is in javascript and not python. Will consum…

Ruby package ranking by category

If you wanted to see what other Ruby gems community is using, be sure to check out this link. It has gems downloads by category :)

Installing 2.3.3 Ruby + Rails without tears

To install Ruby 2.3.3, you have to install DevKit.

For 32-bit system follow instructions below :-
a) install Ruby 2.3.3 

b) download and unzip DevKit

c) Fire up your command prompt, make sure Ruby is in your PATH. Go to the folder that you unzip DevKit, Execute the following command : - ruby dk.rb init and ruby dk.rb install

d) From the command line, type gem install rails

You're done! :)

For 64-bit, please following instruction below:
a) Download Ruby from here.

b) Download and unzip DevKit.

c) Fire up your command prompt, make sure Ruby is in your PATH. Go to the folder that you unzip DevKit, Execute the following command : - ruby dk.rb init and ruby dk.rb install

d) From the command line, type gem install rails

Some typescript key takeaways

1. A module can only have a single default export. Yes only one. Please don't try to use multiple. There is no workaround.

2. You can export a class / interface like this :-

export = ZipCodeValidator;

3. And import it using 

import zip = require("./ZiCodeValidator");

4. Optional loading

Yes, you can define optional loading but Typescripts does this for you automatically. In nodejs, require.js or system.js, you might have to do it manually.

5. Working with other javascript libraries

This gotta sounds catchy...

6. Generator functions

You can have multiple yield statement like so.

function* generateIt()
   yield 1;

  yield 2;

  yield 3;


var myvar = generateIt()

To pass a value to your generator function.

let myfunc  = generateIt();;   // return 1

docker remove all process

Best command of the day

docker stop $(docker ps -a -q) docker rm $(docker ps -a -q)

Using R studio with TensorFlow

Check out this link if you're interested to learn more about tensorflow with R studio.

To install it, just type the following command


It also comes with a cool tutorial on MNIST (a tutorial for digit recognition)

Writing your WAMP Server and client in .net using WampSharp

WAMP server basically allows you to use RPC and web socket to talk to your client. Yeap, you can use standard HTTP like GET/POST as long as you defined those on the server.

All rite just to keep things sweet and simple, let see how a simple server code looks like :-

Client code looks like this :-

Run both code on a separate C# console app, and you will get something like below :-

service designer guide for http status - check this out !!! :)


Angular2 - No http post or get until you call subscribe.

Sometimes rxjs can be so lazy. Nothing happens until you call 'subscribe'.

When you are making a http request get or post, you might find out that there's no outgoing http request. Does the code below looks familiar, well..... you will get it if you call subscribe.  :/ This is a gotcha for many people.

Javascript / Typescript rest and spread operators

To make it easier, rest and spread operator basically helps to merge all your function parameter together. In the past, we use something like

myfunction.apply(null, myAgumentList)

With rest and spread we can easily

function log(...a) {

log(1,2,3,4,5); // outputs [1, 2, 3, 4, 5] array

Notice we're assigning myArguments at the back. 

let myArguments = ['x', 'y', 3]

log(1,2, myArguments); // outputs [1, 2, ['x', 'y', 3]] 2 different array 

log(1,2, ...myArguments); // outputs [1, 2, 'x', 'y', 3] 1 array

Notice we're assigning myArguments at the front. 

log(...myArguments,1,2 ); // outputs [ 'x', 'y', 3, 1, 2] essentially a single array.

Angular2 providers with useClass purpose

Might be wondering what is the purpose the useClass construct as shown here
[{ provide:Logger, useClass:Logger}This provides a way for Angular2 to find and use the proper class / providers. Think of it as a key / value matching approach. Some people uses mock logger service by specifying useClass : BetterLogger.

Difference between NativeElement and DebugElement in Angular2 test

What is the difference between native element and debugElement in Auglar2 test?

Short answer :-

DebugElement contains method / function to query or test an Angular2 component html elements

NativeElement is the html itself.

Have a look at the diagram below. The first is debug element while the last item is native element itself.


Wanting to find out more about ESNext ....check this link out

typescript async and await

Typescript has it now... async and await.

SqlDependency not firing, you're not the only ones. :)

8 years old, i tried using SqlDependency and it failed badly. Today i have another go at it for some weird reasons.

Anyways, if you're having problem with SqlDependency, you're not alone. Try to use this solution instead.  The only thing I don't like about it is, change response is given in XML.

Try SqlDependencyEx instead.

Typescript allow initializer to create and instantiate an instance

The keyword partial transform fields into optional.

So we're able to initialize our object instance this way.

Pretty cool eh..

if your ngFor or ngIf not working - You might not have Common Module imported

import { CommonModule } from '@angular/common';

@NgModule({   imports: [     RouterModule.forRoot(appRoutes), HttpModule, ReactiveFormsModule, CommonModule   ],   declarations: [AddPersonComponent, SearchComponent, ListComponent],   providers: [PersonService],   exports: [     RouterModule   ] })

angular2 call service via OnNgInit

If you need to call a service, please do it via OnNgInit instead of using constructor. If you doing unit testing, your code might not work.
As shown in the code below :

classSomeServiceimplementsOnInit { constructor(privatehttp:Http) { } ngOnInit() { // this should hit mocked backendthis.http.get("dummmy url").subscribe(v=> { console.log('constructor subscribe hit'); }); } someMethod() { // this should hit mocked backendthis.http.get("anotherurl").subscribe(v=> { console.log('some method called subscribe hit'); }); } }

angularjs2 common stuff that i always forgets


   providers : []  // providers
providers array provide a dependency injection layer between ngmodule and other child components. A child component will typically traverse to the parent to look for a provider - a singleton service or in plain language a class with a tasks like news feed service or http service.

if you specify providers in a child component, this service will be instantiated as a separate instance.


This module provides services for running and opening browser. What type of services?
Always import BrowerModule in the root AppModule.ts.

@Input and @Output

Passing data using @Input  (Property passing with [] )  and @Output (event passing with ()  )

Custom Directive 

You essentially can still create custom directive in Angular2.


@Pipe - custom, stateless and stateful pipe

c# recursive Fibonacci implementation. Not a very efficient solution.

javascript function * declaration

Did you know that if you defined your javasscript function  like function *, you're basically creating a iterator function.

var myIterable ={} myIterable[Symbol.iterator]=function*(){ // i want a iterator here! :)yield1;yield2;yield3;};[...myIterable]// [1, 2, 3]

Awesome site for REST Api designs .... core security

Enable SSL in ASP.Net Core 

To enable SSL just add [RequireHttps] attribute on top of your controller or use the following code to secure your entire site with enableSSL.

    if (!_env.IsDevelopment())
                services.AddMvc(options =>

Reason we are defecting if environment is development is due to IISExpress who uses non standard https port for development purposes.


According to OWASP, Unvalidated Redirects and Forwards are one of the most common attacks n real life.

With just a single line of code we're able to stop redirection (must always appear after UseStaticFiles. When we're creating an API, we don't redirect  that often. By having this UseRedictValidation, we're able to monitor http redirection and throws an error if it arises.

First off, you need the following nuget packages :-

Install-Package NWebsec.AspNetCo…

dotnet core 1.0.1 upgrading from old xproj files

Did you know that to upgrade xproj to csproj in dotnet core, all you need to do is run

dotnet migrate

However, you need to use vs2017 to work with this new project.

I tried remove and added the new project using vs2015 but its throwing ms build errors.

vs2017 forcing migration -

If you ever wanted to force vs2017 to migrate exising .net core project, all you gotta do is, right click on the project and choose "reload project". It will automatically migrate all your .xproj to .csproj. Please install your sdk tho.

So the bottom line is,

1. no more xproj,  there is only csproj.

2. global.json becomes { "projects" : ["src", "test"] }

Using LiteIde for Golang

I think LiteIde is definitely the editor to use for GoLang. Tried installing different plugins for vscode but unfortunately it is not so useful.

Go download and try LiteIde.

working with messages in mule

When using mule, it is very common to set and access variables, sessions variable and property along our flow. What are the differences between :-

a) variable - data that exist and last from start to end of a flow unless over-written. Accessed using #[flowVars]

c) session - a lasting location for storing values. Accessed by using #[sessionVars]

d) property - are message header information

e) message payload - is the mule message sent to user and move from flow to flow. It could made up of message inflow (accessed using #[message.inboundProperties] and message outflow #[message.outboundProperties].

f) message events

The best way to test this out is to create a flow that makes use of these basic mule construct.

Let's start off with variable. From the flow below we grab input from http and save it to a variable. Then we create a choice flow to see if the variable is "reece". If yes, branch up and set result to 111, else 222.

If we have a look at the "set variable"…

How CORS works in plain english?

CORS is used to control access to a remote resource, for example If we hosted a webpage on site, we can configure remote resource, and tell it should entertain request coming from

If you make a request from "" to, you will not be able to do so. Because we never really configure that to happen.

So we have, --> making GET request to -->

if is allowed requested to the site, we will  get some response that look like this.

 Request from

=> OPTIONS - HEADERS - Origin: Access-Control-Request-Method: GET Response from

<= HTTP/1.1 204 No Content - RESPONSE HEADERS - Access-Control-Allow-Methods: GET, POST, OPTIONS Access-Control-Max-Age: 86400 Access-Control-Allow-Headers: Api-Key Access-Control-Allow-Origin: Content-Length: 0 From here, we can see that "Access-Control-Allow…

Mule API Gateway vs Mule Runtime

Pretty confusing to me at first, as i would have thought mule Runtime host mule API Gateway. While that's definitely not the case, API Gateway connects with API Manager to enforce and apply policies / settings like throttling, security, CORS to your back end services. These enforcement are based on sa specific apps.

Mule Runtime is where all your application gets hosted and run. It takes incoming request, runs the specific flows and returns results.

Tutorial : Mule Creating a simplest flow using Http and Groovy component

In this example, we're going to create the simplest Mule app flow. Our flow basically consist of a Http and a Groovy component which looks like this :-

Using Groovy is optional and entirely up to developers to choose. If you're using it for simple and not so complicated task, then it is fine. Otherwise, Java gives you ability to debug through the code.
Groovy does give you direct access to message which you otherwise need to call getMessage() - [if you are to use java component ].

And here is our script looks like :-

User submits are request like this, http://localhost:3004/?username='jeremy'.

If it is jeremy, great, if not it returns invalid user.

For beginners, it's pretty hard to know what message property and types that are available.
Perhaps picture below would give a better way for newbies to work with mule in the future.

Tutorial : Mule Creating a simplest flow using Http and Java component

In this example, we're going to create the simplest Mule app flow. Our flow basically consist of a Http and a Java component which looks like this :-

User basically connects to something like this :- http://localhost:3003/?username=jeremy.

Query parameter called ussername get pass into java's class and returns "jeremy" or "unknow user" depending on the string passed.

Here is what our java class looks like :-

From the code above, our java class implements mule class callable and we attempted to extract username parameter from 'http.query.params' which is a Map object type. With this, we proceed to get our value by calling get Map's method 

Well, that's it. Done! :)

Mule - unit testing with munit getting dependencies right

Sometime trying to get the right dependencies for Mule Munit test can be a challenge. Not something  you wanna wrestle.

So here is a quick list of dependencies that you need.:-

Complete pom.xml configuration are given below :-

Could not find a declaration file for module 'react-redux'.

If you encounter this error, it is most likely that you don't have typings installed. All you have to do is run the following command and restart Vs Code

npm install --save @types/react-redux

force maven to re-download repository jar / dependencies

This happens alot in maven development. Somehow your repository has lost your .jar files and all you get is .lastupdate. To force it to redownload all these dependencies, you need to run the following command :-

mvn dependency:purge-local-repository

using mule cxf component

This is by far the coolest way to create a web service, using mule CXF. CXF is a component that allows you to easily create and expose a web service using Mule runtime.

Just create normal java interface and implements it. Cook up some Mule connector and you're good to publish your web service.

Lets create your interface.

Lets create your class. Notice we're not annotating it with @webservice.

Finish it up with the following Mule script :-

After this, we just have to consume this service and gets a return results.

setting response header connection to close

After some really time consuming effort to try to set header connection to "close" with WebAPI, i finally resorted to  good old all powerful nodejs.

Using the code below and fire away. It is able to accept incoming request from any port. Done! :)

using panda to find value matching certain criteria.

Say you have the following dataframe read into panda.

Id | F1    | F2      | Class
0  | True | True   | 0
0  | True | True   | 1
0  | True | True   | 0
0  | True | True   | 1

To select all the class with value 0 in a data frame, you can use

data.loc[data['Class'] == 0]

Class column name is case sensitive. You won't get anything out of using data['class'].

Easy to understand Confusing matrix link

Best link i have found to understand what is confusion matrix. This confusion matrix becomes not so confusing. :)

Mule Http Client Request logging

Best piece of logging i found - literally no code modification. Just add the following code in your log4j configuration file and all your logs will be revealed. :)

<AsyncLogger name="org.mule.module.http.internal.HttpMessageLogger" level="DEBUG" />
<AsyncLogger name="com.ning.http" level="DEBUG" />