Posts

Showing posts from June, 2017

AWSSDK supports .Net Core

Just been messing around with AWS SNS using AWS nuget package and surprisingly it works on .Net Core.

Yay!

Creating a lambda function which runs C# code

Image
Go to your Amazon console and then choose Lambda.

Next, click on Blank Function.




Click next, when you are given the following screen :-



Next, you need to configure the handler. Handler always start off with your assembly (ONLY DLL please, exe is not supported). More detail info, please have a look at diagram below.

Set the runtime to C#.





If you encounter this error,

{ "errorType": "LambdaException", "errorMessage": "Could not find the LambdaSerializerAttribute on the assembly 'TestApp, Culture=neutral, PublicKeyToken=null' or method 'Main' while attempting to deserialize input data of type 'System.String[]'. To use types other than System.IO.Stream as input/output parameters, the assembly or Lambda function should be annotated with Amazon.Lambda.LambdaSerializerAttribute." }

it probably means you are passing a argument in your aws lambda function. It should not take in any arguments unless those type supported by AWS.

IPython - getting your custom kernel up on Wndows

Image
To create your custom control in IPython requires pretty easy setup. All you have to do is make sure you

1. Locate IPython's kernel confguration folder In windows it is located in : C:\Users\Jeremy\.ipython\kernel.

2. Next, create a folder name for your kernel. In my case, I called it "wtf" - i think there is such a language.

3. Edit your kernel.json file content to look like this :-




As you can see, we're using iruby as the interpreter. I will definitely fix that in near future.

A couple of things tho, the iruby draws on rubys gems as what you have installed.

If you modify your gems, those changes will affect output of this session.

This is just to provide a quick and easy way for your to integrate your language into IPython.


You can check out folder layout here.






My jupyter notebook looks like this.

PySpark - Working with JDBC Sqlite database

Spark supports connectivity to a JDBC database. In this session, we going to see how you connect to a sqlite database.  As of this writting, i am using Spark 2.1.1.

First, have your spark-defaults.conf file setup. in your conf folder.




Next, fire up your pyspark, then run the following script in your REPL.




The return data is a list.  Then you probably want to access the data using :-


>>> a[0].ID
2
>>> a[0].Name
'mark'
>>> a[0].Name



Looping through it :-


 for i in a: print(i.Name)

>> mark
>> eric


That's it.








django rest framework - getting started

Image
To get started with django rest framework, you gotta started reading from the back cuz for some reasons these guys like to give out the best implementation in the end.

Anyways, go ahead to create your virtualenv


pip install django pip install djangorestframework
# Set up a new project with a single application django-admin.py startproject tutorial .# Note the trailing '.' character cd tutorial django-admin.py startapp quickstart cd .. Let's start of with Request/Response object. As these remain critical for our development efforts and work with urls.py

What it is saying on 2nd like, is we are creating a root api for our application and the "name" properties you see in 3rd and 4th line are references for us to do reverse look up which you will see later. 




urlpatterns = [ url(r'^$', views.api_root), url(r'^deploy/$', views.SimpleDeployer.as_view(), name='deploy-list'), url(r'^deploy/simple/$', views.SimpleDeployer.as_view(), name='…

Quick start using PySpark

Now that we have configured and started our pyspark, lets go over some common functions that we will be using :-

Let's take a look at our data file.

Assume we started our shell with the following command :-

from pyspark import SparkContext sc = SparkContext.getOrCreate() tf = sc.textFile("j:\\tmp\\data.txt")

filter - filtering results tahat matches a boolean condition. Example 

tf.filter(lambda a  : "test" in a).count()

SyntaxError: invalid syntax
>>> tf.filter(lambda a  : "test" in a).count()
3

Finds a line that contains

collect - really useful and it returns list of all element in a RDD

collect is pretty handy especially when you want to see results

>>> tf.filter(lambda a  : "test" in a).collect()
['test11111', 'testing ', 'reest of the world; test11111']


map - returns RDDs by applying a function on it. Here I am applying upper case to my lines and i returns the results using collect()

>>> …

Pyspark with spark 2.1.0 - Python cannot be version 3.6.

To get started using spark with pyton, you can

1. Install Anaconda Python - which has all the goodies you need.

2.Download Spark and unzip into a folder.

3. After you have all these setup, next you need to issue the following command (spark only supports python 3.5)

conda create -n py35 python=3.5 anaconda

activate py35
4. Goto your spark installation folder, goto "bin" and run "pyspark".
5. You probably going to get some exceptions but still should be able to run the following scripts :

from pyspark import SparkContext sc = SparkContext.getOrCreate() tf = sc.textFile("j:\\tmp\\data.txt") tf.count()
Please make sure you have your "data.txt" pointed correctly.
This setup looks easier than it is. Spent a lot of time today trying to get it up and running.








Search relevance

Search relevance

http://blog.kaggle.com/2016/05/18/home-depot-product-search-relevance-winners-interview-1st-place-alex-andreas-nurlan/

http://blog.kaggle.com/2016/06/15/home-depot-product-search-relevance-winners-interview-2nd-place-thomas-sean-qingchen-nima/

http://blog.kaggle.com/2016/06/01/home-depot-product-search-relevance-winners-interview-3rd-place-team-turing-test-igor-kostia-chenglong/


Claims

http://blog.kaggle.com/2016/05/13/bnp-paribas-cardif-claims-management-winners-interview-1st-place-team-dexters-lab-darius-davut-song/


Demand prediction

http://blog.kaggle.com/2016/02/03/rossmann-store-sales-winners-interview-2nd-place-nima-shahbazi/

http://blog.kaggle.com/2016/09/27/grupo-bimbo-inventory-demand-winners-interviewclustifier-alex-andrey/


Purchase Prediction

- Current competition is https://www.kaggle.com/c/instacart-market-basket-analysis/kernels


http://blog.kaggle.com/2014/07/11/first-place-in-purchase-prediction-challenge/



Product recommendation

http://blog.kaggle.com/…

Creating your first pypi

To create your first pip installable package, create a folder and add the following files in there (alternative you can download from here)


.pypirc

[distutils]
index-servers =
  pypi
  pypitest

[pypi]
repository=https://pypi.python.org/pypi
username=
password=

[pypitest]
repository=https://testpypi.python.org/pypi
username=
password=



Next create a file call setup.py which contains some basic information about your package

from distutils.core import setup setup( name = 'kepungmath', packages = ['kepungmath'], # this must be the same as the name above version = '0.1', description = 'A random test lib', author = 'Jeremy Woo', author_email = 'kepung@gmail.com', url = 'https://github.com/appcoreopc/kepungmath', # use the URL to the github repo download_url = 'https://github.com/appcoreopc/kepungmath/archive/0.1.tar.gz', # I'll explain this in a second keywords = ['testing', 'logging', 'example'], # arbitrary…

Getting started with Jupyter - internals

Image
Jupyter notebook is a web based application that uses web socket to talk to ipython, execute python (or other languages like Julia and Haskell). It is really popular when it comes to executing machine learning kernels.

Say for example, you trying to use numpy to read off certain csv files and runs mathplot, you can easily do so using IPython.

Great! So how does it work?

First you need to setup jupyter notebook. I assumed that you have installed Anaconda in your system.

First, lets setup our virtual environment
virtualenv notebook Now activate our newly created environment. activate notebookpip install --upgrade setuptools pip git clone https://github.com/jupyter/notebook cd notebook pip install -e . From diagram below, we can see notebook depends on jupyter_client (purple) to talk to ipython (yellow)





At the end of the cell execute request, results will be display / render on notebook. Communication is achieve via web socket and all the code is in javascript and not python. Will consum…