Getting started with spark with databricks cloud



When you sign up for a community edition, you tend to create notebook with it.

If you want to make use of the existing database, you can load it using the following command :

To load a table called airlineflight


use databricks;

select month from airlineflight where year == 2002


To convert to a data frame you can use the following command :-

%python 

df = sqlContext.sql("Select * from airlineflight")


And from this point onwards, you can manipulate using filter, select. Please refer to documentation here.


Some quick examples,

%pytho

df.select("FlightNum").collect()


df.filter(df.DepTime>24).count()


Comments

Popular posts from this blog

The specified initialization vector (IV) does not match the block size for this algorithm