hadoop on docker as a single cluster
Start your instance by running the following command :-
docker run -it sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash
docker run -it -p 50070:50070 sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash
You can browse using the following url
http://192.168.99.100:50070/explorer.html#/
Next, run the following commands :-
>docker run -it -p 50070:50070 -v c:/tmp/hive:/hive sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash
For docker toolbox you might need to run the following command :-
docker run -it -p 50070:50070 -v /hive:/hive sequenceiq/hadoop-docker:2.7.0 /etc/bootstrap.sh -bash
To submit jobs
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep input
output 'dfs[a-z.]+'
$HADOOP_HOME/bin/hdfs dfs -cat output/*
Turning on Hive (this step is optional)
To enable Hive, Please download and install Hive (basically extract it - :) to c:/tmp/hive (the command above already mounts it to a hive directory in your docker vm)
Exporting task
export HADOOP_HOME=/usr/local/hadoop-2.7.0
export HIVE_HOME=/hive
$HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$HADOOP_HOME/bin/hadoop fs -mkdir /user
$HADOOP_HOME/bin/hadoop fs -mkdir /user/hive
$HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
$HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
Initialize your metastore
$HIVE_HOME/bin/schematool -initSchema -dbType derby
(this works. For some reason if you run it slightly differently, like so, it is not happy
$HIVE_HOME/bin/schematool -dbType-initSchema
Starting Hive
Start Hive using the following command
$HIVE_HOME/bin/hive
Starting Hive server
$HIVE_HOME/bin/hiveserver2
Starting Beeline
$HIVE_HOME/bin/beeline -u jdbc:hive2://localhost:10000 -n maria_dev
./$HADOOP_HOME/bin/hdfs dfs -mkdir /app
./$HADOOP_HOME/bin/hdfs dfs -put /hive/sample.csv /app
CREATE SCHEMA IF NOT EXISTS bdp;
CREATE EXTERNAL TABLE IF NOT EXISTS bdp.hv_csv_table
(id STRING,Code STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'hdfs://localhost:8020/app/sample.csv';
Comments