To run a container with a shared folder (e.g. ~/Desktop/localFolder), listening on the port 8888. The localFolder is located on the desktop and you can use it to share file with the virtual machines, ipython notebook included.
docker run -d -p 8888:8888 -p 8080:8080 -v ~/Desktop/localFolder/:/notebooks --name pyspark cineca/spark-ipython
-ddetached mode-pport-vvolume--namegive a name to the containers
Open the browser at localhost:8888
Open the browser at localhost:8080
On Mac, remember that the actual VB ip can be finded with boot2docker ip.
While, if you want connect to the localhost you need the following port forwarding for VBox:
(e.g. ports from 8880 to 8890)
for i in {8880..8890}; do
VBoxManage modifyvm "boot2docker-vm" --natpf1 "tcp-port$i,tcp,,$i,,$i";
VBoxManage modifyvm "boot2docker-vm" --natpf1 "udp-port$i,udp,,$i,,$i";
done
VBoxManage modifyvm "boot2docker-vm" --natpf1 "tcp-port8080,tcp,,8080,,8080"
To get info about the virtual machine where the containers run:
boot2docker info
To change the memory of the VirtualMachine (i.e. VBox)
BoxManage modifyvm boot2docker-vm --memory 4096
docker psshows acrive conainersdocker ps -ashows all containersdocker restart CONTAINER-IDrestarts a containerdocker stop 'docker ps -aq'stops all containersdocker rm 'docker ps -aq'removes all conainers
Launching the container the first command issued is:
IPYTHON_OPTS="notebook --no-browser --ip=0.0.0.0 --port 8888" /usr/local/spark/bin/pyspark cineca/pyspark-ipython
The IPython notebook will already have the sparkContext variable sc.
Write sc.version to see what verison is loaded.
To read a file directly from the disk (no HDFS), use explicitly:
sc.textFile("file:///absolute_path to the file/")
SparkContext.textFile internally calls org.apache.hadoop.mapred.FileInputFormat.getSplits, which in turn uses org.apache.hadoop.fs.getDefaultUri if schema is absent. This method reads "fs.defaultFS" parameter of Hadoop conf.
If you set HADOOP_CONF_DIR environment variable, the parameter is usually set as "hdfs://..."; otherwise "file://".