Spark Kernel

The main goal of the Spark Kernel is to provide the foundation for interactive applications to connect to and use Apache Spark.

Overview

The Spark Kernel provides an interface that allows clients to interact with a Spark Cluster. Clients can send libraries and snippets of code that are interpreted and ran against a preconfigured Spark context. These snippets can do a variety of things:

Define and run spark jobs of all kinds
Collect results from spark and push them to the client
Load necessary dependencies for the running code
Start and monitor a stream
...

The kernel's main supported language is Scala, but it is also capable of processing both Python and R. It implements the latest Jupyter message protocol (5.0), so it can easily plug into the 3.x branch of Jupyter/IPython for quick, interactive data exploration.

Try It

A version of the Spark Kernel is deployed as part of the Try Jupyter! site. Select Scala 2.10.4 (Spark 1.4.1) under the New dropdown. Note that this version only supports Scala.

Develop

This project uses make as the entry point for build, test, and packaging. It supports 2 modes, local and vagrant. The default is local and all command (i.e. sbt) will be ran locally on your machine. This means that you need to install sbt, jupyter/ipython, and other develoment requirements locally on your machine. The 2nd mode uses Vagrant to simplify the development experience. In vagrant mode, all commands are sent to the vagrant box that has all necessary dependencies pre-installed. To run in vagrant mode, run export USE_VAGRANT=true.

To build and interact with the Spark Kernel using Jupyter, run

make dev

This will start a Jupyter notebook server. Depending on your mode, it will be accessible at http://localhost:8888 or http://192.168.44.44:8888. From here you can create notebooks that use the Spark Kernel configured for local mode.

Tests can be run by doing make test.

NOTE: Do not use sbt directly.

Build & Package

To build and package up the Spark Kernel, run

make dist

The resulting package of the kernel will be located at ./dist/spark-kernel-<VERSION>.tar.gz. The uncompressed package is what is used is ran by Jupyter when doing make dev.

Version

Our goal is to keep master up to date with the latest version of Spark. When new versions of Spark require code changes, we create a separate branch. The table below shows what is available now.

Branch	Spark Kernel Version	Apache Spark Version
master	0.1.5	1.5.1
branch-0.1.4	0.1.4	1.4.1
branch-0.1.3	0.1.3	1.3.1

Please note that for the most part, new features to Spark Kernel will only be added to the master branch.

Resources

There is more detailed information available in our Wiki and our Getting Started guide.

Name		Name	Last commit message	Last commit date
Latest commit History 775 Commits
client		client
communication		communication
etc/bin		etc/bin
kernel-api		kernel-api
kernel		kernel
macros		macros
project		project
protocol		protocol
pyspark-interpreter		pyspark-interpreter
resources		resources
scala-interpreter		scala-interpreter
sparkr-interpreter		sparkr-interpreter
sql-interpreter		sql-interpreter
src/test/scala		src/test/scala
.gitignore		.gitignore
.pairs		.pairs
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
Vagrantfile		Vagrantfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Spark Kernel

Overview

Try It

Develop

Build & Package

Version

Resources

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

zhangfengaetion/spark-kernel

Folders and files

Latest commit

History

Repository files navigation

Spark Kernel

Overview

Try It

Develop

Build & Package

Version

Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages