Docker Image

Docker is one of the easiest ways to start working with CDAP without having to manually configure anything. A Docker image with the CDAP SDK pre-installed is available on the Docker Hub for download.

To use the Docker image, you can either use Docker’s Kitematic (on Mac OS X and Windows)—a graphical user interface for running Docker containers—or start the container from a command line.

Docker using Kitematic

Docker Kitematic is available as part of the Docker Toolbox for either Mac OS X or Microsoft Windows. It is a graphical user interface for running Docker containers. Follow these steps to install Kitematic and then download, start, and connect to an instance of CDAP.

  1. Download and install the Docker Toolbox for either Mac OS X or Microsoft Windows.

  2. Start Kitematic. On Mac OS X, it will be installed in /Applications/Docker/Kitematic; on Windows, in Start Menu > Docker > Kitematic.

  3. Once Kitematic has started, search for the CDAP image using the search box at the top of the window. Then click on the repository menu, circled in red here:

  4. Click on the tags button:

  5. Select the desired version. Note that the tag latest is the last version that was put up at Docker Hub, which is not the necessarily the most-current version:

  6. Close the menu by pressing the X in the circle. Press “Create” to download and start the CDAP image. When it has started up, you will see in the logs a message that the CDAP UI is listening on port 9999:

  7. To connect a web browser for the CDAP UI, you’ll need to find the external IP addresses and ports that the Docker instance is exposing. The easiest way to do that is click on the Settings tab, and then the Ports tab:

  8. This shows that the CDAP instance is listening on the internal port 9999 within the Docker instance, while the Docker instance exposes that port on the external IP address and port The text in blue is a link; clicking it will open it in your system web browser and connect to the CDAP UI:


Docker from a Command Line

  • Docker is available for a variety of platforms. Download and install Docker in your environment by following the platform-specific installation instructions from to verify that Docker is working and has started correctly.

    If you are not running on Linux, you need to start the Docker Virtual Machine (VM) before you can use containers. For example:

    $ boot2docker start
    $ boot2docker ip
    > boot2docker start
    > boot2docker ip

    to determine the Docker VM’s IP address. You will need to use that address as the host name when either connecting to the CDAP UI or making an HTTP request.

    When you run boot2docker start, it will print a message on the screen such as:

    To connect the Docker client to the Docker daemon, please set:
        export DOCKER_HOST=tcp://
        export DOCKER_CERT_PATH=/Users/.../.boot2docker/certs/boot2docker-vm
        export DOCKER_TLS_VERIFY=1

    It is essential to run these export commands (or command, if only one). Otherwise, subsequent Docker commands will fail because they can’t tell how to connect to the Docker VM.

  • Once Docker has started, pull down the CDAP Docker Image from the Docker hub using:

    $ docker pull caskdata/cdap-standalone:3.4.3
    > docker pull caskdata/cdap-standalone:3.4.3
  • Start the Docker CDAP Virtual Machine with:

    $ docker run -t -i -p 9999:9999 -p 10000:10000 caskdata/cdap-standalone:3.4.3
    > docker run -t -i -p 9999:9999 -p 10000:10000 caskdata/cdap-standalone:3.4.3
  • CDAP will start automatically once the CDAP Virtual Machine starts. CDAP’s Software Directory is under /opt/cdap/sdk.

  • Once CDAP starts, it will instruct you to connect to the CDAP UI with a web browser at http://localhost:9999. Replace localhost with the Docker VM’s IP address (such as that you obtained earlier. Start a browser and enter the address to access the CDAP UI.

  • For a full list of Docker Commands, see the Docker Command Line Documentation.

Docker and CDAP Applications

Development Environment Setup

Creating an Application

When writing a CDAP application, it’s best to use an integrated development environment that understands the application interface to provide code-completion in writing interface methods.

The best way to start developing a CDAP application is by using the Maven archetype:

$ mvn archetype:generate \
    -DarchetypeGroupId=co.cask.cdap \
    -DarchetypeArtifactId=cdap-app-archetype \
> mvn archetype:generate ^
-DarchetypeGroupId=co.cask.cdap ^
-DarchetypeArtifactId=cdap-app-archetype ^

This creates a Maven project with all required dependencies, Maven plugins, and a simple application template for the development of your application. You can import this Maven project into your preferred IDE—such as IntelliJ or Eclipse—and start developing your first CDAP application.

For an application that contains a MapReduce program, use -DarchetypeArtifactId=cdap-mapreduce-archetype instead; for Spark, use either cdap-spark-java-archetype or cdap-spark-scala-archetype.

Using IntelliJ

  1. Open IntelliJ and import the Maven project.
  2. Go to menu File -> Import Project...
  3. Select the pom.xml in the Maven project’s directory.
  4. Select the Import Maven projects automatically and Automatically download: Sources, Documentation boxes in the Import Project from Maven dialog.
  5. Click Next, complete the remaining dialogs, and the new CDAP project will be created and opened.

Using Eclipse

  1. In your Eclipse installation, make sure you have the m2eclipse plugin installed.
  2. Go to menu File -> Import
  3. Enter maven in the Select an import source dialog to filter for Maven options.
  4. Select Existing Maven Projects as the import source.
  5. Browse for the Maven project’s directory.
  6. Click Finish, and the new CDAP project will be imported, created and opened.

Starting and Stopping Standalone CDAP

Use the script (or, if you are using Windows, use cdap.bat) to start and stop the Standalone CDAP (the location will vary depending on where the CDAP SDK is installed):

$ cd cdap-sdk-3.4.3
$ ./bin/ start
. . .
$ ./bin/ stop
> cd cdap-sdk-3.4.3
> .\bin\cdap.bat start
. . .
> .\bin\cdap.bat stop

Note: There is an issue with running Microsoft Windows and using the CDAP Standalone scripts when JAVA_HOME is defined as a path with spaces in it. A workaround is to use a definition of JAVA_HOME that does not include spaces, such as C:\PROGRA~1\Java\jdk1.7.0_79\bin or C:\ProgramData\Oracle\Java\javapath.

Note that starting CDAP is not necessary if you use either the Virtual Machine or the Docker image, as they both start the Standalone CDAP automatically on startup.

Once CDAP is started successfully, in a web browser you will be able to see the CDAP UI running at http://localhost:9999, where you can deploy example applications and interact with CDAP.

Note that in the case of the Docker image, you will need to substitute the Docker VM’s IP address for localhost in the web browser address bar.

Building and Running CDAP Applications

See Building and Running CDAP Applications for information on accessing the CDAP CLI and CDAP SDK bin utilities, building examples, starting CDAP, and deploying, starting, and stopping applications.