Binary Zip File

The zip file is available on the Downloads section of the Cask Website at http://cask.co/downloads/#cdap. Click the tab marked "Sandbox" for the CDAP Sandbox. There will be a button to download the latest version.

The CDAP Sandbox includes the software required for development and a version of CDAP suitable for running on a laptop.

Once downloaded, unzip it to a directory on your machine:

$ unzip cdap-sandbox-5.1.2.zip
> jar xf cdap-sandbox-5.1.2.zip

System Requirements and Dependencies

The CDAP Sandbox runs on Linux, MacOS, and Windows, and has these requirements:

  • JDK 8 (required to run CDAP; note that $JAVA_HOME should be set)
  • Node.js (required to run the CDAP UI; we recommend any version beginning with v8.7.0. Different versions of Node.js are available.)
  • Apache Maven 3.0+ (required to build CDAP applications)

If you are running under Microsoft Windows, you will need to have installed the Microsoft Visual C++ 2010 Redistributable Package in order to have the required DLLs to run Hadoop and CDAP; currently, CDAP is supported only on 64-bit Windows platforms.

Note: There is an issue with running Microsoft Windows and using the CDAP Local Sandbox scripts when CDAP_HOME is defined as a path with spaces in it. Until this is addressed, do not use a path with space characters in it for CDAP_HOME.

Node.js Runtime

You can download an appropriate version of Node.js from nodejs.org. We recommend any version of Node.js beginning with v8.7.0.

You can check if node.js is installed, in your path, and an appropriate version by running the command:

$ node --version
> node --version

Development Environment Setup

Creating an Application

When writing a CDAP application, it's best to use an integrated development environment (IDE) that understands the application interface and provides code-completion in writing interface methods.

The best way to start developing a CDAP application is by using the Maven archetype:

$ mvn archetype:generate \
    -DarchetypeGroupId=co.cask.cdap \
    -DarchetypeArtifactId=cdap-app-archetype \
    -DarchetypeVersion=5.1.2 \
    -DartifactId=myExampleApp \
    -DgroupId=org.example.app
> mvn archetype:generate ^
    -DarchetypeGroupId=co.cask.cdap ^
    -DarchetypeArtifactId=cdap-app-archetype ^
    -DarchetypeVersion=5.1.2 ^
    -DartifactId=myExampleApp ^
    -DgroupId=org.example.app

This creates a Maven project with all required dependencies, Maven plugins, and a simple application template for the development of your application (myExampleApp). You can import this Maven project into your preferred IDE—such as IntelliJ or Eclipse—and start developing your first CDAP application.

For an application that contains a MapReduce program, set the archetypeArtifactId to cdap-mapreduce-archetype; for Spark, use either cdap-spark-java-archetype or cdap-spark-scala-archetype.

Note: Replace the artifactId (myExampleApp) and groupId parameters (org.example.app) with your own app name and organization, but the groupId must not be replaced with co.cask.cdap.

Complete examples for each archetype:

$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-app-archetype -DarchetypeVersion=5.1.2
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-mapreduce-archetype -DarchetypeVersion=5.1.2
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-java-archetype -DarchetypeVersion=5.1.2
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-scala-archetype -DarchetypeVersion=5.1.2

When prompted, complete the values for groupId and artifactId parameters. Enter for the groupId parameter your own organization; it must not be replaced with co.cask.cdap. (The version and package parameters can be either specified or you can use the Maven defaults.)

Maven supplies a guide to the naming convention used above at https://maven.apache.org/guides/mini/guide-naming-conventions.html.

Using IntelliJ

  1. Open IntelliJ and import the Maven project by:
    • If at the starting IntelliJ dialog, click on Import Project; or
    • If an existing project is open, go to the menu item File -> Open...
  2. Navigate to and select the pom.xml in the Maven project's directory.
  3. In the Import Project from Maven dialog, select the Import Maven projects automatically and Automatically download: Sources, Documentation boxes.
  4. Click Next, complete the remaining dialogs, and the new CDAP project will be created and opened.

Using Eclipse

  1. In your Eclipse installation, make sure you have the m2eclipse plugin installed.
  2. Go to menu File -> Import
  3. Enter maven in the Select an import source dialog to filter for Maven options.
  4. Select Existing Maven Projects as the import source.
  5. Browse for the Maven project's directory.
  6. Click Finish, and the new CDAP project will be imported, created and opened.

Running CDAP from within an IDE

As CDAP is an open source project, you can download the source, import it into an IDE, then modify, build, and run CDAP.

To do so, follow these steps:

  1. Install all the prerequisite system requirements for CDAP development.
  2. Either clone the CDAP repo or download a ZIP of the source:
    • Clone the CDAP repository using $ git clone -b v5.1.2 https://github.com/caskdata/cdap.git
    • Download the source as a ZIP from GitHub and unpack the ZIP in a suitable location
  3. In your IDE, install the Scala plugin (for IntelliJ or Eclipse) as there is Scala code in the project.
  4. Open the CDAP project in the IDE as an existing project by finding and opening the cdap/pom.xml.
  5. Resolve dependencies: this can take quite a while, as there are numerous downloads required.
  6. In the case of IntelliJ, you can create a run configuration to run CDAP Sandbox:
    1. Select Run > Edit Configurations...
    2. Add a new "Application" run configuration.
    3. Set "Main class" to be co.cask.cdap.StandaloneMain.
    4. Set "VM options" to -Xmx1024m (for in-memory MapReduce jobs).
    5. Click "OK".
    6. You can now use this run configuration to start an instance of CDAP Sandbox.

This will allow you to start CDAP and access it from either the command line (CLI) or through the HTTP RESTful API. To start the CLI, you can either start it from a shell using the cdap script or run the CLIMain class from the IDE.

If you want to run and develop the UI, you will need to follow additional instructions in the CDAP UI README.

Starting and Stopping CDAP Sandbox

Use the cdap sandbox script (or, if you are using Windows, use cdap.bat sandbox) to start and stop the CDAP Sandbox (the location will vary depending on where the CDAP Sandbox is installed):

$ cd cdap-sandbox-5.1.2
$ ./bin/cdap sandbox start
. . .
$ ./bin/cdap sandbox stop
> cd cdap-sandbox-5.1.2
> .\bin\cdap sandbox start
. . .
> .\bin\cdap sandbox stop

To run Spark2 programs with the CDAP Sandbox, edit the app.program.spark.compat setting in your cdap-site.xml file to be spark2_2.11. When the CDAP Sandbox is using Spark2, Spark1 programs cannot be run. When the CDAP Sandbox is using Spark1, Spark2 programs

Building and Running CDAP Applications

See Building and Running CDAP Applications for information on accessing the CDAP CLI and CDAP Local Sandbox bin utilities, building examples, starting CDAP, and deploying, starting, and stopping applications.