🔗Virtual Machine Image

To use the Virtual Machine image:

The Standalone CDAP Virtual Machine is configured with the recommended settings for Standalone CDAP:

  • 4 GB of RAM
  • Ubuntu Desktop Linux
  • 40 GB of virtual disk space

It has pre-installed all the software that you need to run and develop CDAP applications:

  • The Standalone CDAP SDK is installed under /opt/cdap/sdk and will automatically start when the virtual machine starts.
  • A Java JDK is installed.
  • Maven is installed and configured to work for CDAP.
  • Both IntelliJ IDEA and Eclipse IDE are installed and available through desktop links once the virtual machine has started.
  • Links on the desktop are provided to the CDAP SDK, CDAP UI, CDAP Examples, and CDAP documentation.
  • The Chromium web browser is included. The default page for the CDAP UI, available through a desktop link, is http://localhost:11011/.

No password is required to enter the virtual machine; however, should you need to install or remove software, the admin user and password are both cdap.

🔗Development Environment Setup

🔗Creating an Application

When writing a CDAP application, it's best to use an integrated development environment (IDE) that understands the application interface and provides code-completion in writing interface methods.

The best way to start developing a CDAP application is by using the Maven archetype:

$ mvn archetype:generate \
    -DarchetypeGroupId=co.cask.cdap \
    -DarchetypeArtifactId=cdap-app-archetype \
    -DarchetypeVersion=4.1.1 \
    -DartifactId=myExampleApp \
    -DgroupId=org.example.app

This creates a Maven project with all required dependencies, Maven plugins, and a simple application template for the development of your application (myExampleApp). You can import this Maven project into your preferred IDE—such as IntelliJ or Eclipse—and start developing your first CDAP application.

For an application that contains a MapReduce program, set the archetypeArtifactId to cdap-mapreduce-archetype; for Spark, use either cdap-spark-java-archetype or cdap-spark-scala-archetype.

Note: Replace the artifactId (myExampleApp) and groupId parameters (org.example.app) with your own app name and organization, but the groupId must not be replaced with co.cask.cdap.

Complete examples for each archetype:

$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-app-archetype -DarchetypeVersion=4.1.1
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-mapreduce-archetype -DarchetypeVersion=4.1.1
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-java-archetype -DarchetypeVersion=4.1.1
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-scala-archetype -DarchetypeVersion=4.1.1

When prompted, complete the values for groupId and artifactId parameters. Enter for the groupId parameter your own organization; it must not be replaced with co.cask.cdap. (The version and package parameters can be either specified or you can use the Maven defaults.)

Maven supplies a guide to the naming convention used above at https://maven.apache.org/guides/mini/guide-naming-conventions.html.

🔗Using IntelliJ

  1. Open IntelliJ and import the Maven project by:
    • If at the starting IntelliJ dialog, click on Import Project; or
    • If an existing project is open, go to the menu item File -> Open...
  2. Navigate to and select the pom.xml in the Maven project's directory.
  3. In the Import Project from Maven dialog, select the Import Maven projects automatically and Automatically download: Sources, Documentation boxes.
  4. Click Next, complete the remaining dialogs, and the new CDAP project will be created and opened.

🔗Using Eclipse

  1. In your Eclipse installation, make sure you have the m2eclipse plugin installed.
  2. Go to menu File -> Import
  3. Enter maven in the Select an import source dialog to filter for Maven options.
  4. Select Existing Maven Projects as the import source.
  5. Browse for the Maven project's directory.
  6. Click Finish, and the new CDAP project will be imported, created and opened.

🔗Running CDAP from within an IDE

As CDAP is an open source project, you can download the source, import it into an IDE, then modify, build, and run CDAP.

To do so, follow these steps:

  1. Install all the prerequisite system requirements for CDAP development.
  2. Either clone the CDAP repo or download a ZIP of the source:
    • Clone the CDAP repository using $ git clone -b v4.1.1 https://github.com/caskdata/cdap.git
    • Download the source as a ZIP from GitHub and unpack the ZIP in a suitable location
  3. In your IDE, install the Scala plugin (for IntelliJ or Eclipse) as there is Scala code in the project.
  4. Open the CDAP project in the IDE as an existing project by finding and opening the cdap/pom.xml.
  5. Resolve dependencies: this can take quite a while, as there are numerous downloads required.
  6. Before starting CDAP, disable audit logs by changing the audit.enabled setting in cdap-default.xml to false. Otherwise, due to CDAP-5864, Kafka errors will appear in the logs.
  7. In the case of IntelliJ, you can create a run configuration to run Standalone CDAP:
    1. Select Run > Edit Configurations...
    2. Add a new "Application" run configuration.
    3. Set "Main class" to be co.cask.cdap.StandaloneMain.
    4. Set "VM options" to -Xmx1024m -XX:MaxPermSize=128m (for in-memory MapReduce jobs).
    5. Click "OK".
    6. You can now use this run configuration to start an instance of Standalone CDAP.

This will allow you to start CDAP and access it from either the command line (CLI) or through the HTTP RESTful API. To start the CLI, you can either start it from a shell using the cdap script or run the CLIMain class from the IDE.

If you want to run and develop the UI, you will need to follow additional instructions in the CDAP UI README.

🔗Starting and Stopping Standalone CDAP

Use the cdap script (located in /opt/cdap/sdk/bin) to start and stop the Standalone CDAP:

$ /opt/cdap/sdk/bin/cdap sdk start
  . . .
$ /opt/cdap/sdk/bin/cdap sdk stop

Note that starting CDAP is not necessary if you use the Virtual Machine, as it starts the Standalone CDAP automatically on startup.

Once CDAP is started successfully, in a web browser you will be able to see the CDAP UI running at http://localhost:11011/, where you can deploy example applications and interact with CDAP.

🔗Building and Running CDAP Applications

See Building and Running CDAP Applications for information on accessing the CDAP CLI and CDAP SDK bin utilities, building examples, starting CDAP, and deploying, starting, and stopping applications.