🔗Virtual Machine Image
To use the Virtual Machine image:
- Download and install either Oracle VirtualBox or VMWare player to your environment.
- Download the CDAP Standalone Virtual Machine (Standalone VM) at http://cask.co/downloads/#cdap.
- Import the downloaded
.ovafile into either the VirtualBox or VMWare Player.
The CDAP Standalone Virtual Machine is configured with the recommended settings for Standalone CDAP:
- 4 GB of RAM
- Ubuntu Desktop Linux
- 40 GB of virtual disk space
It has pre-installed all the software that you need to run and develop CDAP applications:
- The Standalone CDAP SDK is installed under
/opt/cdap/sdkand will automatically start when the virtual machine starts.
- A Java JDK is installed.
- Maven is installed and configured to work for CDAP.
- Both IntelliJ IDEA and Eclipse IDE are installed and available through desktop links once the virtual machine has started.
- Links on the desktop are provided to the CDAP SDK, CDAP UI, and CDAP documentation.
- The Chromium web browser is included. The default page for the CDAP UI, available through a desktop link, is http://localhost:11011/.
No password is required to enter the virtual machine; however, should you need to install or
remove software, the admin user and password are both
🔗Development Environment Setup
🔗Creating an Application
When writing a CDAP application, it's best to use an integrated development environment (IDE) that understands the application interface and provides code-completion in writing interface methods.
The best way to start developing a CDAP application is by using the Maven archetype:
$ mvn archetype:generate \ -DarchetypeGroupId=co.cask.cdap \ -DarchetypeArtifactId=cdap-app-archetype \ -DarchetypeVersion=4.0.0
This creates a Maven project with all required dependencies, Maven plugins, and a simple application template for the development of your application. You can import this Maven project into your preferred IDE—such as IntelliJ or Eclipse—and start developing your first CDAP application.
For an application that contains a MapReduce program, set the
cdap-mapreduce-archetype; for Spark, use either
Note: Replace the groupId parameter (
org.example.app) with your own organization, but it must not be replaced with
Complete examples for each archetype:
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-app-archetype -DarchetypeVersion=4.0.0 -DgroupId=org.example.app
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-mapreduce-archetype -DarchetypeVersion=4.0.0 -DgroupId=org.example.app
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-java-archetype -DarchetypeVersion=4.0.0 -DgroupId=org.example.app
$ mvn archetype:generate -DarchetypeGroupId=co.cask.cdap -DarchetypeArtifactId=cdap-spark-scala-archetype -DarchetypeVersion=4.0.0 -DgroupId=org.example.app
Maven supplies a guide to the naming convention used above at https://maven.apache.org/guides/mini/guide-naming-conventions.html.
- Open IntelliJ and import the Maven project.
- Go to menu File -> Import Project...
- Select the
pom.xmlin the Maven project's directory.
- Select the Import Maven projects automatically and Automatically download: Sources, Documentation boxes in the Import Project from Maven dialog.
- Click Next, complete the remaining dialogs, and the new CDAP project will be created and opened.
- In your Eclipse installation, make sure you have the m2eclipse plugin installed.
- Go to menu File -> Import
- Enter maven in the Select an import source dialog to filter for Maven options.
- Select Existing Maven Projects as the import source.
- Browse for the Maven project's directory.
- Click Finish, and the new CDAP project will be imported, created and opened.
🔗Running CDAP from within an IDE
As CDAP is an open source project, you can download the source, import it into an IDE, then modify, build, and run CDAP.
To do so, follow these steps:
- Install all the prerequisite system requirements for CDAP development.
- Either clone the CDAP repo or download a ZIP of the source:
- Clone the CDAP repository using
$ git clone -b v4.0.0 https://github.com/caskdata/cdap.git
- Download the source as a ZIP from GitHub and unpack the ZIP in a suitable location
- Clone the CDAP repository using
- In your IDE, install the Scala plugin (for IntelliJ or Eclipse) as there is Scala code in the project.
- Open the CDAP project in the IDE as an existing project by finding and opening the
- Resolve dependencies: this can take quite a while, as there are numerous downloads required.
- Before starting CDAP, disable audit logs by changing the
false. Otherwise, due to CDAP-5864, Kafka errors will appear in the logs.
- In the case of IntelliJ, you can create a run configuration to run CDAP Standalone:
Run > EditConfigurations...
- Add a new "Application" run configuration.
- Set "Main class" to be
- Set "VM options" to
-Xmx1024m -XX:MaxPermSize=128m(for in-memory MapReduce jobs).
- Click "OK".
- You can now use this run configuration to start an instance of CDAP Standalone.
This will allow you to start CDAP and access it from either the command line (CLI)
or through the HTTP RESTful API. To start the CLI, you can either start
it from a shell using the
cdap script or run the
CLIMain class from the IDE.
If you want to run and develop the UI, you will need to follow additional instructions in the CDAP UI README.
🔗Starting and Stopping Standalone CDAP
cdap script (located in
/opt/cdap/sdk/bin) to start and stop the Standalone CDAP:
$ /opt/cdap/sdk/bin/cdap sdk start . . . $ /opt/cdap/sdk/bin/cdap sdk stop
Note that starting CDAP is not necessary if you use the Virtual Machine, as it starts the Standalone CDAP automatically on startup.
Once CDAP is started successfully, in a web browser you will be able to see the CDAP UI running at http://localhost:11011/, where you can deploy example applications and interact with CDAP.