Cask Data Application Platform

What is the Cask Data Application Platform?

Cask Data Application Platform (CDAP) is the industry’s first Big Data Application Server for Hadoop. It abstracts all the complexities and integrates the components of the Hadoop ecosystem (YARN, MapReduce, Zookeeper, HBase, etc.) enabling developers to build, test, deploy, and manage Big Data applications without having to worry about infrastructure, interoperability, or the complexities of distributed systems.

What is available in the CDAP SDK?

The CDAP SDK comes with:

  • Java and RESTful APIs to build CDAP applications;
  • Standalone CDAP to run the entire CDAP stack in a single Java virtual machine; and
  • Example CDAP applications.

Why should I use Cask Data Application Platform for developing Big Data Applications?

CDAP helps developers to quickly develop, test, debug and deploy Big Data applications. Developers can build and test Big Data applications on their laptop without need for any distributed environment to develop and test Big Data applications. Deploy it on the distributed cluster with a push of a button. The advantages of using CDAP include:

  1. Integrated Framework: CDAP provides an integrated platform that makes it easy to create all the functions of Big Data applications: collecting, processing, storing, and querying data. Data can be collected and stored in both structured and unstructured forms, processed in real time and in batch, and results can be made available for retrieval, visualization, and further analysis.
  2. Simple APIs: CDAP aims to reduce the time it takes to create and implement applications by hiding the complexity of these distributed technologies with a set of powerful yet simple APIs. You don’t need to be an expert on scalable, highly-available system architectures, nor do you need to worry about the low level Hadoop and HBase APIs.
  3. Full Development Lifecycle Support: CDAP supports developers through the entire application development lifecycle: development, debugging, testing, continuous integration and production. Using familiar development tools like Eclipse and IntelliJ, you can build, test and debug your application right on your laptop with a Standalone CDAP. Utilize the application unit test framework for continuous integration.
  4. Easy Application Operations: Once your Big Data application is in production, CDAP is designed specifically to monitor your applications and scale with your data processing needs: increase capacity with a click of a button without taking your application offline. Use the CDAP UI or RESTful APIs to monitor and manage the lifecycle and scale of your application.

Platforms and Language

What Platforms are Supported by the Cask Data Application Platform SDK?

The CDAP SDK can be run on Mac OS X, Linux or Windows platforms.

What programming languages are supported by CDAP?

CDAP currently supports Java for developing applications.

What Version of Java SDK is Required by CDAP?

The latest version of the JDK or JRE version 7 or version 8 must be installed in your environment; we recommend the Oracle JDK.

What Version of Node.JS is Required by CDAP?

The version of Node.js must be from v0.10.* through v0.12.*; we recommend v0.12.0.


I have a Hadoop cluster in my data center, can I run CDAP that uses my Hadoop cluster?

Yes. You can install CDAP on your Hadoop cluster. See Installation and Configuration.

What Hadoop distributions can CDAP run on?

CDAP 3.1.2 has been tested on and supports CDH 5.0.0 through 5.4.4; HDP 2.0, 2.1, and 2.2; MapR 4.1, and Apache Bigtop 0.8.0.

Issues, User Groups, Mailing Lists, and IRC Channel

I’ve found a bug in CDAP. How do I file an issue?

We have a JIRA for filing issues.

What User Groups and Mailing Lists are available about CDAP?

The cdap-user mailing list is primarily for users using the product to develop applications. You can expect questions from users, release announcements, and any other discussions that we think will be helpful to the users.

The cdap-dev mailing list is essentially for developers actively working on the product, and should be used for all our design, architecture and technical discussions moving forward. This mailing list will also receive all JIRA and GitHub notifications.


CDAP IRC Channel: #cdap on irc.freenode.net.