CDAP Developers’ Manual
- Getting Started Developing: A quick, hands-on introduction to developing with CDAP, which guides you through
installing the CDAP SDK, setting up your development environment, starting and stopping CDAP,
and building and running example applications.
- Overview: Covers the overall architecture and technology behind CDAP, including
the abstraction of Data and Applications, CDAP modes and components, and the anatomy
of a Big Data application.
- Building Blocks: This section covers the two core abstractions in the Cask Data
Application Platform: Data and Applications. Data abstractions include streams,
datasets, and views. Application abstraction is accomplished using flows and flowlets, MapReduce, Spark,
workers, workflows, schedules, and services. Details are provided on working with these abstractions to
build Big Data applications.
- Security: CDAP supports securing clusters using perimeter security. Configuration
and client authentication are covered in this section.
- Testing and Debugging: CDAP has a test framework that developers can use with their applications
plus tools and practices for debugging your application prior to deployment.
- Ingesting Data: CDAP comes with a number of tools to make a developer’s life easier. These
tools help with ingesting data into CDAP using Java, Python, and Ruby APIs,
and include an Apache Flume Sink implementation.
- Data Exploration: Data in CDAP can be explored without writing any code through the use of ad-hoc SQL-like queries.
Exploration of streams and datasets, along with integration with business intelligence tools, are covered in this section.
- Advanced Topics: Covers advanced topics on CDAP that will be of interest to
developers who want a deeper dive into CDAP, including suggested best practices for
CDAP development, class loading in CDAP, and on adding a custom logback to a