CDAP Developers’ Manual

  • Getting Started Developing: A quick, hands-on introduction to developing with CDAP, which guides you through installing the CDAP SDK, setting up your development environment, starting and stopping CDAP, and building and running example applications.
  • Overview: Covers the overall architecture and technology behind CDAP, including the abstraction of Data and Applications, CDAP concepts, components and their interactions, and the anatomy of a Big Data application.
  • Building Blocks: This section covers the two core abstractions in the Cask Data Application Platform: **Data and Applications.** Data abstractions include Streams and Datasets. Application abstraction is accomplished using Flows, MapReduce, Spark, Workflows, and Services. Details are provided on working with these abstractions to build Big Data applications.
  • Security: CDAP supports securing clusters using perimeter security. Configuration and client authentication are covered in this section.
  • Testing and Debugging: CDAP has a test framework that developers can use with their applications plus tools and practices for debugging your application prior to deployment.
  • Ingesting Data: CDAP comes with a number of tools to make a developer’s life easier. These tools help with ingesting data into CDAP using Java, Python, and Ruby APIs, and include an Apache Flume Sink implementation.
  • Data Exploration: Data in CDAP can be explored without writing any code through the use of ad-hoc SQL-like queries. Exploration of streams and datasets, along with integration with business intelligence tools, are covered in this section.
  • Advanced Topics: Covers advanced topics on CDAP that will be of interest to developers who want a deeper dive into CDAP, with presentations on suggested best practices for CDAP development, and on adding a custom logback to a CDAP application.