🔗CDAP Developers’ Manual

  • Getting Started Developing: A quick, hands-on introduction to developing with CDAP, which guides you through installing the CDAP SDK, setting up your development environment, starting and stopping CDAP, and building and running example applications.
  • Overview: Covers the overall architecture and technology behind CDAP, including the abstraction of Data and Applications, CDAP modes and components, and the anatomy of a Big Data application.
  • Building Blocks: This section covers the two core abstractions in the Cask Data Application Platform: Data and Applications. Data abstractions include streams, datasets, and views. Application abstraction is accomplished using flows and flowlets, MapReduce, Spark, workers, workflows, schedules, and services. Details are provided on working with these abstractions to build Big Data applications.
  • Metadata: A CDAP capability that automatically captures metadata and lets you see how data is flowing into and out of datasets, streams, and stream views. Audit logging provides a chronological ledger containing evidence of operations or changes on CDAP entities.
  • Pipelines: A capability of CDAP that combines a user interface with back-end services to enable the building, deploying, and managing of data pipelines.
  • Security: CDAP supports securing clusters using perimeter security. Configuration and client authentication are covered in this section.
  • Testing and Debugging: CDAP has a test framework that developers can use with their applications plus tools and practices for debugging your application prior to deployment.
  • Ingesting Data: CDAP comes with a number of tools to make a developer’s life easier. These tools help with ingesting data into CDAP using Java, Python, and Ruby APIs, and include an Apache Flume Sink implementation.
  • Data Exploration: Data in CDAP can be explored without writing any code through the use of ad-hoc SQL-like queries. Exploration of streams and datasets, along with integration with business intelligence tools, are covered in this section.
  • Advanced Topics: Covers advanced topics on CDAP that will be of interest to developers who want a deeper dive into CDAP, including adding a custom logback to a CDAP application, suggested best practices for CDAP development, class loading in CDAP, and on configuring program resources and program retry policies of a CDAP application.