🔗CDAP Administration Manual
Covers putting CDAP into production, with components, system requirements, deployment architectures, Hadoop compatibility, installation, configuration, security setup, and operations. Appendices describe the XML files used to configure the CDAP installation and its security configuration.
- Deployment Architectures: Minimal and high availability, highly scalable deployments.
- Hadoop Compatibility: The Hadoop/HBase environment that CDAP requires.
- System Requirements: Hardware, memory, core, and network requirements, software prerequisites, and using CDAP with firewalls.
Installation: Installation and configuration instructions for either specific distributions using a distribution manager or generic Apache Hadoop clusters using RPM or Debian Package Managers:
- Cloudera Manager: Installing on CDH (Cloudera Distribution of Apache Hadoop) clusters managed with Cloudera Manager.
- Manual Installation using Packages Installing on generic Apache Hadoop clusters, CDH (Cloudera Distribution of Apache Hadoop) clusters not managed with Cloudera Manager, or HDP (Hortonworks Data Platform) clusters not managed with Apache Ambari
- Replication Covers the replication of CDAP clusters from a master to one or more slave clusters
- Verification: How to verify the CDAP installation on your Hadoop cluster by using an example application and health checks.
- Upgrading: Instructions for upgrading both CDAP and its underlying Hadoop distribution.
- Security: CDAP supports securing clusters using a perimeter security, authorization, impersonation, SSL for system services, and secure storage. This section describes enabling, configuring, and testing security. It also provides example configuration files.
- Logging and Monitoring: CDAP collects logs for all of its internal services and user applications; at the same time, CDAP can be monitored through external systems. Covers log location, logging messages, the system services and user application logback configuration and CDAP support for logging through the standard SLF4J (Simple Logging Facade for Java) APIs and Logback.
- Metrics: CDAP collects metrics about the application’s behavior and performance.
- Preferences and Runtime Arguments: Flows, MapReduce and Spark programs, services, workers, and workflows can receive runtime arguments.
- Scaling Instances: Covers querying and setting the number of instances of flowlets, services, and workers.
- Resource Guarantees: Providing resource guarantees for CDAP programs in YARN.
- Transaction Service Maintenance: Periodic maintenance of the Transaction Service.
- CDAP UI: The CDAP UI is available for deploying, querying, and managing CDAP.
- Appendix: Minimal cdap-site.xml: Minimal required configuration for a CDAP installation
- Appendix: cdap-site.xml: Default properties for a CDAP installation
- Appendix: cdap-security.xml: Default security properties for a CDAP installation
- Appendix: HBaseDDLExecutor: Example implementation and description for replication