đź”—Hadoop Compatibility

Before installing the CDAP components, you must first install (or have access to) a Hadoop cluster with HBase, HDFS, YARN, and ZooKeeper. Hive and Spark are optional components; Hive is required to enable CDAP's ad-hoc querying capabilities (CDAP Explore) and Spark is required if a CDAP application uses the Spark program.

All CDAP components can be installed on the same boxes as your Hadoop cluster, or on separate boxes that can connect to the Hadoop services.

CDAP depends on these services being present on the cluster. There are core dependencies, which must be running for CDAP system services to operate correctly, and optional dependencies, which may be required for certain functionality or program types.

The host(s) running the CDAP Master service must have the HBase, HDFS, and YARN clients installed, as CDAP uses the command line clients of these for initialization and their connectivity information for external service dependencies. If Hadoop system services are also running on the same hosts as the CDAP services, they will already have these clients installed.

Core Dependencies

  • HBase: For system runtime storage and queues
  • HDFS: The backing file system for distributed storage
  • YARN: For running system services in containers on cluster NodeManagers
  • MapReduce2: For batch operations in workflows and data exploration (included with YARN)
  • ZooKeeper: For service discovery and leader election

Optional Dependencies

  • Hive: For data exploration using SQL queries via the CDAP Explore system service
  • Spark: For running Spark programs within CDAP applications

Hadoop/HBase Environment

For a Distributed CDAP cluster, version 4.1.1, you must install these Hadoop components (see notes following the tables):

Component Source Supported Versions
Hadoop various 2.0 and higher
HBase Apache 0.98.x and 1.2
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2 (with Apache HBase)
Amazon Hadoop (EMR) 4.6 through 4.8 (with Apache HBase)
HDFS Apache Hadoop 2.0.2-alpha through 2.6
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2 (with MapR-FS)
Amazon Hadoop (EMR) 4.6 through 4.8
YARN and MapReduce2 Apache Hadoop 2.0.2-alpha through 2.7
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2
Amazon Hadoop (EMR) 4.6 through 4.8
ZooKeeper Apache Version 3.4.3 through 3.4
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2
Amazon Hadoop (EMR) 4.6 through 4.8

For a Distributed CDAP cluster, version 4.1.1, you can (optionally) install these Hadoop components, as required:

Component Source Supported Versions
Hive Apache Version 0.12.0 through 1.2.x
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2
Amazon Hadoop (EMR) 4.6 through 4.8
Spark Apache Versions 1.2.x through 1.6.x
Cloudera Distribution of Apache Hadoop (CDH) 5.1 through 5.11 (Note 4)
Hortonworks Data Platform (HDP) 2.0 through 2.6 (Note 4)
MapR 4.1 through 5.2
Amazon Hadoop (EMR) 4.6 through 4.8

Note 1: Component versions shown in these tables are those that we have tested and are confident of their suitability and compatibility. Later versions of components may work, but have not necessarily been either tested or confirmed compatible.

Note 2: Certain CDAP components need to reference your Hadoop, YARN, HBase, and Hive cluster configurations by adding those configurations to their class paths.

Note 3: Hive 0.12 is not supported for secure cluster configurations.

Note 4: An upcoming release of CDAP (scheduled for CDAP 4.3) will drop support for all versions older than CDH 5.4.11 or HDP 2.5.0.0 due to an Apache Hadoop Privilege Escalation Vulnerability.