Hortonworks Data Platform Certified Developer Exam Practice Test

Page: 1 / 14
Total 108 questions
Question 1

Which HDFS command copies an HDFS file named foo to the local filesystem as localFoo?



Answer : A


Question 2

When can a reduce class also serve as a combiner without affecting the output of a MapReduce program?



Answer : A

You can use your reducer code as a combiner if the operation performed is commutative and associative.


Question 3

Which best describes what the map method accepts and emits?



Answer : D

public class Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

extends Object

Maps input key/value pairs to a set of intermediate key/value pairs.

Maps are the individual tasks which transform input records into a intermediate records. The transformed intermediate records need not be of the same type as the input records. A given input pair may map to zero or many output pairs.


Class Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

Question 4

Workflows expressed in Oozie can contain:



Answer : A

Oozie workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph), specifying a sequence of actions execution. This graph is specified in hPDL (a XML Process Definition Language).

hPDL is a fairly compact language, using a limited amount of flow control and action nodes. Control nodes define the flow of execution and include beginning and end of a workflow (start, end and fail nodes) and mechanisms to control the workflow execution path ( decision, fork and join nodes).

Workflow definitions

Currently running workflow instances, including instance states and variables


Note: Oozie is a Java Web-Application that runs in a Java servlet-container - Tomcat and uses a database to store:

Question 5

You need to move a file titled ''weblogs'' into HDFS. When you try to copy the file, you can't. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?



Answer : C


Question 6

A client application creates an HDFS file named foo.txt with a replication factor of 3. Identify which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes A, B and C?



Answer : D

HDFS keeps three copies of a block on three different datanodes to protect against truedata corruption. HDFS also tries to distribute these three replicas on more than one rack to protect againstdata availabilityissues. The fact that HDFS actively monitors any failed datanode(s) and upon failure detection immediately schedules re-replication of blocks (if needed) implies that three copies of data on three different nodes is sufficient to avoid corrupted files.

Note:

HDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks; all blocks in a file except the last block are the same size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable per file. An application can specify the number of replicas of a file. The replication factor can be specified at file creation time and can be changed later. Files in HDFS are write-once and have strictly one writer at any time. The NameNode makes all decisions regarding replication of blocks. HDFS uses rack-aware replica placement policy. In default configuration there are total 3 copies of a datablock on HDFS, 2 copies are stored on datanodes on same rack and 3rd copy on a different rack.


Question 7

Indentify which best defines a SequenceFile?



Answer : D

SequenceFile is a flat file consisting of binary key/value pairs.

There are 3 different SequenceFile formats:

Uncompressed key/value records.

Record compressed key/value records - only 'values' are compressed here.

Block compressed key/value records - both keys and values are collected in 'blocks' separately and compressed. The size of the 'block' is configurable.


Page:    1 / 14   
Total 108 questions