Hortonworks Data Platform Certified Developer Exam Practice Test

Page: 1 / 14
Total 108 questions

Question 1

Which HDFS command displays the contents of the file x in the user's HDFS home directory?

Answer : C

Question 2

What is a SequenceFile?

Answer : D

Question 3

Your cluster's HDFS block size in 64MB. You have directory containing 100 plain text files, each of which is 100MB in size. The InputFormat for your job is TextInputFormat. Determine how many Mappers will run?

Answer : C

Question 4

You want to run Hadoop jobs on your development workstation for testing before you submit them to your production cluster. Which mode of operation in Hadoop allows you to most closely simulate a production cluster while using a single machine?

Answer : C

Question 5

Your client application submits a MapReduce job to your Hadoop cluster. Identify the Hadoop daemon on which the Hadoop framework will look for an available slot schedule a MapReduce operation.

Answer : D

Question 6

You want to count the number of occurrences for each unique word in the supplied input dat

a. You've decided to implement this by having your mapper tokenize each word and emit a literal value 1, and then have your reducer increment a counter for each literal 1 it receives. After successful implementing this, it occurs to you that you could optimize this by specifying a combiner. Will you be able to reuse your existing Reduces as your combiner in this case and why or why not?

Answer : A

Question 7

You need to create a job that does frequency analysis on input dat

a. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?

Answer : B

Page:    1 / 14   
Total 108 questions