CompTIA DataX Certification DY0-001 Exam Practice Test

Page: 1 / 14
Total 85 questions
Question 1

Which of the following distance metrics for KNN is best described as a straight line?



Answer : B

Euclidean distance measures the straight-line distance between two points in space, matching the geometric ''as-the-crow-flies'' notion of distance.


Question 2

Which of the following modeling tools is appropriate for solving a scheduling problem?



Answer : B

Scheduling problems require finding the best allocation of resources subject to constraints (e.g., time slots, resource availability), which is precisely what constrained optimization algorithms are designed to handle.


Question 3

The term "greedy algorithms" refers to machine-learning algorithms that:



Answer : D

Greedy algorithms build the solution iteratively by choosing at each step the option that appears best at that moment, without reconsidering earlier choices.


Question 4

A data analyst wants to save a newly analyzed data set to a local storage option. The data set must meet the following requirements:

Which of the following file types is the best to use?



Answer : B

Parquet is a columnar storage format that automatically includes schema (data types), uses efficient compression to minimize file size, and enables very fast reads for analytic workloads.


Question 5

The most likely concern with a one-feature, machine-learning model is high error due to:



Answer : A

A model with only one feature is unlikely to capture the true complexity of the data's underlying relationships, leading to systematic underfitting - i.e., high bias.


Question 6

Which of the following distributions would be best to use for hypothesis testing on a data set with 20 observations?



Answer : D

With only 20 observations and an unknown population variance, the t-distribution (with -- 1 degrees of freedom) properly accounts for the extra uncertainty in the standard error when performing hypothesis tests.


Question 7

A data analyst is analyzing data and would like to build conceptual associations. Which of the following is the best way to accomplish this task?



Answer : A

n-grams capture contiguous sequences of words, revealing which terms co-occur and form meaningful multi-word concepts. By analyzing these frequent word combinations, you directly uncover conceptual associations in the text.


Page:    1 / 14   
Total 85 questions