Which of the following distance metrics for KNN is best described as a straight line?
Answer : B
Euclidean distance measures the straight-line distance between two points in space, matching the geometric ''as-the-crow-flies'' notion of distance.
Which of the following modeling tools is appropriate for solving a scheduling problem?
Answer : B
Scheduling problems require finding the best allocation of resources subject to constraints (e.g., time slots, resource availability), which is precisely what constrained optimization algorithms are designed to handle.
The term "greedy algorithms" refers to machine-learning algorithms that:
Answer : D
Greedy algorithms build the solution iteratively by choosing at each step the option that appears best at that moment, without reconsidering earlier choices.
A data analyst wants to save a newly analyzed data set to a local storage option. The data set must meet the following requirements:
Which of the following file types is the best to use?
Answer : B
Parquet is a columnar storage format that automatically includes schema (data types), uses efficient compression to minimize file size, and enables very fast reads for analytic workloads.
The most likely concern with a one-feature, machine-learning model is high error due to:
Answer : A
A model with only one feature is unlikely to capture the true complexity of the data's underlying relationships, leading to systematic underfitting - i.e., high bias.
Which of the following distributions would be best to use for hypothesis testing on a data set with 20 observations?
Answer : D
With only 20 observations and an unknown population variance, the t-distribution (with -- 1 degrees of freedom) properly accounts for the extra uncertainty in the standard error when performing hypothesis tests.
A data analyst is analyzing data and would like to build conceptual associations. Which of the following is the best way to accomplish this task?
Answer : A
n-grams capture contiguous sequences of words, revealing which terms co-occur and form meaningful multi-word concepts. By analyzing these frequent word combinations, you directly uncover conceptual associations in the text.