The term "greedy algorithms" refers to machine-learning algorithms that:
Answer : D
Greedy algorithms build the solution iteratively by choosing at each step the option that appears best at that moment, without reconsidering earlier choices.
Which of the following distributions would be best to use for hypothesis testing on a data set with 20 observations?
Answer : D
With only 20 observations and an unknown population variance, the t-distribution (with -- 1 degrees of freedom) properly accounts for the extra uncertainty in the standard error when performing hypothesis tests.
A data scientist built several models that perform about the same but vary in the number of features. Which of the following models should the data scientist recommend for production according to Occam's razor?
Answer : A
According to Occam's razor, when models perform equivalently, you choose the simplest one - in this case, the model that achieves the needed performance with the fewest features.
The most likely concern with a one-feature, machine-learning model is high error due to:
Answer : A
A model with only one feature is unlikely to capture the true complexity of the data's underlying relationships, leading to systematic underfitting - i.e., high bias.
A data scientist is preparing to brief a non-technical audience that is focused on analysis and results. During the modeling process, the data scientist produced the following artifacts:
Which of the following artifacts should the data scientist include in the briefing? (Choose two.)
Answer : A
For a nontechnical audience centered on results, polished visualizations (charts and dashboards) and clear, high-level performance metrics (accuracy, precision, recall, F1 score) best convey the key takeaways. The deeper technical details, code docs, data dictionaries, and algorithm math, should be omitted at this level.
A data scientist wants to predict a person's travel destination. The options are:
Which of the following models would best fit this use case?
Answer : A
You need a supervised multiclass classification model to predict one of the four labeled destinations. Linear Discriminant Analysis is designed for such tasks, finding the linear boundaries that best separate the known destination classes.
A data scientist is standardizing a large data set that contains website addresses. A specific string inside some of the web addresses needs to be extracted. Which of the following is the best method for extracting the desired string from the text data?
Answer : A