SIMULATION
A data scientist needs to determine whether product sales are impacted by other contributing factors. The client has provided the data scientist with sales and other variables in the data set.
The data scientist decides to test potential models that include other information.
INSTRUCTIONS
Part 1
Use the information provided in the table to select the appropriate regression model.
Part 2
Review the summary output and variable table to determine which variable is statistically significant.
If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.







Answer : A
Part 1
Linear regression.
Of the four models, linear regression has the highest R (0.8), indicating it explains the greatest proportion of variance in sales.

Part 2
Var 4 -- Net operations cost.
Net operations cost has a p-value of essentially 0 (far below 0.05), indicating it is the only additional predictor statistically significant in explaining sales. Neither inventory cost (p0.90) nor initial investment (p0.23) reach significance.

The term "greedy algorithms" refers to machine-learning algorithms that:
Answer : D
Greedy algorithms build the solution iteratively by choosing at each step the option that appears best at that moment, without reconsidering earlier choices.
A data scientist wants to predict a person's travel destination. The options are:
Which of the following models would best fit this use case?
Answer : A
You need a supervised multiclass classification model to predict one of the four labeled destinations. Linear Discriminant Analysis is designed for such tasks, finding the linear boundaries that best separate the known destination classes.
A data scientist is analyzing a data set with categorical features and would like to make those features more useful when building a model. Which of the following data transformation techniques should the data scientist use? (Choose two.)
Answer : B
One-hot encoding creates binary indicator columns for each category, allowing models to treat nominal categories without implying any order.
Label encoding maps categories to integer labels, which can be useful for tree-based models or when you need a single numeric column (though you must ensure the algorithm can handle treated ordinality appropriately).
Which of the following modeling tools is appropriate for solving a scheduling problem?
Answer : B
Scheduling problems require finding the best allocation of resources subject to constraints (e.g., time slots, resource availability), which is precisely what constrained optimization algorithms are designed to handle.
Given matrix

Which of the following is AT?
A)

B)

C)

D)

Answer : C
Transposing swaps rows and columns, so the (i, j) entry becomes the (j, i) entry.
A data scientist is preparing to brief a non-technical audience that is focused on analysis and results. During the modeling process, the data scientist produced the following artifacts:
Which of the following artifacts should the data scientist include in the briefing? (Choose two.)
Answer : A
For a nontechnical audience centered on results, polished visualizations (charts and dashboards) and clear, high-level performance metrics (accuracy, precision, recall, F1 score) best convey the key takeaways. The deeper technical details, code docs, data dictionaries, and algorithm math, should be omitted at this level.