Snowflake SnowPro Advanced: Data Scientist Certification DSA-C02 Exam Questions

Page: 1 / 14
Total 65 questions
Question 1

Which ones are the type of visualization used for Data exploration in Data Science?



Answer : A, D, E

Type of visualization used for exploration:

* Correlation heatmap

* Class distributions by feature

* Two-Dimensional density plots.

All the visualizations are interactive, as is standard for Plotly.

For More details, please refer the below link:

https://towardsdatascience.com/data-exploration-understanding-and-visualization-72657f5eac41


Question 2

Data providers add Snowflake objects (databases, schemas, tables, secure views, etc.) to a share us-ing Which of the following options?



Answer : B, C

What is a Share?

Shares are named Snowflake objects that encapsulate all of the information required to share a database.

Data providers add Snowflake objects (databases, schemas, tables, secure views, etc.) to a share using either or both of the following options:

Option 1: Grant privileges on objects to a share via a database role.

Option 2: Grant privileges on objects directly to a share.

You choose which accounts can consume data from the share by adding the accounts to the share.

After a database is created (in a consumer account) from a share, all the shared objects are accessible to users in the consumer account.

Shares are secure, configurable, and controlled completely by the provider account:

* New objects added to a share become immediately available to all consumers, providing real-time access to shared data.

Access to a share (or any of the objects in a share) can be revoked at any time.


Question 3

Which type of Machine learning Data Scientist generally used for solving classification and regression problems?



Answer : A

Supervised Learning

Overview:

Supervised learning is a type of machine learning that uses labeled data to train machine learning models. In labeled data, the output is already known. The model just needs to map the inputs to the respective outputs.

Algorithms:

Some of the most popularly used supervised learning algorithms are:

* Linear Regression

* Logistic Regression

* Support Vector Machine

* K Nearest Neighbor

* Decision Tree

* Random Forest

* Naive Bayes

Working:

Supervised learning algorithms take labelled inputs and map them to the known outputs, which means you already know the target variable.

Supervised Learning methods need external supervision to train machine learning models. Hence, the name supervised. They need guidance and additional information to return the desired result.

Applications:

Supervised learning algorithms are generally used for solving classification and regression problems.

Few of the top supervised learning applications are weather prediction, sales forecasting, stock price analysis.


Question 4

In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?



Answer : D

What is linear regression?

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model.

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).

For linear regression Y=a+bx+error.

If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.


Question 5

What is the formula for measuring skewness in a dataset?



Answer : C

Since the normal curve is symmetric about its mean, its skewness is zero. This is a theoretical expla-nation for mathematical proofs, you can refer to books or websites that speak on the same in detail.


Question 6

Which command is used to install Jupyter Notebook?



Answer : A

Jupyter Notebook is a web-based interactive computational environment.

The command used to install Jupyter Notebook is pip install jupyter.

The command used to start Jupyter Notebook is jupyter notebook.


Question 7

Which one is incorrect understanding about Providers of Direct share?



Answer : D

If you want to provide a share to many accounts, you might want to use a listing or a data ex-change.


Page:    1 / 14   
Total 65 questions