Snowflake SnowPro Advanced: Data Scientist Certification DSA-C02 Exam Practice Test

Page: 1 / 14
Total 65 questions
Question 1

Which of the learning methodology applies conditional probability of all the variables with respec-tive the dependent variable?



Answer : A

Supervised learning methodology applies conditional probability of all the variables with respective the dependent variable and generally conditional probability of variables is nothing but a basic method of estimating the statistics for few random experiments.

Conditional probability is thus the likelihood of an event or outcome occurring based on the occurrence of some other event or prior outcome. Two events are said to be independent if one event occurring does not affect the probability that the other event will occur.


Question 2

A Data Scientist as data providers require to allow consumers to access all databases and database objects in a share by granting a single privilege on shared databases. Which one is incorrect SnowSQL command used by her while doing this task?

Assuming:

A database named product_db exists with a schema named product_agg and a table named Item_agg.

The database, schema, and table will be shared with two accounts named xy12345 and yz23456.

1. USE ROLE accountadmin;

2. CREATE DIRECT SHARE product_s;

3. GRANT USAGE ON DATABASE product_db TO SHARE product_s;

4. GRANT USAGE ON SCHEMA product_db. product_agg TO SHARE product_s;

5. GRANT SELECT ON TABLE sales_db. product_agg.Item_agg TO SHARE product_s;

6. SHOW GRANTS TO SHARE product_s;

7. ALTER SHARE product_s ADD ACCOUNTS=xy12345, yz23456;

8. SHOW GRANTS OF SHARE product_s;



Answer : C

CREATE SHARE product_s is the correct Snowsql command to create Share object.

Rest are correct ones.

https://docs.snowflake.com/en/user-guide/data-sharing-provider#creating-a-share-using-sql


Question 3

What Can Snowflake Data Scientist do in the Snowflake Marketplace as Consumer?



Answer : A, B, C, D

As a consumer, you can do the following:

* Discover and test third-party data sources.

* Receive frictionless access to raw data products from vendors.

* Combine new datasets with your existing data in Snowflake to derive new business insights.

* Have datasets available instantly and updated continually for users.

* Eliminate the costs of building and maintaining various APIs and data pipelines to load and up-date data.

* Use the business intelligence (BI) tools of your choice.


Question 4

In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?



Answer : D

What is linear regression?

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model.

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).

For linear regression Y=a+bx+error.

If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.

For linear regression Y=a+bx+error. If neglect error then Y=a+bx. If x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.


Question 5

Which of the following method is used for multiclass classification?



Answer : A

Binary vs. Multi-Class Classification

Classification problems are common in machine learning. In most cases, developers prefer using a supervised machine-learning approach to predict class tables for a given dataset. Unlike regression, classification involves designing the classifier model and training it to input and categorize the test dataset. For that, you can divide the dataset into either binary or multi-class modules.

As the name suggests, binary classification involves solving a problem with only two class labels. This makes it easy to filter the data, apply classification algorithms, and train the model to predict outcomes. On the other hand, multi-class classification is applicable when there are more than two class labels in the input train data. The technique enables developers to categorize the test data into multiple binary class labels.

That said, while binary classification requires only one classifier model, the one used in the multi-class approach depends on the classification technique. Below are the two models of the multi-class classification algorithm.

One-Vs-Rest Classification Model for Multi-Class Classification

Also known as one-vs-all, the one-vs-rest model is a defined heuristic method that leverages a binary classification algorithm for multi-class classifications. The technique involves splitting a multi-class dataset into multiple sets of binary problems. Following this, a binary classifier is trained to handle each binary classification model with the most confident one making predictions.

For instance, with a multi-class classification problem with red, green, and blue datasets, binary classification can be categorized as follows:

Problem one: red vs. green/blue

Problem two: blue vs. green/red

Problem three: green vs. blue/red

The only challenge of using this model is that you should create a model for every class. The three classes require three models from the above datasets, which can be challenging for large sets of data with million rows, slow models, such as neural networks and datasets with a significant number of classes.

The one-vs-rest approach requires individual models to prognosticate the probability-like score. The class index with the largest score is then used to predict a class. As such, it is commonly used for classification algorithms that can naturally predict scores or numerical class membership such as perceptron and logistic regression.


Question 6

Which ones are the correct rules while using a data science model created via External function in Snowflake?



Answer : A, B, C, D

From the perspective of a user running a SQL statement, an external function behaves like any other UDF . External functions follow these rules:

External functions return a value.

External functions can accept parameters.

An external function can appear in any clause of a SQL statement in which other types of UDF can appear. For example:

1. select my_external_function_2(column_1, column_2)

2. from table_1;

1. select col1

2. from table_1

3. where my_external_function_3(col2) < 0;

1. create view view1 (col1) as

2. select my_external_function_5(col1)

3. from table9;

An external function can be part of a more complex expression:

1. select upper(zipcode_to_city_external_function(zipcode))

2. from address_table;

The returned value can be a compound value, such as a VARIANT that contains JSON.

External functions can be overloaded; two different functions can have the same name but different signatures (different numbers or data types of input parameters).


Question 7

Which one is the incorrect option to share data in Snowflake?



Answer : B

Options for Sharing in Snowflake

You can share data in Snowflake using one of the following options:

* a Listing, in which you offer a share and additional metadata as a data product to one or more ac-counts,

* a Direct Share, in which you directly share specific database objects (a share) to another account in your region,

* a Data Exchange, in which you set up and manage a group of accounts and offer a share to that group.


Page:    1 / 14   
Total 65 questions