Which ones are the type of visualization used for Data exploration in Data Science?
Answer : A, D, E
Type of visualization used for exploration:
* Correlation heatmap
* Class distributions by feature
* Two-Dimensional density plots.
All the visualizations are interactive, as is standard for Plotly.
For More details, please refer the below link:
https://towardsdatascience.com/data-exploration-understanding-and-visualization-72657f5eac41
How do you handle missing or corrupted data in a dataset?
Answer : D
Secure Data Sharing do not let you share which of the following selected objects in a database in your account with other Snowflake accounts?
Answer : A
Secure Data Sharing lets you share selected objects in a database in your account with other Snow-flake accounts. You can share the following Snowflake database objects:
Tables
External tables
Secure views
Secure materialized views
Secure UDFs
Snowflake enables the sharing of databases through shares, which are created by data providers and ''imported'' by data consumers.
There are a couple of different types of classification tasks in machine learning, Choose the Correct Classification which best categorized the below Application Tasks in Machine learning?
* To detect whether email is spam or not
* To determine whether or not a patient has a certain disease in medicine.
* To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).
Answer : C
The Supervised Machine Learning algorithm can be broadly classified into Regression and Classification Algorithms. In Regression algorithms, we have predicted the output for continuous values, but to predict the categorical values, we need Classification algorithms.
What is the Classification Algorithm?
The Classification algorithm is a Supervised Learning technique that is used to identify the category of new observations on the basis of training data. In Classification, a program learns from the given dataset or observations and then classifies new observation into a number of classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can be called as targets/labels or categories.
Unlike regression, the output variable of Classification is a category, not a value, such as 'Green or Blue', 'fruit or animal', etc. Since the Classification algorithm is a Supervised learning technique, hence it takes labeled input data, which means it contains input with the corresponding output.
In classification algorithm, a discrete output function(y) is mapped to input variable(x).
y=f(x), where y = categorical output
The best example of an ML classification algorithm is Email Spam Detector.
The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data.
The algorithm which implements the classification on a dataset is known as a classifier. There are two types of Classifications:
Binary Classifier: If the classification problem has only two possible outcomes, then it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
Multi-class Classifier: If a classification problem has more than two outcomes, then it is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.
Binary classification in deep learning refers to the type of classification where we have two class labels -- one normal and one abnormal. Some examples of binary classification use:
* To detect whether email is spam or not
* To determine whether or not a patient has a certain disease in medicine.
* To determine whether or not quality specifications were met when it comes to QA (Quality Assurance).
For example, the normal class label would be that a patient has the disease, and the abnormal class label would be that they do not, or vice-versa.
As is with every other type of classification, it is only as good as the binary classification dataset that it has -- or, in other words, the more training and data it has, the better it is.
Mark the incorrect statement regarding usage of Snowflake Stream & Tasks?
Answer : D
All are correct except a standard-only stream tracks row inserts only.
A standard (i.e. delta) stream tracks all DML changes to the source object, including inserts, up-dates, and deletes (including table truncates).
What Can Snowflake Data Scientist do in the Snowflake Marketplace as Provider?
Answer : A, B, C, D
All are correct!
About the Snowflake Marketplace
You can use the Snowflake Marketplace to discover and access third-party data and services, as well as market your own data products across the Snowflake Data Cloud.
As a data provider, you can use listings on the Snowflake Marketplace to share curated data offer-ings with many consumers simultaneously, rather than maintain sharing relationships with each indi-vidual consumer. With Paid Listings, you can also charge for your data products.
As a consumer, you might use the data provided on the Snowflake Marketplace to explore and ac-cess the following:
Historical data for research, forecasting, and machine learning.
Up-to-date streaming data, such as current weather and traffic conditions.
Specialized identity data for understanding subscribers and audience targets.
New insights from unexpected sources of data.
The Snowflake Marketplace is available globally to all non-VPS Snowflake accounts hosted on Amazon Web Services, Google Cloud Platform, and Microsoft Azure, with the exception of Mi-crosoft Azure Government. Support for Microsoft Azure Government is planned.
Mark the correct steps for saving the contents of a DataFrame to a Snowflake table as part of Moving Data from Spark to Snowflake?
Answer : C
Moving Data from Spark to Snowflake
The steps for saving the contents of a DataFrame to a Snowflake table are similar to writing from Snowflake to Spark:
1. Use the write() method of the DataFrame to construct a DataFrameWriter.
2. Specify SNOWFLAKE_SOURCE_NAME using the format() method.
3. Specify the connector options using either the option() or options() method.
4. Use the dbtable option to specify the table to which data is written.
5. Use the mode() method to specify the save mode for the content.
Examples
1. df.write
2. .format(SNOWFLAKE_SOURCE_NAME)
3. .options(sfOptions)
4. .option('dbtable', 't2')
5. .mode(SaveMode.Overwrite)
6. .save()