Microsoft Data Engineering on Microsoft Azure DP-203 Exam Practice Test

Page: 1 / 14
Total 354 questions
Question 1

You have an enterprise data warehouse in Azure Synapse Analytics.

You need to monitor the data warehouse to identify whether you must scale up to a higher service level to accommodate the current workloads

Which is the best metric to monitor?

More than one answer choice may achieve the goal. Select the BEST answer.



Answer : C


Question 2

You have an Azure subscription that contains an Azure Data Lake Storage account named myaccount1. The myaccount1 account contains two containers named container1 and contained. The subscription is linked to an Azure Active Directory (Azure AD) tenant that contains a security group named Group1.

You need to grant Group1 read access to contamer1. The solution must use the principle of least privilege. Which role should you assign to Group1?



Answer : A


Question 3

You have an Azure Data Lake Storage account that has a virtual network service endpoint configured.

You plan to use Azure Data Factory to extract data from the Data Lake Storage account. The data will then be loaded to a data warehouse in Azure Synapse Analytics by using PolyBase.

Which authentication method should you use to access Data Lake Storage?



Question 4

You have an Azure Synapse Analytics dedicated SQL pool.

You need to ensure that data in the pool is encrypted at rest. The solution must NOT require modifying applications that query the data.

What should you do?



Answer : B

Transparent Data Encryption (TDE) helps protect against the threat of malicious activity by encrypting and decrypting your data at rest. When you encrypt your database, associated backups and transaction log files are encrypted without requiring any changes to your applications. TDE encrypts the storage of an entire database

by using a symmetric key called the database encryption key.


https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overviewmanage-security

Question 5

You have an Azure Databricks workspace that contains a Delta Lake dimension table named Tablet. Table1 is a Type 2 slowly changing dimension (SCD) table. You need to apply updates from a source table to Table1. Which Apache Spark SQL operation should you use?



Answer : C

The Delta provides the ability to infer the schema for data input which further reduces the effort required in managing the schema changes. The Slowly Changing Data(SCD) Type 2 records all the changes made to each key in the dimensional table. These operations require updating the existing rows to mark the previous values of the keys as old and then inserting new rows as the latest values. Also, Given a source table with the updates and the target table with dimensional data, SCD Type 2 can be expressed with the merge.

Example:

// Implementing SCD Type 2 operation using merge function

customersTable

.as('customers')

.merge(

stagedUpdates.as('staged_updates'),

'customers.customerId = mergeKey')

.whenMatched('customers.current = true AND customers.address <> staged_updates.address')

.updateExpr(Map(

'current' -> 'false',

'endDate' -> 'staged_updates.effectiveDate'))

.whenNotMatched()

.insertExpr(Map(

'customerid' -> 'staged_updates.customerId',

'address' -> 'staged_updates.address',

'current' -> 'true',

'effectiveDate' -> 'staged_updates.effectiveDate',

'endDate' -> 'null'))

.execute()

}


https://www.projectpro.io/recipes/what-is-slowly-changing-data-scd-type-2-operation-delta-table-databricks

Question 6

You are designing the folder structure for an Azure Data Lake Storage Gen2 account.

You identify the following usage patterns:

* Users will query data by using Azure Synapse Analytics serverless SQL pools and Azure Synapse Analytics serverless Apache Spark pods.

* Most queries will include a filter on the current year or week.

* Data will be secured by data source.

You need to recommend a folder structure that meets the following requirements:

* Supports the usage patterns

* Simplifies folder security

* Minimizes query times

Which folder structure should you recommend?

A)

B)

C)

D)

E)



Answer : C

Data will be secured by data source. -> Use DataSource as top folder.

Most queries will include a filter on the current year or week -> Use \YYYY\WW\ as subfolders.

Common Use Cases

A common use case is to filter data stored in a date (and possibly time) folder structure such as /YYYY/MM/DD/ or /YYYY/MM/YYYY-MM-DD/. As new data is generated/sent/copied/moved to the storage account, a new folder is created for each specific time period. This strategy organises data into a maintainable folder structure.


Question 7

You are designing an anomaly detection solution for streaming data from an Azure IoT hub. The solution must meet the following requirements:

Send the output to Azure Synapse.

Identify spikes and dips in time series data.

Minimize development and configuration effort.

Which should you include in the solution?



Answer : B

You can identify anomalies by routing data via IoT Hub to a built-in ML model in Azure Stream Analytics.


https://docs.microsoft.com/en-us/learn/modules/data-anomaly-detection-using-azure-iot-hub/

Page:    1 / 14   
Total 354 questions