Amazon DAS-C01 AWS Certified Data Analytics - Specialty Exam Practice Test

Page: 1 / 14
Total 207 questions
Question 1

A company wants to ingest clickstream data from its website into an Amazon S3 bucket. The streaming data is in JSON format. The data in the S3 bucket must be partitioned by product_id.

Which solution will meet these requirements MOST cost-effectively?



Answer : A


Question 2

A network administrator needs to create a dashboard to visualize continuous network patterns over time in a company's AWS account. Currently, the company has VPC Flow Logs enabled and is publishing this data to Amazon CloudWatch Logs. To troubleshoot networking issues quickly, the dashboard needs to display the new data in near-real time.

Which solution meets these requirements?



Answer : D


Question 3

A company ingests a large set of sensor data in nested JSON format from different sources and stores it in an Amazon S3 bucket. The sensor data must be joined with performance data currently stored in an Amazon Redshift cluster.

A business analyst with basic SQL skills must build dashboards and analyze this data in Amazon QuickSight. A data engineer needs to build a solution to prepare the data for use by the business analyst. The data engineer does not know the structure of the JSON file. The company requires a solution with the least possible implementation effort.

Which combination of steps will create a solution that meets these requirements? (Select THREE.)



Answer : B, D, F


Question 4

A financial services firm is processing a stream of real-time data from an application by using Apache Kafka and Kafka MirrorMaker. These tools run on premises and stream data to Amazon Managed Streaming for Apache Kafka (Amazon MSK) in the us-east-1 Region. An Apache Flink consumer running on Amazon EMR enriches the data in real time and transfers the output files to an Amazon S3 bucket. The company wants to ensure that the streaming application is highly available across AWS Regions with an RTO of less than 2 minutes.

Which solution meets these requirements?



Answer : A


Question 5

A company uses Amazon EC2 instances to receive files from external vendors throughout each day. At the end of each day, the EC2 instances combine the files into a single file, perform gzip compression, and upload the single file to an Amazon S3 bucket. The total size of all the files is approximately 100 GB each day.

When the files are uploaded to Amazon S3, an AWS Batch job runs a COPY command to load the files into an Amazon Redshift cluster.

Which solution will MOST accelerate the COPY process?



Answer : B


Question 6

A machinery company wants to collect data from sensors. A data analytics specialist needs to implement a solution that aggregates the data in near-real time and saves the data to a persistent data store. The data must be stored in nested JSON format and must be queried from the data store with a latency of single-digit milliseconds.

Which solution will meet these requirements?



Question 7

An event ticketing website has a data lake on Amazon S3 and a data warehouse on Amazon Redshift. Two datasets exist: events data and sales data. Each dataset has millions of records.

The entire events dataset is frequently accessed and is stored in Amazon Redshift. However, only the last 6 months of sales data is frequently accessed and is stored in Amazon Redshift. The rest of the sales data is available only in Amazon S3.

A data analytics specialist must create a report that shows the total revenue that each event has generated in the last 12 months. The report will be accessed thousands of times each week.

Which solution will meet these requirements with the LEAST operational effort?



Answer : D

This solution meets the requirements because:

A materialized view is a database object that contains the results of a query.It can be used to improve query performance and reduce data processing costs by caching the query results and refreshing them periodically1.

The autorefresh option enables Amazon Redshift to automatically refresh materialized views with up-to-date data from its base tables when materialized views are created with or altered to have this option.Amazon Redshift autorefreshes materialized views as soon as possible after base tables change2.

Amazon Redshift Spectrum enables you to use your existing Amazon Redshift SQL queries to analyze data that is stored in Amazon S3.You can create external tables in your Amazon Redshift cluster and join them with other tables, including materialized views3.

By creating a materialized view in Amazon Redshift with the autorefresh option, the data analytics specialist can precompute and cache the report query results and keep them updated automatically. This can improve the report performance and reduce the load on the Amazon Redshift cluster.

By using Amazon Redshift Spectrum to include sales data that is older than 6 months, the data analytics specialist can access the data that is stored in Amazon S3 without loading it into Amazon Redshift. This can reduce the storage costs and avoid data duplication.


Page:    1 / 14   
Total 207 questions