AWS Certified Data Analytics - Specialty (DAS-C01)

The AWS Certified Data Analytics - Specialty (DAS-C01) were last updated on today.
  • Viewing page 1 out of 32 pages.
  • Viewing questions 1-5 out of 160 questions
Disclaimers:
  • - ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.and Azure
  • - Trademarks, certification & product names are used for reference only and belong to Amazon.and Azure

Topic 1 - Exam A

Question #1 Topic 1

A mortgage company has a microservice for accepting payments. This microservice uses the Amazon DynamoDB encryption client with AWS KMS managed keys to encrypt the sensitive data before writing the data to DynamoDB. The finance team should be able to load this data into Amazon Redshift and aggregate the values within the sensitive fields. The Amazon Redshift cluster is shared with other data analysts from different business units. Which steps should a data analyst take to accomplish this task efficiently and securely?

  • A Create an AWS Lambda function to process the DynamoDB stream. Decrypt the sensitive data using the same KMS key. Save the output to a restricted S3 bucket for the finance team. Create a finance table in Amazon Redshift that is accessible to the finance team only. Use the COPY command to load the data from Amazon S3 to the finance table.
  • B Create an AWS Lambda function to process the DynamoDB stream. Save the output to a restricted S3 bucket for the finance team. Create a finance table in Amazon Redshift that is accessible to the finance team only. Use the COPY command with the IAM role that has access to the KMS key to load the data from S3 to the finance table.
  • C Create an Amazon EMR cluster with an EMR_EC2_DefaultRole role that has access to the KMS key. Create Apache Hive tables that reference the data stored in DynamoDB and the finance table in Amazon Redshift. In Hive, select the data from DynamoDB and then insert the output to the finance table in Amazon Redshift.
  • D Create an Amazon EMR cluster. Create Apache Hive tables that reference the data stored in DynamoDB. Insert the output to the restricted Amazon S3 bucket for the finance team. Use the COPY command with the IAM role that has access to the KMS key to load the data from Amazon S3 to the finance table in Amazon Redshift.
Suggested Answer: A
NOTE: The data analyst should create an AWS Lambda function to process the DynamoDB stream. The function should decrypt the sensitive data using the same KMS key and save the output to a restricted S3 bucket for the finance team. The analyst should then create a finance table in Amazon Redshift that is accessible only to the finance team. Finally, the COPY command should be used to load the data from Amazon S3 to the finance table.
Question #2 Topic 1

A company has developed an Apache Hive script to batch process data stared in Amazon S3. The script needs to run once every day and store the output in Amazon S3. The company tested the script, and it completes within 30 minutes on a small local three-node cluster. Which solution is the MOST cost-effective for scheduling and executing the script?

  • A Create an AWS Lambda function to spin up an Amazon EMR cluster with a Hive execution step. Set KeepJobFlowAliveWhenNoSteps to false and disable the termination protection flag. Use Amazon CloudWatch Events to schedule the Lambda function to run daily.
  • B Use the AWS Management Console to spin up an Amazon EMR cluster with Python Hue. Hive, and Apache Oozie. Set the termination protection flag to true and use Spot Instances for the core nodes of the cluster. Configure an Oozie workflow in the cluster to invoke the Hive script daily.
  • C Create an AWS Glue job with the Hive script to perform the batch operation. Configure the job to run once a day using a time-based schedule.
  • D Use AWS Lambda layers and load the Hive runtime to AWS Lambda and copy the Hive script. Schedule the Lambda function to run daily by creating a workflow using AWS Step Functions.
Suggested Answer: A
NOTE: The most cost-effective solution for scheduling and executing the script would be to create an AWS Lambda function. This option eliminates the need for an EMR cluster and the associated costs. By using Lambda, the script can be run directly without the overhead of managing a cluster.
Question #3 Topic 1

A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into Amazon Redshift. The engineering team will combine all the user data and apply complex calculations that require I/O intensive resources. The company needs to adjust the cluster's capacity to support the change in analytical and storage requirements. Which solution meets these requirements?

  • A Resize the cluster using elastic resize with dense compute nodes.
  • B Resize the cluster using classic resize with dense compute nodes.
  • C Resize the cluster using elastic resize with dense storage nodes.
  • D Resize the cluster using classic resize with dense storage nodes.
Suggested Answer: C
NOTE: The company needs to adjust the cluster's capacity to support the change in analytical and storage requirements. Since the engineering team will be applying complex calculations that require I/O intensive resources, it is more appropriate to resize the cluster using elastic resize with dense storage nodes.
Question #4 Topic 1

A telecommunications company is looking for an anomaly-detection solution to identify fraudulent calls. The company currently uses Amazon Kinesis to stream voice call records in a JSON format from its on-premises database to Amazon S3. The existing dataset contains voice call records with 200 columns. To detect fraudulent calls, the solution would need to look at 5 of these columns only. The company is interested in a cost-effective solution using AWS that requires minimal effort and experience in anomaly-detection algorithms. Which solution meets these requirements?

  • A Use an AWS Glue job to transform the data from JSON to Apache Parquet. Use AWS Glue crawlers to discover the schema and build the AWS Glue Data Catalog. Use Amazon Athena to create a table with a subset of columns. Use Amazon QuickSight to visualize the data and then use Amazon QuickSight machine learning-powered anomaly detection.
  • B Use Kinesis Data Firehose to detect anomalies on a data stream from Kinesis by running SQL queries, which compute an anomaly score for all calls and store the output in Amazon RDS. Use Amazon Athena to build a dataset and Amazon QuickSight to visualize the results.
  • C Use an AWS Glue job to transform the data from JSON to Apache Parquet. Use AWS Glue crawlers to discover the schema and build the AWS Glue Data Catalog. Use Amazon SageMaker to build an anomaly detection model that can detect fraudulent calls by ingesting data from Amazon S3.
  • D Use Kinesis Data Analytics to detect anomalies on a data stream from Kinesis by running SQL queries, which compute an anomaly score for all calls. Connect Amazon QuickSight to Kinesis Data Analytics to visualize the anomaly scores.
Suggested Answer: C
NOTE: Option C meets the requirements because it uses AWS Glue to transform the data from JSON to Apache Parquet and build the data catalog. It also utilizes Amazon SageMaker to build an anomaly detection model that can detect fraudulent calls by ingesting data from Amazon S3. This solution is cost-effective and requires minimal effort and experience in anomaly-detection algorithms.
Question #5 Topic 1

A company uses the Amazon Kinesis SDK to write data to Kinesis Data Streams. Compliance requirements state that the data must be encrypted at rest using a key that can be rotated. The company wants to meet this encryption requirement with minimal coding effort. How can these requirements be met?

  • A Create a customer master key (CMK) in AWS KMS. Assign the CMK an alias. Use the AWS Encryption SDK, providing it with the key alias to encrypt and decrypt the data.
  • B Create a customer master key (CMK) in AWS KMS. Assign the CMK an alias. Enable server-side encryption on the Kinesis data stream using the CMK alias as the KMS master key.
  • C Create a customer master key (CMK) in AWS KMS. Create an AWS Lambda function to encrypt and decrypt the data. Set the KMS key ID in the function's environment variables.
  • D Enable server-side encryption on the Kinesis data stream using the default KMS key for Kinesis Data Streams.
Suggested Answer: B
NOTE: The requirements can be met by creating a customer master key (CMK) in AWS KMS, assigning the CMK an alias, and enabling server-side encryption on the Kinesis data stream using the CMK alias as the KMS master key. This approach ensures that the data is encrypted at rest with a key that can be easily rotated.