AWS Certified Machine Learning Specialty (MLS-C01)

The AWS Certified Machine Learning Specialty (MLS-C01) were last updated on today.
  • Viewing page 2 out of 57 pages.
  • Viewing questions 6-10 out of 285 questions
Disclaimers:
  • - ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.and Azure
  • - Trademarks, certification & product names are used for reference only and belong to Amazon.and Azure

Topic 1 - Exam A

Question #6 Topic 1

A technology startup is using complex deep neural networks and GPU compute to recommend the company's products to its existing customers based upon each customer's habits and interactions. The solution currently pulls each dataset from an Amazon S3 bucket before loading the data into a TensorFlow model pulled from the company's Git repository that runs locally. This job then runs for several hours while continually outputting its progress to the same S3 bucket. The job can be paused, restarted, and continued at any time in the event of a failure, and is run from a central queue. Senior managers are concerned about the complexity of the solution's resource management and the costs involved in repeating the process regularly. They ask for the workload to be automated so it runs once a week, starting Monday and completing by the close of business Friday. Which architecture should be used to scale the solution at the lowest cost?

  • A Implement the solution using AWS Deep Learning Containers and run the container as a job using AWS Batch on a GPU-compatible Spot Instance
  • B Implement the solution using a low-cost GPU-compatible Amazon EC2 instance and use the AWS Instance Scheduler to schedule the task
  • C Implement the solution using AWS Deep Learning Containers, run the workload using AWS Fargate running on Spot Instances, and then schedule the task using the built-in task scheduler
  • D Implement the solution using Amazon ECS running on Spot Instances and schedule the task using the ECS service scheduler
Suggested Answer: C
NOTE:
Question #7 Topic 1

A gaming company has launched an online game where people can start playing for free, but they need to pay if they choose to use certain features. The company needs to build an automated system to predict whether or not a new user will become a paid user within 1 year. The company has gathered a labeled dataset from 1 million users. The training dataset consists of 1,000 positive samples (from users who ended up paying within 1 year) and 999,000 negative samples (from users who did not use any paid features). Each data sample consists of 200 features including user age, device, location, and play patterns. Using this dataset for training, the Data Science team trained a random forest model that converged with over 99% accuracy on the training set. However, the prediction results on a test dataset were not satisfactory Which of the following approaches should the Data Science team take to mitigate this issue? (Choose two.)

  • A Add more deep trees to the random forest to enable the model to learn more features.
  • B Include a copy of the samples in the test dataset in the training dataset.
  • C Generate more positive samples by duplicating the positive samples and adding a small amount of noise to the duplicated data.
  • D Change the cost function so that false negatives have a higher impact on the cost value than false positives.
  • E Change the cost function so that false positives have a higher impact on the cost value than false negatives.
Suggested Answer: CD
NOTE:
Question #8 Topic 1

A retail company is using Amazon Personalize to provide personalized product recommendations for its customers during a marketing campaign. The company sees a significant increase in sales of recommended items to existing customers immediately after deploying a new solution version, but these sales decrease a short time after deployment. Only historical data from before the marketing campaign is available for training. How should a data scientist adjust the solution?

  • A Use the event tracker in Amazon Personalize to include real-time user interactions.
  • B Add user metadata and use the HRNN-Metadata recipe in Amazon Personalize.
  • C Implement a new solution using the built-in factorization machines (FM) algorithm in Amazon SageMaker.
  • D Add event type and event value fields to the interactions dataset in Amazon Personalize.
Suggested Answer: D
NOTE:
Question #9 Topic 1

A Machine Learning Specialist is working with a large company to leverage machine learning within its products. The company wants to group its customers into categories based on which customers will and will not churn within the next 6 months. The company has labeled the data available to the Specialist. Which machine learning model type should the Specialist use to accomplish this task?

  • A Linear regression
  • B Classification
  • C Clustering
  • D Reinforcement learning
Suggested Answer: B
NOTE: The goal of classification is to determine to which class or category a data point (customer in our case) belongs to. For classification problems, data scientists would use historical data with predefined target variables AKA labels (churner/non-churner) ג€" answers that need to be predicted ג€" to train an algorithm. With classification, businesses can answer the following questions: ✑ Will this customer churn or not? ✑ Will a customer renew their subscription? ✑ Will a user downgrade a pricing plan? ✑ Are there any signs of unusual customer behavior? Reference: https://www.kdnuggets.com/2019/05/churn-prediction-machine-learning.html
Question #10 Topic 1

An insurance company is developing a new device for vehicles that uses a camera to observe drivers' behavior and alert them when they appear distracted. The company created approximately 10,000 training images in a controlled environment that a Machine Learning Specialist will use to train and evaluate machine learning models. During the model evaluation, the Specialist notices that the training error rate diminishes faster as the number of epochs increases and the model is not accurately inferring on the unseen test images. Which of the following should be used to resolve this issue? (Choose two.)

  • A Add vanishing gradient to the model.
  • B Perform data augmentation on the training data.
  • C Make the neural network architecture complex.
  • D Use gradient checking in the model.
  • E Add L2 regularization to the model.
Suggested Answer: BE
NOTE: