Data-Engineer-Associate dumps
5 Star


Customer Rating & Feedbacks
98%


Exactly Questions Came From Dumps
Exam Overview

Amazon Data-Engineer-Associate Question Answers

AWS Certified Data Engineer - Associate (DEA-C01) Dumps November 2024

Are you tired of looking for a source that'll keep you updated on the AWS Certified Data Engineer - Associate (DEA-C01) Exam? Plus, has a collection of affordable, high-quality, and incredibly easy Amazon Data-Engineer-Associate Practice Questions? Well then, you are in luck because Salesforcexamdumps.com just updated them! Get Ready to become a AWS Certified Data Engineer Certified.

discount banner
PDF $120  $24
Test Engine
$170  $34
PDF + Test Engine $210  $42

Here are Amazon Data-Engineer-Associate PDF available features:

130 questions with answers Updation Date : 08 Nov, 2024
1 day study required to pass exam 100% Passing Assurance
100% Money Back Guarantee Free 3 Months Updates
Last 24 Hours Result
96

Students Passed

92%

Average Marks

94%

Questions From Dumps

4205

Total Happy Clients

What is Amazon Data-Engineer-Associate?

Amazon Data-Engineer-Associate is a necessary certification exam to get certified. The certification is a reward to the deserving candidate with perfect results. The AWS Certified Data Engineer Certification validates a candidate's expertise to work with Amazon. In this fast-paced world, a certification is the quickest way to gain your employer's approval. Try your luck in passing the AWS Certified Data Engineer - Associate (DEA-C01) Exam and becoming a certified professional today. Salesforcexamdumps.com is always eager to extend a helping hand by providing approved and accepted Amazon Data-Engineer-Associate Practice Questions. Passing AWS Certified Data Engineer - Associate (DEA-C01) will be your ticket to a better future!

Pass with Amazon Data-Engineer-Associate Braindumps!

Contrary to the belief that certification exams are generally hard to get through, passing AWS Certified Data Engineer - Associate (DEA-C01) is incredibly easy. Provided you have access to a reliable resource such as Salesforcexamdumps.com Amazon Data-Engineer-Associate PDF. We have been in this business long enough to understand where most of the resources went wrong. Passing Amazon AWS Certified Data Engineer certification is all about having the right information. Hence, we filled our Amazon Data-Engineer-Associate Dumps with all the necessary data you need to pass. These carefully curated sets of AWS Certified Data Engineer - Associate (DEA-C01) Practice Questions target the most repeated exam questions. So, you know they are essential and can ensure passing results. Stop wasting your time waiting around and order your set of Amazon Data-Engineer-Associate Braindumps now!

We aim to provide all AWS Certified Data Engineer certification exam candidates with the best resources at minimum rates. You can check out our free demo before pressing down the download to ensure Amazon Data-Engineer-Associate Practice Questions are what you wanted. And do not forget about the discount. We always provide our customers with a little extra.

Why Choose Amazon Data-Engineer-Associate PDF?

Unlike other websites, Salesforcexamdumps.com prioritize the benefits of the AWS Certified Data Engineer - Associate (DEA-C01) candidates. Not every Amazon exam candidate has full-time access to the internet. Plus, it's hard to sit in front of computer screens for too many hours. Are you also one of them? We understand that's why we are here with the AWS Certified Data Engineer solutions. Amazon Data-Engineer-Associate Question Answers offers two different formats PDF and Online Test Engine. One is for customers who like online platforms for real-like Exam stimulation. The other is for ones who prefer keeping their material close at hand. Moreover, you can download or print Amazon Data-Engineer-Associate Dumps with ease.

If you still have some queries, our team of experts is 24/7 in service to answer your questions. Just leave us a quick message in the chat-box below or email at [email protected].

Amazon Data-Engineer-Associate Sample Questions

Question # 1

A data engineer needs Amazon Athena queries to finish faster. The data engineer notices that all the files the Athena queries use are currently stored in uncompressed .csv format. The data engineer also notices that users perform most queries by selecting a specific column. Which solution will MOST speed up the Athena query performance? 

A. Change the data format from .csvto JSON format. Apply Snappy compression.
B. Compress the .csv files by using Snappy compression.
C. Change the data format from .csvto Apache Parquet. Apply Snappy compression.
D. Compress the .csv files by using gzjg compression.


Question # 2

A company stores data in a data lake that is in Amazon S3. Some data that the company stores in the data lake contains personally identifiable information (PII). Multiple usergroups need to access the raw data. The company must ensure that user groups can access only the PII that they require. Which solution will meet these requirements with the LEAST effort? 

A. Use Amazon Athena to query the data. Set up AWS Lake Formation and create datafilters to establish levels of access for the company's IAM roles. Assign each user to theIAM role that matches the user's PII access requirements.
B. Use Amazon QuickSight to access the data. Use column-level security features inQuickSight to limit the PII that users can retrieve from Amazon S3 by using AmazonAthena. Define QuickSight access levels based on the PII access requirements of theusers.
C. Build a custom query builder UI that will run Athena queries in the background to accessthe data. Create user groups in Amazon Cognito. Assign access levels to the user groupsbased on the PII access requirements of the users.
D. Create IAM roles that have different levels of granular access. Assign the IAM roles toIAM user groups. Use an identity-based policy to assign access levels to user groups at thecolumn level.


Question # 3

A company receives call logs as Amazon S3 objects that contain sensitive customer information. The company must protect the S3 objects by using encryption. The company must also use encryption keys that only specific employees can access. Which solution will meet these requirements with the LEAST effort? 

A. Use an AWS CloudHSM cluster to store the encryption keys. Configure the process thatwrites to Amazon S3 to make calls to CloudHSM to encrypt and decrypt the objects.Deploy an IAM policy that restricts access to the CloudHSM cluster.
B. Use server-side encryption with customer-provided keys (SSE-C) to encrypt the objectsthat contain customer information. Restrict access to the keys that encrypt the objects.
C. Use server-side encryption with AWS KMS keys (SSE-KMS) to encrypt the objects thatcontain customer information. Configure an IAM policy that restricts access to the KMSkeys that encrypt the objects.
D. Use server-side encryption with Amazon S3 managed keys (SSE-S3) to encrypt theobjects that contain customer information. Configure an IAM policy that restricts access tothe Amazon S3 managed keys that encrypt the objects.


Question # 4

A data engineer needs to maintain a central metadata repository that users access through Amazon EMR and Amazon Athena queries. The repository needs to provide the schema and properties of many tables. Some of the metadata is stored in Apache Hive. The data engineer needs to import the metadata from Hive into the central metadata repository. Which solution will meet these requirements with the LEAST development effort? 

A. Use Amazon EMR and Apache Ranger.
B. Use a Hive metastore on an EMR cluster.
C. Use the AWS Glue Data Catalog.
D. Use a metastore on an Amazon RDS for MySQL DB instance.


Question # 5

A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must follow best practices for running cost-optimized and long-running workloads on Amazon EMR. The team must find a solution that will maintain the company's current level of performance. Which combination of resources will meet these requirements MOST cost-effectively? (Choose two.) 

A. Use Hadoop Distributed File System (HDFS) as a persistent data store.
B. Use Amazon S3 as a persistent data store.
C. Use x86-based instances for core nodes and task nodes.
D. Use Graviton instances for core nodes and task nodes.
E. Use Spot Instances for all primary nodes.


Question # 6

A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the rate of several gigabytes per second. The company wants to derive near real-time insights by using existing business intelligence (BI) and analytics tools. Which solution will meet these requirements with the LEAST operational overhead? 

A. Use Kinesis Data Streams to stage data in Amazon S3. Use the COPY command toload data from Amazon S3 directly into Amazon Redshift to make the data immediatelyavailable for real-time analysis.
B. Access the data from Kinesis Data Streams by using SQL queries. Create materializedviews directly on top of the stream. Refresh the materialized views regularly to query themost recent stream data.
C. Create an external schema in Amazon Redshift to map the data from Kinesis DataStreams to an Amazon Redshift object. Create a materialized view to read data from thestream. Set the materialized view to auto refresh.
D. Connect Kinesis Data Streams to Amazon Kinesis Data Firehose. Use Kinesis DataFirehose to stage the data in Amazon S3. Use the COPY command to load the data fromAmazon S3 to a table in Amazon Redshift.


Question # 7

A company stores details about transactions in an Amazon S3 bucket. The company wants to log all writes to the S3 bucket into another S3 bucket that is in the same AWS Region. Which solution will meet this requirement with the LEAST operational effort? 

A. Configure an S3 Event Notifications rule for all activities on the transactions S3 bucket toinvoke an AWS Lambda function. Program the Lambda function to write the event toAmazon Kinesis Data Firehose. Configure Kinesis Data Firehose to write the event to thelogs S3 bucket.
B. Create a trail of management events in AWS CloudTraiL. Configure the trail to receivedata from the transactions S3 bucket. Specify an empty prefix and write-only events.Specify the logs S3 bucket as the destination bucket.
C. Configure an S3 Event Notifications rule for all activities on the transactions S3 bucket toinvoke an AWS Lambda function. Program the Lambda function to write the events to thelogs S3 bucket.
D. Create a trail of data events in AWS CloudTraiL. Configure the trail to receive data fromthe transactions S3 bucket. Specify an empty prefix and write-only events. Specify the logsS3 bucket as the destination bucket.


Question # 8

A data engineer has a one-time task to read data from objects that are in Apache Parquet format in an Amazon S3 bucket. The data engineer needs to query only one column of the data. Which solution will meet these requirements with the LEAST operational overhead? 

A. Confiqure an AWS Lambda function to load data from the S3 bucket into a pandasdataframe- Write a SQL SELECT statement on the dataframe to query the requiredcolumn.
B. Use S3 Select to write a SQL SELECT statement to retrieve the required column fromthe S3 objects.
C. Prepare an AWS Glue DataBrew project to consume the S3 objects and to query the required column.
D. Run an AWS Glue crawler on the S3 objects. Use a SQL SELECT statement in AmazonAthena to query the required column.


Question # 9

A retail company has a customer data hub in an Amazon S3 bucket. Employees from many countries use the data hub to support company-wide analytics. A governance team must ensure that the company's data analysts can access data only for customers who are within the same country as the analysts. Which solution will meet these requirements with the LEAST operational effort? 

A. Create a separate table for each country's customer data. Provide access to eachanalyst based on the country that the analyst serves.
B. Register the S3 bucket as a data lake location in AWS Lake Formation. Use the LakeFormation row-level security features to enforce the company's access policies.
C. Move the data to AWS Regions that are close to the countries where the customers are.Provide access to each analyst based on the country that the analyst serves.
D. Load the data into Amazon Redshift. Create a view for each country. Create separate1AM roles for each country to provide access to data from each country. Assign theappropriate roles to the analysts.


Question # 10

A company uses Amazon RDS to store transactional data. The company runs an RDS DB instance in a private subnet. A developer wrote an AWS Lambda function with default settings to insert, update, or delete data in the DB instance. The developer needs to give the Lambda function the ability to connect to the DB instance privately without using the public internet. Which combination of steps will meet this requirement with the LEAST operational overhead? (Choose two.) 

A. Turn on the public access setting for the DB instance.
B. Update the security group of the DB instance to allow only Lambda function invocationson the database port.
C. Configure the Lambda function to run in the same subnet that the DB instance uses.
D. Attach the same security group to the Lambda function and the DB instance. Include aself-referencing rule that allows access through the database port.
E. Update the network ACL of the private subnet to include a self-referencing rule thatallows access through the database port.


Question # 11

A company has five offices in different AWS Regions. Each office has its own human resources (HR) department that uses a unique IAM role. The company stores employee records in a data lake that is based on Amazon S3 storage. A data engineering team needs to limit access to the records. Each HR department should be able to access records for only employees who are within the HR department's Region. Which combination of steps should the data engineering team take to meet this requirement with the LEAST operational overhead? (Choose two.) 

A. Use data filters for each Region to register the S3 paths as data locations.
B. Register the S3 path as an AWS Lake Formation location.
C. Modify the IAM roles of the HR departments to add a data filter for each department'sRegion.
D. Enable fine-grained access control in AWS Lake Formation. Add a data filter for eachRegion.
E. Create a separate S3 bucket for each Region. Configure an IAM policy to allow S3access. Restrict access based on Region.


Question # 12

A healthcare company uses Amazon Kinesis Data Streams to stream real-time health data from wearable devices, hospital equipment, and patient records. A data engineer needs to find a solution to process the streaming data. The data engineer needs to store the data in an Amazon Redshift Serverless warehouse. The solution must support near real-time analytics of the streaming data and the previous day's data. Which solution will meet these requirements with the LEAST operational overhead? 

A. Load data into Amazon Kinesis Data Firehose. Load the data into Amazon Redshift.
B. Use the streaming ingestion feature of Amazon Redshift.
C. Load the data into Amazon S3. Use the COPY command to load the data into AmazonRedshift.
D. Use the Amazon Aurora zero-ETL integration with Amazon Redshift.


Question # 13

A company is migrating a legacy application to an Amazon S3 based data lake. A data engineer reviewed data that is associated with the legacy application. The data engineer found that the legacy data contained some duplicate information. The data engineer must identify and remove duplicate information from the legacy application data. Which solution will meet these requirements with the LEAST operational overhead? 

A. Write a custom extract, transform, and load (ETL) job in Python. Use theDataFramedrop duplicatesf) function by importingthe Pandas library to perform datadeduplication.
B. Write an AWS Glue extract, transform, and load (ETL) job. Usethe FindMatchesmachine learning(ML) transform to transform the data to perform data deduplication.
C. Write a custom extract, transform, and load (ETL) job in Python. Import the Pythondedupe library. Use the dedupe library to perform data deduplication.
D. Write an AWS Glue extract, transform, and load (ETL) job. Import the Python dedupelibrary. Use the dedupe library to perform data deduplication.


Question # 14

A company needs to build a data lake in AWS. The company must provide row-level data access and column-level data access to specific teams. The teams will access the data by using Amazon Athena, Amazon Redshift Spectrum, and Apache Hive from Amazon EMR. Which solution will meet these requirements with the LEAST operational overhead? 

A. Use Amazon S3 for data lake storage. Use S3 access policies to restrict data access byrows and columns. Provide data access throughAmazon S3.
B. Use Amazon S3 for data lake storage. Use Apache Ranger through Amazon EMR torestrict data access byrows and columns. Providedata access by using Apache Pig.
C. Use Amazon Redshift for data lake storage. Use Redshift security policies to restrictdata access byrows and columns. Provide data accessby usingApache Spark and AmazonAthena federated queries.
D. UseAmazon S3 for data lake storage. Use AWS Lake Formation to restrict data accessby rows and columns. Provide data access through AWS Lake Formation.


Question # 15

A company uses an Amazon Redshift provisioned cluster as its database. The Redshift cluster has five reserved ra3.4xlarge nodes and uses key distribution. A data engineer notices that one of the nodes frequently has a CPU load over 90%. SQL Queries that run on the node are queued. The other four nodes usually have a CPU load under 15% during daily operations. The data engineer wants to maintain the current number of compute nodes. The data engineer also wants to balance the load more evenly across all five compute nodes. Which solution will meet these requirements? 

A. Change the sort key to be the data column that is most often used in a WHERE clauseof the SQL SELECT statement.
B. Change the distribution key to the table column that has the largest dimension.
C. Upgrade the reserved node from ra3.4xlarqe to ra3.16xlarqe.
D. Change the primary key to be the data column that is most often used in a WHEREclause of the SQL SELECT statement.


Question # 16

A company is developing an application that runs on Amazon EC2 instances. Currently, the data that the application generates is temporary. However, the company needs to persist the data, even if the EC2 instances are terminated. A data engineer must launch new EC2 instances from an Amazon Machine Image (AMI) and configure the instances to preserve the data. Which solution will meet this requirement? 

A. Launch new EC2 instances by using an AMI that is backed by an EC2 instance storevolume that contains the application data. Apply the default settings to the EC2 instances.
B. Launch new EC2 instances by using an AMI that is backed by a root Amazon ElasticBlock Store (Amazon EBS) volume that contains the application data. Apply the defaultsettings to the EC2 instances.
C. Launch new EC2 instances by using an AMI that is backed by an EC2 instance storevolume. Attach an Amazon Elastic Block Store (Amazon EBS) volume to contain theapplication data. Apply the default settings to the EC2 instances.
D. Launch new EC2 instances by using an AMI that is backed by an Amazon Elastic BlockStore (Amazon EBS) volume. Attach an additional EC2 instance store volume to containthe application data. Apply the default settings to the EC2 instances.


Question # 17

A data engineer must ingest a source of structured data that is in .csv format into an Amazon S3 data lake. The .csv files contain 15 columns. Data analysts need to run Amazon Athena queries on one or two columns of the dataset. The data analysts rarely query the entire file. Which solution will meet these requirements MOST cost-effectively? 

A. Use an AWS Glue PySpark job to ingest the source data into the data lake in .csvformat.
B. Create an AWS Glue extract, transform, and load (ETL) job to read from the .csvstructured data source. Configure the job to ingest the data into the data lake in JSONformat.C. Use an AWS Glue PySpark job to ingest the source data into the data lake in ApacheAvro format.
D. Create an AWS Glue extract, transform, and load (ETL) job to read from the .csvstructured data source. Configure the job to write the data into the data lake in ApacheParquet format.


Question # 18

A data engineer uses Amazon Redshift to run resource-intensive analytics processes once every month. Every month, the data engineer creates a new Redshift provisioned cluster. The data engineer deletes the Redshift provisioned cluster after the analytics processes are complete every month. Before the data engineer deletes the cluster each month, the data engineer unloads backup data from the cluster to an Amazon S3 bucket. The data engineer needs a solution to run the monthly analytics processes that does not require the data engineer to manage the infrastructure manually. Which solution will meet these requirements with the LEAST operational overhead? 

A. Use Amazon Step Functions to pause the Redshift cluster when the analytics processesare complete and to resume the cluster to run new processes every month.
B. Use Amazon Redshift Serverless to automatically process the analytics workload.
C. Use the AWS CLI to automatically process the analytics workload.
D. Use AWS CloudFormation templates to automatically process the analytics workload.


Question # 19

A financial company wants to use Amazon Athena to run on-demand SQL queries on a petabyte-scale dataset to support a business intelligence (BI) application. An AWS Glue job that runs during non-business hours updates the dataset once every day. The BI application has a standard data refresh frequency of 1 hour to comply with company policies. A data engineer wants to cost optimize the company's use of Amazon Athena without adding any additional infrastructure costs. Which solution will meet these requirements with the LEAST operational overhead? 

A. Configure an Amazon S3 Lifecycle policy to move data to the S3 Glacier Deep Archivestorage class after 1 day
B. Use the query result reuse feature of Amazon Athena for the SQL queries.
C. Add an Amazon ElastiCache cluster between the Bl application and Athena.
D. Change the format of the files that are in the dataset to Apache Parquet.


Question # 20

A company uses an Amazon Redshift cluster that runs on RA3 nodes. The company wants to scale read and write capacity to meet demand. A data engineer needs to identify a solution that will turn on concurrency scaling. Which solution will meet this requirement? 

A. Turn on concurrency scaling in workload management (WLM) for Redshift Serverlessworkgroups.
B. Turn on concurrency scaling at the workload management (WLM) queue level in theRedshift cluster.
C. Turn on concurrency scaling in the settings duringthe creation of andnew Redshiftcluster.
D. Turn on concurrency scaling for the daily usage quota for the Redshift cluster.


Question # 21

A company has a production AWS account that runs company workloads. The company's security team created a security AWS account to store and analyze security logs from the production AWS account. The security logs in the production AWS account are stored in Amazon CloudWatch Logs. The company needs to use Amazon Kinesis Data Streams to deliver the security logs to the security AWS account. Which solution will meet these requirements? 

A. Create a destination data stream in the production AWS account. In the security AWSaccount, create an IAM role that has cross-account permissions to Kinesis Data Streams inthe production AWS account.
B. Create a destination data stream in the security AWS account. Create an IAM role and atrust policy to grant CloudWatch Logs the permission to put data into the stream. Create asubscription filter in the security AWS account.
C. Create a destination data stream in the production AWS account. In the production AWSaccount, create an IAM role that has cross-account permissions to Kinesis Data Streams inthe security AWS account.
D. Create a destination data stream in the security AWS account. Create an IAM role and atrust policy to grant CloudWatch Logs the permission to put data into the stream. Create asubscription filter in the production AWS account.


Question # 22

A company is migrating on-premises workloads to AWS. The company wants to reduce overall operational overhead. The company also wants to explore serverless options. The company's current workloads use Apache Pig, Apache Oozie, Apache Spark, Apache Hbase, and Apache Flink. The on-premises workloads process petabytes of data in seconds. The company must maintain similar or better performance after the migration to AWS. Which extract, transform, and load (ETL) service will meet these requirements? 

A. AWS Glue
B. Amazon EMR
C. AWS Lambda
D. Amazon Redshift


Question # 23

A data engineering team is using an Amazon Redshift data warehouse for operational reporting. The team wants to prevent performance issues that might result from longrunning queries. A data engineer must choose a system table in Amazon Redshift to record anomalies when a query optimizer identifies conditions that might indicate performance issues. Which table views should the data engineer use to meet this requirement? 

A. STL USAGE CONTROL
B. STL ALERT EVENT LOG
C. STL QUERY METRICS
D. STL PLAN INFO


Question # 24

A media company wants to improve a system that recommends media content to customer based on user behavior and preferences. To improve the recommendation system, the company needs to incorporate insights from third-party datasets into the company's existing analytics platform. The company wants to minimize the effort and time required to incorporate third-party datasets. Which solution will meet these requirements with the LEAST operational overhead? 

A. Use API calls to access and integrate third-party datasets from AWS Data Exchange.
B. Use API calls to access and integrate third-party datasets from AWS
C. Use Amazon Kinesis Data Streams to access and integrate third-party datasets fromAWS CodeCommit repositories.
D. Use Amazon Kinesis Data Streams to access and integrate third-party datasets fromAmazon Elastic Container Registry (Amazon ECR).


Question # 25

A company uses an on-premises Microsoft SQL Server database to store financial transaction data. The company migrates the transaction data from the on-premises database to AWS at the end of each month. The company has noticed that the cost to migrate data from the on-premises database to an Amazon RDS for SQL Server database has increased recently. The company requires a cost-effective solution to migrate the data to AWS. The solution must cause minimal downtown for the applications that access the database. Which AWS service should the company use to meet these requirements? 

A. AWS Lambda
B. AWS Database Migration Service (AWS DMS)
C. AWS Direct Connect
D. AWS DataSync


Question # 26

A company has used an Amazon Redshift table that is named Orders for 6 months. The company performs weekly updates and deletes on the table. The table has an interleaved sort key on a column that contains AWS Regions. The company wants to reclaim disk space so that the company will not run out of storage space. The company also wants to analyze the sort key column. Which Amazon Redshift command will meet these requirements? 

A. VACUUM FULL Orders
B. VACUUM DELETE ONLY Orders
C. VACUUM REINDEX Orders
D. VACUUM SORT ONLY Orders


Question # 27

A company extracts approximately 1 TB of data every day from data sources such as SAP HANA, Microsoft SQL Server, MongoDB, Apache Kafka, and Amazon DynamoDB. Some of the data sources have undefined data schemas or data schemas that change. A data engineer must implement a solution that can detect the schema for these data sources. The solution must extract, transform, and load the data to an Amazon S3 bucket. The company has a service level agreement (SLA) to load the data into the S3 bucket within 15 minutes of data creation. Which solution will meet these requirements with the LEAST operational overhead? 

A. Use Amazon EMR to detect the schema and to extract, transform, and load the data intothe S3 bucket. Create a pipeline in Apache Spark.
B. Use AWS Glue to detect the schema and to extract, transform, and load the data intothe S3 bucket. Create a pipeline in Apache Spark.
C. Create a PvSpark proqram in AWS Lambda to extract, transform, and load the data intothe S3 bucket.
D. Create a stored procedure in Amazon Redshift to detect the schema and to extract,transform, and load the data into a Redshift Spectrum table. Access the table from AmazonS3.


Question # 28

A company stores data from an application in an Amazon DynamoDB table that operates in provisioned capacity mode. The workloads of the application have predictable throughput load on a regular schedule. Every Monday, there is an immediate increase in activity early in the morning. The application has very low usage during weekends. The company must ensure that the application performs consistently during peak usage times. Which solution will meet these requirements in the MOST cost-effective way? 

A. Increase the provisioned capacity to the maximum capacity that is currently presentduring peak load times.
B. Divide the table into two tables. Provision each table with half of the provisionedcapacity of the original table. Spread queries evenly across both tables.
C. Use AWS Application Auto Scaling to schedule higher provisioned capacity for peakusage times. Schedule lower capacity during off-peak times.
D. Change the capacity mode from provisioned to on-demand. Configure the table to scaleup and scale down based on the load on the table.


Question # 29

A company is planning to migrate on-premises Apache Hadoop clusters to Amazon EMR. The company also needs to migrate a data catalog into a persistent storage solution. The company currently stores the data catalog in an on-premises Apache Hive metastore on the Hadoop clusters. The company requires a serverless solution to migrate the data catalog. Which solution will meet these requirements MOST cost-effectively? 

A. Use AWS Database Migration Service (AWS DMS) to migrate the Hive metastore intoAmazon S3. Configure AWS Glue Data Catalog to scan Amazon S3 to produce the datacatalog.
B. Configure a Hive metastore in Amazon EMR. Migrate the existing on-premises Hivemetastore into Amazon EMR. Use AWS Glue Data Catalog to store the company's datacatalog as an external data catalog.
C. Configure an external Hive metastore in Amazon EMR. Migrate the existing on-premisesHive metastore into Amazon EMR. Use Amazon Aurora MySQL to store the company'sdata catalog.
D. Configure a new Hive metastore in Amazon EMR. Migrate the existing on-premises Hivemetastore into Amazon EMR. Use the new metastore as the company's data catalog.


Question # 30

A company loads transaction data for each day into Amazon Redshift tables at the end of each day. The company wants to have the ability to track which tables have been loaded and which tables still need to be loaded. A data engineer wants to store the load statuses of Redshift tables in an Amazon DynamoDB table. The data engineer creates an AWS Lambda function to publish the details of the load statuses to DynamoDB. How should the data engineer invoke the Lambda function to write load statuses to the DynamoDB table? 

A. Use a second Lambda function to invoke the first Lambda function based on AmazonCloudWatch events.
B. Use the Amazon Redshift Data API to publish an event to Amazon EventBridqe.Configure an EventBridge rule to invoke the Lambda function.
C. Use the Amazon Redshift Data API to publish a message to an Amazon Simple Queue Service (Amazon SQS) queue. Configure the SQS queue to invoke the Lambda function.
D. Use a second Lambda function to invoke the first Lambda function based on AWSCloudTrail events.


Question # 31

A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of each S3 object is less than 100 MB. Which solution will meet these requirements MOST cost-effectively? 

A. Write a custom Python application. Host the application on an Amazon ElasticKubernetes Service (Amazon EKS) cluster.
B. Write a PySpark ETL script. Host the script on an Amazon EMR cluster.
C. Write an AWS Glue PySpark job. Use Apache Spark to transform the data.
D. Write an AWS Glue Python shell job. Use pandas to transform the data.


Question # 32

A financial company wants to implement a data mesh. The data mesh must support centralized data governance, data analysis, and data access control. The company has decided to use AWS Glue for data catalogs and extract, transform, and load (ETL) operations. Which combination of AWS services will implement a data mesh? (Choose two.) 

A. Use Amazon Aurora for data storage. Use an Amazon Redshift provisioned cluster fordata analysis.
B. Use Amazon S3 for data storage. Use Amazon Athena for data analysis.
C. Use AWS Glue DataBrewfor centralized data governance and access control.
D. Use Amazon RDS for data storage. Use Amazon EMR for data analysis.
E. Use AWS Lake Formation for centralized data governance and access control.


Question # 33

A company has a frontend ReactJS website that uses Amazon API Gateway to invoke REST APIs. The APIs perform the functionality of the website. A data engineer needs to write a Python script that can be occasionally invoked through API Gateway. The code must return results to API Gateway. Which solution will meet these requirements with the LEAST operational overhead? 

A. Deploy a custom Python script on an Amazon Elastic Container Service (Amazon ECS)cluster.
B. Create an AWS Lambda Python function with provisioned concurrency.
C. Deploy a custom Python script that can integrate with API Gateway on Amazon ElasticKubernetes Service (Amazon EKS).
D. Create an AWS Lambda function. Ensure that the function is warm byscheduling anAmazon EventBridge rule to invoke the Lambda function every 5 minutes by usingmockevents.


Question # 34

A company uses Amazon Redshift for its data warehouse. The company must automate refresh schedules for Amazon Redshift materialized views. Which solution will meet this requirement with the LEAST effort? 

A. Use Apache Airflow to refresh the materialized views.
B. Use an AWS Lambda user-defined function (UDF) within Amazon Redshift to refresh thematerialized views.
C. Use the query editor v2 in Amazon Redshift to refresh the materialized views.
D. Use an AWS Glue workflow to refresh the materialized views.


Question # 35

A financial services company stores financial data in Amazon Redshift. A data engineer wants to run real-time queries on the financial data to support a web-based trading application. The data engineer wants to run the queries from within the trading application. Which solution will meet these requirements with the LEAST operational overhead? 

A. Establish WebSocket connections to Amazon Redshift.
B. Use the Amazon Redshift Data API.
C. Set up Java Database Connectivity (JDBC) connections to Amazon Redshift.
D. Store frequently accessed data in Amazon S3. Use Amazon S3 Select to run thequeries.


Question # 36

A data engineer must orchestrate a series of Amazon Athena queries that will run every day. Each query can run for more than 15 minutes. Which combination of steps will meet these requirements MOST cost-effectively? (Choose two.) 

A. Use an AWS Lambda function and the Athena Boto3 client start_query_execution APIcall to invoke the Athena queries programmatically.
B. Create an AWS Step Functions workflow and add two states. Add the first state beforethe Lambda function. Configure the second state as a Wait state to periodically checkwhether the Athena query has finished using the Athena Boto3 get_query_execution APIcall. Configure the workflow to invoke the next query when the current query has finishedrunning.
C. Use an AWS Glue Python shell job and the Athena Boto3 client start_query_executionAPI call to invoke the Athena queries programmatically.
D. Use an AWS Glue Python shell script to run a sleep timer that checks every 5 minutes todetermine whether the current Athena query has finished running successfully. Configurethe Python shell script to invoke the next query when the current query has finishedrunning.
E. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to orchestratethe Athena queries in AWS Batch.


Question # 37

A data engineer must use AWS services to ingest a dataset into an Amazon S3 data lake. The data engineer profiles the dataset and discovers that the dataset contains personally identifiable information (PII). The data engineer must implement a solution to profile the dataset and obfuscate the PII. Which solution will meet this requirement with the LEAST operational effort? 

A. Use an Amazon Kinesis Data Firehose delivery stream to process the dataset. Createan AWS Lambda transform function to identify the PII. Use an AWS SDK to obfuscate thePII. Set the S3 data lake as the target for the delivery stream.
B. Use the Detect PII transform in AWS Glue Studio to identify the PII. Obfuscate the PII.Use an AWS Step Functions state machine to orchestrate a data pipeline to ingest the datainto the S3 data lake.
C. Use the Detect PII transform in AWS Glue Studio to identify the PII. Create a rule inAWS Glue Data Quality to obfuscate the PII. Use an AWS Step Functions state machine toorchestrate a data pipeline to ingest the data into the S3 data lake.
D. Ingest the dataset into Amazon DynamoDB. Create an AWS Lambda function to identifyand obfuscate the PII in the DynamoDB table and to transform the data. Use the sameLambda function to ingest the data into the S3 data lake.


Question # 38

During a security review, a company identified a vulnerability in an AWS Glue job. The company discovered that credentials to access an Amazon Redshift cluster were hard coded in the job script. A data engineer must remediate the security vulnerability in the AWS Glue job. The solution must securely store the credentials. Which combination of steps should the data engineer take to meet these requirements? (Choose two.) 

A. Store the credentials in the AWS Glue job parameters.
B. Store the credentials in a configuration file that is in an Amazon S3 bucket.
C. Access the credentials from a configuration file that is in an Amazon S3 bucket by usingthe AWS Glue job.
D. Store the credentials in AWS Secrets Manager.
E. Grant the AWS Glue job 1AM role access to the stored credentials.


Question # 39

A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01. A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket. Which solution will meet these requirements with the LEAST latency? 

A. Schedule an AWS Glue crawler to run every morning.
B. Manually run the AWS Glue CreatePartition API twice each day.
C. Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create partitionAPI call.
D. Run the MSCK REPAIR TABLE command from the AWS Glue console.


Amazon Data-Engineer-Associate Frequently Asked Questions


When will the beta results be available?

Beta exam results will be available in February 2024.

When will the standard version of the exam be available to take?

The standard version of the exam is available to schedule and take starting March 12, 2024.

When will additional exam prep resources become available?
Are there pre-requisites for the AWS Certified Data Engineer - Associate exam?
No, there are no pre-requisites. The recommended experience prior to taking this exam is the equivalent of 2-3 years in data engineering or data architecture and a minimum of 1-2 years of hands-on experience with AWS services.
How will the AWS Certified Data Engineer - Associate help my career?

This is an in-demand role with a low supply of skilled professionals. AWS Certified Data Engineer - Associate and accompanying prep resources offer you a means to build your confidence and credibility in data engineer, data architect, and other data-related roles.

What certification(s) should I earn next after AWS Certified Data Engineer - Associate?

The AWS Certified Security - Specialty certification is a recommended next step for cloud data professionals to validate their expertise in cloud data security and governance.

How long is this certification valid for?

 This certification is valid for 3 years. Before your certification expires, you can recertify by passing the latest version of this exam.

Are there any savings on exams if I already hold an active AWS Certification?

Yes. Once you earn one AWS Certification, you get a 50% discount on your next AWS Certification exam. You can sign in and access this discount in your AWS Certification Account.

Customers Feedback

What our clients say about Data-Engineer-Associate Practice Test

    Faraz Kashyap     Nov 12, 2024
A heartfelt thank you to Salesforcexamdumps! My success story stands as undeniable proof of the credibility and effectiveness of their Data-Engineer-Associate Exam Tips. These dumps guided me seamlessly through every topic, making the exam appear effortlessly conquerable.
    Shobha Sibal     Nov 11, 2024
Salesforcexamdumps is a hidden gem for Data-Engineer-Associate exam preparation. The offers they have are unbeatable. I love that they offer the study materials in both PDF and test engine formats. Plus, their money-back guarantee shows they stand behind their product. I couldn't be happier with my purchase!
    Nilima Mody     Nov 11, 2024
I stumbled upon Salesforcexamdumps, and it's been a revelation. The Data Engineer Associate content is top-notch. The 80% discount is unbelievable, and the money-back guarantee gives you peace of mind. Don't miss out!
    Roman Martin     Nov 10, 2024
A big thanks to Salesforcexamdumps for covering AWS Cloud Compliance and Security in their Data-Engineer-Associate Test Prep. It's thanks to them that I sailed through the exam effortlessly. Their challenging Data-Engineer-Associate dumps were a true reflection of the actual exam, leading me to pass with flying colors.
    Damian Brown     Nov 10, 2024
Salesforcexamdumps's AWS Certified Data Engineer - Associate (DEA-C01) Certification Questions are an essential resource to explore. The invaluable AWS Certified Data Engineer - Associate (DEA-C01) Exam Insights were a welcomed bonus to their well-structured Data-Engineer-Associate practice test. I am deeply appreciative of the invaluable help and support I received.
    Jaxson Jackson     Nov 09, 2024
I couldn't resist sharing this glowing AWS Certified Data Engineer - Associate (DEA-C01) Review. Salesforcexamdumps has turned my dreams into reality, and today, I stand as a proud AWS-certified professional. This achievement is all thanks to their Data-Engineer-Associate braindumps, along with their invaluable insights and tips, which gave me the unwavering confidence to succeed.
    Ishat Gopal     Nov 09, 2024
Salesforcexamdumps has made my exam preparation a breeze. Their Data Engineer Associate content is comprehensive and well-structured. The 80% discount and money-back guarantee . Don't look elsewhere!
    Owen King     Nov 08, 2024
The AWS Certified Data Engineer - Associate (DEA-C01) Learning Path offered by Salesforcexamdumps is your ultimate roadmap to success. With comprehensive coverage of AWS Fundamental Knowledge, passing the exam becomes inevitable. I personally tried and endorsed Salesforcexamdumps's Data-Engineer-Associate question answers. You should definitely give them a try!
    Lucas Mitchell     Nov 08, 2024
The Data-Engineer-Associate braindumps provided by Salesforcexamdumps are nothing short of excellent for AWS Data-Engineer-Associate Exam Preparation. I had complete trust in their originality and relevance throughout the Data-Engineer-Associate Certification Guide. Covering all necessary topics with detailed explanations, I wholeheartedly recommend these to anyone serious about passing.
    Ragini Loke     Nov 07, 2024
I am incredibly impressed with Salesforcexamdumps! The amazing offers they provide make it extremely affordable to access Data-Engineer-Associate exam materials. The PDF and test engine formats are a game-changer, allowing me to study in the way that suits me best. And the icing on the cake is their money-back guarantee, which gave me the confidence to try it out. Highly recommended!

Leave a comment

Your email address will not be published. Required fields are marked *

Rating / Feedback About This Exam