5 Star

Customer Rating & Feedbacks

98%

Exactly Questions Came From Dumps

Amazon DAS-C01 Question Answers

Name: DAS-C01
Brand: Salesforcexamdumps
SKU: DAS-C01
Rating: 4.8 (3011 reviews)

AWS Certified Data Analytics - Specialty Dumps April 2024

Are you tired of looking for a source that'll keep you updated on the AWS Certified Data Analytics - Specialty Exam? Plus, has a collection of affordable, high-quality, and incredibly easy Amazon DAS-C01 Practice Questions? Well then, you are in luck because Salesforcexamdumps.com just updated them! Get Ready to become a AWS Certified Data Analytics Certified.

PDF		~~$100~~ $40
Test Engine		~~$140~~ $56
PDF + Test Engine		~~$180~~ $72

Here are Amazon DAS-C01 PDF available features:

207 questions with answers	Updation Date : 24 Apr, 2024
1 day study required to pass exam	100% Passing Assurance
100% Money Back Guarantee	Free 3 Months Updates

Click Here To Check Customers Feedback

Last 24 Hours Result

Students Passed

91%

Average Marks

97%

Questions From Dumps

4688

Total Happy Clients

What is Amazon DAS-C01?

Amazon DAS-C01 is a necessary certification exam to get certified. The certification is a reward to the deserving candidate with perfect results. The AWS Certified Data Analytics Certification validates a candidate's expertise to work with Amazon. In this fast-paced world, a certification is the quickest way to gain your employer's approval. Try your luck in passing the AWS Certified Data Analytics - Specialty Exam and becoming a certified professional today. Salesforcexamdumps.com is always eager to extend a helping hand by providing approved and accepted Amazon DAS-C01 Practice Questions. Passing AWS Certified Data Analytics - Specialty will be your ticket to a better future!

Pass with Amazon DAS-C01 Braindumps!

Contrary to the belief that certification exams are generally hard to get through, passing AWS Certified Data Analytics - Specialty is incredibly easy. Provided you have access to a reliable resource such as Salesforcexamdumps.com Amazon DAS-C01 PDF. We have been in this business long enough to understand where most of the resources went wrong. Passing Amazon AWS Certified Data Analytics certification is all about having the right information. Hence, we filled our Amazon DAS-C01 Dumps with all the necessary data you need to pass. These carefully curated sets of AWS Certified Data Analytics - Specialty Practice Questions target the most repeated exam questions. So, you know they are essential and can ensure passing results. Stop wasting your time waiting around and order your set of Amazon DAS-C01 Braindumps now!

We aim to provide all AWS Certified Data Analytics certification exam candidates with the best resources at minimum rates. You can check out our free demo before pressing down the download to ensure Amazon DAS-C01 Practice Questions are what you wanted. And do not forget about the discount. We always provide our customers with a little extra.

Why Choose Amazon DAS-C01 PDF?

Unlike other websites, Salesforcexamdumps.com prioritize the benefits of the AWS Certified Data Analytics - Specialty candidates. Not every Amazon exam candidate has full-time access to the internet. Plus, it's hard to sit in front of computer screens for too many hours. Are you also one of them? We understand that's why we are here with the AWS Certified Data Analytics solutions. Amazon DAS-C01 Question Answers offers two different formats PDF and Online Test Engine. One is for customers who like online platforms for real-like Exam stimulation. The other is for ones who prefer keeping their material close at hand. Moreover, you can download or print Amazon DAS-C01 Dumps with ease.

If you still have some queries, our team of experts is 24/7 in service to answer your questions. Just leave us a quick message in the chat-box below or email at [email protected].

Amazon DAS-C01 Sample Questions

Question # 1

A business intelligence (Bl) engineer must create a dashboard to visualize how oftencertain keywords are used in relation to others in social media posts about a public figure.The Bl engineer extracts the keywords from the posts and loads them into an AmazonRedshift table. The table displays the keywords and the count correspondingto each keyword.The Bl engineer needs to display the top keywords with more emphasis on the mostfrequently used keywords.Which visual type in Amazon QuickSight meets these requirements?

A. Bar charts
B. Word clouds
C. Circle packing
D. Heat maps

Question # 2

A company uses an Amazon Redshift provisioned cluster for data analysis. The data is notencrypted at rest. A data analytics specialist must implement a solution to encrypt the dataat rest.Which solution will meet this requirement with the LEAST operational overhead?

A. Use the ALTER TABLE command with the ENCODE option to update existing columnsof the Redshift tables to use LZO encoding.
B. Export data from the existing Redshift cluster to Amazon S3 by using the UNLOADcommand with the ENCRYPTED option. Create a new Redshift cluster with encryptionconfigured. Load data into the new cluster by using the COPY command.
C. Create a manual snapshot of the existing Redshift cluster. Restore the snapshot into anew Redshift cluster with encryption configured.
D. Modify the existing Redshift cluster to use AWS Key Management Service (AWS KMS)encryption. Wait for the cluster to finish resizing.

Question # 3

A company's data science team is designing a shared dataset repository on a Windowsserver. The data repository will store a large amount of training data that the datascience team commonly uses in its machine learning models. The data scientists create arandom number of new datasets each day.The company needs a solution that provides persistent, scalable file storage and highlevels of throughput and IOPS. The solution also must be highly available and mustintegrate with Active Directory for access control.Which solution will meet these requirements with the LEAST development effort?

A. Store datasets as files in an Amazon EMR cluster. Set the Active Directory domain forauthentication.
B. Store datasets as files in Amazon FSx for Windows File Server. Set the Active Directorydomain for authentication.
C. Store datasets as tables in a multi-node Amazon Redshift cluster. Set the ActiveDirectory domain for authentication.
D. Store datasets as global tables in Amazon DynamoDB. Build an application to integrateauthentication with the Active Directory domain.

Question # 4

A company is creating a data lake by using AWS Lake Formation. The data that will bestored in the data lake contains sensitive customer information and must be encrypted atrest using an AWS Key Management Service (AWS KMS) customer managed key to meetregulatory requirements.How can the company store the data in the data lake to meet these requirements?

A. Store the data in an encrypted Amazon Elastic Block Store (Amazon EBS) volume.Register the Amazon EBS volume with Lake Formation.
B. Store the data in an Amazon S3 bucket by using server-side encryption with AWS KMS(SSE-KMS). Register the S3 location with Lake Formation.
C. Encrypt the data on the client side and store the encrypted data in an Amazon S3bucket. Register the S3 location with Lake Formation.
D. Store the data in an Amazon S3 Glacier Flexible Retrieval vault bucket. Register the S3Glacier Flexible Retrieval vault with Lake Formation.

Question # 5

A financial company uses Amazon Athena to query data from an Amazon S3 data lake.Files are stored in the S3 data lake in Apache ORC format. Data analysts recentlyintroduced nested fields in the data lake ORC files, and noticed that queries are takinglonger to run in Athena. A data analysts discovered that more data than what is required isbeing scanned for the queries.What is the MOST operationally efficient solution to improve query performance?

A. Flatten nested data and create separate files for each nested dataset.
B. Use the Athena query engine V2 and push the query filter to the source ORC file.
C. Use Apache Parquet format instead of ORC format.
D. Recreate the data partition strategy and further narrow down the data filter criteria.

Question # 6

A company collects data from parking garages. Analysts have requested the ability to runreports in near real time about the number of vehicles in each garage.The company wants to build an ingestion pipeline that loads the data into an AmazonRedshift cluster. The solution must alert operations personnel when the number of vehiclesin a particular garage exceeds a specific threshold. The alerting query will use garagethreshold values as a static reference. The threshold values are stored inAmazon S3.What is the MOST operationally efficient solution that meets these requirements?

A. Use an Amazon Kinesis Data Firehose delivery stream to collect the data and to deliverthe data to Amazon Redshift. Create an Amazon Kinesis Data Analytics application thatuses the same delivery stream as an input source. Create a reference data source inKinesis Data Analytics to temporarily store the threshold values from Amazon S3 and tocompare the number of vehicles in a particular garage to the corresponding thresholdvalue. Configure an AWS Lambda function to publish an Amazon Simple NotificationService (Amazon SNS) notification if the number of vehicles exceeds the threshold.
B. Use an Amazon Kinesis data stream to collect the data. Use an Amazon Kinesis DataFirehose delivery stream to deliver the data to Amazon Redshift. Create another Kinesisdata stream to temporarily store the threshold values from Amazon S3. Send the deliverystream and the second data stream to Amazon Kinesis Data Analytics to compare thenumber of vehicles in a particular garage to the corresponding threshold value. Configurean AWS Lambda function to publish an Amazon Simple Notification Service (Amazon SNS)notification if the number of vehicles exceeds the threshold.
C. Use an Amazon Kinesis Data Firehose delivery stream to collect the data and to deliverthe data to Amazon Redshift. Automatically initiate an AWS Lambda function that queriesthe data in Amazon Redshift. Configure the Lambda function to compare the number ofvehicles in a particular garage to the correspondingthreshold value from Amazon S3. Configure the Lambda function to also publish an Amazon Simple Notification Service(Amazon SNS) notification if the number of vehicles exceeds the threshold.
D. Use an Amazon Kinesis Data Firehose delivery stream to collect the data and to deliverthe data to Amazon Redshift. Create an Amazon Kinesis Data Analytics application thatuses the same delivery stream as an input source. Use Kinesis Data Analytics to comparethe number of vehicles in a particular garage to the corresponding threshold value that isstored in a table as an in-application stream. Configure an AWS Lambda function as anoutput for the application to publish an Amazon Simple Queue Service (Amazon SQS)notification if the number of vehicles exceeds the threshold.

Question # 7

A company is designing a data warehouse to support business intelligence reporting. Userswill access the executive dashboard heavily each Monday and Friday morningfor I hour. These read-only queries will run on the active Amazon Redshift cluster, whichruns on dc2.8xIarge compute nodes 24 hours a day, 7 days a week. There arethree queues set up in workload management: Dashboard, ETL, and System. The AmazonRedshift cluster needs to process the queries without wait time.What is the MOST cost-effective way to ensure that the cluster processes these queries?

A. Perform a classic resize to place the cluster in read-only mode while adding anadditional node to the cluster.
B. Enable automatic workload management.
C. Perform an elastic resize to add an additional node to the cluster.
D. Enable concurrency scaling for the Dashboard workload queue.

Question # 8

A company analyzes historical data and needs to query data that is stored in Amazon S3.New data is generated daily as .csv files that are stored in Amazon S3. The company'sdata analysts are using Amazon Athena to perform SQL queries against a recent subset ofthe overall data.The amount of data that is ingested into Amazon S3 has increased to 5 PB over time. Thequery latency also has increased. The company needs to segment the data to reduce theamount of data that is scanned.Which solutions will improve query performance? (Select TWO.)Use MySQL Workbench on an Amazon EC2 instance. Connect to Athena by using a JDBCconnector. Run the query from MySQL Workbench instead ofAthena directly.

A. Configure Athena to use S3 Select to load only the files of the data subset.
B. Create the data subset in Apache Parquet format each day by using the AthenaCREATE TABLE AS SELECT (CTAS) statement. Query the Parquet data.
C. Run a daily AWS Glue ETL job to convert the data files to Apache Parquet format and topartition the converted files. Create a periodic AWS Glue crawler to automatically crawl the partitioned data each day.
D. Create an S3 gateway endpoint. Configure VPC routing to access Amazon S3 throughthe gateway endpoint.

Question # 9

A company wants to use a data lake that is hosted on Amazon S3 to provide analyticsservices for historical data. The data lake consists of 800 tables but is expected to grow tothousands of tables. More than 50 departments use the tables, and each department hashundreds of users. Different departments need access to specific tables and columns. Which solution will meet these requirements with the LEAST operational overhead?

A. Create an 1AM role for each department. Use AWS Lake Formation based accesscontrol to grant each 1AM role access to specific tables and columns. Use Amazon Athenato analyze the data.
B. Create an Amazon Redshift cluster for each department. Use AWS Glue to ingest intothe Redshift cluster only the tables and columns that are relevant to that department.Create Redshift database users. Grant the users access to the relevant department'sRedshift cluster. Use Amazon Redshift to analyze the data.
C. Create an 1AM role for each department. Use AWS Lake Formation tag-based accesscontrol to grant each 1AM roleaccess to only the relevant resources. Create LF-tags that are attached to tables andcolumns. Use Amazon Athena to analyze the data.
D. Create an Amazon EMR cluster for each department. Configure an 1AM service role foreach EMR cluster to access
E. relevant S3 files. For each department's users, create an 1AM role that provides accessto the relevant EMR cluster. Use Amazon EMR to analyze the data.

Question # 10

A data analyst is designing an Amazon QuickSight dashboard using centralized sales datathat resides in Amazon Redshift. The dashboard must be restricted so that a salesperson in Sydney, Australia, can see only the Australia view and that a salesperson in New Yorkcan see only United States (US) data.What should the data analyst do to ensure the appropriate data security is in place?

A. Place the data sources for Australia and the US into separate SPICE capacity pools.
B. Set up an Amazon Redshift VPC security group for Australia and the US.
C. Deploy QuickSight Enterprise edition to implement row-level security (RLS) to the salestable.
D. Deploy QuickSight Enterprise edition and set up different VPC security groups forAustralia and the US.

Question # 11

A gaming company is building a serverless data lake. The company is ingesting streamingdata into Amazon Kinesis Data Streams and is writing the data to Amazon S3 throughAmazon Kinesis Data Firehose. The company is using 10 MB as the S3 buffer size and isusing 90 seconds as the buffer interval. The company runs an AWS Glue ET L job tomerge and transform the data to a different format before writing the data back to Amazon S3.Recently, the company has experienced substantial growth in its data volume. The AWSGlue ETL jobs are frequently showing an OutOfMemoryError error.Which solutions will resolve this issue without incurring additional costs? (Select TWO.)

A. Place the small files into one S3 folder. Define one single table for the small S3 files inAWS Glue Data Catalog. Rerun the AWS Glue ET L jobs against this AWS Glue table.
B. Create an AWS Lambda function to merge small S3 files and invoke them periodically.Run the AWS Glue ETL jobs after successful completion of the Lambda function.
C. Run the S3DistCp utility in Amazon EMR to merge a large number of small S3 filesbefore running the AWS Glue ETL jobs.
D. Use the groupFiIes setting in the AWS Glue ET L job to merge small S3 files and rerunAWS Glue E TL jobs.
E. Update the Kinesis Data Firehose S3 buffer size to 128 MB. Update the buffer interval to900 seconds.

Question # 12

A retail company has 15 stores across 6 cities in the United States. Once a month, thesales team requests a visualization in Amazon QuickSight that provides the ability to easilyidentify revenue trends across cities and stores.The visualization also helps identify outliersthat need to be examined with further analysis.Which visual type in QuickSight meets the sales team's requirements?

A. Geospatial chart
B. Line chart
C. Heat map
D. Tree map

Question # 13

A company uses Amazon EC2 instances to receive files from external vendors throughouteach day. At the end of each day, the EC2 instances combine the files into a single file,perform gzip compression, and upload the single file to an Amazon S3 bucket. The totalsize of all the files is approximately 100 GB each day.When the files are uploaded to Amazon S3, an AWS Batch job runs a COPY command toload the files into an Amazon Redshift cluster.Which solution will MOST accelerate the COPY process?

A. Upload the individual files to Amazon S3. Run the COPY command as soon as the filesbecome available.
B. Split the files so that the number of files is equal to a multiple of the number of slices inthe Redshift cluster. Compress and upload the files to Amazon S3. Run the COPYcommand on the files.
C. Split the files so that each file uses 50% of the free storage on each compute node inthe Redshift cluster. Compress and upload the files to Amazon S3. Run the COPYcommand on the files.
D. pply sharding by breaking up the files so that the DISTKEY columns with the samevalues go to the same file. Compress and upload the sharded files to Amazon S3. Run theCOPY command on the files.

Question # 14

A bank is building an Amazon S3 data lake. The bank wants a single data repository forcustomer data needs, such as personalized recommendations. The bank needs to useAmazon Kinesis Data Firehose to ingest customers' personal information, bank accounts,and transactions in near real time from a transactional relational database. All personally identifiable information (Pll) that is stored in the S3 bucket must be masked.The bank has enabled versioning for the S3 bucket.Which solution will meet these requirements?

A. Invoke an AWS Lambda function from Kinesis Data Firehose to mask the PII beforeKinesis Data Firehose delivers the data to the S3 bucket.
B. Use Amazon Macie to scan the S3 bucket. Configure Macie to discover Pll. Invoke anAWS Lambda function from S3 events to mask the Pll.
C. Configure server-side encryption (SSE) for the S3 bucket. Invoke an AWS Lambdafunction from S3 events to mask the PII.
D. Create an AWS Lambda function to read the objects, mask the Pll, and store the objectsback with same key. Invoke the Lambda function from S3 events.

Question # 15

A company developed a new voting results reporting website that uses Amazon KinesisData Firehose to deliver full logs from AWS WAF to an Amazon S3 bucket. The company isnow seeking a solution to perform this infrequent data analysis with data visualizationcapabilities in a way that requires minimal development effort.Which solution MOST cost-effectively meets these requirements?

A. Use an AWS Glue crawler to create and update a table in the AWS Glue data catalogfrom the logs. Use Amazon Athena to perform ad-hoc analyses. Develop datavisualizations by using Amazon QuickSight.
B. Configure Kinesis Data Firehose to deliver the logs to an Amazon OpenSearch Servicecluster. Use OpenSearch Service REST APIs to analyze the data. Visualize the data bybuilding an OpenSearch Service dashboard.
C. Create an AWS Lambda function to convert the logs to CSV format. Add the Lambdafunction to the Kinesis Data Firehose transformation configuration. Use Amazon Redshift toperform a one-time analysis of the logs by using SQL queries. Develop data visualizationsby using Amazon QuickSight.
D. Create an Amazon EMR cluster and use Amazon S3 as the data source. Create anApache Spark job to perform a one-time analysis of the logs. Develop data visualizationsby using Amazon QuickSight.

Question # 16

A large ecommerce company uses Amazon DynamoDB with provisioned read capacity andauto scaled write capacity to store its product catalog. The company uses Apache HiveQLstatements on an Amazon EMR cluster to query the DynamoDB table. After the companyannounced a sale on all of its products, wait times for each query have increased. The dataanalyst has determined that the longer wait times are being caused by throttling whenquerying the table.Which solution will solve this issue?

A. Increase the size of the EMR nodes that are provisioned.
B. Increase the number of EMR nodes that are in the cluster.
C. Increase the DynamoDB table's provisioned write throughput.
D. Increase the DynamoDB table's provisioned read throughput.

Question # 17

A social media company is using business intelligence tools to analyze data for forecasting.The company is using Apache Kafka to ingest data. The company wants to build dynamicdashboards that include machine learning (ML) insights to forecast key business trends.The dashboards must show recent batched data that is not more than 75 minutes old.Various teams at the company want to view the dashboards by using Amazon QuickSightwith ML insights.Which solution will meet these requirements?

A. Replace Kafka with Amazon Managed Streaming for Apache Kafka (Amazon MSK). UseAWS Data Exchange to store the data in Amazon S3. Use SPICE in QuickSight Enterpriseedition to refresh the data from Amazon S3 each hour. Use QuickSight to create a dynamicdashboard that includes forecasting and ML insights.
B. Replace Kafka with an Amazon Kinesis data stream. Use AWS Data Exchange to storethe data in Amazon S3. Use SPICE in QuickSight Standard edition to refresh the data fromAmazon S3 each hour. Use QuickSight to create a dynamic dashboard that includesforecasting and ML insights.
C. Configure the Kafka-Kinesis-Connector to publish the data to an Amazon Kinesis DataFirehose delivery stream. Configure the delivery stream to store the data in Amazon S3with a max buffer size of 60 seconds. Use SPICE in QuickSight Enterprise edition torefresh the data from Amazon S3 each hour. Use QuickSight to create a dynamicdashboard that includes forecasting and ML insights.
D. Configure the Kafka-Kinesis-Connector to publish the data to an Amazon Kinesis DataFirehose delivery stream. Configure the delivery stream to store the data in Amazon S3with a max buffer size of 60 seconds. Refresh the data in QuickSight Standard edition SPICE from Amazon S3 by using a scheduled AWS Lambda function. Configure theLambda function to run every 75 minutes and to invoke the QuickSight API to create adynamic dashboard that includes forecasting and ML insights.

Question # 18

A company recently created a test AWS account to use for a development environmentThe company also created a production AWS account in another AWS Region As part ofits security testing the company wants to send log data from Amazon CloudWatch Logs inits production account to an Amazon Kinesis data stream in its test accountWhich solution will allow the company to accomplish this goal?

A. Create a subscription filter in the production accounts CloudWatch Logs to target theKinesis data stream in the test account as its destination In the test account create an 1AMrole that grants access to the Kinesis data stream and the CloudWatch Logs resources inthe production account
B. In the test account create an 1AM role that grants access to the Kinesis data stream andthe CloudWatch Logs resources in the production account Create a destination datastream in Kinesis Data Streams in the test account with an 1AM role and a trust policy thatallow CloudWatch Logs in the production account to write to the test account
C. In the test account, create an 1AM role that grants access to the Kinesis data streamand the CloudWatch Logs resources in the production account Create a destination datastream in Kinesis Data Streams in the test account with an 1AM role and a trust policy thatallow CloudWatch Logs in the production account to write to the test account
D. Create a destination data stream in Kinesis Data Streams in the test account with an1AM role and a trust policy that allow CloudWatch Logs in the production account to writeto the test account Create a subscription filter in the production accounts CloudWatch Logsto target the Kinesis data stream in the test account as its destination

Question # 19

A banking company wants to collect large volumes of transactional data using AmazonKinesis Data Streams for real-time analytics. The company usesPutRecord to send data toAmazon Kinesis, and has observed network outages during certain times of the day. Thecompany wants to obtain exactly once semantics for the entire processing pipeline.What should the company do to obtain these characteristics?

A. Design the application so it can remove duplicates during processing be embedding aunique ID in each record.
B. Rely on the processing semantics of Amazon Kinesis Data Analytics to avoid duplicateprocessing of events.
C. Design the data producer so events are not ingested into Kinesis Data Streams multipletimes.
D. Rely on the exactly one processing semantics of Apache Flink and Apache SparkStreaming included in Amazon EMR.

Question # 20

A company uses Amazon kinesis Data Streams to ingest and process customer behaviorinformation from application users each day. A data analytics specialist notices that its datastream is throttling. The specialist has turned on enhanced monitoring for the Kinesis datastream and has verified that the data stream did not exceed the data limits. The specialistdiscovers that there are hot shardsWhich solution will resolve this issue?

A. Use a random partition key to ingest the records.
B. Increase the number of shards Split the size of the log records.
C. Limit the number of records that are sent each second by the producer to match thecapacity of the stream.
D. Decrease the size of the records that are sent from the producer to match the capacityof the stream.

Question # 21

An online food delivery company wants to optimize its storage costs. The company hasbeen collecting operational data for the last 10 years in a data lake that was built onAmazon S3 by using a Standard storage class. The company does not keep data that isolder than 7 years. The data analytics team frequently uses data from the past 6months for reporting and runs queries on data from the last 2 years about once a month.Data that is more than 2 years old is rarely accessed and is only used for audit purposes.Which combination of solutions will optimize the company's storage costs? (Select TWO.)

A. Create an S3 Lifecycle configuration rule to transition data that is older than 6 months tothe S3 Standard-Infrequent Access (S3 Standard-IA) storage class.
B. Create another S3 Lifecycle configuration rule to transition data that is older than 2years to the S3 Glacier Deep Archive storage class. Create an S3 Lifecycle configurationrule to transition data that is older than 6 months to the S3 One Zone-Infrequent Access(S3 One Zone-IA) storage class.
C. Create another S3 Lifecycle configuration rule to transition data that is older than 2years to the S3 Glacier Flexible Retrieval storage class.
D. Use the S3 Intelligent-Tiering storage class to store data instead of the S3 Standardstorage class.
E. Create an S3 Lifecycle expiration rule to delete data that is older than 7 years.
F. Create an S3 Lifecycle configuration rule to transition data that is older than 7 years tothe S3 Glacier Deep Archive storage class.

Question # 22

A company is using an AWS Lambda function to run Amazon Athena queries against across-account AWS Glue Data Catalog. A query returns the following error:HIVE METASTORE ERRORThe error message states that the response payload size exceeds the maximum allowedpayload size. The queried table is already partitioned, and the data is stored in anAmazon S3 bucket in the Apache Hive partition format.Which solution will resolve this error?

A. Modify the Lambda function to upload the query response payload as an object into theS3 bucket. Include an S3 object presigned URL as the payload in the Lambda functionresponse.
B. Run the MSCK REPAIR TABLE command on the queried table.
C. Create a separate folder in the S3 bucket. Move the data files that need to be queriedinto that folder. Create an AWS Glue crawler that points to the folder instead of the S3bucket.
D. Check the schema of the queried table for any characters that Athena does not support.Replace any unsupported characters with characters that Athena supports.

Question # 23

A large media company is looking for a cost-effective storage and analysis solution for itsdaily media recordings formatted with embedded metadata. Daily data sizes rangebetween 10-12 TB with stream analysis required on timestamps, video resolutions, filesizes, closed captioning, audio languages, and more. Based on the analysis,processing the datasets is estimated to take between 30-180 minutes depending on theunderlying framework selection. The analysis will be done by using business intelligence(Bl) tools that can be connected to data sources with AWS or Java Database Connectivity(JDBC) connectors.Which solution meets these requirements?

A. Store the video files in Amazon DynamoDB and use AWS Lambda to extract the metadata from the files and load it to DynamoDB. Use DynamoDB to provide the data to be analyzed by the Bltools.
B. Store the video files in Amazon S3 and use AWS Lambda to extract the metadata fromthe files and load it to Amazon S3. Use Amazon Athena to provide the data to be analyzedby the BI tools.
C. Store the video files in Amazon DynamoDB and use Amazon EMR to extract themetadata from the files and load it to Apache Hive. Use Apache Hive to provide the data tobe analyzed by the Bl tools.
D. Store the video files in Amazon S3 and use AWS Glue to extract the metadata from thefiles and load it to Amazon Redshift. Use Amazon Redshift to provide the data to beanalyzed by the Bl tools.

Question # 24

A large energy company is using Amazon QuickSight to build dashboards and report thehistorical usage data of its customers This data is hosted in Amazon Redshift The reportsneed access to all the fact tables' billions ot records to create aggregation in real timegrouping by multiple dimensionsA data analyst created the dataset in QuickSight by using a SQL query and not SPICEBusiness users have noted that the response time is not fast enough to meet their needsWhich action would speed up the response time for the reports with the LEASTimplementation effort?

A. Use QuickSight to modify the current dataset to use SPICE
B. Use AWS Glue to create an Apache Spark job that joins the fact table with thedimensions. Load the data into a new table
C. Use Amazon Redshift to create a materialized view that joins the fact table with thedimensions
D. Use Amazon Redshift to create a stored procedure that joins the fact table with thedimensions Load the data into a new table

Question # 25

A data analyst notices the following error message while loading data to an AmazonRedshift cluster:"The bucket you are attempting to access must be addressed using the specifiedendpoint."What should the data analyst do to resolve this issue?

A. Specify the correct AWS Region for the Amazon S3 bucket by using the REGION optionwith the COPY command.
B. Change the Amazon S3 object's ACL to grant the S3 bucket owner full control of theobject.
C. Launch the Redshift cluster in a VPC.
D. Configure the timeout settings according to the operating system used to connect to theRedshift cluster.

Question # 26

A company has a mobile app that has millions of users. The company wants to enhancethe mobile app by including interactive data visualizations that show user trends.The data for visualization is stored in a large data lake with 50 million rows. Data that isused in the visualization should be no more than two hours old.Which solution will meet these requirements with the LEAST operational overhead?

A. Run an hourly batch process that renders user-specific data visualizations as staticimages that are stored in Amazon S3.
B. Precompute aggregated data hourly. Store the data in Amazon DynamoDB. Render thedata by using the D3.js JavaScript library.
C. Embed an Amazon QuickSight Enterprise edition dashboard into the mobile app byusing the QuickSight Embedding SDK. Refresh data in SPICE hourly.
D. Run Amazon Athena queries behind an Amazon API Gateway API. Render the data byusing the D3.js JavaScript library.

Question # 27

A company uses Amazon Redshift for its data warehouse. The company is running an ET Lprocess that receives data in data parts from five third-party providers. The data partscontain independent records that are related to one specific job. The company receives thedata parts at various times throughout each day.A data analytics specialist must implement a solution that loads the data into AmazonRedshift only after the company receives all five data parts.Which solution will meet these requirements?

A. Create an Amazon S3 bucket to receive the data. Use S3 multipart upload to collect thedata from the different sources andto form a single object before loading the data intoAmazon Redshift.
B. Use an AWS Lambda function that is scheduled by cron to load the data into atemporary table in Amazon Redshift. Use Amazon Redshift database triggers toconsolidate the final data when all five data parts are ready.
C. Create an Amazon S3 bucket to receive the data. Create an AWS Lambda function thatis invoked by S3 upload events. Configure the function to validate that all five data partsare gathered before the function loads the data into Amazon Redshift.
D. Create an Amazon Kinesis Data Firehose delivery stream. Program a Python conditionthat will invoke a buffer flush when all five data parts are received.

Question # 28

A large marketing company needs to store all of its streaming logs and create near-realtimedashboards. The dashboards will be used to help the company make critical businessdecisions and must be highly available.Which solution meets these requirements?

A. Store the streaming logs in Amazon S3 with replication to an S3 bucket in a differentAvailability Zone. Create the dashboards by using Amazon QuickSight.
B. Deploy an Amazon Redshift cluster with at least three nodes in a VPC that spans twoAvailability Zones. Store the streaming logs and use the Redshift cluster as a source tocreate the dashboards by using Amazon QuickSight.
C. Store the streaming logs in Amazon S3 with replication to an S3 bucket in a differentAvailability Zone. Every time a new log is added in the bucket, invoke an AWS Lambdafunction to update the dashboards in Amazon QuickSight.
D. Store the streaming logs in Amazon OpenSearch Service deployed across threeAvailability Zones and with three dedicated master nodes. Create the dashboards by usingOpenSearch Dashboards.

Question # 29

A data analytics specialist has a 50 GB data file in .csv format and wants to perform a datatransformation task. The data analytics specialist is using the Amazon Athena CREATETABLE AS SELECT (CTAS) statement to perform the transformation. The resulting outputwill be used to query the data from Amazon Redshift Spectrum.Which CTAS statement should the data analytics specialist use to provide the MOSTefficient performance?

A. Option A
B. Option B
C. Option C
D. Option D

Question # 30

A financial services company is building a data lake solution on Amazon S3. The companyplans to use analytics offerings from AWS to meet user needs for one-time querying andbusiness intelligence reports. A portion of the columns will contain personally identifiableinformation (Pll). Only authorized users should be able to see plaintext PII data.What is the MOST operationally efficient solution that meets these requirements?

A. Define a bucket policy for each S3 bucket of the data lake to allow access to users whohave authorization to see PII data. Catalog the data by using AWS Glue. Create two IAMroles. Attach a permissions policy with access to PII columns to one role. Attach a policywithout these permissions to the other role.
B. Register the S3 locations with AWS Lake Formation. Create two IAM roles. Use LakeFormation data permissions to grant Select permissions to all of the columns for one role.Grant Select permissions to only columns that contain non-PII data for the other role.
C. Register the S3 locations with AWS Lake Formation. Create an AWS Glue job to createan E TL workflow that removes the Pll columns from the data and creates a separate copyof the data in another data lake S3 bucket. Register the new S3 locations with LakeFormation. Grant users the permissions to each data lake data based on whether the usersare authorized to see PII data.
D. Register the S3 locations with AWS Lake Formation. Create two IAM roles. Attach apermissions policy with access to Pll columns to one role. Attach a policy without thesepermissions to the other role. For each downstream analytics service, use its nativesecurity functionality and the IAM roles to secure the Pll data.

Question # 31

A company uses Amazon Connect to manage its contact center. The company usesSalesforce to manage its customer relationship management (CRM) data. The companymust build a pipeline to ingest data from Amazon Connect and Salesforce into a data lakethat is built on Amazon S3.Which solution will meet this requirement with the LEAST operational overhead?

A. Use Amazon Kinesis Data Streams to ingest the Amazon Connect data. Use AmazonAppFlow to ingest the Salesforce data.
B. Use Amazon Kinesis Data Firehose to ingest the Amazon Connect data. Use AmazonKinesis Data Streams to ingest the Salesforce data.
C. Use Amazon Kinesis Data Firehose to ingest the Amazon Connect data. Use AmazonAppFlow to ingest the Salesforce data.
D. Use Amazon AppFlow to ingest the Amazon Connect data. Use Amazon Kinesis DataFirehose to ingest the Salesforce data.

Question # 32

A healthcare company ingests patient data from multiple data sources and stores it in anAmazon S3 staging bucket. An AWS Glue ETL job transforms the data, which is written toan S3-based data lake to be queried using Amazon Athena. The company wants to matchpatient records even when the records do not have a common unique identifier.Which solution meets this requirement?

A. Use Amazon Macie pattern matching as part of the ETLjob
B. Train and use the AWS Glue PySpark filter class in the ETLjob
C. Partition tables and use the ETL job to partition the data on patient name
D. Train and use the AWS Glue FindMatches ML transform in the ETLjob

Question # 33

A company uses an Amazon EMR cluster with 50 nodes to process operational data andmake the data available for data analysts These jobs run nightly use Apache Hive with theApache Jez framework as a processing model and write results to Hadoop Distributed FileSystem (HDFS) In the last few weeks, jobs are failing and are producing the following errormessage"File could only be replicated to 0 nodes instead of 1"A data analytics specialist checks the DataNode logs the NameNode logs and networkconnectivity for potential issues that could have prevented HDFS from replicating data Thedata analytics specialist rules out these factors as causes for the issueWhich solution will prevent the jobs from failing'?

A. Monitor the HDFSUtilization metric. If the value crosses a user-defined threshold addtask nodes to the EMR cluster
B. Monitor the HDFSUtilization metri.c If the value crosses a user-defined threshold addcore nodes to the EMR cluster
C. Monitor the MemoryAllocatedMB metric. If the value crosses a user-defined threshold,add task nodes to the EMR cluster
D. Monitor the MemoryAllocatedMB metric. If the value crosses a user-defined threshold,add core nodes to the EMR cluster.

Question # 34

A company is sending historical datasets to Amazon S3 for storage. A data engineer at thecompany wants to make these datasets available for analysis using Amazon Athena. Theengineer also wants to encrypt the Athena query results in an S3 results location by usingAWS solutions for encryption. The requirements for encrypting the query results are asfollows:Use custom keys for encryption of the primary dataset query results.Use generic encryption for all other query results.Provide an audit trail for the primary dataset queries that shows when the keys were usedand by whom.Which solution meets these requirements?

A. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the primarydataset. Use SSE-S3 for the other datasets.
B. Use server-side encryption with customer-provided encryption keys (SSE-C) for theprimary dataset. Use server-side encryption with S3 managed encryption keys (SSE-S3)for the other datasets.
C. Use server-side encryption with AWS KMS managed customer master keys (SSE-KMSCMKs) for the primary dataset. Use server-side encryption with S3 managed encryptionkeys (SSE-S3) for the other datasets.
D. Use client-side encryption with AWS Key Management Service (AWS KMS) customermanaged keys for the primary dataset. Use S3 client-side encryption with client-side keysfor the other datasets.

Question # 35

A bank operates in a regulated environment. The compliance requirements for the countryin which the bank operates say that customer data for each state should only be accessibleby the bank’s employees located in the same state. Bank employees in one state shouldNOT be able to access data for customers who have provided a home address in adifferent state.The bank’s marketing team has hired a data analyst to gather insights from customer datafor a new campaign being launched in certain states. Currently,data linking each customeraccount to its home state is stored in a tabular .csv file within a single Amazon S3 folder ina private S3 bucket. The total size of the S3 folder is 2 GB uncompressed. Due to thecountry’s compliance requirements, the marketing team is not able to access this folder.The data analyst is responsible for ensuring that the marketing team gets one-time accessto customer data for their campaign analytics project, while being subject to all thecompliance requirements and controls.Which solution should the data analyst implement to meet the desired requirements withthe LEAST amount of setup effort?

A. Re-arrange data in Amazon S3 to store customer data about each state in a different S3folder within the same bucket. Set up S3 bucket policies to provide marketing employeeswith appropriate data access under compliance controls. Delete the bucket policies afterthe project.
B. Load tabular data from Amazon S3 to an Amazon EMR cluster using s3DistCp.Implement a custom Hadoop-based row-level security solution on the Hadoop DistributedFile System (HDFS) to provide marketing employees with appropriate data access undercompliance controls. Terminate the EMR cluster after the project.
C. Load tabular data from Amazon S3 to Amazon Redshift with the COPY command. Usethe built-in row- level security feature in Amazon Redshift to provide marketing employeeswith appropriate data access under compliance controls. Delete the Amazon Redshifttables after the project.
D. Load tabular data from Amazon S3 to Amazon QuickSight Enterprise edition by directlyimporting it as a data source. Use the built-in row-level security feature in AmazonQuickSight to provide marketing employees with appropriate data access undercompliance controls. Delete Amazon QuickSight data sources after the project is complete.

Question # 36

A company has multiple data workflows to ingest data from its operational databases intoits data lake on Amazon S3. The workflows use AWS Glue and Amazon EMR for dataprocessing and ETL. The company wants to enhance its architecture to provide automatedorchestration and minimize manual intervention Which solution should the company use tomanage the data workflows to meet these requirements?

A. AWS Glue workflows
B. AWS Step Functions
C. AWS Lambda
D. AWS Batch

Question # 37

A company has a fitness tracker application that generates data from subscribers. Thecompany needs real-time reporting on this data. The data is sent immediately, and theprocessing latency must be less than 1 second. The company wants to perform anomalydetection on the data as the data is collected. The company also requires a solution thatminimizes operational overhead.Which solution meets these requirements?

A. Amazon EMR cluster with Apache Spark streaming, Spark SQL, and Spark's machinelearning library (MLIib)
B. Amazon Kinesis Data Firehose with Amazon S3 and Amazon Athena
C. Amazon Kinesis Data Firehose with Amazon QuickSight
D. Amazon Kinesis Data Streams with Amazon Kinesis Data Analytics

Question # 38

A company wants to ingest clickstream data from its website into an Amazon S3 bucket.The streaming data is in JSON format. The data in the S3 bucket must be partitioned byproduct_id.Which solution will meet these requirements MOST cost-effectively?

A. Create an Amazon Kinesis Data Firehose delivery stream to ingest the streaming datainto the S3 bucket. Enable dynamic partitioning. Specify the data field of productjd as onepartitioning key.
B. Create an AWS Glue streaming job to partition the data by productjd before deliveringthe data to the S3 bucket. Create an Amazon Kinesis Data Firehose delivery stream.Specify the AWS Glue job as the destination of the delivery stream.
C. Create an Amazon Kinesis Data Firehose delivery stream to ingest the streaming datainto the S3 bucket. Create an AWS Glue ETL job to read the data stream in the S3 bucket,partition the data by productjd, and write the data into another S3 bucket.
D. Create an Amazon Kinesis Data Firehose delivery stream to ingest the streaming datainto the S3 bucket. Create an Amazon EMR cluster that includes a job to read the datastream in the S3 bucket, partition the data by productjd, and write the data into another S3bucket.

Question # 39

A machinery company wants to collect data from sensors. A data analytics specialist needsto implement a solution that aggregates the data in near-real time and saves the data to apersistent data store. The data must be stored in nested JSON format and must be queriedfrom the data store with a latency of single-digit milliseconds.Which solution will meet these requirements?

A. Use Amazon Kinesis Data Streams to receive the data from the sensors. Use AmazonKinesis Data Analytics to read the stream, aggregate the data, and send the data to anAWS Lambda function. Configure the Lambda function to store the data in AmazonDynamoDB.
B. Use Amazon Kinesis Data Firehose to receive the data from the sensors. Use Amazon Kinesis Data Analytics to aggregate the data. Use an AWS Lambda function to read thedata from Kinesis Data Analytics and store the data in Amazon S3.
C. Use Amazon Kinesis Data Firehose to receive the data from the sensors. Use an AWSLambda function to aggregate the data during capture. Store the data from Kinesis DataFirehose in Amazon DynamoDB.
D. Use Amazon Kinesis Data Firehose to receive the data from the sensors. Use an AWSLambda function to aggregate the data during capture. Store the data in Amazon S3.

Question # 40

A company is building an analytical solution that includes Amazon S3 as data lake storageand Amazon Redshift for data warehousing. The company wants to use Amazon RedshiftSpectrum to query the data that is stored in Amazon S3.Which steps should the company take to improve performance when the company usesAmazon Redshift Spectrum to query the S3 data files? (Select THREE )Use gzip compression with individual file sizes of 1-5 GB

A. Use a columnar storage file format
B. Partition the data based on the most common query predicates
C. Split the data into KB-sized files.
D. Keep all files about the same size.
E. Use file formats that are not splittable

Question # 41

An IOT company is collecting data from multiple sensors and is streaming the data toAmazon Managed Streaming for Apache Kafka (Amazon MSK). Each sensor type hasits own topic, and each topic has the same number of partitions.The company is planning to turn on more sensors. However, the company wants toevaluate which sensor types are producing the most data sothat the company can scaleaccordingly. The company needs to know which sensor types have the largest values forthe following metrics: ByteslnPerSec and MessageslnPerSec.Which level of monitoring for Amazon MSK will meet these requirements?

A. DEFAULT level
B. PER TOPIC PER BROKER level
C. PER BROKER level
D. PER TOPIC level

Question # 42

An airline has .csv-formatted data stored in Amazon S3 with an AWS Glue Data Catalog.Data analysts want to join this data with call center data stored in Amazon Redshift as partof a dally batch process. The Amazon Redshift cluster is already under a heavy load. Thesolution must be managed, serverless, well-functioning, and minimize the load on theexisting AmazonRedshift cluster. The solution should also require minimal effort anddevelopment activity.Which solution meets these requirements?

A. Unload the call center data from Amazon Redshift to Amazon S3 using an AWS Lambdafunction. Perform the join with AWS Glue ETL scripts.
B. Export the call center data from Amazon Redshift using a Python shell in AWS Glue.Perform the join with AWS Glue ETL scripts.
C. Create an external table using Amazon Redshift Spectrum for the call center data andperform the join with Amazon Redshift.
D. Export the call center data from Amazon Redshift to Amazon EMR using Apache Sqoop.Perform the join with Apache Hive.

Question # 43

A company plans to store quarterly financial statements in a dedicated Amazon S3 bucket.The financial statements must not be modified or deleted after they are saved to the S3bucket.Which solution will meet these requirements?

A. Create the S3 bucket with S3 Object Lock in governance mode.
B. Create the S3 bucket with MFA delete enabled.
C. Create the S3 bucket with S3 Object Lock in compliance mode.
D. Create S3 buckets in two AWS Regions. Use S3 Cross-Region Replication (CRR)between the buckets.

Question # 44

A financial services firm is processing a stream of real-time data from an application byusing Apache Kafka and Kafka MirrorMaker. These tools run on premises and stream datato Amazon Managed Streaming for Apache Kafka (Amazon MSK) in the us-east-1 Region.An Apache Flink consumer running on Amazon EMR enriches the data in real time andtransfers the output files to an Amazon S3 bucket. The company wants to ensure that thestreaming application is highly available across AWS Regions with an RTO of less than 2minutes.Which solution meets these requirements?

A. Launch another Amazon MSK and Apache Flink cluster in the us-west-1 Region that isthe same size as the originalcluster in the us-east-1 Region. Simultaneously publish and process the data in bothRegions. In the event of a disaster that impacts one of the Regions, switch to the other Region.
B. Set up Cross-Region Replication from the Amazon S3 bucket in the us-east-1 Region tothe us-west-1 Region. In the event of a disaster, immediately create Amazon MSK andApache Flink clusters in the us-west-1 Region and start publishing data to this Region.
C. Add an AWS Lambda function in the us-east-1 Region to read from Amazon MSK andwrite to a global Amazon
D. DynamoDB table in on-demand capacity mode. Export the data from DynamoDB toAmazon S3 in the us-west-1 Region. In the event of a disaster that impacts the us-east-1Region, immediately create Amazon MSK and Apache Flink clusters in the us-west-1Region and start publishing data to this Region.
E. Set up Cross-Region Replication from the Amazon S3 bucket in the us-east-1 Region tothe us-west-1 Region. In the event of a disaster, immediately create Amazon MSK andApache Flink clusters in the us-west-1 Region and start publishing data to this Region.Store 7 days of data in on-premises Kafka clusters and recover the data missed during therecovery time from the on-premises cluster.

Question # 45

A company ingests a large set of sensor data in nested JSON format from different sourcesand stores it in an Amazon S3 bucket. The sensor data must be joined with performancedata currently stored in an Amazon Redshift cluster.A business analyst with basic SQL skills must build dashboards and analyze this data inAmazon QuickSight. A data engineer needs to build a solution to prepare the data for useby the business analyst. The data engineer does not know the structure of the JSON file.The company requires a solution with the least possible implementation effort.Which combination of steps will create a solution that meets these requirements? (SelectTHREE.)

A. Use an AWS Glue ETL job to convert the data into Apache Parquet format and write toAmazon S3.
B. Use an AWS Glue crawler to catalog the data.
C. Use an AWS Glue ETL job with the ApplyMapping class to un-nest the data and write toAmazon Redshift tables.
D. Use an AWS Glue ETL job with the Regionalize class to un-nest the data and write toAmazon Redshift tables.
E. Use QuickSight to create an Amazon Athena data source to read the Apache Parquetfiles in Amazon S3.
F. Use QuickSight to create an Amazon Redshift data source to read the native AmazonRedshift tables.

Question # 46

An analytics team uses Amazon OpenSearch Service for an analytics API to be used bydata analysts. The OpenSearch Service cluster is configured with three master nodes. Theanalytics team uses Amazon Managed Streaming for Apache Kafka (Amazon MSK) and acustomized data pipeline to ingest and store 2 months of data in an OpenSearch Servicecluster. The cluster stopped responding, which is regularly causing timeout requests. Theanalytics team discovers the cluster is handling too many bulk indexing requests.Which actions would improve the performance of the OpenSearch Service cluster? (SelectTWO.)

A. Reduce the number of API bulk requests on the OpenSearch Service cluster and reducethe size of each bulk request.
B. Scale out the OpenSearch Service cluster by increasing the number of nodes.
C. Reduce the number of API bulk requests on the OpenSearch Service cluster, butincrease the size of each bulk request.
D. Increase the number of master nodes for the OpenSearch Service cluster.
E. Scale down the pipeline component that is used to ingest the data into the OpenSearchService cluster.

Question # 47

A global pharmaceutical company receives test results for new drugs from various testingfacilities worldwide. The results are sent in millions of 1 KB-sized JSON objects to anAmazon S3 bucket owned by the company. Thedata engineering team needs to processthose files, convert them into Apache Parquet format, and load them into Amazon Redshiftfor data analysts to perform dashboard reporting. The engineering team uses AWS Glue toprocess the objects, AWS Step Functions for process orchestration, and AmazonCloudWatch for job scheduling.More testing facilities were recently added, and the time to process files is increasing.What will MOST efficiently decrease the data processing time?

A. Use AWS Lambda to group the small files into larger files. Write the files back toAmazon S3. Process the files using AWS Glue and load them into Amazon Redshift tables.
B. Use the AWS Glue dynamic frame file grouping option while ingesting the raw input files.Process the files and load them into Amazon Redshift tables.
C. Use the Amazon Redshift COPY command to move the files from Amazon S3 intoAmazon Redshift tables directly. Process the files in Amazon Redshift.
D. Use Amazon EMR instead of AWS Glue to group the small input files. Process the filesin Amazon EMR and load them into Amazon Redshift tables.

Question # 48

A company with a video streaming website wants to analyze user behavior to makerecommendations to users in real time Clickstream data is being sent to Amazon Kinesis Data Streams and reference data is stored in Amazon S3 The company wants a solutionthat can use standard SQL quenes The solution must also provide a way to look up precalculatedreference data while making recommendationsWhich solution meets these requirements?

A. Use an AWS Glue Python shell job to process incoming data from Kinesis Data StreamsUse the Boto3 library to write data to Amazon Redshift
B. Use AWS Glue streaming and Scale to process incoming data from Kinesis DataStreams Use the AWS Glue connector to write data to Amazon Redshift
C. Use Amazon Kinesis Data Analytics to create an in-application table based upon thereference data Process incoming data from Kinesis Data Streams Use a data stream towrite results to Amazon Redshift
D. Use Amazon Kinesis Data Analytics to create an in-application table based upon thereference data Process incoming data from Kinesis Data Streams Use an Amazon KinesisData Firehose delivery stream to write results to Amazon Redshift

Question # 49

A manufacturing company is storing data from its operational systems in Amazon S3. Thecompany's business analysts need to perform one-time queries of the data in Amazon S3with Amazon Athena. The company needs to access the Athena service from the onpremisesnetwork by using a JDBC connection. The company has created a VPC. Securitypolicies mandate that requests to AWS services cannot traverse the internet.Which combination of steps should a data analytics specialist take to meet theserequirements? (Select TWO.)

A. Establish an AWS Direct Connect connection between the on-premises network and theVPC.
B. Configure the JDBC connection to connect to Athena through Amazon API Gateway.
C. Configure the JDBC connection to use a gateway VPC endpoint for Amazon S3.
D. Configure the JDBC connection to use an interface VPC endpoint for Athena.
E. Deploy Athena within a private subnet.

Question # 50

A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GBdataset. The data analyst triggered the job to run with the Standard worker type. After 3hours, the AWS Glue job status is still RUNNING. Logs from the job run show no errorcodes. The data analyst wants to improve the job execution time without overprovisioning.Which actions should the data analyst take?

A. Enable job bookmarks in AWS Glue to estimate the number of data processing units(DPUs). Based on the profiled metrics, increase the value of the executor-cores jobparameter.
B. Enable job metrics in AWS Glue to estimate the number of data processing units(DPUs). Based on theprofiled metrics, increase the value of the maximum capacity job parameter.
C. Enable job metrics in AWS Glue to estimate the number of data processing units(DPUs). Based on the profiled metrics, increase the value of thespark.yarn.executor.memoryOverhead job parameter.
D. Enable job bookmarks in AWS Glue to estimate the number of data processing units(DPUs). Based on the profiled metrics, increase the value of the num-executors jobparameter.

Question # 51

A human resources company maintains a 10-node Amazon Redshift cluster to run analytics queries on the company’s data. The Amazon Redshift cluster contains a product table and a transactions table, and both tables have a product_sku column. The tables are over 100 GB in size. The majority of queries run on both tables.Which distribution style should the company use for the two tables to achieve optimal query performance?

A. An EVEN distribution style for both tables
B. A KEY distribution style for both tables
C. An ALL distribution style for the product table and an EVEN distribution style for the transactions table
D. An EVEN distribution style for the product table and an KEY distribution style for the transactions table

Question # 52

A large ride-sharing company has thousands of drivers globally serving millions of unique customers every day. The company has decided to migrate an existing data mart to Amazon Redshift. The existing schema includes the following tables. A trips fact table for information on completed rides. A drivers dimension table for driver profiles. A customers fact table holding customer profile information. The company analyzes trip details by date and destination to examine profitability by region. The drivers data rarely changes. The customers data frequently changes. What table design provides optimal query performance?

A. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers and customers tables.
B. Use DISTSTYLE EVEN for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table. Use DISTSTYLE EVEN for the customers table.
C. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers table. Use DISTSTYLE EVEN for the customers table.
D. Use DISTSTYLE EVEN for the drivers table and sort by date. Use DISTSTYLE ALL for both fact tables.

Question # 53

An analytics software as a service (SaaS) provider wants to offer its customers business intelligence <BI) reporting capabilities that are self-service The provider is using AmazonQuickSight to build these reports The data for the reports resides in a multi-tenant database, but each customer should only be able to access their own data The provider wants to give customers two user role options • Read-only users for individuals who only need to view dashboards • Power users for individuals who are allowed to create and share new dashboards withother users Which QuickSight feature allows the provider to meet these requirements'?

A. Embedded dashboards
B. Table calculations
C. Isolated namespaces
D. SPICE

Question # 54

A software company wants to use instrumentation data to detect and resolve errors to improve application recovery time. The company requires API usage anomalies, like error rate and response time spikes, to be detected in near-real time (NRT) The company also requires that data analysts have access to dashboards for log analysis in NRT Which solution meets these requirements'?

A. Use Amazon Kinesis Data Firehose as the data transport layer for logging data Use Amazon Kinesis Data Analytics to uncover the NRT API usage anomalies Use Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use OpenSearch Dashboards (Kibana) in Amazon OpenSearch Service (Amazon Elasticsearch Service) for the dashboards.
B. Use Amazon Kinesis Data Analytics as the data transport layer for logging data. Use Amazon Kinesis Data Streams to uncover NRT monitoring metrics. Use Amazon Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use Amazon QuickSight for the dashboards
C. Use Amazon Kinesis Data Analytics as the data transport layer for logging data and to uncover NRT monitoring metrics Use Amazon Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use OpenSearch Dashboards (Kibana) in Amazon OpenSearch Service (Amazon Elasticsearch Service) for the dashboards
D. Use Amazon Kinesis Data Firehose as the data transport layer for logging data Use Amazon Kinesis Data Analytics to uncover NRT monitoring metrics Use Amazon Kinesis Data Streams to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use Amazon QuickSight for the dashboards.

Question # 55

An advertising company has a data lake that is built on Amazon S3. The company uses AWS Glue Data Catalog to maintain the metadata. The data lake is several years old and its overall size has increased exponentially as additional data sources and metadata are stored in the data lake. The data lake administrator wants to implement a mechanism to simplify permissions management between Amazon S3 and the Data Catalog to keep them in sync Which solution will simplify permissions management with minimal development effort?

A. Set AWS Identity and Access Management (1AM) permissions tor AWS Glue
B. Use AWS Lake Formation permissions
C. Manage AWS Glue and S3 permissions by using bucket policies
D. Use Amazon Cognito user pools.

Question # 56

A utility company wants to visualize data for energy usage on a daily basis in Amazon QuickSight A data analytics specialist at the company has built a data pipeline to collect and ingest the data into Amazon S3 Each day the data is stored in an individual csv file in an S3 bucket This is an example of the naming structure 20210707_datacsv 20210708_datacsv To allow for data querying in QuickSight through Amazon Athena the specialist used an AWS Glue crawler to create a table with the path "s3 //powertransformer/20210707_data csv" However when the data is queried, it returns zero rows How can this issue be resolved?

A. Modify the IAM policy for the AWS Glue crawler to access Amazon S3.
B. Ingest the files again.
C. Store the files in Apache Parquet format.
D. Update the table path to "s3://powertransformer/".

Question # 57

A company using Amazon QuickSight Enterprise edition has thousands of dashboards analyses and datasets. The company struggles to manage and assign permissions for granting users access to various items within QuickSight. The company wants to make it easier to implement sharing and permissions management. Which solution should the company implement to simplify permissions management?

A. Use QuickSight folders to organize dashboards, analyses, and datasets Assign individual users permissions to these folders
B. Use QuickSight folders to organize dashboards analyses, and datasets Assign group permissions by using these folders.
C. Use AWS 1AM resource-based policies to assign group permissions to QuickSight items
D. Use QuickSight user management APIs to provision group permissions based on dashboard naming conventions

Question # 58

A company is reading data from various customer databases that run on Amazon RDS. The databases contain many inconsistent fields For example, a customer record field that is place_id in one database is location_id in another database. The company wants to link customer records across different databases, even when many customer record fields do not match exactly Which solution will meet these requirements with the LEAST operational overhead?

A. Create an Amazon EMR cluster to process and analyze data in the databases Connect to the Apache Zeppelin notebook, and use the FindMatches transform to find duplicate records in the data.
B. Create an AWS Glue crawler to crawl the databases. Use the FindMatches transform to find duplicate records in the data Evaluate and tune the transform by evaluating performance and results of finding matches
C. Create an AWS Glue crawler to crawl the data in the databases Use Amazon SageMaker to construct Apache Spark ML pipelines to find duplicate records in the data
D. Create an Amazon EMR cluster to process and analyze data in the databases. Connect to the Apache Zeppelin notebook, and use Apache Spark ML to find duplicate records in the data. Evaluate and tune the model by evaluating performance and results of finding duplicates

Question # 59

A bank wants to migrate a Teradata data warehouse to the AWS Cloud The bank needs a solution for reading large amounts of data and requires the highest possible performance. The solution also must maintain the separation of storage and compute Which solution meets these requirements?

A. Use Amazon Athena to query the data in Amazon S3
B. Use Amazon Redshift with dense compute nodes to query the data in Amazon Redshift managed storage
C. Use Amazon Redshift with RA3 nodes to query the data in Amazon Redshift managed storage
D. Use PrestoDB on Amazon EMR to query the data in Amazon S3

Question # 60

A data analyst runs a large number of data manipulation language (DML) queries by using Amazon Athena with the JDBC driver. Recently, a query failed after It ran for 30 minutes.The query returned the following message Java.sql.SGLException: Query timeout The data analyst does not immediately need the query results However, the data analyst needs a long-term solution for this problem Which solution will meet these requirements?

A. Split the query into smaller queries to search smaller subsets of data.
B. In the settings for Athena, adjust the DML query timeout limit
C. In the Service Quotas console, request an increase for the DML query timeout
D. Save the tables as compressed .csv files

Question # 61

A bank is using Amazon Managed Streaming for Apache Kafka (Amazon MSK) to populate real-time data into a data lake The data lake is built on Amazon S3, and data must be accessible from the data lake within 24 hours Different microservices produce messages to different topics in the cluster The cluster is created with 8 TB of Amazon Elastic Block Store (Amazon EBS) storage and a retention period of 7 days The customer transaction volume has tripled recently and disk monitoring has provided an alert that the cluster is almost out of storage capacity What should a data analytics specialist do to prevent the cluster from running out of disk space1?

A. Use the Amazon MSK console to triple the broker storage and restart the cluster
B. Create an Amazon CloudWatch alarm that monitors the KafkaDataLogsDiskUsed metric Automatically flush the oldest messages when the value of this metric exceeds 85%
C. Create a custom Amazon MSK configuration Set the log retention hours parameter to 48 Update the cluster with the new configuration file
D. Triple the number of consumers to ensure that data is consumed as soon as it is added to a topic.

Question # 62

An online retail company uses Amazon Redshift to store historical sales transactions. The company is required to encrypt data at rest in the clusters to comply with the Payment Card Industry Data Security Standard (PCI DSS). A corporate governance policy mandates management of encryption keys using an on-premises hardware security module (HSM). Which solution meets these requirements?

A. Create and manage encryption keys using AWS CloudHSM Classic. Launch an Amazon Redshift cluster in a VPC with the option to use CloudHSM Classic for key management.
B. Create a VPC and establish a VPN connection between the VPC and the on-premises network. Create an HSM connection and client certificate for the on-premises HSM. Launch a cluster in the VPC with the option to use the on-premises HSM to store keys.
C. Create an HSM connection and client certificate for the on-premises HSM. Enable HSM encryption on the existing unencrypted cluster by modifying the cluster. Connect to the VPC where the Amazon Redshift cluster resides from the on-premises network using a VPN.
D. Create a replica of the on-premises HSM in AWS CloudHSM. Launch a cluster in a VPC with the option to use CloudHSM to store keys.

Amazon DAS-C01 Frequently Asked Questions

Customers Feedback

What our clients say about DAS-C01 Quiz Sheets

William Martin Apr 26, 2024

Your team is amazing! The questions were incredibly similar to the actual exam. I got 870/1000 score Thank you so much.

Samantha Davis Apr 25, 2024

The DAS-C01 exam study materials from Salesforcexamdumps.com are excellent. The PDF dumps and online test engine are well-structured and provide a comprehensive coverage of the exam topics. The questions in the dumps are very similar to the actual exam, which helped me gain confidence and prepare effectively. I was able to pass the DAS-C01 exam with a high score thanks to these study materials. Overall, I highly recommend Salesforcexamdumps.com for anyone preparing for the DAS-C01 exam.

Christopher Apr 25, 2024

Very Helpful Study Material I passed and I got 95% Marks.

David Lee Apr 24, 2024

Thanks to the Dumps of DAS-C01 Exam, my preparation was significantly simplified. I was pleasantly surprised at how easily I was able to obtain and follow the instructions in the PDF file. What's more, the material came with a money-back guarantee, which boosted my confidence even further. Now that I'm certified in Amazon, I'm thrilled. I'm grateful to Salesforcexamdumps.com for making my journey to certification so much smoother.

Madison Apr 24, 2024

Despite my lack of study time, I was able to pass DAS-C01 exam with the help of this exam dump and will purchase it again. Thank you so much Salesforcexamdumps.com

Joseph Wright Apr 23, 2024

These DAS-C01 Dumps Are well-formatted and Test Engine software was definitely worth the money I spent. Thank Guys

Sarah Johnson Apr 23, 2024

Salesforcexamdumps.com has earned my trust and brought me genuine happiness by aiding me in passing my DAS-C01 exam and obtaining my AWS Certified Data Analytics - Specialty certification. I simply obtained their exam study guide in PDF format and began preparing for the exam. I'm grateful to the team for their assistance.

Williams Apr 22, 2024

This prep guide acted as a secret cheat code for me to pass my exam easily.

Daniel Davis Apr 22, 2024

Passed AWS Certified Data Analytics - Specialty Exam using Practice Test. Thanks Salesforcexamdumps.com Team.

Victoria Jones Apr 21, 2024

The DAS-C01 exam dumps are excellent content and very helpful. I studied for 5 days and made sure to get an 85+ passing mark using the online test engine, I booked my certification exam and passed it on my first try. Highly recommended thanks