The latest Amazon DAS-C01 dumps by Lead4Pass helps you pass the DAS-C01 exam for the first time! Lead4Pass
Latest Update Amazon DAS-C01 VCE Dump and DAS-C01 PDF Dumps, Lead4Pass DAS-C01 Exam Questions Updated, Answers corrected!
Get the latest Lead4Pass DAS-C01 dumps with Vce and PDF: https://www.leads4pass.com/das-c01.html (Q&As: 77 dumps)
[Free DAS-C01 PDF] Latest Amazon DAS-C01 Dumps PDF collected by Lead4pass Google Drive:
https://drive.google.com/file/d/11eR0VE7iSQs3rgXd0sa3F7AIxfJDFD_0/
[Lead4pass DAS-C01 Youtube] Amazon DAS-C01 Dumps can be viewed on Youtube shared by Lead4Pass
Latest Amazon DAS-C01 Exam Practice Questions and Answers
QUESTION 1
An airline has .csv-formatted data stored in Amazon S3 with an AWS Glue Data Catalog. Data analysts want to join this
data with call center data stored in Amazon Redshift as part of a dally batch process. The Amazon Redshift cluster is
already under a heavy load. The solution must be managed, serverless, well-functioning, and minimize the load on the
existing Amazon Redshift cluster. The solution should also require minimal effort and development activity.
Which solution meets these requirements?
A. Unload the call center data from Amazon Redshift to Amazon S3 using an AWS Lambda function. Perform the join
with AWS Glue ETL scripts.
B. Export the call center data from Amazon Redshift using a Python shell in AWS Glue. Perform the join with AWS Glue
ETL scripts.
C. Create an external table using Amazon Redshift Spectrum for the call center data and perform the join with Amazon
Redshift.
D. Export the call center data from Amazon Redshift to Amazon EMR using Apache Sqoop. Perform the join with
Apache Hive.
Correct Answer: C
QUESTION 2
A team of data scientists plans to analyze market trend data for their company\\’s new investment strategy. The trend
data comes from five different data sources in large volumes. The team wants to utilize Amazon Kinesis to support their
use case. The team uses SQL-like queries to analyze trends and wants to send notifications based on certain significant
patterns in the trends. Additionally, the data scientists want to save the data to Amazon S3 for archival and historical
reprocessing, and use AWS managed services wherever possible. The team wants to implement the lowest-cost
solution.
Which solution meets these requirements?
A. Publish data to one Kinesis data stream. Deploy a custom application using the Kinesis Client Library (KCL) for
analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on the Kinesis data
stream to persist data to an S3 bucket.
B. Publish data to one Kinesis data stream. Deploy Kinesis Data Analytic to the stream for analyzing trends, and
configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis Data
Firehose on the Kinesis data stream to persist data to an S3 bucket.
C. Publish data to two Kinesis data streams. Deploy Kinesis Data Analytics to the first stream for analyzing trends, and
configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis Data
Firehose on the second Kinesis data stream to persist data to an S3 bucket.
D. Publish data to two Kinesis data streams. Deploy a custom application using the Kinesis Client Library (KCL) to the
first stream for analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on the
second Kinesis data stream to persist data to an S3 bucket.
Correct Answer: A
QUESTION 3
A financial services company needs to aggregate daily stock trade data from the exchanges into a data store. The
company requires that data be streamed directly into the data store, but also occasionally allows data to be modified
using SQL. The solution should integrate complex, analytic queries running with minimal latency. The solution must
provide a business intelligence dashboard that enables viewing of the top contributors to anomalies in stock prices.
Which solution meets the company\\’s requirements?
A. Use Amazon Kinesis Data Firehose to stream data to Amazon S3. Use Amazon Athena as a data source for Amazon
QuickSight to create a business intelligence dashboard.
B. Use Amazon Kinesis Data Streams to stream data to Amazon Redshift. Use Amazon Redshift as a data source for
Amazon QuickSight to create a business intelligence dashboard.
C. Use Amazon Kinesis Data Firehose to stream data to Amazon Redshift. Use Amazon Redshift as a data source for
Amazon QuickSight to create a business intelligence dashboard.
D. Use Amazon Kinesis Data Streams to stream data to Amazon S3. Use Amazon Athena as a data source for Amazon
QuickSight to create a business intelligence dashboard.
Correct Answer: D
QUESTION 4
A media company wants to perform machine learning and analytics on the data residing in its Amazon S3 data lake.
There are two data transformation requirements that will enable the consumers within the company to create reports:
1.
Daily transformations of 300 GB of data with different file formats landing in Amazon S3 at a scheduled time.
2.
One-time transformations of terabytes of archived data residing in the S3 data lake.
Which combination of solutions cost-effectively meets the company\\’s requirements for transforming the data? (Choose
three.)
A. For daily incoming data, use AWS Glue crawlers to scan and identify the schema.
B. For daily incoming data, use Amazon Athena to scan and identify the schema.
C. For daily incoming data, use Amazon Redshift to perform transformations.
D. For daily incoming data, use AWS Glue workflows with AWS Glue jobs to perform transformations.
E. For archived data, use Amazon EMR to perform data transformations.
F. For archived data, use Amazon SageMaker to perform data transformations.
Correct Answer: BCD
QUESTION 5
A company has developed an Apache Hive script to batch process data started in Amazon S3. The script needs to run
once every day and store the output in Amazon S3. The company tested the script, and it completes it within 30 minutes
on a small local three-node cluster.
Which solution is the MOST cost-effective for scheduling and executing the script?
A. Create an AWS Lambda function to spin up an Amazon EMR cluster with a Hive execution step. Set
KeepJobFlowAliveWhenNoSteps to false and disable the termination protection flag. Use Amazon CloudWatch Events
to schedule the Lambda function to run daily.
B. Use the AWS Management Console to spin up an Amazon EMR cluster with Python Hue. Hive, and Apache Oozie.
Set the termination protection flag to true and use Spot Instances for the core nodes of the cluster. Configure an Oozie workflow in the cluster to invoke the Hive script daily.
C. Create an AWS Glue job with the Hive script to perform the batch operation. Configure the job to run once a day
using a time-based schedule.
D. Use AWS Lambda layers and load the Hive runtime to AWS Lambda and copy the Hive script. Schedule the Lambda
function to run daily by creating a workflow using AWS Step Functions.
Correct Answer: C
QUESTION 6
A healthcare company uses AWS data and analytics tools to collect, ingest, and store electronic health record (EHR)
data about its patients. The raw EHR data is stored in Amazon S3 in JSON format partitioned by the hour, day, and year
and is updated every hour. The company wants to maintain the data catalog and metadata in an AWS Glue Data
Catalog to be able to access the data using Amazon Athena or Amazon Redshift Spectrum for analytics.
When defining tables in the Data Catalog, the company has the following requirements:
1.
Choose the catalog table name and do not rely on the catalog table naming algorithm.
2.
Keep the table updated with new partitions loaded in the respective S3 bucket prefixes. Which solution meets these
requirements with minimal effort?
A. Run an AWS Glue crawler that connects to one or more data stores, determines the data structures, and writes
tables in the Data Catalog.
B. Use the AWS Glue console to manually create a table in the Data Catalog and schedule an AWS Lambda function to
update the table partitions hourly.
C. Use the AWS Glue API CreateTable operation to create a table in the Data Catalog. Create an AWS Glue crawler
and specify the table as the source.
D. Create an Apache Hive catalog in Amazon EMR with the table schema definition in Amazon S3 and update the table
partition with a scheduled job. Migrate the Hive catalog to the Data Catalog.
Correct Answer: B
Reference: https://docs.aws.amazon.com/glue/latest/dg/tables-described.html
QUESTION 7
A global company has different sub-organizations, and each sub-organization sells its products and services in various
countries. The company\\’s senior leadership wants to quickly identify which sub-organization is the strongest performer
in each country. All sales data is stored in Amazon S3 in Parquet format.
Which approach can provide the visuals that senior leadership requested with the least amount of effort?
A. Use Amazon QuickSight with Amazon Athena as the data source. Use heat maps as the visual type.
B. Use Amazon QuickSight with Amazon S3 as the data source. Use heat maps as the visual type.
C. Use Amazon QuickSight with Amazon Athena as the data source. Use pivot tables as the visual type.
D. Use Amazon QuickSight with Amazon S3 as the data source. Use pivot tables as the visual type.
Correct Answer: C
QUESTION 8
A company wants to optimize the cost of its data and analytics platform. The company is ingesting a number of .csv and
JSON files in Amazon S3 from various data sources. Incoming data is expected to be 50 GB each day. The company is
using Amazon Athena to query the raw data in Amazon S3 directly. Most queries aggregate data from the past 12
months and data that is older than 5 years is infrequently queried. The typical query scans about 500 MB of data and is
expected to return results in less than 1 minute. The raw data must be retained indefinitely for compliance
requirements.
Which solution meets the company\\’s requirements?
A. Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to
query the processed dataset. Configure a lifecycle policy to move the processed data into the Amazon S3 StandardInfrequent Access (S3 Standard-IA) storage class 5 years after object creation. Configure a second lifecycle policy to
move the raw data into Amazon S3 Glacier for long-term archival 7 days after object creation.
B. Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the
processed dataset. Configure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3
Standard-IA) storage class 5 years after object creation. Configure a second lifecycle policy to move the raw data into
Amazon S3 Glacier for long-term archival 7 days after object creation.
C. Use an AWS Glue ETL job to compress, partition, and convert the data into a columnar data format. Use Athena to
query the processed dataset. Configure a lifecycle policy to move the processed data into the Amazon S3 StandardInfrequent Access (S3 Standard-IA) storage class 5 years after the object was last accessed. Configure a second
lifecycle policy to move the raw data into Amazon S3 Glacier for long-term archival 7 days after the last date the object
was accessed.
D. Use an AWS Glue ETL job to partition and convert the data into a row-based data format. Use Athena to query the
processed dataset. Configure a lifecycle policy to move the data into the Amazon S3 Standard-Infrequent Access (S3
Standard-IA) storage class 5 years after the object was last accessed. Configure a second lifecycle policy to move the
raw data into Amazon S3 Glacier for long-term archival 7 days after the last date the object was accessed.
Correct Answer: C
QUESTION 9
A data analyst is using AWS Glue to organize, cleanse, validate, and format a 200 GB dataset. The data analyst
triggered the job to run with the Standard worker type. After 3 hours, the AWS Glue job status is still RUNNING. Logs
from the job run show no error codes. The data analyst wants to improve the job execution time without
overprovisioning.
Which actions should the data analyst take?
A. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the executor-cores job parameter.
B. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the maximum capacity job parameter.
C. Enable job metrics in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the spark.yarn.executor.memory overhead job parameter.
D. Enable job bookmarks in AWS Glue to estimate the number of data processing units (DPUs). Based on the profiled
metrics, increase the value of the num-executors job parameter.
Correct Answer: B
Reference: https://docs.aws.amazon.com/glue/latest/dg/monitor-debug-capacity.html
QUESTION 10
A data analyst is designing a solution to interactively query datasets with SQL using a JDBC connection. Users will join
data stored in Amazon S3 in Apache ORC format with data stored in Amazon Elasticsearch Service (Amazon ES) and
Amazon Aurora MySQL.
Which solution will provide the MOST up-to-date results?
A. Use AWS Glue jobs to ETL data from Amazon ES and Aurora MySQL to Amazon S3. Query the data with Amazon
Athena.
B. Use Amazon DMS to stream data from Amazon ES and Aurora MySQL to Amazon Redshift. Query the data with
Amazon Redshift.
C. Query all the datasets in place with Apache Spark SQL running on an AWS Glue developer endpoint.
D. Query all the datasets in place with Apache Presto running on Amazon EMR.
Correct Answer: C
QUESTION 11
A large ride-sharing company has thousands of drivers globally serving millions of unique customers every day. The
company has decided to migrate an existing data mart to Amazon Redshift. The existing schema includes the following
tables.
1.
A trip fact table for information on completed rides.
2.
A driver dimension table for driver profiles.
3.
A customer fact table holding customer profile information.
The company analyzes trip details by date and destination to examine profitability by region. The driver’s data rarely
changes. The customer’s data frequently changes.
What table does design provide optimal query performance?
A. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the drivers and
customers tables.
B. Use DISTSTYLE EVEN for the trips table and sort by date. Use DISTSTYLE ALL for the driver’s table. Use
DISTSTYLE EVEN for the customer’s table.
C. Use DISTSTYLE KEY (destination) for the trips table and sort by date. Use DISTSTYLE ALL for the driver’s table.
Use DISTSTYLE EVEN for the customer’s table.
D. Use DISTSTYLE EVEN for the driver’s table and sort by date. Use DISTSTYLE ALL for both fact tables.
Correct Answer: A
QUESTION 12
A technology company is creating a dashboard that will visualize and analyze time-sensitive data. The data will come in
through Amazon Kinesis Data Firehose with the butter interval set to 60 seconds. The dashboard must support near real-time data.
Which visualization solution will meet these requirements?
A. Select Amazon Elasticsearch Service (Amazon ES) as the endpoint for Kinesis Data Firehose. Set up a Kibana
dashboard using the data in Amazon ES with the desired analyses and visualizations.
B. Select Amazon S3 as the endpoint for Kinesis Data Firehose. Read data into an Amazon SageMaker Jupyter
notebook and carry out the desired analyses and visualizations.
C. Select Amazon Redshift as the endpoint for Kinesis Data Firehose. Connect Amazon QuickSight with SPICE to
Amazon Redshift to create the desired analyses and visualizations.
D. Select Amazon S3 as the endpoint for Kinesis Data Firehose. Use AWS Glue to catalog the data and Amazon
Athena to query it. Connect Amazon QuickSight with SPICE to Athena to create the desired analyses and
visualizations.
Correct Answer: A
QUESTION 13
A company is planning to create a data lake in Amazon S3. The company wants to create tiered storage based on
access patterns and cost objectives. The solution must include support for JDBC connections from legacy clients,
metadata management that allows federation for access control, and batch-based ETL using PySpark and Scala
Operational management should be limited.
Which combination of components can meet these requirements? (Choose three.)
A. AWS Glue Data Catalog for metadata management
B. Amazon EMR with Apache Spark for ETL
C. AWS Glue for Scala-based ETL
D. Amazon EMR with Apache Hive for JDBC clients
E. Amazon Athena for querying data in Amazon S3 using JDBC drivers
F. Amazon EMR with Apache Hive, using an Amazon RDS with MySQL-compatible backed megastore
Correct Answer: BEF
Reference: https://d1.awsstatic.com/whitepapers/Storage/data-lake-on-aws.pdf
latest updated Amazon DAS-C01 exam questions from the Lead4Pass DAS-C01 dumps! 100% pass the DAS-C01 exam!
Download Lead4Pass DAS-C01 VCE and PDF dumps: https://www.leads4pass.com/das-c01.html (Q&As: 77 dumps)
Get free Amazon DAS-C01 dumps PDF online: https://drive.google.com/file/d/11eR0VE7iSQs3rgXd0sa3F7AIxfJDFD_0/