01. Which offering provides scale-out parallel processing and dramatically accelerates performance of analytics clusters when integrated with the IBM Flash System?
a) IBM Cloud Object Storage
b) IBM Spectrum Accelerate
c) IBM Spectrum Scale
d) IBM Spectrum Connect
02. A company is planning on creating an Azure SQL database to support a mission critical application. The application needs to be highly available and not have any performance degradation during maintenance windows.
Which of the following technologies can be used to implement this solution?
(Choose 3)
a) Premium Service Tier
b) Virtual Machine Scale Sets
c) Basic Service Tier
d) SQL Data Sync
e) Always On Availability Groups
f) Zone-redundant configuration
03. The data engineering team manages Azure HDInsight clusters. The team spends a large amount of time creating and destroying clusters daily because most of the data pipeline process runs in minutes.
You need to implement a solution that deploys multiple HDInsight clusters with minimal effort. What should you implement?
a) Azure Databricks
b) Azure Traffic Manager
c) Azure Resource Manager templates
d) Ambari web user interface
04. A company is designing a hybrid solution to synchronize data and on-premises Microsoft SQL Server database to Azure SQL Database. You must perform an assessment of databases to determine whether data will move without compatibility issues.
You need to perform the assessment. Which tool should you use?
a) SQL Server Migration Assistant (SSMA)
b) Microsoft Assessment and Planning Toolkit
c) SQL Vulnerability Assessment (VA)
d) Azure SQL Data Sync
e) Data Migration Assistant (DMA)
05. You are a data engineer for an Azure SQL Database. You write the following SQL statements:
CREATE TABLE Customer (
CustomerID int IDENTITY PRIMARY KEY,
GivenName varchar(100) MASKED WITH (FUNCTION = 'partial(2,"XX",0)') NULL,
SurName varchar(100) NOT NULL,
Phone varchar(12) MASKED WITH (FUNCTION = 'default()')
INSERT Customer (GivenName, SurName, Phone) VALUES ('Sammy', 'Jack', '555.111.2222');
SELECT * FROM Customer;
You need to determine what is returned by the SELECT query. What data is returned?
a) 1 SaXX Jack XXX.XXX.2222
b) 1 XXXX Jack XXX.XXX.XXXX
c) 1 xx Jack XXX.XXX.2222
d) 1 SaXX Jack xxxx
06. Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.
You must develop a pipeline that meets the following requirements:
- Process data every six hours
- Offer interactive data analysis capabilities
- Offer the ability to process data using solid-state drive (SSD) caching Use Directed Acyclic Graph(DAG) processing mechanisms
- Provide support for REST API calls to monitor processes Provide native support for Python
- Integrate with Microsoft Power BI
You need to select the appropriate data technology to implement the pipeline. Which data technology should you implement?
a) Azure SQL Data Warehouse
b) HDInsight Apache Storm cluster
c) Azure Stream Analytics
d) HDInsight Apache Hadoop cluster using MapReduce
e) HDInsight Spark cluster
07. You are a data engineer for your company. Your company has an on-premises SQL Server instance that contains 16 databases. Four of the databases require Common Language Runtime (CLR) features. You must be able to manage each database separately because each database has its own resource needs.
You plan to migrate these databases to Azure. You want to migrate the databases by using a backup and restore process by using SQL commands. You need to choose the most appropriate deployment option to migrate the databases.
What should you use?
a) Azure SQL Database with an elastic poolAzure Cosmos DB with the SQL (DocumentDB) API
b) Azure Cosmos DB with the SQL (DocumentDB) API
c) Azure SQL Database managed instance
d) Azure Cosmos DB with the Table API
Your company uses Azure Stream Analytics to monitor devices. The company plans to double the number of devices that are monitored.
You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load.
Which metric should you monitor?
a) Input Deserialization Errors
b) Early Input Events
c) Late Input Events
d) Watermark delay
09. A company has an Azure SQL data warehouse. They want to use PolyBase to retrieve data from an Azure Blob storage account and ingest into the Azure SQL data warehouse. The files are stored in parquet format. The data needs to be loaded into a table called lead2pass_sales.
Which of the following actions need to be performed to implement this requirement?
(Choose 4)
a) Create an external file format that would map to the parquet-based files
b) Load the data into a staging table
c) Create an external table called lead2pass_sales_details
d) Create an external data source for the Azure Blob storage account
e) Create a master key on the database
f) Configure Polybase to use the Azure Blob storage account
10. You are migrating a corporate research analytical solution from an internal data center to Azure. 45 TB of research data is currently stored in an on-premises Hadoop cluster. You plan to copy it to Azure Storage.
Your internal data center is connected to your Azure Virtual Network (VNet) with Express Route private peering. The Azure Storage service endpoint is accessible from the same VNet.Corporate policy dictates that the research data cannot be transferred over public internet.
You need to securely migrate the research data online. What should you do?
a) Transfer the data using Azure Data Factory in distributed copy (DistCopy) mode, with an Azure Data Factory self-hosted Integration Runtime (IR) machine installed in the on-premises datacenter.
b) Transfer the data using Azure Data Factory in native Integration Runtime (IR) mode, with an Azure Data Factory self-hosted IR machine installed on the Azure VNet.
c) Transfer the data using Azure Data Box Heavy devices.
d) Transfer the data using Azure Data Box Disk devices.