BA03 Introduction to Big Data Assignment 1
WARNING - Clicking on the "SUBMIT ASSIGNMENT" button will submit the
Assignment. Be sure that you have reviewed your answers before clicking it.
Subject Code: BA03
Subject Name:
INTRODUCTION TO BIG
DATA
Component name:
ASSIGNMENT1
Question 1:- For analyzing the data, Jennifer, needs a system that collects, aggregates, and moves large amounts of log data from many different sources to a centralized data store. Which of the following tools would you suggest Jennifer to use?
a) MapReduce
b) ZooKeeper
c) Oozie
d) Flume
Question 2:- How does Hadoop use computing resources?
a) It only distributes data to computing resources.
b) It distributes software to computing resources.
c) It distributes data and computing tasks to computing resources.
d) It creates shared memory for computing resources.
Question 3:- With the enhancement in technology, companies are using different ways for marketing their products and services. The new sensors are being used with new marketing campaigns, and this result in new type of data and information. Which element of Big Data is being discussed here?
a) Volume
b) Velocity
c) Variety
d) Both volume and velocity
Question 4:- Which of the following can be tracked using the RFID tags?
a) Raw materials
b) Scrap materials
c) Finished goods inventory
d) Insurance fraud
Question 5:- You are the Marketing Head of an organization. You plan to increase your market outreach by converting prospective customers to actual customers. Which of the following analysis approaches would you consider as the best to adopt?
a) Data interpretation
b) Behavioral analytics
c) Data visualization
d) Data collection
Question 6:- What are the two disadvantages of public cloud, compared to in-house analysis?
/
Latency and risk to data security
Latency and software incompatibility
Higher cost and risk to data security
Question 7:- What is Metadata is defined as?
Data about data
Pattern framework
Link analysis Text mining
Question 8:- Sam is seeking a career as a data analyst. Which of these is a key responsibility of a data analyst?
Determine what data means and recommend ways to search the data
Specialize in collecting data from different sources, organizing it in suitable format, and making
analysis
Design, create, manage and interpret large datasets to achieve business goals
Develops codes and images to automate data reports
Question 9:- In the MapReduce framework, map and reduce functions can be run in any order. Do you agree and
Yes, because in functional programming, the order of execution is not important.
Yes, because the functions use KVP as input and output; order is not important.
No, because the output of the map function is the input for the reduce function.
No, because the output of the reduce function is the input for the function.
Question 10:- Which of the following components of Hadoop provides SQL-like access to structured data and sophisticated Big Data analysis with MapReduce?
Hive
HDFS
Hbase
MapReduce
Question 11:- Predictive models based on both historical and real-time data can help which of these businesses to identify suspected cases of fraud in early stages?
Marketing companies
Medical claims companies
Construction companies
CRM based manufacturing companies
Question 12:- Why are Big Data applications susceptible to latency?
The volume of Big Data is too large to be analyzed rapidly.
Big Data may reside in a different location from the application.
Big Data cannot use in-memory computing.
Big Data applications are still in early stages of development.
Question 13:- ABC is a retailer organization that conducts its business through e-commerce. The organization offers customized online shopping experience to their customers with an attractive and responsive Web page user interface. Now the company wants to collect the data about customers’ activities on the Internet. What can be the best source for such data?
Transactional database
Social media
Weblogs of customers
All of the above
Question 14:- Which of the following is a RFID reader action?
Text mining
Credit notes management
Insurance fraud detection Inventory management
Question 15:- What could be the biggest challenge, for the production or operations unit of an organization?
Determining the data to be used for making business decisions
Determining the best Big Data technology to be used
Securing Big Data initiatives from unauthorized access
Determining the best way to present Big Data findings to enable decision-making
Question 16:- Which of the following options was one of the factors driving the creation of MapReduce?
Increasing processing power of new hardware
Business need for complex analysis of structured data
Increasing number of Web users
Spread of distributed computing
Question 17:- In designing the MapReduce framework, which of the following needs did the engineers consider?
It should be cheap and distributed free of cost.
Processing should expand and contract automatically.
Processing should be stopped in the case of network failure.
Developers should be able to create new languages.
Question 18:- Which of the following describes the map function?
It processes data to create a list of key-value pairs.
It indexes the data to list all the words occurring in it.
It converts a relational database to key-value pairs.
It tracks data across multiple tables and clusters in Hadoop.
Question 19:- Which of the following describes the reduce function?
It analyzes the map function results to show the most frequently occurring values.
It combines the map function results to return a list of the best matches for the query.
It adds the results of the map function to convert the KVP lists to columnar databases.
It processes map function results and creates a new KVP list to answer the query.
Question 20:- How does MapReduce achieve co-location?
a) The scheduler sends the code to the machine where the relevant data resides.
b) The process scheduler distributes data of the same type to machines in the same cluster.
c) The master JobTracker sends map and reduce functions to the same machines or nodes in acluster.
d) The slave TaskTrackers copy related data and code to adjacent clusters in case of processingfailure.
Resources
- 24 x 7 Availability.
- Trained and Certified Experts.
- Deadline Guaranteed.
- Plagiarism Free.
- Privacy Guaranteed.
- Free download.
- Online help for all project.
- Homework Help Services