Flume is a tool for converting data back and forth between a relational database and the HDFS. False
______ minimizes the number of disk reads necessary to retrieve a row of data. Row-centric storage
A(n) ______ is a process or set of operations in a calculation. Algorithm
Scaling out is keeping the same number of systems, but migrating each system to a larger one. False
When using MapReduce, a _______ function takes a collection and data and sorts and filters it into a set of key-value pairs. Map
______ focuses on filtering data as it enters the system to determine which data to keep and which to discard. Stream processing
______ is a tool for converting data back and forth between a relational database and the HDFS. Sqoop
______ is keeping the same number of systems, but migrating each system to a larger system. Scaling up
______ is NOT one of the “3 Vs” of Big Data. Validation
______ processing occurs when a program runs from beginning to end without any user interaction. Batch
______ was the first SQL on Hadoop application. Impala
A ______ is a programmed function within an object used to manipulate the data in that same object. Method
A block report is used to let the name node know that the data mode is still available. False
A column family database is a NoSQL database model that organizes data in key-value pairs with keys mapped to a set of columns in the value component. True
A query in a graph database is called a ______. Traversal
A reduce function takes a collection of key-value pairs with the same key value and summarizes them into a single result. True
A(n) ______ is a tag that is used to associate a collection of nodes as being of the same type or belonging to the same group. Label
Big Data ______. captures data in whatever format it naturally exists
Big Data processing imposes a structure on the data as needed for applications as a part of retrieval and processing. True
By default, Hadoop uses a replication factor of ______. Three
Characteristics that are important in working with data in the relational database model also apply to Big Data. True
Data collected or aggregated around a central topic or entity is said to be ______ aware. Aggregate
Document databases group documents into logical groups called ______. Collections
For a data set to be considered Big Data, it must display only one of the 3 Vs (volume, velocity and variety). False
Graph theory is a mathematical and computer science field that models relationships, or edges, between objects called ______. Nodes
Hadoop is a database that has become the de facto standard for most Big Data storage and processing. True
Hive is a good choice for jobs that require a small subset of data to be returned very quickly. False
In many ways, the issues associated with volume and velocity are the same. True
In MongoDB, ______ method retrieves objects from a collection that match the restrictions provided. find()
In MongoDB, the ______ method is used to improve the readability of retrieved documents through the use of line breaks and indention. pretty()
In the context of Big Data, ______ refers to the trustworthiness of a set of data. Veracity
In the context of Big Data, ______ relates to changes in meaning. Variability
Interest in graph databases can be tied to the area of social networks. True
Key-value and document databases are structurally similar. True
Lack of specificity is what leads to ambiguity in defining Big Data. True
Modeling and storing data about relationships is the focus of ______ databases. Graph
Most NoSQL products run only in a Linux or Unix environment. True
Neo4j is a ______ database. Graph
Relational databases rely on unstructured data. False
The ability to graphically present data in a way that makes it understandable is the concept of value. False
The analysis of data to produce actionable results is feedback loop processing True
The name, MongoDB, comes from the word humongous as its developers intended their new product to support extremely large data sets. True
To query the value component of the pair when using a key-value database, use get or ______. Fetch
Two of the most popular applications to simplify the process of creating MapReduce jobs are Hive and ______. Pig
Under the HDFS system, using a write-one, ready-many model simplifies concurrency issues. True
When using a HDFS, a heartbeat is sent every ______ to notify the name node that the data mode is still available. 3 seconds
When using a HDFS, the ______ node creates new files by communicating with the ______ node. client; name
When using MapReduce, best practices suggest that the number of mappers on a given node should be ______. 100 or less
Which of the following is NOT a key assumption of the Hadoop Distributed File System? Write many, read-once
Which of the following is NOT one of the standard NoSQL categories? Chart databases

See other websites for quiz:

Check on QUIZLET

Check on CHEGG

Post Views: 470

INF503 QUIZ

Other Links:

See other websites for quiz:

Leave a ReplyCancel Reply