Question: 1.Using the examples of distributed join and distributed deadlock, explain why conventional distributed database systems are problematic from a performance perspective. Note that the answer

1.Using the examples of distributed join and distributed deadlock, explain why conventional distributed database systems are problematic from a performance perspective. Note that the answer to this question serves as a motivation for NO-SQL databases.

2. Pick one No-SQL database system from https://en.wikipedia.org/wiki/NoSQL so either Column, Document, Key-value, or Graph databases and answer how the following three aspects differ from relational databases. Provide your sources, preferably a link, and state clearly when your source only speaks to one of the specific instances of the No-DQL database type.

a) What is the underlying data structure or data model? How many, and what types of constraints are enforced? Compare with relational. What types of queries are supported in comparison with relational databases.

3. Much processing of large data companies uses the Hadoop framework and similar proprietary ones. Processing for this framework uses MapReduce that consists of a map and a reduce step. In contrast to older models of distributed computing, such as the Message Passing Interface, MPI, data cannot be passed from one mapper to another. Explain why that limitation is important for performance, and what communication between processes does happen under MapReduce. The Hadoop distributed file system, HDFS, achieves redundancy for data. Compare with RAID technology. What happens in Hadoop when a mapper aborts?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!