Question: Section 1 25 Marks Discuss all questions in a context of a mini case study at least 1 paragraph per question 1. What are the
Section 1
25 Marks
Discuss all questions in a context of a mini case study at least 1 paragraph per question
1. What are the three characteristics of Big Data, and what are the main considerations in processing
Big Data?
2. What is an analytic sandbox, and why is it important?
3. Explain the differences between BI and Data Science.
4. Describe the challenges of the current analytical architecture for data scientists. 5. What are the key skill sets and behavioural characteristics of a data scientist?
Section 2
20 Marks
1. What are the benefits of doing a pilot program before a full-scale rollout of a new analytical
methodology? Discuss this in the context of the mini case study at least 1 paragraph.
2. What kinds of tools that need to be used in the following phases, and for which kinds of use
scenarios? a. Phase 2: Data preparation
3BDA FA 1
b. Phase 4: Model building
3. How many levels does fdata contain in the following R code?
data = c(1,2,2,3,1,2,3,3,1,2,3,3,1)
fdata = factor(data)
4. Two vectors, v1 and v2, are created with the following R code:
v1
v2
What are the results of cbind (v1, v2) and rbind (v1, v2)?
Section 3
35 Marks
1. An online retailer wants to study the purchase behaviours of its customers. Figure 3-1 shows
the density plot of the purchase sizes (in Rand). What would be your recommendation to
enhance the plot to detect more structures that otherwise might be missed?
At least one paragraph.

\f
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
