Question: Multiples Choice 1. Cross industry Standard Process for data Mining (CRISP-DM) consists of six phases. Of the six, which one represents the phase where data

Multiples Choice

1. Cross industry Standard Process for data Mining (CRISP-DM) consists of six phases. Of the six, which one represents the phase where data wrangling occurs?

a) Deployment

b) Modeling

c) Data understanding

d) Data preparation

2. Consider the partial data set in the table represents online hours spent shopping by age and income. Using the min-max transformation to normalize the observation for income spent online

ID Income Age Online Hours

2201 62,000 48 2

2202 58,000 52 4

2203 53,000 44 5

2204 22,000 28 7

2205 43,000 33 4

2206 48,000 35 3

a) 1

b) 6417

c) 0

d) 0.6997

3. The following table is a segment of Loan Data from a bank for car loans. Compute the matching coefficient between Pairs 1 and 4

a) Matching coefficient is 0.50

b) Matching coefficient is 0.25

c) Matching coefficient is 0.75

d) Matching coefficient is 0.40

4. Cameron is Performing a study on the IQ of groups in various areas. He has calculated that the average IQ of Group A is 105 with a standard deviation of 10. What is the Z-score for someone with an IQ of 98?

a) 0.7

b) -0.7

c) 0.9

d) 0.1

5. Calculate the accuracy rate for following confusion matrix

Predicted Class Predicted Class

Actual Class 1 0

Class 1 18 11

Class 0 11 60

a) 0.72

b) 0.22

c) 0.78

d) 0.28

6. When a predictive model is made overly complex to fit in quirks of given sample data, it is called?

a) Oversampling

b) Overfitting

c) Partitioning

d) Distribution

7. The process of diving a data set into a straining, a validation, and an optimal test data set is called?

a) Overfitting

b) Oversampling

c) Optional testing

d) Data partitioning

8. Based on the following confusing matrix with a validation set of 100, class 1 reflects the number targeted respondents who did not purchase services. Calculate the specificity rate.

Actual Class Predicted class 1 Class Predicted Class 0

Class 1 18 11

Class 0 11 60

a) 62%

b) 78%

c) 84%

d) 84.5%

9. When using PCA, all the following are disadvantages except

a) PCA results are difficult to interpret clearly

b) Components are weighted linear combinations and abstract

c) PCA only works with numerical data

d) PCA significantly increases the dimension of the data

10. Of the following selections, which is not a descriptor of principal component analysis?

a) The first principal account is not suitable for analysis

b) Principal Components are uncorrelated variables

c) The first principle accounts for most of the variability

d) Principal Component variables are weighted linear combinations of the original variables

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!