When analyzing big data (large data sets with many variables), business researchers often encounter the problem of

Question:

When analyzing big data (large data sets with many variables), business researchers often encounter the problem of missing data (e.g., non-response). Typically, an imputation method will be used to substitute in reasonable values (e.g., the mean of the variable) for the missing data. An imputation method that uses "nearest neighbors" as substitutes for the missing data was evaluated in Data & Knowledge Engineering (March 2013). Two quantitative assessment measures of the imputation algorithm are normalized root mean square error (NRMSE) and classification bias. The researchers applied the imputation method to a sample of 3,600 data sets with missing values and determined the NRMSE and classification bias for each data set. The correlation coefficient between the two variables was reported as r = .2838.
a. Conduct a test to determine if the true population correlation coefficient relating NRMSE and bias is positive. Interpret this result practically.
b. A scatterplot for the data (extracted from the journal article) is shown below. Based on the graph, would you recommend using NRMSE as a linear predictor of bias? Explain why your answer does not contradict the result in part a.
3 r = 0.2838 0.24 0.25 0.26 0.27 0.28 0.29 0.3 NRMSE BIAS
Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Statistics For Business And Economics

ISBN: 9780134506593

13th Edition

Authors: James T. McClave, P. George Benson, Terry Sincich

Question Posted: