Question: 4. (Pairs plot & correlation) Below is a pairs plot for four variables Glucose, Insulin, HOMA and Classification in the cancer dataset. Note that Classification




4. ("Pairs" plot & correlation) Below is a "pairs" plot for four variables Glucose, Insulin, HOMA and Classification in the cancer dataset. Note that Classification is treated as a factor/categorical variable (0: healthy controls; 1: cancer patients). Glucose Insulin HOMA Classification 0.03- 0.02- Corr: Corn Glucose 0.01- 0.505 0.696 0.00- 60 - .. . . 40- Corr: Insulin 20- 0.932 0 25 20 15 HOMA 10 ONUTENNUNEN Classification LL.. 100 150 200 0 20 40 60 0 10 20 O- (i) According to the plot, does the cancer patient tend to have a higher or lower Glucose level? (Briefly explain the reason.) (ii) According to the plot, what is the correlation between Glucose and Insulin? (iii) We want to build a statistical model to predict the response variable Classification. The plot shows that Glucose, Insulin and HOMA all appear to be (slightly) correlatedwith Classification. If we have to discard one of them, which one would you choose (among Glucose, Insulin and HOMA)? Why? (Open question.)
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
