Question: By using the R Language, in the data set fractures.txt, Myers(1990) presents data on the number of fractures (y) that occur in the upper seams

By using the R Language, in the data set fractures.txt, Myers(1990) presents data on the number of fractures (y) that occur in the upper seams of coal mines in the Appalachian region of western Virginia. Four regressors were reported:

x1 = inner burden thickness (feet), the shortest distance between seam floor and the lower seam;

x2 = percent extraction of the lower previously mined seam;

x3 = lower seam height (fleet); and

x4 = time (years) that the mine has been in operation.

y x1 x2 x3 x4 2 50 70 52 1 1 230 65 42 6.9 0 125 70 45 1 4 75 65 68 0.5 1 70 65 53 0.5 2 65 70 46 3 0 65 60 62 1 0 350 60 54 0.5 4 350 90 54 0.5 4 160 80 38 0 1 145 65 38 10 4 145 85 38 0 1 180 70 42 2 5 43 80 40 0 2 42 85 51 12 5 42 85 51 0 5 45 85 42 0 5 83 85 48 10 0 300 65 68 10 5 190 90 84 6 1 145 90 54 12 1 510 80 57 10 3 65 75 68 5 3 470 90 90 9 2 300 80 165 9 2 275 90 40 4 0 420 50 44 17 1 65 80 48 15 5 40 75 51 15 2 900 90 48 35 3 95 88 36 20 3 40 85 57 10 3 140 90 38 7 0 150 50 44 5 0 80 60 96 5 0 80 85 96 5 0 145 65 72 9 0 100 65 72 9 3 150 80 48 3 2 150 80 48 0 3 210 75 42 2 5 11 75 42 0 0 100 65 60 25 3 50 88 60 20

(a) Read in the data as a data frame. Add a new column called indicator to the data frame. If the y of a observation is above the median, indicator = 1; If not, indicator = 0.

(b) Explore the data graphically in order to investigate the association between indicator and the other features (x1, x2, x3, x4). Which of the other features seem most likely to be useful in predicting indicator? Scatter-plots and boxplots may be useful tools to answer this question. Describe your findings.

(d) Perform LDA on the training data in order to predict indicator using the variables that seemed most associated with indicator in part (b). What is the misclassification rate (test error) of the model obtained ?

(e) Perform kNN on the training data in order to predict indicator using the variables that seemed most associated with indicator in part (b). What is the misclassification rate (test error) of the model obtained?

(f) Perform logistic regression on the training data in order to predict indicator using the variables that seemed most associated with indicator in part (b). What is the misclassification rate (test error) of the model obtained?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

By using the R Language, in the data set fractures.txt, Myers(1990) presents data on the number of fractures (y) that occur in the upper seams of coal mines in the Appalachian region of western...

Myers [1990] presents data on the number of fractures (y) that occur in the upper seams of coal mines in the Appalachian region of western Virginia. Four regressors were reported: $x_{1}=$ inner...

Myers [ 1990 ] presents data on the number of fractures ( y ) that occur in the upper seams of coal mines in the Appalachian region of western Virginia. Four regressors were reported: x1 = inner...

Reconsider the mine fracture data from Problems 13.7 and 13.8. Construct plots of the deviance residuals from the best model you found and comment on the plots. Does the model appear satisfactory...

Reconsider the mine fracture data from Problem 13.7. Remove any regressors from the original model that you think might be unimportant and rework parts b-e of Problem 13.7. Comment on your findings....

How to solve this problem in language" R "? Please note the answers (a),(b),(c),(d),(e)&(f) In the data set fractures tart, Myers (1990) presents data on the number of frac- tures (y) that occur in...

read case seven entitled "the Upper Big Branch Mine Disaster" at page 500 of your book and discuss the costs and benefits to stakeholder of the action taken by Massey Ebergy and its managers. The...

On Monday, April 5, 2010, just before 3:00 in the afternoon, miners at Massey Energy Corporation's Upper Big Branch coal mine in southern West Virginia were in the process of a routine shift change....

Please can you assist in conceptualizing and drawing key themes/ dimensions of Machiavellianism Ohilosophy and Ethics in Technology. The response should include The definition of Machiavellianism...

Reference: FOSSUM.. Labor Relations, 10th Edition. McGraw-Hill Learning Solutions, 2008. VitalBook file. Page 429Chapter Thirteen Union-Management Cooperation Many labor relations practices are...

State Unwins Formula.

Let S be the empty set. Add the elements of {1, 2} to S, and after 1 minute randomly remove 1 element of S. Next add the 4 elements of {3, 4, 5, 6} to S and after 30 sec randomly remove 2 of the 5...

Aircraft noise is so common at the Grand Canyon that in some areas it is audible almost _ _ _ _ _ percent of the time. 8 0 3 0 1 0 5 0

Seved Help 14 Wisconsin Snowmobile Corp. is considering a switch to level production Cost efficiencies would occur under level production, and aftertax costs would decline by $31,500, but inventory...

=+2 How can those culture-related concerns be understood and dealt with?

=+1 What culture-related problems and issues do you see in these uses of thirdcountry nationals?

=+3 What role does international HR need to take in coping with the cultural issues presented by the use of TCNs?