Question: Help with coding in R: cyl

Help with coding in R:

cyl<-factor(scan(text= "6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4")) am<-factor(scan(text= "1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1"))

1. Using the data `cyl` and `am` (transmission type) from Part II, group vehicles based into 8 cylinder and less than 8 cyl. Test whether there is evidence of association between many cylinders and automatic transmissions. (_Hint:_ use `levels()` to re-level `cyl` and then use `chisq.test()`).

2. The built in dataset `faithful` records the time between eruptions and the length of the prior eruption for 272 inter-eruption intervals (load the data with `data(faithful)`). Examine the distribution of each of these variables with `stem()` or `hist()`. Plot these variables against each other with the length of each eruption (`eruptions`) on the x axis. How would you describe the relationship?

3) Fit a regression of `waiting` as a function of `eruptions`. What can we say about this regression? Compare the distribution of the residuals (`model$resid` where `model` is your lm object) to the distribution of the variables.

4) Is this data well suited to regression? Create a categorical variable from `eruptions` to separate long eruptions from short eruptions (2 groups) and fit a model (ANOVA) of `waiting` based on this. (_Hint:_ use `cut()` to make the categorical variable, and `lm()` to fit the model). How did you choose the point at which to cut the data? How might changing the cutpoint change the results?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!