Question: The data is in the below links https://archive.ics.uci.edu/ml/datasets/Student+Performance Data set description: This data set contains the 649 rows of students' personal, family, and other related
The data is in the below links
https://archive.ics.uci.edu/ml/datasets/Student+Performance
Data set description: This data set contains the 649 rows of students' personal, family, and other related information , such as, sex, age, family size, etc. The total number of independent variables is 30. The 3 response variables are grades for either Math or Portuguese from two midterm grades and one final grade. Out of 30 variables, we choose 25 most important ones.
These 25 variables are
1 school - student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira)
2 sex - student's sex (binary: 'F' - female or 'M' - male)
3 age - student's age (numeric: from 15 to 22)
7 Medu - mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 " 5th to 9th grade, 3 " secondary education or 4 " higher education)
8 Fedu - father's education (numeric: 0 - none, 1 - primary education (4th grade), 2 " 5th to 9th grade, 3 " secondary education or 4 " higher education)
9 Mjob - mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
10 Fjob - father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
14 studytime - weekly study time (numeric: 1 - <2 hours, 2 - to 5 3 10 or 4>10 hours)
15 failures - number of past class failures (numeric: n if 1<=n<3, else 4)
16 schoolsup - extra educational support (binary: yes or no)
18 paid - extra paid classes within the course subject (Math or Portuguese) (binary: yes or no)
19 activities - extra-curricular activities (binary: yes or no)
21 higher - wants to take higher education (binary: yes or no)
22 internet - Internet access at home (binary: yes or no)
23 romantic - with a romantic relationship (binary: yes or no)
24 famrel - quality of family relationships (numeric: from 1 - very bad to 5 - excellent)
26 goout - going out with friends (numeric: from 1 - very low to 5 - very high)
27 Dalc - workday alcohol consumption (numeric: from 1 - very low to 5 - very high)
28 Walc - weekend alcohol consumption (numeric: from 1 - very low to 5 - very high)
29 health - current health status (numeric: from 1 - very bad to 5 - very good)
30 absences - number of school absences (numeric: from 0 to 93)
Our Hypothesis: Student's sex(sex), Parents' education level(Medu, Fedu), weekly study time(studytime), whether or not paid for extra classes(paid), whether or not in a romantic relationship(romantic) will significantly influence the student's performance on grades.
Requirement: Use R-studio to conduct a linear regression to test our hypothesis. Besides programming, please finish writing a page of complete, clear description of the analysis you performed. This should be sufficient for someone else to run an R program to reproduce your results. It should also likely be helpful to people who read your code later. This section should tie your computations to your questions/hypotheses, indicating exactly what results would lead you to what conclusion. You may want to provide the key statistics, e.g., t-statistic, z-statistic, p-values,R2and the adjustedR2, etc.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
