Question: For this chapters exercise, you will use logistic regression to try to predictwhether or not young people you know will eventually graduate from college.Complete the
For this chapters exercise, you will use logistic regression to try to predictwhether or not young people you know will eventually graduate from college.Complete the following steps: Open a new blank spreadsheet in OpenOffice Calc. At the bottom of thespreadsheet there will be three default tabs labeled Sheet Sheet SheetRename the first one Training and the second one Scoring. You can rename thetabs by double clicking on their labels. You can delete or ignore the third defaultsheet On the training sheet, starting in cell A and going across, create attributelabels for five attributes: ParentGrad, Gender, IncomeLevel, NumSiblings, andGraduated Copy each of these attribute names except Graduated into the Scoring sheet On the Training sheet, enter values for each of these attributes for severaladults that you know who are at the age that they could have graduated fromcollege by now. These could be family members, friends and neighbors,coworkers or fellow students, etc. Try to do at least observations; or morewould be better. Enter husband and wife couples as two separate observations.Use the following to guide your data entry:a For ParentGrad, enter a if neither of the persons parents graduated fromcollege, a if one parent did, and a if both parents did. If the persons parentswent on to earn graduate degress, you could experiment with making thisattribute even more interesting by using it to hold the total number of collegedegrees by the persons parents. For example, if the person represented in theobservation had a mother who earned a bachelors, masters and doctorate, and afather who earned a bachelors and a masters, you could enter a in thisattribute for that person.b For Gender, enter for female and for male.c For IncomeLevel, enter a if the person lives in a household with an incomelevel below what you would consider to be below average, a for average, and a for above average. You can estimate or generalize. Be sensitive to others whengathering your datadont snoop too much or risk offending your data subjects.d For NumSiblings, enter the number of siblings the person has.e For Graduated, put Yes if the person has graduated from college and No ifthey have not Once youve compiled your Training data set, switch to the Scoring sheet inOpenOffice Calc. Repeat the data entry process for at least more is betteryoung people between the ages of and that you know. You will use thetraining set to try to predict whether or not these young people will graduatefrom college, and if so how confident you are in your prediction. Remember thisis your scoring data, so you wont provide the Graduated attribute, youll predictit shortly Use the File Save As menu option in OpenOffice Calc to save your Trainingand Scoring sheets as CSV files Import your two CSV files into your RapidMiner respository. Be sure to givethem descriptive names. Alternatively, you can simply connect to them usingRead CSV operators Add your two data sets to a new Main Process window. If you have preparedyour data well in OpenOffice Calc, you shouldnt have any missing or inconsistentdata to contend with, so data preparation should be minimal. Rename the twoRetrieve or Read CSV operators so you can tell the difference between yourtraining and scoring data sets One necessary data preparation step is to add a Set Role operator and definethe Graduated attribute as your label in your training data. Alternatively, you canset your Graduated attribute as the label during data import Add a Logistic Regression operator to your Training stream Apply your Logistic Regression model to your scoring data and run yourmodel. Evaluate and report your results. Are your confidence percentagesinteresting? Surprising? Do the predicted Graduation values seem reasonableand consistent with your training data? Does any one independent variablepredictor attribute seem to be a particularly good predictor of the dependentvariable label or prediction attribute If so why do you think so
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
