Question: 7. Apply a proper regression model- either Binary Logistic Regression or Linear Regression. You need to think which one best fit with this dataset. Use

 7. Apply a proper regression model- either Binary Logistic Regression or

7. Apply a proper regression model- either Binary Logistic Regression or Linear Regression. You need to think which one best fit with this dataset. Use the following for the selected model Task 1 1. Download house data from LMS and save it in your PC (working directory (folder). The data has 12 columns. Columns 1 to 11 are independent variables (X1 to XI1). Column 12 is the dependent variable (Y). o Assign random values to all Beta (B0 to B) parameters. All random values 2. MUST be real values between "0" and "I". Make SURE all parameters (B0 to B11) DO NOT have the same VALUE 3. Remove inconsistence data from some ROWS. For example, "bathrooms" values should be integers only (for example, 1, 2,4, 5, ...etc). If any row in "bathrooms" contains a real number (1.2, 3.4, 7.3,... etc), remove the complete row from the dataset. Please check other COLUMNS carefully. If you think any one of these o Use "Yest" for the predicated (estimated) value. Do not use different name. o Preform the steps of selected regression model. o If you selected Binary Logistic Regression, you need to calculate and print the accuracy. If you selected Linear Regression, you need to calculate and print the Sum of Squared Errors (SSE). 8. Apply Gradient Descent Algorithm (GDA) to optimise the selected regression model for 500 iterations. Check Algorithm 1 Pseudocode for GDA steps in Page 35 Lecture Note Week 6. Set Theta into 0.01. Use the following to calculate the partial derivative for Beta parameters: -For B0: B0-( 1 /No, of samples) * (Yest-Y) -For other parameters (B1 to B11) For i=1 to 11 columns should contains integer values only, remove the row(s) that contains real value(s). Write a code to perform the checking and removing process. Do not ask me which one should be checked or removed. Perform the following Data Exploration processes: Median: for all columns B. = ( 1 No. of samples) * (Nest-Y) *x.) 4. - Range: for all columns. - Frequency: for the following columns ONLY: "bathrooms", "floors", "condition" "grade 5. Check if the there are any "missing values" in the data. If you found "missing value" use "mean" to replace the "missing value" for real value and "min" for integer value. Check all columns value ranges. If you think normalisation is needed, use the "Min- Max" method. 6. 9 Prn the final results based on the selected regression model: either accuracy or SSE. 7. Apply a proper regression model- either Binary Logistic Regression or Linear Regression. You need to think which one best fit with this dataset. Use the following for the selected model Task 1 1. Download house data from LMS and save it in your PC (working directory (folder). The data has 12 columns. Columns 1 to 11 are independent variables (X1 to XI1). Column 12 is the dependent variable (Y). o Assign random values to all Beta (B0 to B) parameters. All random values 2. MUST be real values between "0" and "I". Make SURE all parameters (B0 to B11) DO NOT have the same VALUE 3. Remove inconsistence data from some ROWS. For example, "bathrooms" values should be integers only (for example, 1, 2,4, 5, ...etc). If any row in "bathrooms" contains a real number (1.2, 3.4, 7.3,... etc), remove the complete row from the dataset. Please check other COLUMNS carefully. If you think any one of these o Use "Yest" for the predicated (estimated) value. Do not use different name. o Preform the steps of selected regression model. o If you selected Binary Logistic Regression, you need to calculate and print the accuracy. If you selected Linear Regression, you need to calculate and print the Sum of Squared Errors (SSE). 8. Apply Gradient Descent Algorithm (GDA) to optimise the selected regression model for 500 iterations. Check Algorithm 1 Pseudocode for GDA steps in Page 35 Lecture Note Week 6. Set Theta into 0.01. Use the following to calculate the partial derivative for Beta parameters: -For B0: B0-( 1 /No, of samples) * (Yest-Y) -For other parameters (B1 to B11) For i=1 to 11 columns should contains integer values only, remove the row(s) that contains real value(s). Write a code to perform the checking and removing process. Do not ask me which one should be checked or removed. Perform the following Data Exploration processes: Median: for all columns B. = ( 1 No. of samples) * (Nest-Y) *x.) 4. - Range: for all columns. - Frequency: for the following columns ONLY: "bathrooms", "floors", "condition" "grade 5. Check if the there are any "missing values" in the data. If you found "missing value" use "mean" to replace the "missing value" for real value and "min" for integer value. Check all columns value ranges. If you think normalisation is needed, use the "Min- Max" method. 6. 9 Prn the final results based on the selected regression model: either accuracy or SSE

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!