Question: Using Weka The data set for the clustering test is for the value of houses in various areas around Boston (USA) based on characteristics of

Using Weka
The data set for the clustering test is for the value of houses in various areas around Boston (USA) based on characteristics of the locale such as proximity to the Charles River and major highways, socioeconomic status, air pollution and other factors. The attributes in the data set are: 1. Crime rate 2. Industrial Area 3. By Charles River 4. Nitrous Oxide Level 5. Number of rooms 6. Age 7. Distance from cities 8. Tax rate 9. Percent Black Inhabitants 10. Inhabitants in Lower status 11. House value You will require the Boston data file in CSV format for this test. 1. Start WEKA 2. Load your data file and carryout any necessary preprocessing activity required on it. Report any pre-processing activity carried out. Build clustering models with varying values of k (cluster numbers) using SimpleKmeans for the given data set. From the evaluation metrics generated, which clustering model gave the best result? Explain why? Give an interpretation of the clusters generated from the best model which the City Council social workers can use to have useful insights into the demographics of that part of the city
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
