The Football Bowl Subdivision (FBS) level of the National Collegiate Athletic Association (NCAA) consists of over 100 schools. Most of these schools belong to one of several conferences, or collections of schools, that compete with each other on a regular basis in collegiate sports. Suppose the NCAA has commissioned a study that will propose the formation of conferences based on the similarities of the constituent schools. The file FBS contains data on schools belong to the Football Bowl Subdivision (FBS). Each row in this file contains information on a school. The variables include football stadium capacity, latitude, longitude, athletic department revenue, endowment, and undergraduate enrollment.
a. Apply k-means clustering with k = 10 using football stadium capacity, latitude, longitude, endowment, and enrollment as variables. Be sure to Normalize input data, and specify 50 iterations and 10 random starts in Step 2 of the XLMiner k-Means Clustering procedure. Analyze the resultant clusters. What is the smallest cluster? What is the least dense cluster (as measured by the average distance in the cluster)? What makes the least dense cluster so diverse?
b. What problems do you see with the plan with defining the school membership of the 10 conferences directly with the 10 clusters?
c. Repeat part a, but this time do not Normalize input data in Step 2 of the XLMiner k-Means Clustering procedure. Analyze the resultant clusters. Why do they differ from those in part a? Identify the dominating factor(s) in the formation of these new clusters.