After seeing a presentation on the power of data mining and hearing about your past consulting work,
Question:
After seeing a presentation on the power of data mining and hearing about your past consulting work, you are approached to create a model for a university's admissions office. The members of the admissions review board just heard about a technique called "clustering", and they think it would be a good idea to try this technique on the newest batch of applying students for the upcoming academic year. By sorting applicants into groups, they believe it will be easier to then make application and recruiting decisions.
For confidentiality reasons, you are provided with only a few select attributes for each of the university's roughly 4,000 undergraduate applicants.
Attributes
- hsgpa - the applicant's high school GPA (out of 4)
- sat - the applicant's SAT score (out of 1600)
- hsize - the size of the applicant's graduating class
- athlete - whether the student participated in high school athletics for at least 1 year
A) Upload the gpa.csv data to BigML and create a cluster model. Use the default k-means algorithm and set the number of clusters (k) to 3.
In about 5-8 sentences, briefly describe each of the 3 clusters and explain any overarching takeaways about the groups of applicants this university receives.
B) Based on your findings in Part A, what type of supervised learning could help further explore the data and these clusters? Do you believe this would provide any meaningful information and do have any concerns about a university making application decisions using these types of models (supervised or unsupervised)? Explain why or why not.
International Marketing And Export Management
ISBN: 9781292016924
8th Edition
Authors: Gerald Albaum , Alexander Josiassen , Edwin Duerr