Question: A student - information - system ( SIS ) intends to include additional ML - based insights into its management reporting suite. One of the
A studentinformationsystem SISintends to include additional MLbased insights into its management reporting suite. One of the csv data files used to prime the SIS Machine Learning application has been found to have significant data quality issues.
You have been asked to provide advice and guidance on the steps that need to be taken to clean the data file as part of quality assurance for the ML system.
To complete this assignment:
Carry out the following tasks
Perform detailed analysis of data quality for the dataset providedsee notemaking use of some of the criteria discussed in the weeks lectures.
Map out, in outline form, what data cleansing measures would need to be put in place to prepare the dataset for ML work.
Indicate the types of PandasScikitLearn python commands that could be used whilst carrying out the measures you suggested in task
Attempt a cleanup of the dataset using the procedures set out in and and report on the degree of success with carrying this out.
Note: The database provided contains ID's, names, locations, job titles, join dates, Type of contract, highest aca demic qualification, major, university, courses taught and major teaching field.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
