Question: (1) As an analyst, you have received data from a recently concluded survey and the was a question in the survey to get respondents race
(1) As an analyst, you have received data from a recently concluded survey and the was a question in the survey to get respondents race (i.e. African, White, Coloured, and Indian). However, you find out that about 30% of the data for the Race variable is missing, what will be your best course of action?
Exclude the data for the participants with the missing Race column
Impute the Race of the missing value
Create another category for the missing values such as No Race
None of the above
(2) A survey is conducted and there is a question to understand the average salary of all interviewed respondents, you find out that there is an outlier in the income data and you want to find the average salary, which of the following options would you choose?
Remove the outlier from the data before calculating the mean salary
Keep the outlier in the data and calculate the mean salary
None of the above
(3) If your quantitative data is right-skewed, which measure of central tendency (i.e. mean, mode, median) will not be ideal, please explain why?
(4) Social Surveys Africa has received a project from a retail client to help forecast the average monthly household expenditure on food. On receiving the time-series dataset which covers the years from 2012 to 2020 for every month, you find out that there is no order in the dataset, the monthly timestamps are in string format (for e.g. 17th Nov 2014) and there are some missing values, how would you go about cleaning the data before forecasting? (Detail the data cleaning steps you will follow)?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
