Question: DATA DESCRIPTION (A FICTITIOUS DATASET DESIGNED FOR THE ASSIGNMENT ONLY) A consulting firm randomly selected 150 young employees in Tasmania. These selected employees answered questions
DATA DESCRIPTION
(A FICTITIOUS DATASET DESIGNED FOR THE ASSIGNMENT ONLY)
A consulting firm randomly selected 150 young employees in Tasmania. These selected employees
answered questions and undertook a standard IQ test and a KW test. The KW test examines
respondents' knowledge about the duties in their workplaces and the knowledge about the Australian
and Tasmanian labour markets. Respondents' answers are entered a spreadsheet where each column
represents a variable. These variables include:
1. wage: monthly earnings in dollars
2. hours: average weekly working hours
3. IQ: IQ score
4. KW: knowledge of work score
5. educ: years of education
6. exper: years of work experience
7. tenure: years with the current employer
8. age: age in years
9. marriage: marriage status
10. gender: female or male
11. urban: =Y if lives in urban areas
=N if lives in rural areas
12. sibs: the number of siblings
13. brthord: birth order, e.g. =2 means he/she is the second child in the family.
14. meduc: mother's education
15. feduc: father's education
The missing values are shown by a "." in the cells.
Questions:
1. Read the provided raw data carefully to check whether all respondents have provided
information for each variable. Explain what you have done to manage the missing data. Clearly
indicate the final number of observations (respondents) you will use in the following analysis.
Submit an electronic copy of the Excel spreadsheet of the final dataset together with your
assignment. All your following analysis should be based on this final dataset.
[10 marks]
2. Pick up two numerical variables and two categorical variables and then describe each of them
one by one. Use appropriate tables/graphs and numerical measures to help you describe the
distribution of the variables.
[10 marks]
3. It's often asked what factors relate to IQ score and KW score. Look through your data and
first pick up one numerical variable that you think may relate to IQ score. Explain why you
pick up this variable. Then use an appropriate graph and an appropriate numerical measure to
discuss the empirical relationship between IQ score and this numerical variable. Repeat the
same exercise for the relationship between KW score and a numerical variable to which you
think KWmay relate.
[15 marks]
4. You want to look at the relationship between gender and wages. However, you notice that
gender is a categorical variable and wage is a numerical variable. One way to work on two
different types of variables is to transform one variable to the type of the other. You decide
to generate a categorical variable based on the level of wage, and this categorical variable has
two values, "high" and "low". For example, you choose a threshold value for wage, and if a
respondent's wage is no less than the threshold value, you enter "high" and enter "low"
otherwise.
a. Describe in detail how you have decided the threshold value for generating the new
categorical variable for the level of wage. Then use an appropriate graph to present this
variable. (Hint: you may choose to use an appropriate numerical measure of wage as the
threshold value).
b. Present these two categorical variables together using an appropriate graph, and then
discuss what the graph shows.
[4 marks]
c. Produce a contingency table to present these two categorical variables. Based on the
contingency table, calculate the related (empirical) joint and marginal probabilities. You
may find helpful to produce another contingency table to show your calculated
probabilities. (Hint: you may need Excel skills -- e.g. use the commands such as "sort" or
"countif" to count the relevant frequencies, or use PivotTable function)
[8 marks]
d. Based on the sample information, calculate the probability of either being a female or
getting a low wage level, and calculate the probability of being a female conditional on
getting a low wage level
[5 marks]
e. Examine whether the statement "Males tend to receive high wages than females" is true,
false or inconlusive based on the sample information. Explain your response.
[5 marks]
[Total Marks 28]
5. Suppose that the population average of (monthly) wage of young employees in Tasmania in
the previous year before this survey was conducted was $900.
f. Conduct a hypothesis test that the population average wage of young employees in
Tasmania during the year of survey remains the same as in the previous year.
[7 marks]
g. Construct a 95% confidence estimate for the population average wage, and comment
whether the population average wage in the year of survey remains the same as in the
previous year.
6. You want to use the collected data to study what is the most important factor that affects
young employees wage in Tasmania. Use simple regression analysis to answer the following
questions. (For each regression you run, show the Excel regression output and report the
regression equation. Partial marks from the following questions assign to your regression
results.).
a. Do the years of education have significant impact on the wages? (You need to explain the
choice of the null and the alternative hypotheses.)
[5 marks]
b. Do the IQ scores have significant impact on the wages? (You need to explain the choice
of the null and the alternative hypotheses.)
[5 marks]
c. Which of the two variables is a better predictor for the wage, years of education or IQ
scores? Explain why.
[3 marks]
d. Do the years of work experience have significant impact on the wages? (You need to
explain the choice of the null and the alternative hypotheses.
[3 marks]
e. Do the KW scores have significant impact on the wages? (You need to explain the choice
of the null and the alternative hypotheses.)
[3 marks]
f. Which of the two variables is a better predictor for the wages, years of work experience or
KW scores? Explain why.
[3 marks]
g. Newspapers often criticize a weak link between wage and education comparing with the
link between wage and work experience. Discuss if the criticism is consistent with our data
xl data
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
