Question: Instructions: 1.0 - The Data We'll be looking at a data set (collected in 2009-10) called theHappyPlanetIndex. It has 143 countries and the following variables:

Instructions:

1.0 - The Data

We'll be looking at a data set (collected in 2009-10) called theHappyPlanetIndex. It has 143 countries and the following variables:

  • CountryName of country
  • Region1=Latin America, 2=Western nations, 3=Middle East, 4=Sub-Saharan Africa, 5=South Asia, 6=East Asia, 7=former Communist countries
  • HappinessScore on a 0-10 scale for average level of happiness (10 being happiest)
  • LifeExpectancyAverage life expectancy (in years)
  • FootprintEcological footprint - a measure of the (per capita) ecological impact
  • HLYHappy Life Years - combines life expectancy with well-being
  • HPIHappy Planet Index (0-100 scale) combines well-being and ecological impact
  • HPIRankHPI rank for the country
  • GDPperCapitaGross Domestic Product (per capita)
  • HDIHuman Development Index
  • PopulationPopulation (in millions)

1.1 - Write some code to take a look at the data.

To measure happiness, the Gallup World Poll used this question: "Please imagine a ladder with steps numbered from zero at the bottom to 10 at the top. The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you. On which step of the ladder would you say you personally feel you stand at this time?"

1.2 - If we asked our class this question today, could we represent that data as a row in this data frame? Why or why not?

1.3 - Today we'll explore three different measures of happiness:Happiness,HLY, andHPI. If you were the leader of a country, which measure of happiness would you be most interested in? Why?

1.4 - Let's learn a little about Bangladesh on these three measures of happiness. Run the code below. Can you tell if it's a "happy" country or if it's citizens have many happy years?

In[]:

1

filter(HappyPlanetIndex, Country == "Bangladesh")

2.0 - Exploring Variation

For today, let's focus on the variableHappiness.

2.1 - Make some visualizations of how countries vary in their happiness.

2.2 - This data set is "old" (it's from 2010). If we collected data today, would we get the same distribution? Would anything be different? Would anything be the same?

3.0 - Modeling Variation

3.1 - If we had to predict a randomly selected country's happiness, we could use the mean as a very simple model. Put that number into your visualization above.

3.2 -Ris constantly changing. Recently, we asked the developers ofmosaicandggformulato write an easier function for fitting a normal curve for us... and they did it! They called itgf_fitdistr()and it works with density histograms. Modify the following code to fit a normal distribution onto the histogram of happiness.

In[]:

1

# gf_dhistogram(~ Thumb, data = Fingers, fill = "coral2") %>%

2

# gf_fitdistr(color = "darkblue")

3.3 - Add the mean to the histogram above (in blue). Also add Bangladesh's happiness (as a gold colored line).

3.4 - (Everybody Gestures) I'm going to tell you the standard deviation of Happiness. Gesture that standard deviation on the histogram. Is Bangladesh within 1 standard deviation from the mean (aka in zone 1)?

3.5 - (Everybody Gestures) How big would our residual be if we had predicted Bangladesh's happiness with the mean? Is that going to be bigger than or smaller than the standard deviation? Then estimate the number.

3.6 - (Everybody Gestures) In the histogram, what percentage of countries had lower happiness levels than Bangladesh? Gesture and then estimate the number.

3.7 - What is the likelihood that a randomly selected country will have a lower happiness than Bangladesh? Try to answer this question using the data as a model.

3.8 - We all said at the beginning that if we ever collected this data again, all these numbers will probably be different. So maybe next time the data won't look exactly like this data. That's why the data isn't usually the greatest model of the DGP.

What is the likelihood that a randomly selected country will have a lower happiness than Bangladesh? Now answer this question using the normal distribution as a model.

3.9 - What does the z-score of Bangladesh mean?

(BONUS) 4.0 - Is the DGP of happiness even normal?

Many times, students just don't think the normal model is all that important because data look jagged rather than normal to them. It might help to do the following simulation.

The functionrnormis introduced later in the book but it might be helpful here. It will simulate a data point drawn from a normal distribution with a particular mean and standard deviation. First try running the code a few times.

4.1 - Try modifying the code to simulate 143 countries' happiness ratings.

In[]:

1

Happy_stats <- favstats(~Happiness, data = HappyPlanetIndex)

2

3

rnorm(1, mean = Happy_stats$mean, sd = Happy_stats$sd)

In[]:

1

# Run this code a few times. What do you think each line does?

2

3

# What does this do?

4

sim_Happiness <- rnorm(143, mean = Happy_stats$mean, sd = Happy_stats$sd)

5

6

# What does this do?

7

gf_dhistogram(~ sim_Happiness, color = "magenta3") %>%

8

gf_fitdistr()

4.2 - These data are all simulated from a normal distribution. We know they specifically come from a normal distribution. Do samples from a normal distributionalways looknormal?

4.3 - Even though the distribution of happiness doesn't look perfectly normal, could the DGP of happiness be roughly normal?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

The question seems incomplete and it includes multiple sections and subsectio... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!