Question: We have a data file that contains data from the 2005 World Factbook relating to gross domestic product (GDP) per capita in US$ thousands (gdp)

We have a data file that contains data from the 2005 World Factbook relating to gross domestic product (GDP) per capita in US$ thousands (gdp) and the percentage of the population that are internet users (intpct) for 213 countries. Here, GDP is based on purchasing power parities to account for between-country differences in price levels. This problem investigates whether there is a linear association between these two variables. In particular, how effective is it to use gdp to predict intpct using simple linear regression. You can read the data into R as follows:

mydata=read.csv("http://www.datadescant.com/stat104/internet.csv")

a) Using R, find the least squares line for the data.

b) Interpret the estimates of the slope and the y-intercept in the context of the problem.

c) Predict the percentage of internet users if GDP per capita is US$20,000. 3

d) Draw a scatterplot with intpct on the vertical axis and gdp on the horizontal axis, and add the least squares line to the plot.

e) Based on the scatterplot, do you think it is appropriate to use this simple linear regression model in this problem or is the model potentially misleading (and if so, how)?

f) Now fit the model using log(intpct) instead of pct. Does this appear to be a better model? Why?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!