STA 108 — Term project

Select an 80% subset for your analysis.

The data set “countries.csv” contains information on the following variables:
• Country – list of countries in the data set
• Code – three letter code
• LandArea – land area in square kilometers
• Population – population in millions
• Rural – percentage of population living in rural areas
• Health – % of government expenditures directed towards health care
• Internet – % of population with internet access
• BirthRate – Births per 1000 people
• ElderlyPop – % of population at least 65 years old
• LifeExpectancy – Average life expectancy in years
• CO2 – CO2 emissions in metric tons per capita
• GDP – Gross Domestic Product per capita
• Cell – Cell phone subscriptions per 100 people

We are interested in finding a parsimonious model to predict life expectancy. Use the tools we
have learned in this course to
1. Build a model with LifeExpectancy as the outcome and any of the remaining variables as
predictors.
2. Carry out a residual analysis to identify

• Deviations from linearity in any of the predictors
• Possible transformations of predictors
• Possible transformation of the outcome variable

3. Assess the potential for multicollinearity
4. Identify which variables are predictors of LifeExpectancy using suitable model selection
algorithms.

Answer

Study Cred Tutor

4.6 (24k+)
4.6/5

Purchase the answer to view it

×

Hello!

Click one of our contacts below to chat on WhatsApp

× How can I help you?