STA 108 — Term project

Select an 80% subset for your analysis.

The data set “countries.csv” contains information on the following variables:
• Country – list of countries in the data set
• Code – three letter code
• LandArea – land area in square kilometers
• Population – population in millions
• Rural – percentage of population living in rural areas
• Health – % of government expenditures directed towards health care
• Internet – % of population with internet access
• BirthRate – Births per 1000 people
• ElderlyPop – % of population at least 65 years old
• LifeExpectancy – Average life expectancy in years
• CO2 – CO2 emissions in metric tons per capita
• GDP – Gross Domestic Product per capita
• Cell – Cell phone subscriptions per 100 people

We are interested in finding a parsimonious model to predict life expectancy. Use the tools we
have learned in this course to
1. Build a model with LifeExpectancy as the outcome and any of the remaining variables as
2. Carry out a residual analysis to identify

• Deviations from linearity in any of the predictors
• Possible transformations of predictors
• Possible transformation of the outcome variable

3. Assess the potential for multicollinearity
4. Identify which variables are predictors of LifeExpectancy using suitable model selection


Study Cred Tutor

4.6 (24k+)

Purchase the answer to view it



Click one of our contacts below to chat on WhatsApp

× How can I help you?