Modern Prediction and Modeling Methods

RGS, Winter 2004

Instructor: Greg Ridgeway

Monday 3:15-4:45pm

Wednesday 10:00-11:30am

 

Syllabus

     See the syllabus.

Homework

Homework #1, due Monday 2/16 (notes on HW#1)

Homework #2, due Wednesday 2/25

Homework #3, due Wednesday, 3/3

Homework #4, due Wednesday, 3/10

Homework #5, due Friday, 3/19

Schedule

Lecture

Topic

1

notes

1.      Introduction

a.      Introduction to prediction problems and non-parametric regression

b.      Linear least squares

c.      Accuracy and interpretability

d.      k-nearest neighbors

e.      Introduction to the R statistical environment

2

 

notes

 

2.      Review of Lecture 1

a.      Bias-variance decomposition

b.      Logistic regression: Natural extensions of linear model and knn to 0/1 outcomes

3

notes

3.      Out-of-sample predictive performance

a.      Cross-validation

4.      Using test datasets

4

notes

5.      Naïve Bayes classifier

a.      Introduction

b.      Properties

c.      Relationship to logistic regression and linear discriminant analysis

d.      Application to automated medical diagnosis (ACL injuries or chronic whiplash)

5

notes

6.      Splines

a.      Adding non-linear effects to linear models

b.      Use of natural splines and generalized additive models in R

6

notes

R code

7.      Variable selection

a.      Why stepwise is not wise

b.      Least angle regression

7

notes

8.      Regression Trees

a.      Development of the methodology

8

R code

b.      Missing data

c.      Degrees of freedom

d.      Pros and cons

e.      Regression trees in R

9

notes

 

9.      Causal modeling with propensity scores

a.      Rubin causal model

b.      Propensity scores

10

notes

10. Gradient Boosting

a.      Model fitting

b.      Application to the evaluation of drug treatment programs and racial profiling

11. Wrap up loose ends, final topics, discussion