Statistics How To

Forward Selection

Regression Analysis >

Forward selection is a type of stepwise regression which begins with an empty model and adds in variables one by one. In each forward step, you add the one variable that gives the single best improvement to your model.

It is one of two commonly used methods of stepwise regression; the other is backward elimination, and is almost opposite. In that, you start with a model that includes every possible variable and eliminate the extraneous ones one by one.

General Method Behind Forward Selection

Forward selection typically begins with only an intercept. One tests the various variables that may be relevant, and the ‘best’ variable — where best is determined by some pre-determined criteria– is added to the model.

As the model continues to improve (per that same criteria) we continue the process, adding in one variable at a time and testing at each step. Once the model no longer improves with adding more variables, the process stops.

The criterion used to determine which variable goes in when are varied. You could be attempting to find the lowest score under cross validation, the lowest p-value, or any of a number of other tests or measures of accuracy.

Since stepwise regression tends toward over-fitting, it is usually good to have strict criteria for adding in any variables. (Overfitting happens when we put in more variables than is actually good for the model; it typically shows a very close, neat fit of the data used in regression, but the model will be far off from additional data points and not good for interpolation).

References

Shalizi, Cosma. Lecture 26: Variable Selection. Modern Regression for Undergraduates Class Notes.
http://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/26/lecture-26.pdf

Brant, Rollin. Forward Selection. MDSC 643.02 Lecture Materials. Retrieved from
https://www.stat.ubc.ca/~rollin/teach/643w04/lec/node41.html on July 7, 2018

SAS Support. Forward Selection. The GLMSELECT Procedure. Retrieved from http://support.sas.com/documentation/cdl/en/statug/66859/HTML/default/viewer.htm#statug_glmselect_details03.htm on July 8, 2018.

Cook, Perry. Stepwise Selection. Human-Computer Interface Technology (CS436) Class Notes. Retrieved from
https://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/FS/stepwise.htm on July 8, 2018.

------------------------------------------------------------------------------

Confused and have questions? Head over to Chegg and use code “CS5OFFBTS18” (exp. 11/30/2018) to get $5 off your first month of Chegg Study, so you can understand any concept by asking a subject expert and getting an in-depth explanation online 24/7.

Comments? Need to post a correction? Please post a comment on our Facebook page.

Check out our updated Privacy policy and Cookie Policy