 Best Subsets Regression is a method used to help determine which predictor (independent) variables should be included in a multiple regression model.  This method involves examining all of the models created from all possible combination of predictor variables. Best Subsets Regression uses R2 to check for the best model.  It would not be fun or fast to compute this method without the use of a statistical software program.

First, all models that have only one predictor variable included are checked and the two models with the highest R2 are selected.  Then all models that have only two predictor variables included are checked and the two models with the highest R2 are chosen, again.  This process continues until all combinations of all predictors variables have been taken into account.

Specific Example:  Assume that during a three-hour period spent outside, a person recorded the temperature, the time spent mowing the lawn, weather there was sun or not (0 or 1) and their water consumption. The experiment was conducted on 7 randomly selected days during the summer.  By using our imaginations we can come up with some other possible predictors that might give us a more accurate model, such as temp^2 and mowing time * temperature.  Now, we have 5 possible predictors to include in our model.  With only 7 data points, it would not be wise to include all five.  In fact, including some of these predictors may even decrease the accuracy of the model.  So we are left with how to decide which predictors to included and which to not.

Next, learn the for calculating Best Subsets Regression.

STATS @ MTSU