Predicting Overall High School Football Team Strength

A Special Report by Paul Katula, Last Updated Sept. 18, 2007

A Special Report by Paul Katula, Last Updated Sept. 18, 2007

SAN ANTONIO, Texas (Sept. 18, 2007)—The Chicago Voxitatis has derived a mathematical formula for determining how strong an Illinois high school football team is. Although many factors play a role in making a team strong, only certain numerical values are available reliably from Illinois high school football games. Therefore, we expect the statistical power of our estimate to be limited by the small amount of information available about the teams. On the other hand, there are 561 high school football teams in the state of Illinois, representing what we consider a viable data sample for the purposes of this report.

**UPDATE, Sept. 17, 2009:** Two years ago, I came up with the algorithm described in this file, and while one of the statistics (Δ) has proven to be accurate about 90 percent of the time, and exceptionally accurate when it is over 25, the other statistic (α) has proven less so, with the exception of games in the very early part of the season, when it has frequently outperformed Δ. Therefore, we will keep both statistics for the 2009 season. In brief, we predict the team with the positive Δ value to win the game, while the team with a negative Δ would be predicted to lose the game. The greater the absolute value of the number, the more confident we are about the prediction. Furthermore, the reliability is extremely low after fewer than four games. Toward the end of the season, the accuracy will improve tremendously, our analyses in 2007 and 2008 have revealed.

**UPDATE, Jan. 10, 2010:** In the future, we will report both Δ and α, except that we have changed the formula for α based on empirical evidence. The formula is now *mu* = (*a* – .5*b*) – (*c* – .25*d*). We made this change because the correction factors, *b* and *d*, as described in this initial report, needed to have less weight on the formula. In support of this change, see our analysis after the fourth week of the fall 2009 season, here.

Illinois high schools run the gamut of enrollment, number of schools contributing to coop teams, history, and range of recruiting available to private schools. In Illinois, public high schools are considered in the same playoff contention as private high schools, but a multiplier is applied to enrollment at private high schools that draw from a larger population area. This computed enrollment is what sets a school in a certain playoff “class,” of which there are eight, 1A to 8A.

At the present time, we are not considering enrollment, yet we know enrollment contributes to depth of the bench, which is an important factor in high school football. The current study, performed after the fourth general weekend of football in the 2007 high school season, looks only at points scored. We hypothesize that teams who score more points and hold their opponents to fewer points are stronger than teams who score fewer points and allow more points.

Mathematically, let *a* = points scored by a team and *c* = points given up by a team. Then, since high-scoring teams will tend to score more points against weaker opponents who allow more points, on average anyway, let *b* = points given up by opponents and *d* = points scored by opponents. Correct points scored by subtracting points given up by opponents, and call *e* = a – *b*; correct points given up by subtracting points scored by opponents, and call *f* = *c* – *d*.

Our testable hypothesis is that *e* – *f* (corrected points for minus corrected points against) is proportional to the strength of the team. We call this statistic “nu” and base our hypothesis on the knowledge that strong teams tend to score more points and weak teams tend to score fewer points, but when facing strong opponents, they tend to give up more points and score fewer points.

After the third general weekend of high school football in Illinois, we will compute mean values for *a*, *b*, *c*, and *d*, which will give us values for *nu*. After the fourth week, we will determine, using linear regression techniques, how these values and our hypothesized statistic, the difference between *nu* for each school and its opponent, which we will call *delta* and represent with the Greek symbol Δ, correlate with the actual performance of the teams in their Week 4 games.

That is to say, we will plot the actual victory margin for each team vs. Δ between that team and its opponent in the Week 4 game. If there is a significant correlation between the margin of victory and the Voxitatis Δ statistic, we accept our hypothesis. If not, we reject it and consider further testing.

Following the Week 3 games across the state, we computed the values for *nu* for each team. A simple histogram of the Week 3 *nu* values for the 558 teams in Illinois who played a game in Week 4 is shown below.

As you can see, the histogram is skewed slightly toward the higher *nu* values, and while the mean value is just slightly below 0 (-0.06), the median is 0.086. Since the median is slightly above the mean, there is some evidence that weak teams are bringing the mean value down disproportionately, as evidenced by the small *tail* on the histogram, trailing off to the left. The standard deviation is 14.43, and the quartiles are at -9.89 and 9.75.

By plotting *nu* vs. *a*, *b*, *c*, and *d*, we can gain some understanding about how the number of points a team scores contributes to overall strength, from a purely theoretical perspective: We find a correlation coefficient of 0.71 between *nu* and *a*. That is a significant correlation, but it’s not too valuable to us, since *nu* is a computed value based, in part, on *a*. In other words, the two values are not independent. Nor is *nu* independent of *b*, *c*, or *d*. Performing this purely theoretical background analysis, we see that actual *nu* values are much more correlated with the principal values, *a* and *c*, than they are with the correction factors, *b* and *d*. Correlation coefficients for *a*, *b*, *c*, and *d* after Week 3 are .71, .11, -.72, and -.11, respectively. Scatter plots showing each of these variables plotted with *nu* on the dependent axis are shown below.

Now that we have computed *nu* for all 558 football teams who played a game in Week 4, we need to look at the margin of victory in each of the 279 games they played during the weekend of Sept. 14 through Sept. 16, 2007. The perusal of a simple histogram shows 19 games were decided by margins of 3 points or less, 23 games by margins of 4–6, and so on. The histogram of the Week 4 margins of victory is shown below.

Finally, for each game, we find Δ. That is, subtract *nu* for one team from *nu* for the other, and plot the actual margin of victory (or loss) in the Week 4 game vs. that computed value for Δ. The result is shown below. Note that, since Δ and the victory margin are both differences, the ordered pair (Δ, *margin*) = (40, 36) for the winner, say, necessarily means that the losing team will have values of (-40, -36).

The correlation coefficient is .65, which we consider adequate — not great, but adequate. It is close to the theoretical (and otherwise useless) correlation coefficient between *a* and *nu*, which tends to confirm our hypothesis that factors other than the ability to run up the score have more than a minimal effect on the correlation. The intercept of the line is, of course, 0, and the slope is computed at about 0.9 points per unit Δ.

Although higher correlation coefficients, and thus predictive power of our statistic, may be possible using a different formula for Δ, only a few are simpler than our hypothesized formula. For example, what if we don’t correct *a* or *c* for the strength of a team’s opponents? From a football sense, that doesn’t make sense, but mathematically, we are intrigued. So, we tried it. The scatter plot of what we have called *mu*, with the regression line, is shown below. We have called the difference statistic, *mu* for a team minus *mu* for its opponent, *alpha* and used the Greek symbol α to represent it on the schedule pages.

Here, the correlation coefficient, at least for Week 4, drops slightly to 0.61, and the computed slope is about 0.64. The frequency distribution for *mu* is slightly different from that for *nu*, hence the difference in slope. In looking at the data, we found that α was a better predictor on occasion for the outcome of a game than was our value for Δ. However, on average, Δ is still slightly better. That is, a high positive Δ for a team in a game is a stronger predictor of a victory for that team, on average, than a high positive value of α.

From a purely football perspective, that makes sense. Better teams are tested more against stronger opponents. There probably should be some correction factor included in the computation of any statistic to predict team strength when only points scored are being used as the raw data.

Along those lines, what if we only consider the points scored by the two teams who are playing? It’s not the most logical thing to assume there will be a strong correlation here, because teams that scored a huge number of points against all weak opponents are not as strong as teams who scored somewhat fewer points but against much stronger opponents. However, we are again mathematically curious. The scatter plot is shown below.

The correlation coefficient is only slightly less than using Δ, 0.545. But the *only* reason this makes sense is that teams who tend to start out the season playing weak teams probably have weak teams in the fourth week as well. That may skew the statistic of *a* – *a’* more toward the actual Δ value for the two teams playing in any one game.

While we’re exercising our mathematical curiosity, let’s see if the points allowed by the two teams correlates with the eventual outcome of the Week 4 game. The scatter plot is shown below.

Here, the correlation coefficient drops to -0.526. That is getting too close to the 0.5 mark, indicating a weaker correlation between this statistic and the eventual outcome of the Week 4 games.

Our analysis may be hindered by several factors. First, only three games, at a maximum, were used to compute *a*, *b*, *c*, and *d* for each team. We will continue our analysis in future weeks this season to determine if including more games into these variables improves the correlation coefficient of Δ.

Second, since *b* and *d* correlate with *nu* much less than *a* and *c* do, it may be necessary to compute coefficients for these four variables in the formula for *nu*. This can be done using the techniques of multiple regression. We will continue this analysis in future weeks of the season and report our findings. It may also be beneficial in our analysis to attempt a polynomial regression, with higher-order terms. In support of this pursuit, we note carefully on the scatter plot, that there is a stronger concentration of points around a sinusoidal curve going from Quadrant III, having a point of inflection at the origin, and continuing into Quadrant I.

Third, outliers may skew results, as in the one game in Week 4 that finished with a score of 77-6. Other teams, which are stronger, may stop running the score up after only, say, 45 points. Using only points in our calculations, the team that runs the score up against a very weak opponent appears stronger than the team who stops after a sufficient margin of victory is attained, even though the latter may actually be much stronger than the former. That is the nature of football, and it is not possible to account for these coaching decisions or other “human factors” in our statistics, given that we don’t measure what each team is “capable of,” simply what they actually did on the field.

And finally, when very strong teams play each other, or very weak teams play each other, the margin of victory is highly uncertain, almost “too close to call.” No matter how much we may pretend that our statistics predict the outcome of games, the score of a game between equally matched teams often comes down to a simple lucky bounce of the football or an intangible quality of the game. Our analysis, we hope, does nothing to reduce the significance of the fundamentals of football, which are shown on the field only, and not in the classroom.

So, how did we do? Examination of the Δ values for the 279 games played in Illinois on the fourth general weekend shows that negative Δ values were predicted for 68 winners and positive Δ values were predicted for 211 winners. This represents about a 76 percent success rate for our computed statistic. Another way to look at it is this: 76 percent of the teams that had a positive Δ value in Week 4 won their games. That also means 76 percent of the teams that had a negative Δ value lost theirs. It works both ways, once for the winning team and once again for the losing team.

In addition, only three teams with Δ values greater than 30 lost their games. There were 38 games in Week 4 with Δ values greater than 30, representing less than a 10 percent chance of losing your game if our computers project a Δ greater than +30. Other odds are presented in the table below.

If Δ is | Wk. 4 Win % | Odds of Winning |

0–4.99 | 58 | 62% |

5–9.99 | 68 | 67% |

10–14.99 | 75 | 72% |

15–19.99 | 87 | 78% |

20–24.99 | 71 | 84% |

25–29.99 | 97 | 89% |

30+ | 92 | 94% |