How FiveThirtyEight Is Forecasting The 2016 NCAA Tournament


UPDATE (6:30 p.m. March 18): We’ve updated this post to add information about the excitement index.


Welcome to FiveThirtyEight’s forecasts of the men’s and women’s NCAA basketball tournaments. We’ve been issuing probabilistic March Madness forecasts in some form since 2011, when FiveThirtyEight was just a couple of us writing for The New York Times. While the basics of the system remain the same, we unveil a couple of new wrinkles each year.

Last season, we issued forecasts of the women’s tournament for the first time. Our big change for this year is that we won’t just be updating our forecasts at the end of each game — but also in real time. If a No. 2 seed is losing to a No. 15 seed, you’ll be able to see how that could affect the rest of the bracket, even before the game is over.

Live win probabilities

Our interactive graphic will include a dashboard that shows the score and time remaining in every game as it’s played, as well as the chance that each team will win that game. These probabilities are derived using logistic regression analysis, which lets us plug the current state of a game into a model to produce the probability that either team wins the game. Specifically, we used play-by-play data from the past five seasons of Division I NCAA basketball to fit a model that incorporates:

  • Time remaining in the game
  • Score difference
  • Pre-game win probabilities
  • Which team has possession, with a special adjustment if the team is shooting free throws.

These in-game win probabilities won’t account for everything. If a key player has fouled out of a game, for example, his or her team’s win probability is probably a bit lower than we’ve listed. There are also a few places where the model experiences momentary uncertainty: In the handful of seconds between the moment when a player is fouled and the free throws that follow, we use the team’s average free-throw percentage. Still, these probabilities ought to do a reasonably good job of showing which games are competitive and which are in the bag.

We built a separate in-game probability model for the women’s tournament that works in exactly the same way but uses historical women’s data. Thus, we’ll be updating our forecasts live for both the men’s and women’s tournament.

Excitement index

Our March Madness “excitement index” (loosely based on Brian Burke’s NFL work) is a measure of how much each team’s chances of winning changed over the course of the game and is a good reference for picking the best games to flip to.

The calculation is simple: It’s the average change in win probability per basket scored, weighted by the amount of time remaining in the game. This means that a late-game basket has more influence on a game’s rating than a basket near the beginning of the game. We give additional weight to changes in win probability in overtime. Ratings range from 0 to 10, except in extreme cases where they can exceed 10.

The index isn’t perfect — this year’s play-in game between Holy Cross and Southern was good, but perhaps not deserving of its 9.4 rating. But even if it doesn’t quite capture the difference between a closely contested slog and a Dunk City run to the Sweet 16, it does a nice job of quantifying how tight a game was and how many big shots were hit.

Elo ratings

Otherwise, the methodology for our men’s forecasts is also largely the same as last year. But we’ve developed our own computer rating system — Elo — which we include along with the five computer rankings and two human rankings we used previously.

If you’ve followed FiveThirtyEight, you’ll know that we’re big fans of Elo ratings, which we’ve introduced for the NBA, the NFL and other sports. We’ve now applied them for men’s college basketball teams dating back to the 1950s, using game data from ESPN, Sports-Reference.com and other sources.

Our methodology for calculating these Elo ratings is highly similar to the one we use for NBA. They rely on relatively simple information — specifically, the final score, home-court advantage, and the location of each game. (College basketball teams perform significantly worse when they travel a long distance to play a game.) They also account for a team’s conference — at the beginning of each season, a team’s Elo rating is regressed toward the mean of other schools in its conference — and whether the game was an NCAA Tournament game. We’ve found that historically, there are actually fewer upsets in the NCAA Tournament than you’d expect from the difference in teams’ Elo ratings, perhaps because the games are played under better and fairer conditions in the tournament than in the regular season. Our Elo ratings account for this and also weight tournament games slightly higher than regular season ones.

Elo ratings for the 68 teams to qualify for the men’s tournament follow below.

RATINGS PROBABILITY OF…
TEAM REGION SEED ELO COMPOSITE FINAL 4 CHAMPS
Kansas South 1 2097 94.5 45.1% 19.1%
North Carolina East 1 2075 93.9 43.6 15.0
Virginia Midwest 1 2052 92.5 30.4 9.8
Michigan State Midwest 2 2078 91.8 33.9 8.9
Oklahoma West 2 1972 90.0 32.0 6.8
Villanova South 2 2045 91.3 22.4 6.4
Kentucky East 4 2014 90.7 15.9 4.4
West Virginia East 3 1956 89.3 16.2 3.4
Purdue Midwest 5 1938 88.7 13.0 2.7
Oregon West 1 2033 88.0 22.6 2.6
Texas A&M West 3 1915 86.8 12.4 2.4
Xavier East 2 1973 87.7 9.9 1.8
Arizona South 6 1953 89.0 6.0 1.8
Duke West 4 1910 87.3 12.1 1.7
Maryland South 5 1876 87.4 6.3 1.3
Indiana East 5 1938 87.4 5.8 1.1
Miami (FL) South 3 1933 87.1 4.9 1.0
Iowa State Midwest 4 1867 86.5 6.4 1.0
Baylor West 5 1837 85.5 6.0 1.0
Texas West 6 1788 84.7 5.9 0.9
Utah Midwest 3 1887 86.6 5.3 0.8
Wichita State South 11 1893 86.6 2.7 0.7
California South 4 1871 86.5 4.0 0.7
Iowa South 7 1904 85.9 3.2 0.6
Vanderbilt South 11 1846 85.6 2.4 0.5
Gonzaga Midwest 11 1916 86.0 3.2 0.5
Wisconsin East 7 1896 84.8 2.9 0.4
Notre Dame East 6 1832 84.4 2.6 0.3
Connecticut South 9 1872 85.3 2.1 0.3
Cincinnati West 9 1794 83.7 3.2 0.3
Butler Midwest 9 1815 84.2 2.5 0.3
Seton Hall Midwest 6 1914 84.5 1.8 0.2
Virginia Commonwealth West 10 1798 83.1 2.2 0.2
Dayton Midwest 7 1788 82.4 1.6 0.1
Syracuse Midwest 10 1772 82.7 1.3 0.1
Pittsburgh East 10 1787 82.3 1.2 0.1
Saint Joseph’s West 8 1814 81.6 1.1 0.1
Providence East 9 1824 82.5 0.8 0.1
Northern Iowa West 11 1751 80.2 0.8 <0.1
Stephen F. Austin East 14 1824 81.0 0.4 <0.1
Colorado South 8 1756 81.5 0.4 <0.1
Yale West 12 1792 80.2 1.0 <0.1
Texas Tech Midwest 8 1777 81.3 0.4 <0.1
Tulsa East 11 1690 79.9 0.2 <0.1
Michigan East 11 1768 79.6 0.3 <0.1
Southern California East 8 1733 81.4 0.2 <0.1
Arkansas-Little Rock Midwest 12 1734 78.9 0.2 <0.1
South Dakota State South 12 1735 78.6 0.2 <0.1
Temple South 10 1730 78.5 0.2 <0.1
North Carolina-Wilmington West 13 1722 77.7 0.2 <0.1
Oregon State West 7 1740 77.6 0.2 <0.1
Iona Midwest 13 1759 78.2 0.1 <0.1
Green Bay West 14 1667 76.2 0.1 <0.1
Stony Brook East 13 1663 77.1 0.1 <0.1
Chattanooga East 12 1610 76.6 <0.1 <0.1
Hawaii South 13 1737 78.0 <0.1 <0.1
Fresno State Midwest 14 1708 76.6 <0.1 <0.1
Buffalo South 14 1613 75.7 <0.1 <0.1
Cal State Bakersfield West 15 1635 75.0 0.1 <0.1
Middle Tennessee Midwest 15 1638 75.0 <0.1 <0.1
North Carolina-Asheville South 15 1553 74.2 <0.1 <0.1
Weber State East 15 1623 73.3 <0.1 <0.1
Florida Gulf Coast East 16 1544 71.4 <0.1 <0.1
Southern West 16 1392 68.0 <0.1 <0.1
Austin Peay South 16 1477 68.8 <0.1 <0.1
Hampton Midwest 16 1488 68.6 <0.1 <0.1
Holy Cross West 16 1420 66.9 <0.1 <0.1
Fairleigh Dickinson East 16 1417 66.7 <0.1 <0.1
2016 NCAA Tournament team ratings

Note, however, that Elo is still just one of six computer rankings that we use for the men’s tournament. The other five are ESPN’s BPI, Jeff Sagarin’s “predictor” ratings, Ken Pomeroy’s ratings, Joel Sokol’s LRMC ratings, and Sonny Moore’s computer power ratings. In addition, we use two human-generated rating systems: the selection committee’s 68-team “S-Curve”, and a composite of preseason ratings from coaches and media polls. The eight systems — six computer-generated and two human-generated — are weighted equally in coming up with a team’s overall rating.

We’ve calculated Elo ratings for men’s teams only. For women’s ratings, we rely on the same composite of ratings systems that we used last year. You can find more about the methodology for our women’s forecasts here.

As has been the case previously, our ratings are also adjusted for travel distance and (for men’s teams only) player injuries. Our injury adjustment has been slightly improved to account for the higher or lower caliber of replacement players on different teams: Stony Brook, for example, won’t be able to replace a star player as easily as Kentucky can.

As a final reminder, these forecasts are probabilistic — something especially important to consider in the men’s tournament this year when there’s about as much parity among teams as we’ve ever seen. In some sense, every team but the UConn women should be thought of as underdogs to win the tournament this year.

Check out FiveThirtyEight’s 2016 March Madness Predictions.