Baseball’s Most Surprising Seasons, Good And Bad


Which baseball players have had the most surprisingly bad and surprisingly good seasons in recent years? I wondered about this while researching an article on whether spring training performance foreshadows regular-season production.

I calculated the uncertainty in the Marcel forecasting system projection for batting wOBA — a measure of a hitter’s overall offensive production per plate appearance — based on the reliability of the forecast. This gives us a range of expected results and allows us to look at which players’ regular-season performances were the least likely going into the year. It’s a nice way of quantifying unexpectedly good and bad campaigns.

First, the most surprising strong seasons in the dataset I used in my article, which extends back to 2006 (minimum 200 plate appearances):

Player Year Age Proj. wOBA Actual wOBA Percentile
Hanley Ramirez 2013 30 .346 .444 99.9
Luke Scott 2006 28 .316 .437 99.9
Mike Napoli 2011 30 .346 .444 99.9
Jose Bautista 2010 30 .325 .421 99.9
Ben Zobrist 2009 28 .311 .413 99.8
Brandon Moss 2012 29 .296 .402 99.8
Jermaine Dye 2006 32 .338 .425 99.7
Jerry Hairston 2008 32 .290 .384 99.7
Justin Morneau 2010 29 .358 .436 99.5
Josh Hamilton 2010 29 .360 .441 99.4
Magglio Ordonez 2007 33 .354 .435 99.4
Justin Ruggiano 2012 30 .311 .409 99.3
Mike Trout 2012 21 .330 .427 99.3
Ryan Raburn 2013 32 .305 .387 99.3
Jacoby Ellsbury 2011 28 .337 .413 99.1
Carlos Quentin 2008 26 .334 .419 99.1
Josh Bard 2006 28 .311 .398 99.1
Jim Thome 2010 40 .359 .430 98.8
Aaron Hill 2012 30 .309 .380 98.8
Jose Bautista 2011 31 .359 .430 98.8
Dioner Navarro 2013 29 .289 .372 98.8
Chris Davis 2013 27 .337 .411 98.7
Melky Cabrera 2012 28 .326 .395 98.7
Carlos Gonzalez 2010 25 .342 .418 98.6
Jason Bartlett 2009 30 .325 .394 98.5
Carlos Ruiz 2012 33 .327 .396 98.4
Scott Spiezio 2006 34 .301 .372 98.2
Ian Desmond 2012 27 .308 .375 98.2
Garrett Atkins 2006 27 .339 .411 98.2
Brent Lillibridge 2011 28 .297 .375 98.1

Before the 2013 season, we would have expected there to be a 50 percent chance that Hanley Ramirez’s wOBA would be above .346. If you’d asked us what the odds were that Ramirez’s wOBA would reach or beat .444, we would have said practically zero — 0.1 percent, to be exact.

The fact that Ramirez’s wOBA was .444 was an outcome in the 99.9th percentile of his preseason wOBA distribution.

Flipping things around, here are the most disappointing seasons of the past eight years:

Player Year Age Proj. wOBA Actual wOBA Percentile
Travis Hafner 2008 31 .387 .270 0.0
Andruw Jones 2008 31 .350 .238 0.0
Tyler Colvin 2011 26 .343 .213 0.0
Tony Pena 2008 27 .299 .175 0.0
Wily Mo Pena 2008 26 .350 .231 0.0
Ryan Raburn 2012 31 .334 .219 0.0
Nick Hundley 2012 29 .326 .209 0.0
Brandon Wood 2010 25 .300 .167 0.0
Reid Brignac 2011 25 .320 .203 0.1
Brian Giles 2009 38 .346 .247 0.1
Chone Figgins 2011 33 .329 .232 0.1
Adam Dunn 2011 32 .364 .269 0.1
Justin Morneau 2011 30 .367 .274 0.2
Alexi Casilla 2007 23 .354 .240 0.2
Adam Moore 2010 26 .340 .226 0.2
Pete Kozma 2013 25 .350 .239 0.2
Rob Brantly 2013 24 .347 .237 0.3
Jeff Francoeur 2013 29 .320 .235 0.3
Martin Maldonado 2013 27 .327 .228 0.4
Tommy Manzella 2010 27 .339 .236 0.4
Jason Bay 2012 34 .331 .247 0.4
B.J. Upton 2013 29 .341 .260 0.4
Andy LaRoche 2008 25 .335 .232 0.5
Mark Kotsay 2007 32 .333 .253 0.5
Adam Kennedy 2007 31 .335 .253 0.5
Ruben Tejada 2013 24 .321 .238 0.5
Mike Lamb 2008 33 .336 .253 0.6
Milton Bradley 2010 32 .372 .290 0.6
J.D. Drew 2011 36 .355 .275 0.6
Clint Barmes 2006 27 .340 .251 0.6

Travis Hafner, if you’ll recall, had been one of the best hitters in baseball in the four years leading up to 2008, which was one of the big reasons why another statistical system for forecasting player performance, FiveThirtyEight founder Nate Silver’s PECOTA, called for the Cleveland Indians to win 91 games that year. Instead, the Indians went 81-81 as Hafner’s wOBA sunk to .270 — an outcome that seemed almost impossible (hence, the 0.0 percentile score).

Keep in mind that this method is based on the Marcel reliability score, which essentially measures how much of a given projection is made up of the league mean and how much belongs to the player’s statistical record. It employs a generic age adjustment, but it does not look at similar players at similar ages, as Silver did with PECOTA.

Hafner, Andruw Jones, Chone Figgins, Adam Dunn and others on the second list hit their early 30s and, rather than declining gradually, completely fell apart. Marcel has no way to determine whether players with certain tendencies or body types are more likely to completely crater, which would affect our confidence intervals. Cleaning up projections on the margins like that is one of the ways a system such as PECOTA is superior to Marcel, even though the returns diminish sharply with extra complexity.