One of the key concepts in sabermetrics is regression to the mean. Dave Bush has a career HR/FB rate of 11.5%. Prior to last night’s start against the Arizona Diamondbacks, Dave Bush’s HR/FB rate was below 10% for the first time in his career. After giving up four straight home runs to the Arizona Diamondbacks, that is no longer the case, as his rate now sits at 10.7%, quite near his career average.
A common phrase among some of the sabermetrically inclined, particularly on twitter, would be “regression is a you-know what, isn’t it?” In this case, regression was an incredibly rare event, four home runs in a row, against a pitcher who is a couple percentage points above the league average in terms of allowing home runs on fly balls.
That’s not regression to the mean, that’s regression above and beyond the mean. Nobody could ever predict that this would happen to Bush, nor that 4 HRs in one game would happen to Bush. That would be the Gambler’s Fallacy at its finest. Wikipedia:
The Gambler’s fallacy’, also known as the Monte Carlo fallacy (due to its significance in a Monte Carlo casino in 1913) or the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated independent trials of some random process then these deviations are likely to be evened out by opposite deviations in the future. For example, if a fair coin is tossed repeatedly and tails comes up a larger number of times than is expected, a gambler may incorrectly believe that this means that heads is more likely in future tosses. Such an expectation could be mistakenly referred to as being due. This is an informal fallacy. It is also known colloquially as the law of averages.
Going back to the coin example, if you toss the coin 10 times and it comes up heads seven times and tails three times, you can’t expect it to come up tails the next four times. However, I don’t think anybody would be particularly surprised if it did – after all, your overall results end up as seven heads and seven tails in 14 tosses.
For Bush, the reasonable expectation is that for the rest of the season, he would give up home runs 11.5% of the time. We wouldn’t be surprised, however, to come back, look at his season stats, and see a value of 12.0%, like in both 2009 and 2008, in the HR/FB column of his stat sheet. Since the HR/FB value on individual fly balls has to be either “0” or “1”, the only way to move that HR/FB number up is a few “1”s, and that means a few home runs; possibly four in a row.
No, we couldn’t have predicted four home runs in a row, nor should we have, nor should we predict Dave Bush to perform particularly above or below his career average. When you look at it though, Dave Bush gives up a lot of HRs on fly balls, and he hadn’t prior to last night’s start. It would be a fallacy to expect four home runs in one game or even in a row, but really, it’s not terribly shocking.