I’ve been seeing and hearing a lot of talk about how the Brewers offense has been all or nothing this year. It’s basically summed up best in this article, where Tom Haudricort of the Journal Sentinel notes that the Brewers had, entering yesterday’s game, scored 64 runs in 5 games (12.8 R/G) and scored the other 64 runs in the other 20 games (3.2 R/G) against a team 5.08 ERA – roughly 5.50 runs allowed per game. Haudricort correctly points out that this is a pretty good explanation for why the Brewers carried an abysmal 10-15 record into last night’s game – that run distribution is almost certainly going to lead to 5 wins in the 5 games with the 12.8 and a poor record in the other 20.
However, the claim that this run distribution indicates that the Brewers offense is not as good as the 5.12 R/G indicated coming into the game is simply not true. This kind of claim is a textbook case of selection bias. I will say that it’s not clear that Haudricort is making that claim in his post. However, I do know that some people have been interpreting as such, and so this issue needs addressing.
Selection bias is a statistical bias in which there is an error in choosing the individuals or groups to take part in a scientific study. It is sometimes referred to as the selection effect. The term “selection bias” most often refers to the distortion of a statistical analysis, resulting from the method of collecting samples. If the selection bias is not taken into account then any conclusions drawn may be wrong.
The “selection” we see here is, of course, the selection of the 20 games in which the Brewers have struggled to score runs and the exclusion of the 5 games in which the Brewers just destroyed opposing pitchers to the tune of 12.8 runs per game. We have to look at every data point available to us – throwing away 5 games worth of hitting stats arbitrarily could have the exact same effect. Yes, the Brewers great hitting came, for the large part, against poor pitching. Still, things tend to average out in the end. I don’t buy the argument that the Pirates simply stopped playing at times in those 17-3 or 20-0 games – The Book found that players don’t tend to play to the score, and even when the game is out of reach, players are still competing for their jobs, pride, etc.
Similarly, if you remove the second inning from the Brewers game against the Dodgers last night, they lose the game 6-2. But, as it happens, the second inning still counts despite the fact that something that doesn’t happen very often happened in that inning. It’s still one event (or in this case, a string of events) in a larger case, and we can’t simply throw it out because it is different. The Brewers have now scored 139 runs in 26 games, good for 3rd in the NL. That’s 5.35 runs per game. They have a remarkable .359 wOBA, second in the league to only the New York Yankees, and that’s including the pitcher hitting.
The Brewers offense is fine, and they showed it last night.