Suppose that an organization studied videotape of Major League Baseball umpires over a two week period. They reviewed every play in every game, other than balls and strikes. And they found that, in the 184 games they studied, there were only 47 missed calls -- about one for every four games.
That'd be quite vindicating for the umpires, I would think. Just 0.2 or 0.3 missed calls a game? A 2004 study of NFL games, by contrast, found 40 reversals on challenges initiated by the replay booth in the final two minutes of each half**; that would extrapolate out to 600 miscalls over the course of the entire game, or about 2.3 per contest (and not all calls are reviewable). And several NBA insiders that I spoke with for the book chapter that I'm writing about hoops said there were 15 or 20 "questionable" calls a game in their sport.
Indeed, this is exactly what a study by ESPN just found. Baseball umpires very rarely blow calls.
But that's not the framing that ESPN used. Instead, they said that umpires missed 1 in 5 "close" calls, which sounds much more damning. The question, of course, is how one defines a "close" call -- something which is completely arbitrary. If ESPN had used a more expansive definition of a "close" call, perhaps umpires would only have missed 1 in 10 "close" calls, or 1 in 20, rather than 1 in 5. If they used a narrower definition, perhaps the umpires would have missed 1 in 3. All of which tells you nothing about the performance of umpires and a lot about the semantic proclivities of a bunch of research assistants sitting around a conference room somewhere in Bristol.
** The reason it's preferable to look at this statistic, rather that at the number of reversals on coach-initiated challenges as occur in the first 28 minutes of each half, is because coaches are limited in the number of challenges they may make and penalized for incorrect ones with the loss of a timeout. Thus, many incorrect calls will go undetected, because it is not worth it for a coach to initiate a challenge, even if there is some likelihood that the call was incorrect. The reply booth has no such restrictions, however, and should therefore provide for a more reliable estimate.