So, in case you've been in a cave somewhere, the World Cup starts today.
Last summer, I worked in conjunction with ESPN on a project called the Soccer Power Index (SPI), a frankly somewhat complex system to rate internal soccer teams that we hope will be superior to other metrics like FIFA and ELO. Soccer is tough to predict, so take with the usual a grain of salt, but there was a lot of thought put into SPI, as you can see here. Among other things, SPI takes into account the performance in international club play of the members of each team's roster, and looks at the rosters for each game to determine which matches teams were taking seriously and which they weren't.
One of the nice features of SPI is that it was explicitly designed to be predictive. You can take any two teams, plop in their OFF (offense/attack) and DEF (defense) ratings, and it will estimate the probability of a win, loss or draw. ESPN has a cool implementation of this called the SPI match predictor, which you can see at various locations around their World Cup site.
Obviously, if you have a tool to predict the probabilities for a given match, it doesn't take that much more work to project out the balance of the tournament, given what's happened so far. So, that's what you'll be seeing in the top right-hand corner of the website for the next month or so -- I'll be running 10,000 simulations once a day or so and updating a chart that looks like this:
I hope this will be reasonably intuitive to people; PTS is the number of points that a team is projected to accumulate during group play (3 points are earned for each win and 1 for each draw); GF and GA are their projected goals for and against; Win and Adv are the probabilities that a team wins their group, and advances to the knockout stages, respectively. The rightmost two columns look beyond the group stages to the knockout stage; Semi is the probability that a team reaches the Semifinals, and Cup is their chance of coming home with a trophy.
As the World Cup progresses, of course, we'll replace the results of simulated matches with actual ones and the probabilities will change accordingly.
Beyond this, I probably won't be writing a lot about the World Cup; I'm just going to be enjoying it from an Undisclosed Location where I'm ensconced to make hay on my book project, and running the numbers, and hopefully SPI will do pretty well for itself.
A caution: SPI does not account for injuries, of which there have been quite a few, although we're working on a fix for that.