The FIFA World Cup has seen ‘Paul the Octopus’ – the famous eight-limbed soothsayer. In this age of AI and machine learning, predicting a World Cup winner has become more refined. Take, for example, Achim Zeileis, Professor of Statistics, University of Innsbruck. He has used “machine learning algorithm and subsequent simulations are fueled by data, expert knowledge and statistical models” to predict the likely winner of the FIFA World Cup 2026 – the biggest edition of the marquee event so far.
What Was the Process Followed to Predict the Winner?
Achim Zeileis says his algorithm proceeds in two steps. “In the first, sophisticated statistical models and expert insight from bookmakers and transfer markets are combined to determine the strengths of all teams and their players. In the second step, a machine learning algorithm decides how to best combine the strength estimates with other information about the teams,” he wrote in The Independent.
“We ran the simulation 100,000 times to determine the tournament’s most likely course. The results show that Spain is the favourite for the title with a winning probability of 14.5%, closely followed by England and France, each at 12.4%, and Germany at 11.2%.”
“Portugal and Argentina also have good chances of winning the title, at 8.9% and 8.2%, respectively.”
A Deep Dive
Zeileis says his algorithm took note of four variables.
“First, all national matches over the past eight years are the basis for a “retrospective” estimate of the teams’ strengths. Second, a “prospective” strength estimate is obtained from quoted odds of various international bookmakers, reflecting their expert opinions about the upcoming tournament,” he wrote further.
“Third, ratings of the individual players are produced based on their contributions to goals at the club and national levels. And finally, the current quality and future potential of the players are reflected in their expected market values. These are available from the Transfermarkt website, which uses a wisdom-of-the-crowd approach to estimate the unknown real market values.

