Imagine you’re at a soccer game, and just as the opening whistle blows, the power cuts out. The stadium goes black. Eventually someone rigs up a single spotlight and the game goes on, but the light can only follow the ball. You can see who’s making a pass or a tackle, but as for what the other 21 players are up to, you’re in the dark.
That’s basically what most soccer data looks like: clear information about what’s happening on the ball and a total blank everywhere else. For such a big, messy sport, that can be a problem. No less an authority than Johan Cruyff once said the test of a good player is: “What do you do during those 87 minutes when you don’t have the ball?”
The good news is that new methods are starting to make it possible to see the rest of the pitch in a manageable way, without losing track of what’s happening on the ball. And better data is changing how soccer measures itself.
Traditional event data, which Opta started collecting more than 15 years ago, is a line-by-line record of everything that occurs on the ball in a match. It’s created by humans who watch in real time (or sometimes later for added precision) and log the timestamp, location and type of hundreds of actions, or “events.” This data is the basis for most familiar soccer stats, such as shot counts and possession percentages, and even FiveThirtyEight’s Soccer Power Index. It’s how the sport’s analytics revolution got its start.
But so much of what matters in soccer happens away from the ball, through how players move and defend, continuously creating and denying space. To analyze the game in depth, you need “more football reality inside your data,” said Ted Knutson, the CEO and co-founder of StatsBomb. In 2018, the upstart data provider added “freeze frames” to its event feed to capture the location of defenders and the goalkeeper every time a shot was taken. Gathering off-ball positions helped StatsBomb build a more precise expected goals model to estimate each shot’s chance of scoring, one of the foundational tools in soccer analytics.
To bring that level of detail to the much bigger chunk of the game that happens between the boxes, the company rolled out a new kind of data on Wednesday called StatsBomb 360, which includes freeze frame positions at the moment of every event, not just shots. It’s part of a major new wave of innovation in a field that still does a lot of its work using the same limited information as a decade ago. “The StatsBomb 360 data idea is for sure a remarkable step forward in soccer analytics,” tweeted FC Barcelona head of sports analytics Javier Fernández in February, when the launch was announced. “I’m convinced that this will make high-quality analytics more accessible (and way better datasets!).”
“Contextual” or “augmented” event data like StatsBomb 360 combines traditional human-logged events with off-ball positions captured by computer vision, allowing analysts to measure things they could previously only guess at. For example, coaches often want to know whether a pass broke a defensive line, which requires the locations of players who never touched the ball. The new data may even make it possible to evaluate a player’s choices. “You can essentially say, ‘Hey, I think this passing lane was open, and this player decided to do this instead,’” said Devin Pleuler, head of analytics for Toronto FC. “So for the first time, you can make a real, solid guess at player decision-making.”
Clubs with top analytics staff like Liverpool already do this kind of advanced tactical modeling using tracking data, which relies on cameras positioned around a stadium to capture the location of every player and the ball multiple times per second. Tracking data sees everything in immaculate detail, but it has drawbacks: It’s expensive, unwieldy and often unavailable when clubs are trying to scout new players. “I can’t go set up an eight-camera array in Argentina or wherever,” Pleuler said.
Within the Premier League and MLS, clubs get tracking data through an official partner, Second Spectrum, which got its start in the NBA. Their multi-camera setup delivers real-time information about player movement everywhere on the pitch for what Pleuler called “the gold standard — a true, full picture of the game.” Second Spectrum can link its tracking data to event data from companies like Opta’s parent, Stats Perform, or even invent new action categories via “auto-eventing.” “We can create an event for something like an overlapping run, and it’s quite a similar process as for a pass,” said Second Spectrum’s Mike D’Auria. “If a league or club has historical event data, it’s a question of where do they want consistency, versus where do they want to move to a different paradigm?”
To look at leagues outside their own, some clubs turn to a burgeoning middle ground called broadcast tracking, which extracts imperfect but useful tracking data from the version of the game you see on TV. Startups Sportlogiq in Montreal and SkillCorner in Paris have developed techniques to recognize players from shirt numbers and physical features, which allow them to go beyond anonymous positions and measure sprints, defensive positioning and other individual off-ball stats. When joined to an event provider’s on-ball actions, broadcast tracking can create rich contextual event data. But while it’s comparatively affordable and available for any league with a TV deal, broadcast tracking can’t quite see everything.
“We are a little bit at the mercy of the broadcast producer,” said Sportlogiq’s Stephen Foyston. Each time the TV cuts to replay, zooms in too far or clutters the screen with ads or graphics, the algorithms momentarily lose track of the game and the data blinks out. Still, most important actions are usually trackable with a precision that approaches full tracking. “When you look at what’s happening away from the main camera, it’s mostly walking, jogging, low-speed running, stuff that nobody wants to look at,” Foyston said. “The high-intensity stuff, we actually capture it quite close to multi-camera.”
Like Sportlogiq, StatsBomb gets its new 360 data from broadcast video. In some ways, StatsBomb is playing catch-up to broadcast tracking, since its computer vision product can’t yet identify players off the ball or how they’re moving. Its advantage is that 360’s freeze frames are part of the same feed as high-quality, human-collected records of passes, shots, dribbles, headers, interceptions and all the other on-ball actions across 38 competitions. “StatsBomb is taking their event data, which is the best on the market right now, and adding tracking-like features to it,” Pleuler said, “so now I can understand if there is an open passing lane, where previously I didn’t have that sort of information.”
As old and new kinds of data converge, soccer is flipping the switch from the dark ages to a floodlit future where analysts will be able to track most of what’s happening on the field anywhere in the world. It could change the way we see the game. “We don’t know everything that we’re going to learn out of it,” Knutson said. “But it seemed like the right thing to do from a football understanding.”
Check out our latest soccer predictions.