How do we know all those football statistics are true?


Are you into statistics? Are you constantly looking up your team’s xG numbers or your striker’s shot location metrics? If you do, here’s a question; do you trust them to be accurate? And if so, why?

It isn’t uncommon to hear an ex-pro pundit guffawing at the latest statistical measurement concept as though it’s all part of the liberal PC conspiracy to emasculate them, along with quinoa, rose wine and the word Twitter, but this is a massive industry.

It is rarely said but many of us like statistics for their own sake and this applies to football as much as any other walk of life. I like knowing that Osasuna have won more aerials (28.2) this season than any other team in the top five European leagues, and yet have the lowest successful pass percentage (68.7) or that Napoli has the most shots per game (17.6) and Elche the least (6.3), every bit as much as I like knowing Ten Years After’s highest charting album on the Billboard 200 is their 1970 release, Cricklewood Green, which peaked at #14. It doesn’t make me enjoy football or music more per se, but that doesn’t matter. The knowledge is, in and of itself, entertaining enough.

Statistics give me a warm fuzzy feeling and I’m not ashamed to say so. Of course, I have absolutely no idea if these numbers (from who state ‘Data sources – Opta Sports, eNetPulse & Getty Images’) are correct and I have no way of checking. I accept them as I find them, but then I’ve nothing invested in their accuracy. But many do.

The industry sells stats as profound insider intel, in a way which often seems over-inflated in importance. Football is more than data, data is not more than football, but you’d think they had the key to unlocking the greatest mysteries of the universe.

All of them breathlessly promise unrivalled insights.

Stats Perform’s website says: ‘Our award-winning AI team maximises the value of global sports coverage dating back over 40 years by coupling it with machine-learning technology to generate insights for meaningful experiences for fans.’

‘Machine-learning technology’ sounds like they’ve constructed a huge Heath Robinson contraption where you insert a football at one end and out comes a prediction for Colchester United v Mansfield at the other and that it’s not actually just a computer and software.

Okay, it’s all so much Partridge-style corporate BS, but it reflects just how seriously they take what they do and how serious the business of statistics is for football, from gambling to TV, to player recruitment.

People and pundits often try to unearth their own personal massive Koh-i-Noor soccer stat to place in their crown of punditry, to prove the quality of their insight into the game. It seems obligatory these days. And in an era where it has become important to some fans not just to be right, but to be seen to be right in the court of social media, this isn’t so surprising. To some stats=intelligence. But as our Prime Minister shows us every day, education isn’t the same thing as intelligence or understanding.

Remarkable then, given the value and size of the industry that there is no independent organisation to assess and judge the veracity of the statistics sold to us. Opta’s website (its parent company is Stats Perform) says it covers over 1,000 Leagues and competitions and over 200,000 matches. By any stretch of the imagination, that must be a huge operation. It surely isn’t possible that every single statistic across 200,000 games is 100% accurate and yet there is no indication of any margin of error. There is no concession to any possible inaccuracy. We accept all and every stat as gospel truth. I’ve never once questioned any stat, have you? Maybe we should.

We are unable to challenge their veracity, so we have to go on trust. And trust is a word most companies in the field use a lot, as you can imagine they would. It isn’t in anyone’s interest to be wrong but mistakes can easily be made and flaws baked into any system, no matter how rigorously policed. And some organisations will just be better than others.

Data companies all say they have their own internal checking procedures to try and ensure that the numbers they sell are correct. They go back and adapt them after games when other info becomes available. But we know internal policing methods have an innate weakness to them; they’re internal. Thus are potentially subjected to all manner of relationship, employment and commercial business pressures.

So perhaps it is no surprise that there isn’t an independent OfStat agency to oversee accuracy, standards and procedures. There should be penalties for persistent inaccuracy and industry standards to which all sign up to.

Elsewhere in society, it is accepted that in any major industry there should be a body whose purpose it is to define, supervise and maintain standards, to help protect the interests of the customers and workers. This business isn’t just a bit of fun. A lot of money rides on such analysis for those that use the numbers. As everyone from media to clubs to agents wants ever more data in pursuit of success, accuracy and integrity of data is crucial. Just because the company says it’s accurate, doesn’t mean we should trust them at their word, surely we need third party confirmation.

Looking at various websites, I note that the data companies do not vie with each other as to who has the most accurate data, I assume because to do so would be to undermine the credibility of the artform across the whole industry and that’s to no-one’s advantage, even though they’re essentially all selling the same potatoes, so creating a USP for your firm must be hard. You have to invent machine-learning.

Like any currency, the football stat industry relies on us all believing in it for it to continue to exist. The moment we think money is a worthless bit of paper, it has no value at all. The moment we think football stats are not precise, is the moment the industry collapses.

Stat culture does get pulled into the orbit of the anti-PC, anti-intellectual, anti-woke mindset, as another thing that’s wrong with football these days, for some reason, though I suspect this is less to do with the science itself and more a hostility to those who embrace it too tightly as a way to explain everything. There is certainly a mutual snootiness.

For me stats are just part of the fun, if they’re right or wrong makes little real difference. But the industry needs a reliable guarantee that they’re correct and until there is an independent body whose task is to ascertain that they are, how can we ever really trust them, or perhaps more pertinently, why have we unquestioningly trusted them for so long?

John Nicholson

Johnny’s new book Can We Have Our Football Back YET? What Covid-19 has told us about the Premier League, is an update to his 2019/20 best seller. “A searing account of a year like no other in football.”

Buy it here.

Read More

You might also like

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More