How it works
Methodology
FormSora is built on the principle that numbers need context to be useful. This page explains how we source data, compute statistics, and generate the signals you see across the platform.
Data sourcing
Match-level statistics are sourced from professional football data providers with coverage of the top European leagues. Data is ingested at the player-per-appearance level — meaning each row represents one player in one match, not season totals.
We cover the Premier League, Championship, La Liga, Bundesliga, and Serie A. Coverage for other competitions is limited and may be incomplete.
Per-game averages
All averages are calculated from confirmed appearances only. A player being named in a squad without playing minutes is not counted. This keeps averages honest — a player who featured for 5 minutes in 10 games does not inflate their output numbers.
Where a specific stat is not recorded for a match (due to data provider coverage gaps), that appearance is excluded from the average for that stat. This means different stats for the same player may have different sample sizes, which we show on each stat page.
Context signals
A signal highlights where a player's output in a specific context — at home, away, or against a particular tier of opposition — differs meaningfully from their overall season average. The idea is simple: the same player can perform very differently depending on the situation.
Signals are generated by splitting a player's appearances into two groups (for example: home games vs. all games) and comparing the average output in each. Only splits with a sufficient number of appearances are considered.
Each signal carries a sample quality rating — Strong, Moderate, or Thin — so you can judge how much weight to put on it. A signal backed by 15 games is more reliable than one backed by 4.
Opponent quality
Opponents are classified into three tiers based on their current league standing: top third, middle third, and bottom third of the table. This is recalculated regularly as standings change across the season.
This lets us identify players who consistently perform differently against stronger or weaker opposition, which is a more meaningful context than just home/away.
Form
Player form is derived from recent match output relative to their season average — not from subjective ratings or editorial opinion. A player in “hot form” has been producing numbers above their own season baseline over recent appearances.
Team form is calculated from recent match results (wins, draws, losses) over the last five games.
Referee profiles
Referee statistics (cards per game, foul rates) are calculated across all matches officiated in the current season for tracked leagues. Referees are classified as strict, moderate, or lenient relative to the league average for their competition.
What we don't do
FormSora does not use predictive models, machine learning, or AI-generated analysis. Every number on the platform is a direct calculation from historical match data. We do not generate odds, probability estimates, or betting recommendations. What you see is what the data says — nothing more.
Data freshness
Statistics are updated daily after each match day. There may be a delay of up to 24 hours before a match's stats appear on the platform. Upcoming fixture data is updated regularly but may not always reflect the latest scheduling changes.
Coverage caveats
Not every stat is available for every match or competition. Where coverage is partial, we show a note on the relevant stat page explaining what is and is not included in the calculation. We prefer to show a smaller, accurate sample than a larger, noisy one.
Questions about specific stats or coverage? Browse signals or get in touch.