Methodology

Last updated: April 2026

1. Data Pipeline

We collect historical match data (2015–present) from 14 European leagues via football-data.co.uk. Each match includes final score, half-time score, shots, cards, and Bet365/average closing odds.

2. Machine Learning Layer

XGBoost and LightGBM classifiers are trained on 150+ engineered features per match: ELO ratings, form (last 5/10/21 days), H2H, home advantage, injury burden, tactical style indicators and more. Cross-validated accuracy exceeds 55% on 1X2 (baseline: 45%).

3. Poisson Score Matrix

λh and λa are derived via Supremacy Split from the Over/Under and Asian Handicap lines. The score matrix P(i,j) = Pois(i|λh) × Pois(j|λa) is adjusted with Dixon-Coles correction (ρ calibrated per league via MLE on historical low-score frequency).

4. Value Bet Detection

Edge = model_probability − 1/bookmaker_odds. A positive edge above 3% is flagged as a potential value bet. We do not recommend staking amounts — edge detection is informational only.

5. Entropy & No-Bet Filter

Market entropy H = −Σ p·ln(p) over 1X2 probabilities. Matches with entropy > 1.2 or all-negative edge are flagged as "avoid" — these are cases where the markets are internally inconsistent or the model has no conviction.