What I would say is that the factors in the two European algorithms are IDENTICAL to the factors in the UK algorithms. When I first told a few people I was going to be looking at the European leagues, they said I’d no doubt find that other factors were more important in these leagues than in the UK leagues. Nope, I didn’t and I did test for other factors but basically, every factor that appears in the UK algorithm appears in the European algorithm.
In a way, this is quite reassuring for me as it means the work I did nearly 2.5 years ago is still valid! When I was looking at factors you can use to build football ratings, I downloaded data from lots of different leagues, dumped it in a spreadsheet and started to analyse it. I didn’t care what league it was from, what time of year it was from and so on. My findings then were that shots on goal and shots on target were the best underlying indicator and that remains the case with these new European leagues I’m looking at.
One big difference I can see between the UK algorithms and the European algorithms is the fact that historical results plays no impact in the UK algorithms (it’s a factor in the model but with a weighting of 0 in every algorithm) and yet, in the European algorithm, I can see that there appears to be a correlation between historical results and future results in a fixture. Hence, this factor carries some weighting in the European rating algorithms.
Without giving away the weightings and variables, I’d say short-form matters more in the UK than in Europe whereas long-term form matters more in Europe. That’s a definite trend I picked up. I’d also say home form matters much more in Europe than in UK Leagues.
The biggest difference by far between the UK and European Leagues is the performance of Home and Aways bets. Basically, it’s very easy (almost hard to lose!) to build a rating algorithm to produce a fantastic return on Homes but unfortunately, the better the algorithm is on Homes, the worse it is on Aways!
It took me a fair bit of time to understand this dynamic as when I first asked the program I have (basically a SAS program which takes in the data and runs simulations) to maximise the return on some variables in the algorithm, it was throwing up massive profits on Homes and massive losses on Away selections for some variables! Hence, it was actually optimal to ignore the Aways and concentrate on the Homes.
I did consider having a separate algorithm for Homes and Aways but after discussing it with a few other football analysts, I decided this was far too big a risk. For a start, it would have meant that I could have ended up with opposing teams in some games which would have given me a headache and secondly, it is so different to what I did with the UK algorithms, it means I would have been going into the unknown.
Therefore, I reigned back the Home selections a little and found weightings in the model which ensured I could achieve a decent profit on the Away bets too. Sacrificing a little return on Homes to get a better return overall isn’t a bad thing and I did a similar thing with the UK systems back at the beginning when I think back but the opposite way around! All the profits were on the aways in the UK leagues, so I pulled these back a bit and ended up finding some really good home bets.
One other change between the original backtesting for the UK systems and the Euro systems is the overround. In the UK systems, I adjusted the draw odds to try to account for the fact that the draw odds I was using are far too low and the overround was too high. If you use a bookie like Pinnacle for draw odds, you can easily beat the draw odds I’m using in the backtesting results. There is NO adjustment in these results though.
What does this mean? Well, it means the historical results for AH results are probably lower than what can be achieved in a live environment. In hindsight, I’d always wished I’d never adjusted the draw odds in the original backtesting for the UK systems as it made the AH results slightly too good during backtesting. So, in this case then, the AH results are going to look too low. So, you can probably see these AH returns as a base minimum for all the Euro systems. Something to be wary of when looking at the results.
So, after a few late nights, the first algorithm is complete. How does it look?
Well, similar to what I’ve done with the UK algorithms, I won’t ever publish the results from seasons 2000-2005. Quite simply, it doesn’t add anything to the analysis as the data is 100% backfitted and therefore, it is useless data as a means of projecting the future.
One small change compared to the UK systems is the fact that I am now 7 years away from the last season of fully backfitted data. I’ve made a conscious decision to change the part seasons I’ve used during the backfitting process.
I’ve used 50% of the data from 2006/07 and 2007/08 (which is identical to what I did with the UK algorithms) but I’ve also used 50% of the data from 2010/11. I feel like if I don’t use any more recent data, the data in the model would be too far out of date and therefore, I’m happy to sacrifice a season’s results to ensure I could get a model that works going forward.
What it means is that when you look at results from 2010/11 (as well as the first two seasons), be very wary of them. They are NOT backfitted results so to speak, 50% of the games in this season was used in the backfitting process. Hence, they are not 100% reliable but then again, they are not 100% backfitted either.
Anyway, caveats aside, here’s the results of the first European algorithm.
Similar to what I did when I first showed the UK algorithms to people, here’s the results from the 3 seasons which are fully backtested. This should give an indication of what can be achieved going forward (ignoring the fact I now know the single systems won’t achieve these results as they didn’t with the UK bets).
The backtested results look about 75% of the overall results, so there isn’t a big discrepancy. It was bigger on the UK results! Hence, it’s OK to look at the full set of results and not get drawn into worrying about the results being massively overtstated.
Here’s the split by Homes and Away.
The most interesting aspect of Euro algorithm one is the fact that over 60% of the bets are Home bets! I can’t explain how different this is to the UK algorithms. On UK algorithm one, 60% are Away bets, 77% are Away bets on UK algorithm two and 66% are Away bets on algorithm three.
That’s a significant shift from Aways to Homes but it fits in with the comments I made above. It seems much easier to make a profit on Home bets in the European Leagues and this is shining through on Euro algorithm one.
Interestingly, the ROI on Homes on Euro algorithm one is 15.7% and on Aways it is 19.2%. It’s a similar thing I found with the UK algorithm. Even though I could have had a much better ROI on the Home bets here, by ensuring I made a profit on the Away bets, I actually end up with a better return on the Aways! I did the same on the UK algorithms where the Homes ended up with fewer bets and a better ROI than the Aways.
Here’s the performance by each League:
Germany appears to be the strongest league with France being the weakest league. Germany does have the fewest bets though, so that maybe explains the higher ROI we’re seeing.
I think that’s enough of an introduction to algorithm one. Of course, the results above will now become the results for system E1.
The next stage is to filter these results and create system E2.