After writing my latest post yesterday and reading through it again today at lunchtime, it did make me wonder.....What if by including the 2006/7 and 2007/8 seasons in my results, this has been causing me to think things were behaving differently from the past when in reality, they weren't that different from the last couple of seasons?
As most people know now, from when I started out on this journey, I've only been quoting the last 4 season's results as the data I used to build the ratings from 2002-2006 is all backfitted data. If I was quoting these results, then I would be painting a misleading picture as the results for these years had ROI's of about 40% as I fixed the parameters of the two rating algorithms to maximise these results.
However, as I've quoted a fair few times now, I also used a sample of data from the 2006/7 and 2007/8 seasons to ensure my model wasn't using data from too long ago to miss any changes in trends. I randomly selected 50% of the games from 2006/7 and 2007/8 to include in my backfitted model and therefore, these years will always be slightly overstated. However, these seasons account for 1/6 (1/3 * 50%) of my full backfitted results and therefore, I was still fairly confident that the results weren't overstated in these seasons.
Admittedly, it would have been much better to just ignore the first two seasons completely and only look at the full 2 seasons of backtested results but my issue there was I only had 2 season's of data and it meant my sample size was only 5,164 games. I noticed a post by Cassini today about someone saying you needed 100,000 bets to be sure of an edge, so I'd be paper trading for the next 20 seasons under this scenario!
Anyway, I thought 5,164 games was too small to draw meaningful conclusions and that's why I've always went with 4 season's. 4 season's gave me 12,037 games and therefore, I was happy to go with this, on the understanding that my first two years were probably slightly inflated by around 20% I guess (hard to know for sure tbh as it depends how good the backfitting was in these 2 years compared to the backtested results!)
After writing yesterday's post, I could see clearly that the draw % was much lower in the first two seasons and therefore, this would be bringing down my overall draw % average. Hence, maybe this season's wasn't as bad compared to the last two season's backtested results.........
There is a lot of information to glean from this but whatever way you look at this, it shows my worries were unfounded. The one thing that stands out here though is how consistent the 08/09 and 09/10 seasons were. They had a very similar H/A/X % across both years. The draw % was remarkably similar with 26.1% in the first season and 26.4% in the second season.
To put this into perspective, the draw % for this season is 31.3%. That's an increase of a full 5% points or an increase of around 18% on last season.
Over the course of this season (2801 games), we can use the same draw % as last season at 26.4% to get an expected number of draws this season. This would be 740 draws. What have we had this season? 876 draws. Hence, 146 draws more than you would have expected, based on last season.
How much does this hurt the systems? Well, for every draw, the systems score -1pt. The average odds for my all bets is around the 7/4 mark (2.75). Using this as a proxy, we can say that each draw costs the systems 2.75pts (1.75pt profit as against 1pt loss). Hence, the systems have lost 400pts this season compared to last season.
This season, the systems have ONLY made a profit of 203pts with an ROI of 7.3%. If we add on the missing 400pts (sounds a lot when I say it!), then we'd be looking at an ROI of 21% for this season.
What was the ROI in the 2009/10 season? 21%. What was the ROI in the 2008/09 season? 21%.
I rest my case your honour.
Incidentally, the attachment also shows the draws by each system for the last 3 seasons. You can see just how badly some systems have been crippled this season by the draw.....
System 7 has a 32% draw rate this season compared to circa. 25% for the last two seasons
System 8 has a 30% draw rate this season compared to circa. 25% for the last two seasons
System 22 has a 33% draw rate this season compared to circa. 28% for the last two seasons
System 23 has a 36% draw rate (50% after Christmas!) this season compared to circa. 24% last season
All systems 6-21,6-22 et. al have draw rates much larger than the last two seasons as a result of this too
Whatever way you look at the data, this draw effect isn't caused by me backfitting results, having poor ratings, building poor systems etc. It is just short-term variance that has unfortunately lasted for half a season. Long-term though, the work I've done on this project isn't flawed I hope (and pray!)