Some python education (when backtesting goes wrong)

Read the original post

Hi everyone, I recently made a post about insider performance highly outperforming real performance by crazy amounts. While this is still true, I want to highlight a bug that made this way higher than it should’ve been. Hopefully this also helps those of you that are interested in programming and backtesting your own things. Basically, to do this backtesting I was pulling the tickers from my database, sorting and filtering, then getting the long term portfolio calculations. To do this, I was using this code snippet: Note that yes there is a bunch of code in between this that I don’t want to reveal fully. What you need to know is that this is how I was calculating return (and there is an error in here, can you see it?) Now I was looking at the top stocks that this returned for insider performance, and inspecting each one by hand. So if I pull up the second stock, GPOR, I see that all time it has a return of just 178%. So how the hell did insiders achieve a 20,000% return? This definitely seemed off, and it tipped me that something in the code must be assuming ADDITIVE returns rather than an avg. I ran a quick =CORREL() on the value of insider buying vs the return (outperformance vs total_invested) and it returned a value of .99 – the smoking gun! This meant that returns are entirely controlled by amount invested. We expect some correlation, but not perfect correlation. Looking deeper into my code then, I found the line! This ASSUMES that the starting value for the insider returns is just 1k! So the more they invest, the better the returns. In reality that code line should be: This will give the REAL portfolio return. I’m going to have to rerun thousands of tickers worth of backtest, but that’s okay! Better to be right than wrong….. Hope this was educational! And a great indication of how easily backtesting can go wrong.