SPY benchmark for PyFolio

mads · June 18, 2018, 7:34pm

In your new video you shown a PyFolio report using the SPY benchmark. Traditionally this has been a major headache. My current solution is to explicitly download and use the data with the below workaround:

    # Download the SPY benchmark dataset from IEX (only 5 years)
    r = requests.get( 'https://api.iextrading.com/1.0/stock/{}/chart/5y'.format('SPY'))
    data = json.loads(r.text)
    spy = pd.DataFrame(data)
    spy.index = pd.DatetimeIndex(spy['date'])
    spy = spy['close']
    spy = spy.sort_index().tz_localize('UTC').pct_change(1).iloc[1:]

Let me know if you found a better way.

Brian · June 18, 2018, 8:42pm

You can specify a benchmark in your Moonshot strategy and it will be returned in the CSV. Pyfolio will then use that benchmark data instead of trying to download SPY data.

I agree this has been a long-time headache and fortunately Quantopian is deprecating external data dependencies in pyfolio. When that's released it will be pulled into QuantRocket. Once that happens, if no Moonshot benchmark is specified, the benchmark portions of the plots will simply be omitted by pyfolio.

mads · June 18, 2018, 9:36pm

Hmm, I get a:

HTTPError: ('500 Server Error: INTERNAL SERVER ERROR for url: http://houston/moonshot/backtests.csv?strategies=vix&start_date=2018-01-01+00%3A00%3A00', {'status': 'error', 'msg': 'vix BENCHMARK ConId 756733 is not in backtest data'})

I've added SPY to the Universe and refetched data. Any clues?

Brian · June 19, 2018, 12:17am

It could be due to caching. In backtesting, Moonshot caches history query results for up to 24 hours for any given set of history db query parameters. This is to speed up parameter scans and repeated backtest iterations. The design of caching based on history query parameters usually works fine but gets tricked when the query parameters remain the same but the underlying data has changed (as in your case due to adding SPY).

Try changing a parameter that gets passed to the history query, such as start or end date, or you can --force-recreate the moonshot service to clear the cache. If nothing were done, it would sort itself out in 24 hours. In the future, the caching behavior will be better documented and offer more user control.

Brian · September 17, 2018, 12:38pm

quantrocket/jupyter:1.3.0 contains a Pyfolio update which no longer attempts to plot benchmarks unless benchmark data is provided by the user.