Zipline doesn't support trading breaks in custom calendars

I'm customizing the built-in GLOBEX calendar to add proper support for the daily trading halt with the following lines:

    break_start_times = (
        (None, time(15, 16)),
    )

    break_end_times = (
        (None, time(15, 30)),
    )

This is supported syntax in the original trading-calendars package from Quantopian. However, it appears Zipline doesn't handle trading breaks when it's trying to fetch historical data:

        quantrocket_zipline_1|Traceback (most recent call last):
        quantrocket_zipline_1|  File "sym://qrocket_app_py", line 807, in post
        quantrocket_zipline_1|  File "sym://qrocket_qrzipline_backtest_py", line 167, in backtest_algo
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/algorithm.py", line 675, in run
        quantrocket_zipline_1|    for perf in self.get_generator():
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/gens/tradesimulation.py", line 205, in transform
        quantrocket_zipline_1|    for capital_change_packet in every_bar(dt):
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/gens/tradesimulation.py", line 133, in every_bar
        quantrocket_zipline_1|    handle_data(algo, current_data, dt_to_use)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/utils/events.py", line 218, in handle_data
        quantrocket_zipline_1|    dt,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/utils/events.py", line 237, in handle_data
        quantrocket_zipline_1|    self.callback(context, data)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/algorithm.py", line 485, in handle_data
        quantrocket_zipline_1|    self._handle_data(self, data)
        quantrocket_zipline_1|  File "data_integrity", line 26, in handle_data
        quantrocket_zipline_1|  File "data_integrity", line 32, in check_data
        quantrocket_zipline_1|  File "zipline/_protocol.pyx", line 121, in zipline._protocol.check_parameters.__call__.assert_keywords_and_call (zipline/_protocol.c:3824)
        quantrocket_zipline_1|  File "zipline/_protocol.pyx", line 744, in zipline._protocol.BarData.history (zipline/_protocol.c:9622)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 974, in get_history_window
        quantrocket_zipline_1|    field)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 906, in _get_history_minute_window
        quantrocket_zipline_1|    minutes_for_window,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 1063, in _get_minute_window_data
        quantrocket_zipline_1|    False)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 549, in history
        quantrocket_zipline_1|    is_perspective_after)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 431, in _ensure_sliding_windows
        quantrocket_zipline_1|    array = self._array(prefetch_dts, needed_assets, field)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 595, in _array
        quantrocket_zipline_1|    assets,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/dispatch_bar_reader.py", line 121, in load_raw_arrays
        quantrocket_zipline_1|    for t in asset_types if sid_groups[t]}
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/dispatch_bar_reader.py", line 121, in <dictcomp>
      quantrocket_flightlog_1|2021-05-16 16:54:16 data_integrity: ERROR Error occurred at timestamp '2021-03-01 07:45:00-06:00' in contract 'Future(QF000000026993 [ESH1])'.
        quantrocket_zipline_1|    for t in asset_types if sid_groups[t]}
        quantrocket_houston_1|172.20.0.12 - - [16/May/2021:16:54:16 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/continuous_future_reader.py", line 280, in load_raw_arrays
        quantrocket_zipline_1|    out[start_loc:end_loc + 1, i] = result
        quantrocket_zipline_1|ValueError: could not broadcast input array from shape (1574) into shape (1561)

I've narrowed the problem down to this function: zipline/minute_bars.py at master · quantrocket-llc/zipline · GitHub. It looks like the exclusion tree it builds only accounts for market opens and closes, so when there's a trading halt defined, it doesn't compensate for the shift in the calendar and tries requesting more data than it actually requires in the prefetch. I'm going to tinker with this a bit more to see if I can implement a fix. I'm posting this in case someone else has run into this already.

I tinkered with this a bit more and got close but have an annoying off by 1 error that I can't seem to get rid of.

  1. In the exchange calendar, add these lines:
    break_start_times = (
        (None, time(15, 16)),
    )
    break_end_times = (
        (None, time(15, 31)),
    )
  1. Replace _minute_exclusion_tree() in minute_bars.py with the following (i.e. no longer use the helper method _minutes_to_exclude() so we can handle breaks inline with the market opens/closes):
    @lazyval
    def _minute_exclusion_tree(self):
        """
        (omitted for brevity)
        """
        itree = IntervalTree()

        for open_time, break_start_time, break_end_time, close_time in zip(
            self._market_opens, self._market_break_starts, self._market_break_ends, self._market_closes
        ):
            if break_start_time is not pd.NaT:
                break_start_pos = self._find_position_of_minute(
                    break_start_time) # note the omission of +1 here
                break_end_pos = self._find_position_of_minute(
                    break_end_time) - 1
                itree[break_start_pos:break_end_pos + 1] = (
                    break_start_pos, break_end_pos)

            close_pos = self._find_position_of_minute(close_time) + 1
            eod_pos = (
                self._find_position_of_minute(open_time)
                + self._minutes_per_day
                - 1
            )
            itree[close_pos:eod_pos + 1] = (close_pos, eod_pos)

        return itree

You'll also need to make __init__ look like this:

        self._schedule = self.calendar.schedule[slicer]
        self._market_opens = self._schedule.market_open
        self._market_open_values = self._market_opens.values.\
            astype('datetime64[m]').astype(np.int64)
        self._market_break_starts = self._schedule.break_start
        self._market_break_start_values = self._market_break_starts.values.\
            astype('datetime64[m]').astype(np.int64)
        self._market_break_ends = self._schedule.break_end
        self._market_break_end_values = self._market_break_ends.values.\
            astype('datetime64[m]').astype(np.int64)
        self._market_closes = self._schedule.market_close
        self._market_close_values = self._market_closes.values.\
            astype('datetime64[m]').astype(np.int64)
  1. Run any algo that fetches minute data. You'll get an exception like this:
        quantrocket_zipline_1|Traceback (most recent call last):
        quantrocket_zipline_1|  File "sym://qrocket_app_py", line 807, in post
        quantrocket_zipline_1|  File "sym://qrocket_qrzipline_backtest_py", line 167, in backtest_algo
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/algorithm.py", line 675, in run
        quantrocket_zipline_1|    for perf in self.get_generator():
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/gens/tradesimulation.py", line 205, in transform
        quantrocket_zipline_1|    for capital_change_packet in every_bar(dt):
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/gens/tradesimulation.py", line 133, in every_bar
        quantrocket_zipline_1|    handle_data(algo, current_data, dt_to_use)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/utils/events.py", line 218, in handle_data
        quantrocket_zipline_1|    dt,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/utils/events.py", line 237, in handle_data
        quantrocket_zipline_1|    self.callback(context, data)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/algorithm.py", line 485, in handle_data
        quantrocket_zipline_1|    self._handle_data(self, data)
        quantrocket_zipline_1|  File "data_integrity", line 36, in handle_data
        quantrocket_zipline_1|  File "data_integrity", line 42, in check_data
        quantrocket_zipline_1|  File "zipline/_protocol.pyx", line 121, in zipline._protocol.check_parameters.__call__.assert_keywords_and_call (zipline/_protocol.c:3824)
        quantrocket_zipline_1|  File "zipline/_protocol.pyx", line 744, in zipline._protocol.BarData.history (zipline/_protocol.c:9622)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 974, in get_history_window
        quantrocket_zipline_1|    field)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 906, in _get_history_minute_window
        quantrocket_zipline_1|    minutes_for_window,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/data_portal.py", line 1063, in _get_minute_window_data
        quantrocket_zipline_1|    False)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 549, in history
        quantrocket_zipline_1|    is_perspective_after)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 431, in _ensure_sliding_windows
        quantrocket_zipline_1|    array = self._array(prefetch_dts, needed_assets, field)
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/history_loader.py", line 595, in _array
        quantrocket_zipline_1|    assets,
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/dispatch_bar_reader.py", line 121, in load_raw_arrays
        quantrocket_zipline_1|    for t in asset_types if sid_groups[t]}
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/dispatch_bar_reader.py", line 121, in <dictcomp>
        quantrocket_zipline_1|    for t in asset_types if sid_groups[t]}
        quantrocket_zipline_1|  File "/opt/conda/lib/python3.6/site-packages/zipline/data/continuous_future_reader.py", line 286, in load_raw_arrays
        quantrocket_zipline_1|    out[start_loc:end_loc + 1, i] = result
        quantrocket_zipline_1|ValueError: could not broadcast input array from shape (1560) into shape (1561)

This is where I start to get less sure about what's going on. If you print the timestamps generated in _minute_exclusion_tree(), they look correct. If you add the +1 offset to break_start_pos, it results in the wrong timestamps but fixes the exception. However, I don't think this is the real fix because if you inspect the out in load_raw_arrays(), you'll see there's a nan that lines up with 15:16:00. In other words, it's not really doing what it's supposed to because it's trying to fetch data during an exclusion.

Another issue I noticed is that the simulation clock doesn't support trading breaks; handle_data() continues to get dispatched. You can workaround this by calling get_calendar() from the algo, then calling is_open_on_minute(get_datetime()) on the calendar object to see if the exchange is actually open (i.e. this method correctly handles trading breaks). It's actually this workaround that masks the issue mentioned above with the errant nan; 15:16:00 would result in false so you can skip that iteration of handle_data().

At this point, I don't know enough about the internals of zipline to attempt to make more changes. I can work around all of this by littering my algos with trading calendar details. Given that what I want to do is already "supported" on the calendar-side of things, I'm hoping that there's something simplie-ish to do on zipline to connect the missing pieces.

Based on the example of the Hong Kong calendar, which has a lunch break from 12:00 - 1:00 PM, the break times are inclusive and thus in the case of GLOBEX should be:

    break_start_times = (
        (None, time(15, 16)),
    )
    break_end_times = (
        (None, time(15, 30)),
    )

Having the simulation clock respect breaks would either require excluding the breaks in zipline.algorithm.TradingAlgorithm._create_clock or skipping break minutes in zipline.gens.sim_engine.__iter__.

Optimistically, I would think we can address this so that breaks are supported.

QuantRocket 2.6.0 provides support for trading breaks in Zipline as well as additional futures calendars to choose from.