Volume data inconsistency in zipline live minute pulls

Hi Brian,

Per the docs, I've been mapping zipline's 'volume' field to 'VolumeClose' in my aggregate database. This successfully produces the cumulative volume for same-day minute pulls (using data.history()) during live trading.

I'm getting a problem when the live minute pull spans more than one day (e.g. try a pull with more than 390 rows). The volume from previous days is not cumulative, it's the volume traded in the single minute. I'm guessing that the pull is wired to combine previous days' volume data from the bundle and current day's data from aggregate, hence the inconsistency.

Ideally this would be fixed to produce consistent data within the same column of the same pull, but if not, it's prob worth mentioning in the docs so people can patch on their own.

Paul.

Volume in Zipline is not cumulative. For that reason, mapping VolumeClose has not been recommended in the docs for quite some time.

Mapping VolumeClose is described in the docs (pasted below) as the only way to get a non-sampled volume from IBKR. What other option is there if you want minute volume comparable to backtest volume data from the bundle?

My point is that if you map to VolumeClose and pull minute volume across multiple trading days, you'll get an inconsistent data series with some cumulative numbers and others not.


Real-time volume from Interactive Brokers

Real-time data from Interactive Brokers is not tick-by-tick but is sampled at a rate of 250 ms (4 samples per second) for stocks. This means that LastSizeSum will typically not contain the complete trading volume for a given minute but will only reflect the volume of the sampled trades. If this is a problem, an alternate configuration strategy is to collect and use Volume instead, as follows:

  1. In your real-time tick database, collect Volume instead of LastSize .
  2. In your aggregate database, instead of storing the Sum of LastSize , store the Close of Volume .
  3. In before_trading_start , instead of mapping Zipline's volume field to LastSizeSum , map it to VolumeClose .

The downside of this approach is that Interactive Brokers' Volume field provides the cumulative session volume, whereas the volume field in Zipline backtests represents the volume for a single minute. To get the volume for a single minute when using VolumeClose , you can take a .diff() of volume in live trading:

# get minute volume
volume = data.history(assets, "volume", 20, '1m')

# in live trading, volume comes from VolumeClose which
# is cumulative, so take a diff() to get minute volume
if algo.get_environment("arena") == "trade":
    volume = volume.diff()