Get_prices handles timezones differently between history and realtime aggregate databases

Noticed that get_prices handles timezones differently depending if you’re querying history or realtime aggregate databases. This poses an issue when trying to use get_prices with both history and realtime databases at the same time; such as with live trading.

Examples:

# Getting prices from history database (the usstock minute bundle) gives prices up to 2:30PM EST
get_prices(["usstock-minute"], sids=['FIBBG00PN9XD40'], start_date="2021-05-21", end_date="2021-05-21 14:30:00", fields=["Close"])
# Getting prices from realtime aggregate database, with the same end date, only gives prices up to 10:30AM EST (basically timezone offset of -04:00 is being applied to an already offset time)
get_prices(["usstock-realtime-minute"], sids=['FIBBG00PN9XD40'], start_date="2021-05-21", end_date="2021-05-21 14:30:00", fields=["MinuteCloseClose"])

Tried specifying the timezone, but that had no effect. The only way I was able to get consistent responses when trying to get_prices from both these databases at the same time was by specify the end_date with the timezone offset applied. See below.

# Get history and realtime prices up to a specific time, with correct timezone applied
get_prices(["usstock-minute", "usstock-realtime-minute"], sids=['FIBBG00PN9XD40'], start_date="2021-05-21", end_date="2021-05-21T14:30:00-04:00", fields=["Close", "MinuteCloseClose"])

My specific problem occured when trying to run a backtest on pricing data from 2021-05-20 to 2021-05-21, and noticed that I wasn’t getting any realtime pricing data back, only history data came through. The realtime data is being used to fill in pre and post market data onto the usstock-minute history prices, so basically I didn’t get back any extended hours data. After troubleshooting I figured out the solution above, which was to specify an ISO-8601 end_date, with -04:00 offset, such as 2021-05-21T20:00:00-04:00 in order to get a correctly populated prices dataframe. While specifying an end-time sort of made sense, I was surprised to find that the -04:00 offset was required.

In short, I would expect the following to work to give me prices from both the history and realtime database through the end of 2021-05-21, without having to specify an end time, or a time with a specific timezone offset.

get_prices(["usstock-minute", "usstock-realtime-minute"], sids=['FIBBG00PN9XD40'], start_date="2021-05-21", end_date="2021-05-21", fields=["Close", "MinuteCloseClose"])

Any help/guidance is appreciated; thanks!

I see two different issues in your post, both of which could be improved.

  1. When the end_date used in a realtime database query consists of a date only with no time, the realtime service should treat it as inclusive, which is what the history and zipline service do. Currently the date is normalized to <date> 00:00:00, so it’s exclusive.
  2. When the start_date/end_date consists of a datetime and you pass a timezone to get_prices (or you’ve set Moonshot.TIMEZONE if using Moonshot), the timezone should be passed along to the realtime service to help it interpret the start_date/end_date parameters. It doesn’t currently do that.

Real-time data is stored in UTC while history databases and Zipline bundles are stored/returned in the exchange timezone. So when querying multiple databases, I do think it’s appropriate and necessary to have to specify a timezone to avoid ambiguity.

Thanks for the response Brian. Agree with you on both points; I think those changes would make the user experience better. In the meantime, I’ll be sure to always include the timezone offset to ensure consistency.

Just for context, my use case is that I use a history database that’s updated each night + a realtime aggregate database within my strategies that run every minute throughout the day. This setup provides a nice, always up to date, data stream that’s always available. I also combine the history and realtime prices into a single set of OHLCV columns which makes things simple for my strategy; it just has a single prices dataframe to worry about.

Thanks brother!