Missing data when ingesting intraday usstock bundle

For some reason I am getting a lot of missing values when ingesting the usstock bundle, but checking the flightlog is showing a normal ingestion and not triggering any error.

I tried ingesting the bundle again for a subset of securities that should have pricing data for the entire data collection period (Vanguard sector ETFs), but again I am getting a lot of missing values for all the ETFs. Here is the list of sids for the ETFs:


I looked at a dataframe of the bundle using get_prices and it has a lot of missing values scattered throughout the data at random intervals.

Calculating the share of NaN values for each ETF by running:

prices = get_prices("sector-etfs", fields=["Close"])

#get share of NANs in each column
for security in prices.columns:
    print("{} is missing {}".format(security, prices[security].isna().sum()/len(prices[security])))

Shows a lot of NaN values:

FIBBG000HSWLH9 is missing 0.6369862916400313
FIBBG000HX4281 is missing 0.6794765253214007
FIBBG000HWRGK3 is missing 0.6456332125861212
FIBBG000HTG205 is missing 0.3673968321613751
FIBBG000HSSXW1 is missing 0.36526386817245543
FIBBG000HSZQ76 is missing 0.3505589885645287
FIBBG000HWNSD9 is missing 0.5271645713473968
FIBBG000HX9TN0 is missing 0.6586959301086724
FIBBG000Q89NG6 is missing 0.048358548192343205
FIBBG000HTGPJ4 is missing 0.7229966616947227
FIBBG000HX1FV9 is missing 0.5741608068754883
FIBBG000HR9779 is missing 0.0493685631081753

Any idea why it is not ingesting all the data?

Why do you think these ETFs should have data every minute? They’re not especially liquid. Many stocks don’t trade every minute, even if they trade every day, so gaps in minute data are much more common than in daily data and are entirely expected. You shouldn’t expect a Vanguard sector ETF to be as liquid as something like AAPL (which indeed will have no gaps in its minute data).

Yes you are right, I assumed that the minute bundle would show the last traded price at every minute even when the security is not necessarily traded at every minute. But I see that it wouldn’t make sense to do that since the file would unnecessarily be a lot bigger. Thanks