[Feature Request] Cash Flow Statement Quarterly Data

The interim data are almost always quarterly, except for the Cash Flow Statement.
In this case the IB Reuters Reports use cumulative periods: 3M, 6M, 9M.

In order to have data that are comparable, if I set ‘quarterly’ in the query, the period must be the same for each record.

This is expecially important for strategies migrated from Quantopian to Quantrocket. In Quantopian are the Fundamental Data always quarterly.

This is probably best handled by an additional parameter to filter by statement lengths. Looking at the entire dataset, the statement lengths can vary widely both for the Cash Flow statements and Income statements. For quarterly statements the statement length might be 3M, 13W, 14W, depending on the company.

I think the best thing would be to standardize the data, in order to get comparable periods for all the companies and the Income and Cash Flow statements.
Instead of interim (or in addition to), we need the new parameter ‘quarterly’.

Another problem with the Cash Flow statements from IB/Reuters is that the interim data are cumulated: 3M, 6M, 9M and 12M. Therefore to get the data only for the last 3 months like in the Income Statement, one has to subtract the value from the previous period (if period > 3M).

I forget to mention, that we also need a Zipline-Extensions, for example
from zipline_extensions.pipeline.data import ReutersQuartelyFinancials

Soon or later, I’d like to migrate my Quantopian algorithms to Quanrocket and as Quantopian uses quartely data for the Fundamentals Pipeline, the new extension ReutersQuartelyFinancials would make such a migration a lot easier.

To get the most recent measure also the timeframe Trailing Twelve Months (TTM) should be added (of course only for the Income and the Cash Flow Statements).

As my algos use maily quarterly and TTMs data, actually QuantRocket is unfortunately no option for me.
Any plan to add support for “homogeneous timeframes” (annual, quartely and TTM) in fundamentals?


It’s likely that we will integrate new datasets, including fundamentals. It’s probably less likely that we would modify the timeframes of the Reuters dataset, as doing so would blur the line between the software (QuantRocket) and the data vendor (Reuters) and I’m not sure blurring that line is a good idea.

I understand your point, but it’s not really necessary to “blur the line”. I mean you can save the Reuters data in the database as they are and then leave the user the option to “normalise” them in the query result (or manage a second db with normalised data to speed up the queries).

Except for the annual data, the interim data (which are the most interessant, because more recent) are practically unusable because of a lot of different periods lenghts and units.

For example in my NYSE db for Income Statement there are 13 differents periods!

f = io.StringIO()
download_reuters_financials([“RTLR”],f, universes=[“nyse”], interim=True)
financials = pd.read_csv(f, parse_dates=[“SourceDate”, “FiscalPeriodEndDate”])
(financials[‘StatementPeriodLength’].map(str) + financials[‘StatementPeriodUnit’]).unique()

array([‘12W’, ‘16W’, ‘3M’, ‘6M’, ‘13W’, ‘12M’, ‘14W’, ‘5M’, ‘4M’, ‘1M’,
‘52W’, ‘2M’, ‘9M’], dtype=object)

For the Cash Flow Statement even worse with 23 different periods!:
['16W', '28W', '40W', '52W', '12M', '3M', '6M', '9M', '26W', '39W', '13W', '12W', '24W', '36W', '53W', '5M', '4M', '7M', '8M', '27W', '10M', '2M', '15M']

Actually is very difficult to perform a simple query like for example “give me the last 4 quarterly result for Total Revenue”.

I think the point should be addressed, because it could be the “added value” provided by Quantrocket.
As a programmer I’ve already written a Java Service that download the Reuters data from IB in XML format and put them in a relational database but the difficult task (and therefore the “added value”) would be in the post-process/homogenization of data. That “added value” would contribute to justify a paid subscription for QuantRocket.

Another use case is the following one: I test my algo in Quantopian because of their free long fundamentals data history and then trade it live with QuantRocket. Actually it’s not possible without a lot of work, because the Quantopian data are quarterly and yours (in the simplest case) annually.

QuantRocket looks very promising to me and I hope you will consider my request constructively.
We can also ask to the other members of the QuantRocket community, which is their opinion about this point.

FYI I understand your concerns… I don’t said it’s an easy task, maybe it’s even not doable…
Below the frequency distribution of the interim timeframes Reuters fundamentals… really crazy :frowning:
Looking forward to hearing your thoughts!

Income Statements
3M 8244
13W 614
6M 395
12W 34
12M 26
14W 16
16W 14
9M 6
4M 3
1M 2
2M 2
52W 1
5M 1
Cash Flow Statements
12M 2629
9M 2279
6M 2021
3M 1830
39W 188
52W 149
13W 145
26W 143
53W 16
40W 11
16W 7
28W 6
12W 4
7M 4
24W 3
36W 3
10M 2
5M 2
8M 2
15M 1
27W 1
2M 1
4M 1

I hear you. Normalization of interim periods isn’t out of the question, but some of the considerations are:

  1. Right now we still have a lot of work to do to support live trading and real time data. In order of priorities, doing anything about interim fundamentals would be at least a number of months away.
  2. Priority depends on demand. If it turns out everyone requires normalized quarterly fundamentals for QuantRocket to be usable, obviously it will have higher priority. If lots of people don’t use fundamentals or can work with annual periods or are willing to normalize within their own code, then it will have lower priority.
  3. Since QuantRocket is not the data vendor, it may be better to crowdsource/opensource the normalization rules; i.e. at the very least let pioneering users explore the data and help influence how to safely and correctly normalize it.
  4. In the end, it may (or may not) be easier/better to integrate a different fundamentals dataset that already has normalized quarterly periods.

Hi Brian,

I really appreciate your approach to the question and I agree with you that implementing the live trading and follow the road map should the remain the priority.

I think, it’s a good idea to opensource the normalization rules: I candidate myself to explore the data and trying to write a ruleset to correctly normalize it.
I’ve already started a jupyter notebook to this aim (I coudn’t attach it to this post)

Let me know how you would like to proceed!

P.S.: In my background I’m mainly a java programmer but enthusiastic to improve my python knowledge

1 Like

Before I can share with you the complete Jupyter Notebook (or shoud I use pastebin.com?), here are the most important code snippets:

# Convert all periods to weeks
cas_weekly = cas.loc[cas['StatementPeriodUnit']=='M']['StatementPeriodLength']
cas_weekly *= 4

def normalise(financials, conid, length=12):
    data = financials[financials['ConId']==conid]
    for coa in data['CoaCode'].unique():
        coa_data = data[data['CoaCode'] == coa].sort_values('FiscalPeriodEndDate', ascending=False)    
        temp = coa_data[['Amount', 'StatementPeriodLength']]
        temp = temp - temp.shift(-1)
        temp = temp[temp['StatementPeriodLength'] == length]
        coa_data = coa_data[coa_data['StatementPeriodLength'] == length]
    data = data[data['StatementPeriodLength'] == length]
    return data

At least for the Cash Flow Statements works quite well, the only problem is for ADR Stocks.
Is there a way to filter them out?

P.S.: I get a SettingWithCopyWarning data.update(coa_data) but I don’t undestand why

Most ADRs have “ADR” in the LongName field in the securities master. So the best way to filter them is to export your universe to CSV, open it in Excel or similar, filter and delete the ADR rows, then re-create your universe from the filtered CSV.

@Costantino it should now be possible to upload .ipynb files.

Here the .ipynb files use to test the normalisation:
Fundamentals CAS.ipynb (262.3 KB)
Fundamentals INC.ipynb (197.0 KB)

After excluding companies with ADR in the name and no sector, there were still a few names that couldn’t be normalised because of missing data. For the Cash Flow Statement:


For the Income Statements:



I’ve just seen the new page about Reuters: https://www.quantrocket.com/docs/data/fundamental/reuters/

There is any update regarding the problem with the differents timeframe discussed in this threads?