Unable to Upload Data to Custom Historical Database

Post Summary

I'm trying to populate a custom historical database with transformed CSV data (Date, Sid, OHLCV) in QuantRocket v2.11.0. Database creation via create_custom_db() works perfectly, but I cannot find any working method to upload the data.

I've systematically tested 6 different approaches:

  • CLI commands (history load, history collect --filenames) → Commands don't exist

  • Direct SQLite access → Blocked by container isolation

  • HTTP API endpoints (/custom, /data, /csv) → All return 404 Not Found

Question: What is the correct method to upload CSV data to a custom historical database in QuantRocket v2.11.0?

more technical details:

Issue Summary

Unable to upload CSV data to a custom historical database in QuantRocket v2.11.0. All documented and attempted methods fail. Database creation succeeds, but no data upload method works.

Expected Behavior: Ability to populate a custom historical database with transformed CSV data (Date, Sid, OHLCV columns).

Actual Behavior: All tested upload methods fail with various errors (404 Not Found, command not found, container isolation errors).


What We're Trying to Do

Objective

Create and populate a custom historical database (parquet-stk-1min) with 10 years of 1-minute OHLCV bar data for 3 symbols .

Data Specifications

  • Source: parquet file containing irregular tick data

  • Transformation: 5-phase pipeline to create uniform 1-minute bars:

    1. Floor timestamps to 1-minute boundaries

    2. Aggregate duplicates using OHLCV rules (first/max/min/last/sum)

    3. Convert timezone from America/New_York to UTC

    4. Reindex to NYSE trading calendar with extended hours (4:00-20:00 ET)

    5. Forward-fill OHLC prices, zero-fill Volume

  • Output: Clean DataFrame with columns: Date (UTC), Sid (Bloomberg format), Open, High, Low, Close, Volume (int64)

  • Data Volume: ~7.5 million rows (3 symbols × 2,520 trading days × 961 bars/day)

Database Configuration


from quantrocket.history import create_custom_db

create_custom_db(

    code='parquet-stk-1min',

    bar_size='1 min',

    columns={

        'Open': 'float',

        'High': 'float',

        'Low': 'float',

        'Close': 'float',

        'Volume': 'int'

    }

)

Result: :white_check_mark: Database creation succeeds without errors.


Methods Tested and Results

We systematically tested 6 different upload methods. All failed.

Test Setup

  • Created test database: test-upload-method

  • Test dataset: 100 rows (1 symbol, 100 minutes of XYZ data)

  • Schema validated: Date (UTC timezone-aware), Sid (FIBBG*********), OHLCV columns

Method 1: CLI quantrocket history load

Attempt:


quantrocket history load test-upload-method --file /codeload/data/test_upload.csv

Result: :x: FAILED


quantrocket history: error: argument subcommand: invalid choice: 'load'

(choose from 'create-custom-db', 'create-edi-db', 'create-ibkr-db',

'create-sharadar-db', 'create-usstock-db', 'list', 'config', 'drop-db',

'collect', 'queue', 'cancel', 'wait', 'sids', 'get')

Conclusion: load subcommand does not exist in v2.11.0 CLI.


Method 2: CLI quantrocket history collect --filenames

Attempt:


quantrocket history collect test-upload-method --filenames /codeload/data/test_upload.csv

Result: :x: FAILED


quantrocket: error: unrecognized arguments: --filenames /codeload/data/test_upload.csv

Investigation: Checked quantrocket history collect --help. The collect command is for fetching data from vendors (IBKR, Sharadar), not for loading custom CSV files. No --filenames parameter exists.

Conclusion: collect command is not designed for custom data upload.


Method 3: Direct SQLite Access via quantrocket.db

Attempt:


from quantrocket.db import connect_sqlite, insert_or_replace

db_path = '/var/lib/quantrocket/quantrocket.history.test-upload-method.sqlite'

conn = connect_sqlite(db_path)

insert_or_replace(df, 'prices', conn)

Result: :x: FAILED


(sqlite3.OperationalError) unable to open database file

(Background on this error at: https://sqlalche.me/e/14/e3q8)

Root Cause: Container isolation - the codeload container cannot access the history service container's filesystem at /var/lib/quantrocket/.

Conclusion: Direct SQLite access blocked by QuantRocket's microservices architecture.


Method 4: HTTP API - Multipart Upload to /custom

Attempt:


import requests

csv_data = df.to_csv(index=False)

url = 'http://houston/history/databases/test-upload-method/custom'

files = {'file': ('data.csv', csv_data, 'text/csv')}

response = requests.post(url, files=files, timeout=60)

Result: :x: FAILED - HTTP 404 Not Found


<!doctype html>

<html lang=en>

<title>404 Not Found</title>

<h1>Not Found</h1>

<p>The requested URL was not found on the server. If you entered the URL

manually please check your spelling and try again.</p>

Conclusion: /custom endpoint does not exist in v2.11.0.


Method 5: HTTP API - Raw CSV POST to /data

Attempt:


import requests

csv_data = df.to_csv(index=False)

url = 'http://houston/history/databases/test-upload-method/data'

response = requests.post(

    url,

    data=csv_data,

    headers={'Content-Type': 'text/csv'},

    timeout=60

)

Result: :x: FAILED - HTTP 404 Not Found

Also tried: PUT to /data endpoint → Same 404 error.

Conclusion: /data endpoint does not exist in v2.11.0.


Method 6: HTTP API - Raw CSV POST to /csv

Attempt:


import requests

csv_data = df.to_csv(index=False)

url = 'http://houston/history/databases/test-upload-method/csv'

response = requests.post(

    url,

    data=csv_data,

    headers={'Content-Type': 'text/csv'},

    timeout=60

)

CSV Format Sent:


Date,Sid,Open,High,Low,Close,Volume

2020-06-15 13:30:00+00:00,FIBBG*********,300.248,300.248,299.585,299.585,8266

2020-06-15 13:31:00+00:00,FIBBG*********,299.719,300.305,299.719,299.719,9702

...

Result: :x: FAILED - HTTP 404 Not Found

Conclusion: /csv endpoint does not exist in v2.11.0 (despite being suggested in some documentation).


Summary of Failures

| Method | Approach | Error | Root Cause |

|--------|----------|-------|------------|

| 1 | CLI history load | Invalid choice: 'load' | Command doesn't exist in v2.11.0 |

| 2 | CLI history collect --filenames | Unrecognized argument | Not designed for custom data |

| 3 | Direct SQLite via quantrocket.db | Unable to open database file | Container isolation |

| 4 | HTTP POST to /custom | 404 Not Found | Endpoint doesn't exist |

| 5 | HTTP POST/PUT to /data | 404 Not Found | Endpoint doesn't exist |

| 6 | HTTP POST to /csv | 404 Not Found | Endpoint doesn't exist |

Status: :x: NO WORKING METHODS FOUND


Questions for QuantRocket Support

  1. What is the correct method to upload CSV data to a custom historical database in QuantRocket v2.11.0?

  2. Is there a Python API function we should use? We tried:

    • push_history() - ImportError (function doesn't exist in v2.11.0)

    • load_custom_data() - ImportError (function doesn't exist in v2.11.0)

  3. Is there a CLI command we missed? We checked all subcommands of quantrocket history - none appear to support custom data upload.

  4. Is there a specific HTTP endpoint for custom database ingestion? We tested /custom, /data, and /csv - all returned 404.

  5. Has the upload method changed between QuantRocket versions? If so, what is the v2.11.0-specific approach?

  6. Is there documentation for this workflow? We reviewed the platform documentation but could not find clear guidance on uploading to custom historical databases.


Code Artifacts

Test Script

Location: test_db_upload.py

Purpose: Systematically tests all 6 upload methods with a small dataset (100 rows).

Key Functions:

  • create_test_dataframe() - Generates valid test data matching QuantRocket schema

  • test_method_cli_load() - Tests CLI load command

  • test_method_cli_collect() - Tests CLI collect with filenames

  • test_method_sqlite_direct() - Tests direct SQLite access

  • test_method_http_api() - Tests HTTP multipart upload

  • test_method_http_raw_csv() - Tests HTTP raw CSV to /data

  • test_method_http_csv_endpoint() - Tests HTTP to /csv endpoint

Main Implementation Script

Location: create_parquet_database.py

Purpose: Production script for creating and populating parquet-stk-1min database.

Status:

  • :white_check_mark: Data transformation pipeline complete (5 phases, 7.5M rows generated)

  • :white_check_mark: Database creation successful

  • :x: BLOCKED: Cannot upload data (no working method found)


Environment Details

See Appendix A: System Environment for complete version information.


Appendix A: System Environment

QuantRocket Version


$ quantrocket version

2.11.0.0

Python Environment


$ python3 --version

Python 3.11.10

Key Python Libraries


pandas==2.0.3

numpy==1.24.4

requests==2.31.0

zipline-reloaded==3.0.4

Container Information

  • Deployment: QuantRocket Docker-based deployment

  • Container: Running inside codeload service container

  • History Service: Separate microservice container (container isolation prevents direct SQLite access)

  • Houston Gateway: HTTP API gateway at http://houston

Platform Details


$ uname -a

Linux 5.15.167.4-microsoft-standard-WSL2 x86_64 GNU/Linux

Available QuantRocket CLI Commands


$ quantrocket history --help

Subcommands:

  - create-custom-db

  - create-edi-db

  - create-ibkr-db

  - create-sharadar-db

  - create-usstock-db

  - list

  - config

  - drop-db

  - collect

  - queue

  - cancel

  - wait

  - sids

  - get

Note: No load, push, upload, or similar data ingestion commands are available.

Python API Inspection


>>> from quantrocket import history

>>> dir(history)

['cancel_collections', 'collect_history', 'config_db', 'create_custom_db',

 'create_edi_db', 'create_ibkr_db', 'create_sharadar_db', 'create_usstock_db',

 'download_history_file', 'drop_db', 'get_db_config', 'get_history_queue',

 'list_databases', 'wait_for_collections']

Missing Functions (that we attempted to use):

  • push_history - Does not exist

  • load_custom_data - Does not exist


Appendix B: Sample Data Format

CSV Format (Generated from DataFrame)


Date,Sid,Open,High,Low,Close,Volume

2020-06-15 13:30:00+00:00,FIBBG*********,300.248357,300.248357,299.585502,299.585502,8266

2020-06-15 13:31:00+00:00,FIBBG*********,299.719909,300.305185,299.719909,299.719909,9702

2020-06-15 13:32:00+00:00,FIBBG*********,300.373647,300.373647,299.858856,300.373647,1384

2020-06-15 13:33:00+00:00,FIBBG*********,300.305185,300.305185,299.629993,300.305185,1404

2020-06-15 13:34:00+00:00,FIBBG*********,299.989549,300.126869,299.989549,299.989549,9175

DataFrame Schema (as sent to upload functions)


Columns: ['Date', 'Sid', 'Open', 'High', 'Low', 'Close', 'Volume']

Dtypes:

  - Date: datetime64[ns, UTC]

  - Sid: object (Bloomberg format, e.g., 'FIBBG*********')

  - Open: float64

  - High: float64

  - Low: float64

  - Close: float64

  - Volume: int64

Database Configuration (as created)


{

    'code': 'parquet-stk-1min',

    'bar_size': '1 min',

    'columns': {

        'Open': 'float',

        'High': 'float',

        'Low': 'float',

        'Close': 'float',

        'Volume': 'int'

    }

}


Appendix C: Test Script Output

Full test run output (all 6 methods):


**********************************************************************

QUANTROCKET v2.11.0 DATA UPLOAD METHOD TEST

**********************************************************************

Goal: Determine which upload method works for custom historical databases

Test database: test-upload-method

Test data: 100 rows (1 symbol, 100 minutes)

======================================================================

SETUP: Dropping existing test database (if any)

======================================================================

✓ Dropped existing database

======================================================================

SETUP: Creating test database

======================================================================

✓ Created database: test-upload-method

  Schema: Date, Sid, Open, High, Low, Close, Volume

======================================================================

Creating test DataFrame...

======================================================================

✓ Created test DataFrame: 100 rows

  Date range: 2020-06-15 13:30:00+00:00 to 2020-06-15 15:09:00+00:00

  Sid: FIBBG*********

  Schema: ['Date', 'Sid', 'Open', 'High', 'Low', 'Close', 'Volume']

First 5 rows:

                       Date             Sid  ...       Close  Volume

0 2020-06-15 13:30:00+00:00  FIBBG*********  ...  299.585502    8266

1 2020-06-15 13:31:00+00:00  FIBBG*********  ...  299.719909    9702

2 2020-06-15 13:32:00+00:00  FIBBG*********  ...  300.373647    1384

3 2020-06-15 13:33:00+00:00  FIBBG*********  ...  300.305185    1404

4 2020-06-15 13:34:00+00:00  FIBBG*********  ...  299.989549    9175

[5 rows x 7 columns]

[... All 6 methods failed as documented above ...]

======================================================================

TEST RESULTS SUMMARY

======================================================================

cli_load             : ✗ FAILED

cli_collect          : ✗ FAILED

sqlite_direct        : ✗ FAILED

http_api_multipart   : ✗ FAILED

http_api_raw_csv     : ✗ FAILED

http_csv_endpoint    : ✗ FAILED

❌ NO WORKING METHODS FOUND

   This requires further investigation or QuantRocket support

**********************************************************************

TEST COMPLETE

**********************************************************************

Method 3 is the correct method. Are you running those commands from JupyterLab? Any container with volumes: 'db:/var/lib/quantrocket' in docker-compose.yml has access to /var/lib/quantrocket/ (the database directory). For example:

  jupyter:
    ..
    volumes:
      - 'db:/var/lib/quantrocket'

Try loading the data from a JupyterLab notebook. Alternatively, you can load it from a custom script running in the satellite container.