Satellite process mysteriously died

Hey @Brian, I could really use some insight to figure out what's going on here. I've spent hours looking through the logs, and as far as I can see the satellite process I'm using for live trading just stopped.

Here's the setup:

  • Countdown calls Satellite script for live trading every 1 minute
  • Live trading script establishes a FileLock as sometimes the API calls made in this script take longer than 1 minute to execute
  • Once live trading script completes, FileLock is released, and the next Countdown call resumes live trading

Today I noticed live trading got stuck due to a FileLock that was never released. Upon closer inspection, it appears that QuantRocket potentially killed the process that was managing this lock, that might have caused the problem.

I'm not 100%, but what I wanted to know from you is, do the following logs look suspicious in anyway? Specifically where an Broker API request is made, but the response is never logged before the process is killed. I've commented what I consider the relevant parts. Note that I've confirmed the broker api request sent back a successful response, but it looks like the satellite process was killed before it received it.

# FileLock is established, and Broker API Request is fired
2021-09-02 12:34:50 EDT| <14>1 2021-09-02T16:34:50Z 14de65b6c526 quantrocket_satellite_1 28935 - - Broker API Request: BrokerAPIRequest(id='b86881a0', service='broker', action='request', data=BrokerRequest(id='b86881a0', brokerage='tradezero', action='locate_quote', data=Locate(symbol='IPHA', qty=100, price=None, total=None, status=None, message=None)))
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:02 +0000] "GET /master/calendar?exchanges=NASDAQ&in=15min HTTP/1.1" 200 127 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:02 +0000] "GET /master/calendar?exchanges=NASDAQ&in=15min HTTP/1.1" 200 127 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:02 +0000] "GET /master/calendar?exchanges=NASDAQ&in=15min HTTP/1.1" 200 127 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:02 +0000] "GET /license-service/credentials/polygon HTTP/1.1" 200 48 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:02 +0000] "GET /license-service/credentials/alpaca HTTP/1.1" 200 201 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:02 +0000] "GET /license-service/credentials/polygon HTTP/1.1" 200 48 "-" "-"
2021-09-02 12:35:02 EDT| <14>1 2021-09-02T16:35:02Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:02 +0000] "GET /license-service/credentials/alpaca HTTP/1.1" 200 201 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO Trading previous-wide-range strategy.
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO Trading big-gap-initial-turn strategy.
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "GET /license-service/credentials/polygon HTTP/1.1" 200 48 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "GET /license-service/credentials/alpaca HTTP/1.1" 200 201 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO Strategy.trade(allocations={'DU3088139': 1}, review_date=2021-09-02 12:33:32)
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO Strategy.trade(allocations={'DU3088139': 1}, review_date=2021-09-02 12:33:32)
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "GET /master/securities.csv?universes=previous-wide-range HTTP/1.1" 400 71 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "GET /moonshot/orders.csv?strategies=previous-wide-range&review_date=2021-09-02+12%3A33%3A32 HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:03 +0000] "POST /satellite/commands?cmd=codeload.trade.trade_strategy&params=code%3Aprevious-wide-range&params=review_date%3A2021-09-02+12%3A33%3A32&params=trade_orders%3ATrue HTTP/1.1" 200 90 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO No previous-wide-range candidates to trade at 2021-09-02 12:33:32
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO No orders generated while trading previous-wide-range strategy.
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "GET /master/securities.csv?universes=big-gap-initial-turn&exclude_universes=big-gap-initial-turn-blacklist HTTP/1.1" 400 71 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 8432cc066d95 quantrocket_moonshot_1 44543 - - ...The work of process 4285 is done. Seeya!
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "GET /moonshot/orders.csv?strategies=big-gap-initial-turn&review_date=2021-09-02+12%3A33%3A32 HTTP/1.1" 200 5 "-" "-"

# Broker API response is never logged, but I see the below indicating that the process' work is done?
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 14de65b6c526 quantrocket_satellite_1 28935 - - ...The work of process 5654 is done. Seeya!
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO No big-gap-initial-turn candidates to trade at 2021-09-02 12:33:32
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:03 +0000] "POST /satellite/commands?cmd=codeload.trade.trade_strategy&params=code%3Abig-gap-initial-turn&params=review_date%3A2021-09-02+12%3A33%3A32&params=trade_orders%3ATrue HTTP/1.1" 200 90 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.2 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <11>1 2021-09-02T16:35:03Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:03 capitalmastery: INFO No orders generated while trading big-gap-initial-turn strategy.
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.17 - - [02/Sep/2021:16:35:03 +0000] "POST /flightlog/handler HTTP/1.1" 200 5 "-" "-"
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 8432cc066d95 quantrocket_moonshot_1 44543 - - worker 1 killed successfully (pid: 4285)
2021-09-02 12:35:03 EDT| <14>1 2021-09-02T16:35:03Z 8432cc066d95 quantrocket_moonshot_1 44543 - - Respawned uWSGI worker 1 (new pid: 4294)
2021-09-02 12:35:04 EDT| <14>1 2021-09-02T16:35:04Z 3a4b21e6bb8c quantrocket_houston_1 4696 - - 172.18.0.14 - - [02/Sep/2021:16:35:04 +0000] "POST /satellite/commands?cmd=codeload.trade.trade_strategy&params=code%3Abig-gap&params=review_date%3A2021-09-02+12%3A33%3A32&params=trade_orders%3ATrue HTTP/1.1" 200 22 "-" "-"
2021-09-02 12:35:04 EDT| <14>1 2021-09-02T16:35:04Z 14de65b6c526 quantrocket_satellite_1 28935 - - ...The work of process 5651 is done. Seeya!

# Finally the worker for pid 5654 is killed
2021-09-02 12:35:04 EDT| <14>1 2021-09-02T16:35:04Z 14de65b6c526 quantrocket_satellite_1 28935 - - worker 3 killed successfully (pid: 5654)
2021-09-02 12:35:04 EDT| <14>1 2021-09-02T16:35:04Z 14de65b6c526 quantrocket_satellite_1 28935 - - Respawned uWSGI worker 3 (new pid: 5675)
2021-09-02 12:35:04 EDT| <11>1 2021-09-02T16:35:04Z d5635dcde2a3 quantrocket_flightlog_1 3934 - - 2021-09-02 12:35:04 capitalmastery: WARNING Trading for big-gap strategy is already locked; why are we trying to trade when we're already trading?
2021-09-02 12:35:04 EDT| <14>1 2021-09-02T16:35:04Z 14de65b6c526 quantrocket_satellite_1 28935 - - [deadlock-detector] a process holding a robust mutex died. recovering...

For context, here's the code that never finished

# Establish a lock to ensure only one instance of trade, per strategy, is running
lock = FileLock(f'/tmp/{code}.lock', timeout=1) # wait up to 1 second to acquire lock
    
    # Confidently trade strategy with exlusive lock
    try:
        with lock:
            logger.info(f'Trading {code} strategy.')