Quick and dirty way to subset the market

If you have a small account, your strategy can generates more signals than you have capital to cover. Maybe your backtest tearsheet says it holds 600 issues and the orders are all rounded down to … 0 lots.

Say you want to hold 60 issues. You need to create an investment universe that is 1/10th the size. That is a bit of a hassle.

This code gives you a stable, effectively random subset of your existing universe. You can dial the size of your subset up and down by changing one number:

def prices_to_signals(self, prices):
   signal = ... # your secret sauce

   # Divide the universe into 100 groups
   hashes = prices.loc['Open'].apply(lambda _: list(map(lambda h: hash(h) % 100, prices.columns)), axis=1, result_type='broadcast')

   # Pick 10 of them, so now you're playing with 10/100th = 1/10th of the universe
   hash_ok = hashes < 10

   # Mask the signal to that subset
   signal = signal & hash_ok

What the hashes = ... line does is it takes your SIDs, runs them through a stable function, hash which turns them into large numbers, and then separates them into 100 groups (0, 1, … 99.)

The hash_ok = ... line says give me groups (0, 1, … 9.) This is (roughly) 10/100ths = 1/10th of the universe. That is used to mask your signal. (I have a boolean buy signal; you could use cast to float and use * for a float signal.)

Because hash gives the same result each time, the grouping is stable, so you can live trade with this and it will not thrash your portfolio.

When your account grows and you want to take more positions, you can do that by changing < 10 to < 15 or whatever. Because this includes the original subset (0, 1, … 9) there’s no extra churn to the rebalancing.

(It would be neat if QuantRocket supported “undercapitalized” accounts. I love Moonshot but it takes some work to get it to avoid churning a small portfolio.)

I just realized Python 3.3+ gives hash(...) a random salt! So this is not stable once you restart QuantRocket. But the same idea applies as long as you use a stable hash.