Marketside Chats #4: Costs of replicating an S&P 500 portfolio

What are stock indexes?

(For a stock index, this is the correct plural – not “indices”.)

Wikipedia says it is “a method of measuring the value of a section of the stock market”. An index includes stocks

  1. that are similar in some meaningful way. Examples: all technology stocks, the biggest 100 stocks trading at the NASDAQ stock exchange, all stocks trading at New York Stock Exchange.
  2. with meaningful weights. One accepted way is “market-capitalization-weighted”; e.g. Apple has 100x the market value of US Steel, so it will have a 100x bigger weight in such an index. Other ones are equal-weighted and, in the case of DJIA (Dow Jones Industrial Average) price-weighted.
Why do they exist?
Most people use them as reference points. In news & discussions, “the market was up today” typically means the Dow Jones Industrial Average (DJIA) was up. This index includes 30 large stocks, and despite its deficiencies, it is the de facto most used index. Though most stocks are correlated, it is always possible for one subset of the market to go up and another to go down, so there’s never a single definition of “the market is up”.

However, indexes are proprietary products. The creator of an index makes money licensing it to others. In a way, the real reason for their existence is that the creators want to make money. However, to do so, they need to create an index that others will want to mimic. As a counterexample, if I create an index that includes all stocks that start with a vowel, I cannot license it to any mutual fund, because nobody will want to match it.

There are many financial instruments (e.g. mutual funds, and exchange traded funds – ETFs) that attempt to mimic the performance of well-known indexes. The investing public likes those instruments because they provide exposure to a wide, diversified set of stocks. In addition, some are cheap to own, i.e. their expense ratio (annual fees as a %) tends to be low. Finally, some are also cheap to trade: the spread (ask price minus bid price) may be only $0.01 on average for e.g. SPY, an ETF that tracks the S&P 500 index (SPX), which currently trades around $185 – so less than 0.01% of its price.

Market practitioners typically follow SPX, because it is covers more stocks (500) than DJIA (30), but also because it uses a more reasonable weighting methodology. (*1)

Index vs fund that matches an index

An index is a calculation. You cannot trade an index directly. What you can trade in most cases is:

  1. financial instruments (ETFs, mutual funds) that attempt to match an index. However, these have fees, and are subject to ‘tracking error’, as it is very difficult to hold all stocks in the index at exactly the right proportions, and to buy or sell stocks when they enter/exit the index while doing it at the same price used in the index calculation. That is, the performance of an ETF matching an index will not be 100% the same as that of the index itself.
  2. futures contracts, which are essentially bets on how high/low an index will go in a certain period of time. These also cannot match an index perfectly, although for different reasons. The chief reason is temporary supply/demand dislocations that may cause them to trade away from their fair value (beyond the scope of this post). Also, futures contracts are harder for retail traders (i.e. you and me) to trade, and they cover fewer indexes than ETFs do.
For instance, SPY is an ETF that you can actually buy and hold in your retail brokerage account.

Replicating a portfolio 

One is said to have a position that is…

  • long N shares, when one owns N shares of a security
  • short N shares, when one owes N shares of a security
  • flat (or zero), if neither is true

A portfolio is a set of positions. A replicating portfolio is a portfolio that behaves (almost) the same. For example, if portfolio A holds $1m of a single security that tracks SPX (e.g. the SPY ETF), and portfolio B holds $1m of the same 500 stocks in SPX & at the same weights (e.g. 50 Apple, 20 GOOG, etc.), then B is a portfolio that replicates A.

Replicating SPX
In some cases, you’d want to replicate an SPX portfolio with all 500 stocks. “Why” is beyond the scope of this article. Ignoring the real-life constraint that you cannot buy fractions of a share, you would accomplish this by splitting your money on all 500 stocks in the same proportion as SPX.

Nothing too complicated so far. However, how can one compare the trading costs of buying, say, $100k of SPX vs $100k of the underlying portfolio? If needed here, see marketside chat #1 for the definitions of bid price, ask price, fair value, and half-spread. A reasonable approach is to compare the spread of an instrument that tracks the SPX (e.g. the SPY ETF)  (*2) against all 500 spreads for the SPX stocks, weighted by each stock’s relative weight in the index. E.g. if AAPL (Apple) makes 3% of the SPX index, its spread matters much more in the final result than that of a small stock.

Methodology of comparison

For each individual stock, you need to use some average of its spread over the last few days, as it may jump around.

A basis point is 0.01%. It’s widely used in trading only because it is faster to say “three bips” than saying “zero point zero 3 per cent”, and because the human brain can process small whole numbers more intuitively.
SPY has a spread of almost 1 cent (*3), so $0.01 / $185 ~= 0.5 bps. What is the statistical distribution of spreads for the 500 US stocks? Using some unimpressive but expedient awk, we get:
awk ‘{s=int(100*$5)/100; print( (s<0.1) ? s : “0.10 or more”)}’ | sort -rg | uniq -c
 22 0.10 or more
4 0.09
5 0.08
7 0.07
12 0.06
12 0.05
17 0.04
30 0.03
65 0.02
326 0.01
These are counts of spreads that are rounded down. So 500-326=174 stocks have average spreads that are 2 cents or more.
However, a more natural way to compare spreads across stocks is to express them as a fraction of stock price. A similar awk commands yields (in bps):
 17 10 or more
5 9
11 8
23 7
44 6
40 5
87 4
135 3
108 2
30 1
The (unweighted) average of the distribution above is ~4.5 bps (computation not shown). However – and here is a key market insight – the stocks that have high weights in SPX (e.g. AAPL) tend to be large companies, which are heavily traded and have smaller % spreads than average. The correctly weighted spread is therefore 3.2 bps, vs. 0.6 bps for SPY, so ~5x more.
This analysis assumes that one trades a small enough size to avoid market impact (see marketside chat #1). Also, this comparison is not applicable to retail trading, as the costs for an individual retail investor would likely be completely dominated by commissions; that is, the cost of buying $1m of SPY is way more than the cost of a few hundred individual stock trades.

(*1) SPX weighs stocks based on the value of the company they represent. Instead, the DJIA is a straight sum of the nominal prices of stocks, which are somewhat arbitrary. Example: AA (Alcoa) trades around $12, while MMM (3M) trades around $130. However, the market capitalization (stock price times number of shares in the company) is $13b for AA and $90b for MMM. So the market cap of MMM stock is ~7x that of AA, while its stock price (and therefore its weight in DJIA) is 130/12= ~11x > 7x. Using this reasoning, MMM has a higher weight in DJIA than market-cap weighting would imply.
(*2) If e.g. a stock today had an earnings announcement, it would be more volatile, and market participants should demand a bigger price ‘cushion’ when trading it, resulting in higher spreads in the market. Therefore, it is better to take the average over many days, so as to smooth out this effect.
(*3) In the US, stocks priced over $1 must be quoted in prices that are increments of 1 penny. This may be the subject of a future marketside chat, but for now consider the two extreme cases:

  1. If the minimum quote increment is 10^-9, and Alice is bidding $10 + 10^-9, then Bob has very little to lose if he sends an order to buy at $10 + 2*10^-9; he’s only improving the market by a tiny amount), while if a market order comes to sell at the bid, Bob will get to buy (and make the ‘half-spread’) and Alice wouldn’t. The result would be endless “pennying”, i.e. sending of orders a tiny bit better than the previous ones.
  2. If the minimum quote increment were $50, then (assuming fair value is around $175), there would be tons of orders to buy at $150 and sell at $200, as everyone would want to buy a $175 stock at the cheaper price of $150. Conversely, few would want to “cross the spread” (e.g. sell at $150 or buy at $200) because that would be giving away too much value. Therefore, there would be very little to no trading activity.