Magic with shell

November 13, 2009

Today was one of those days where you wear a big smile all day. I started with some Java for breakfast, continued with a bit of Python, and then got roped in on a red card. (Tonight I’ll get back to JavaScript experimentations.)

We use Interactive Brokers for all of our execution, and love their quality and speed. Part of the way they achieve such stellar execution is by ensuring that there are enough market makers and volume for securities that are traded. So called “junk” stocks are a no-no. To continue providing us with great service, IB and kaChing has been working on defining a “good” list and have set our mind on stocks on specific OTC exchanges cheaper than $5 are junk.

To measure the impact on our users, I set forth to get a count on how many portfolios would be affected by such rule. Our security master, the file containing raw information about securities, is named KachingDaily20091112.csv. First step, count how many securities are in each exchange.

The file is a CSV file with quoted data (e.g. company names) which has on the first column symbols and on the third column exchanges.

$ sed -e 's/"[^"]*"//g' KachingDaily20091112.csv | cut -d, -f 3 | sort | uniq -c
   1223 AMEX
    514 NASDAQCM
   1731 NASDAQGM
   1234 NASDAQGS
   3354 NYSE
    862 NYSEARCA
   2616 OTCBB
  19881 OTHEROTC
   3294 PINKSHEETS

The important part is filtering our data that contains commas (,) within quotes (“Super Bank, LLC”) which would mess up the parsing.

Let’s fetch prices now. All our backends are service oriented and we use POST requests to talk to them. For example, we fetch a delayed quote for AAPL from the TickerFeed using

curl -sd "q=GetDelayedQuotes&p0=AAPL" tf0:8081

We can combine this command to get all the prices in one fell swoop

$ sed -e 's/"[^"]*"//g' KachingDaily20091112.csv | cut -d, -f 1,3 | 
    egrep ',(OTCBB|OTHEROTC|PINKSHEETS)$' | while read line; do
  symbol=$(echo $line | cut -d, -f 1)
  price=$(curl -sd "q=GetDelayedQuotes&p0=$symbol" tf0:8081 | jgrep 0.p)
  echo "$line,$price"
done > otc_with_prices.txt

The jgrep is our internal “JSON grep” tool. That’ll be the topic of another post.

We’re almost done. Last step is to look at all securities which price lower than $5.

$ egrep -v ',null$' otc_with_prices.txt | 
    sed -e 's/,([0-9]*).[0-9]*/,1/' | egrep ',[0-4]$' | wc -l
15316

After that quick tour, my hope is that you’ll be as excited as I to do shell scripting.