Today was one of those days where you wear a big smile all day. I started with some Java for breakfast, continued with a bit of Python, and then got roped in on a red card. (Tonight I’ll get back to JavaScript experimentations.)
We use Interactive Brokers for all of our execution, and love their quality and speed. Part of the way they achieve such stellar execution is by ensuring that there are enough market makers and volume for securities that are traded. So called “junk” stocks are a no-no. To continue providing us with great service, IB and kaChing has been working on defining a “good” list and have set our mind on stocks on specific OTC exchanges cheaper than $5 are junk.
To measure the impact on our users, I set forth to get a count on how many portfolios would be affected by such rule. Our security master, the file containing raw information about securities, is named KachingDaily20091112.csv. First step, count how many securities are in each exchange.
The file is a CSV file with quoted data (e.g. company names) which has on the first column symbols and on the third column exchanges.
$ sed -e 's/"[^"]*"//g' KachingDaily20091112.csv | cut -d, -f 3 | sort | uniq -c 1223 AMEX 514 NASDAQCM 1731 NASDAQGM 1234 NASDAQGS 3354 NYSE 862 NYSEARCA 2616 OTCBB 19881 OTHEROTC 3294 PINKSHEETS
The important part is filtering our data that contains commas (,) within quotes (“Super Bank, LLC”) which would mess up the parsing.
Let’s fetch prices now. All our backends are service oriented and we use POST requests to talk to them. For example, we fetch a delayed quote for AAPL from the TickerFeed using
curl -sd "q=GetDelayedQuotes&p0=AAPL" tf0:8081
We can combine this command to get all the prices in one fell swoop
$ sed -e 's/"[^"]*"//g' KachingDaily20091112.csv | cut -d, -f 1,3 | egrep ',(OTCBB|OTHEROTC|PINKSHEETS)$' | while read line; do symbol=$(echo $line | cut -d, -f 1) price=$(curl -sd "q=GetDelayedQuotes&p0=$symbol" tf0:8081 | jgrep 0.p) echo "$line,$price" done > otc_with_prices.txt
The jgrep is our internal “JSON grep” tool. That’ll be the topic of another post.
We’re almost done. Last step is to look at all securities which price lower than $5.
$ egrep -v ',null$' otc_with_prices.txt | sed -e 's/,([0-9]*).[0-9]*/,1/' | egrep ',[0-4]$' | wc -l 15316
After that quick tour, my hope is that you’ll be as excited as I to do shell scripting.