A specific trick and a general habit for achieving fast computations

I recently worked on a computation involved in a client-facing projection graph. In order to delight our users, we wanted it to respond instantly to user interaction. However, there are two forces at play that make the calculation complicated: the consistent contributions of recurring deposits and the erratic motion of stock prices. We needed to calculate a variable of the form

\[y_t = \sum_{i=0}^t\sum_{j=0}^t a_ib_je^{k(t-\max(i, j))}\]

for all months $t$ from 0 (today) to $T$ (retirement). This quantity $y_t$ is related to one parameter of the estimated distribution at time $t$. The $a_i$’s and $b_i$’s depend on the deposits made at different time periods, and $k$ depends on the portfolio’s volatility. A direct implementation of this equation would look like:

This would require evaluating $t^2$ terms for each $t$ (because of the nested sum). If we do this many operations for each $t$ from 0 to $T$, we will end up using $O(T^3)$ operations for the final result. We often have $T \approx 500$ months to calculate, making this an unacceptable runtime.

The process for turning this into an $O(T)$ algorithm is a reminder to never redo the same operation, whether implicitly or explicitly. I brought the graph’s recalculation time down to under 0.002 seconds by expanding the formula into recurrence relations.


Optimization 1: Incremental calculation

The formula for $y_t$ may look nasty, but it can be simplified if we unravel it in terms of $y_{t-1}$. That is, if we reuse the information in $y_{t-1}$. To do this, separate the terms with $i=t$ or $j=t$ from the rest:

\[y_t = \sum_{i=0}^{t-1}\sum_{j=0}^{t-1} a_ib_je^{k(t-\max(i, j))} + \sum_{i=0}^t a_ib_te^{k(t-t)} + \sum_{j=0}^{t-1} a_tb_je^{k(t-t)}\]

\[y_t = \sum_{i=0}^{t-1}\sum_{j=0}^{t-1} a_ib_je^{k(t-\max(i, j))} + b_t\sum_{i=0}^t a_i + a_t\sum_{j=0}^{t-1} b_j\]

Then notice that the doubly-nested sum is very similar to $y_{t-1}$. In fact, it is off by only a factor of $e^k$:

\[y_t = e^k\sum_{i=0}^{t-1}\sum_{j=0}^{t-1} a_ib_je^{k(t-1-\max(i, j))} + b_t\sum_{i=0}^t a_i + a_t\sum_{j=0}^{t-1} b_j\]

\[y_t = e^ky_{t-1} + b_t\sum_{i=0}^t a_i + a_t\sum_{j=0}^{t-1} b_j\]

Interpreting this recurrence relation as code gives us:

If we iterate over $t$ from 0 to $T$, we now only need to reuse $y_{t-1}$, do a few miscellaneous operations, and calculate $O(t)$ new terms. That means that in total, we require only $O(T^2)$ operations to calculate all $y_t$.


Optimization 2: Incremental calculation again

We can make this formula even faster by reusing the same trick, this time on each of the remaining summations. If we let

\[A_t = \sum_{i=0}^{t-1}a_i\]

\[B_t = \sum_{j=0}^{t-1}b_j\]

Then we can use recurrence relations to describe $A_t$ and $B_t$:

\[A_t = A_{t-1} + a_{t-1}\]

\[B_t = B_{t-1} + b_{t-1}\]

And we can use these in our formula for $y_t$:

\[y_t = e^ky_{t-1} + A_tb_t + a_tB_t + a_tb_t\]

Interpreting these recurrence relations as code gives:

Now evaluating $y_t, A_t,$ and $B_t$ from $y_{t-1}, A_{t-1},$ and $B_{t-1}$ takes only a constant number of operations! That means we can calculate all $y_t$ in $O(T)$ time. Thanks to this optimization, the runtime is indeed imperceptible (about 1.4ms on my machine when $T = 500$).


Optimization $n$

That’s as fast as we can make this algorithm, but what if we were to use this trick many times on a deeply nested sum? Here we had a doubly-nested sum, but I invite you to investigate what would happen if you had an $n$-times-nested summation of similar form:

\[z_t = \sum_{i_1=0}^t\sum_{i_2=0}^t\ldots\sum_{i_n=0}^ta_{1,i_1}a_{2,i_2}\ldots a_{n,i_n}f(\max(i_1, i_2, \ldots i_n))\]

Each time you do this trick, the algorithm requires one fewer loop but gets more complicated. To be precise, for large $n$ you can change this from an $O(nT^{n+1})$ algorithm to an $O(3^nT)$ algorithm.

The takeaway is that we often compute the same data multiple times in subtle ways, and that we can achieve huge performance gains by identifying and correcting these issues.