Menu

Intelligent Metrics Monitoring

August 12, 2016

Here at Wealthfront we have many offline computations running in Spark. In some cases, small changes have caused a job to slow down dramatically, and other times, the size of the input may be growing and causing the job runtime to quickly increase. We normally check pipeline runtimes manually to make sure jobs are running… Read more

Tips for Unit Testing D3

July 27, 2016

Note: D3 4.0 has been released, but we haven’t upgraded to it yet, so the syntax in the code examples below is written for D3 3.x. However, the same ideas should hold for the newer version. If you’ve ever attempted to write unit tests for D3.js code, you’ve undoubtedly noticed some pain points. Many are due… Read more

Integrating Apache Spark into your existing Hadoop system – Part I

June 22, 2016

As evidenced by our previous blog post, Statistics is Eating the World, data is at the very center of Wealthfront’s values. At Wealthfront, fields ranging from research, analytics to marketing, client services, human resources and even employee productivity all heavily rely on data in their decision makings. Such a variety of data sources and requirements need… Read more

Connecting to an FTPS Server with SSL Session Reuse in Java 7 and 8

June 10, 2016

“Good programmers write good code… Great programmers reuse great code.”  Or so I told myself as I snagged an Apache Commons class to connect to a new vendor’s FTPS server.  Several hours of debugging later, however, I realized to my dismay that the omnipotent Apache Commons did not support a major security feature required by most modern FTPS servers.  This post outlines my process… Read more

How to Make Your Persistent Queues Run Faster Safely

May 04, 2016

What is a Persistent Queue? A persistent queue is a list of objects that persist in the database waiting to be polled and processed in some way. Usually it is a table that has columns for data, timestamp of when the object persisted, and timestamp of when the object is polled. As opposed to an… Read more

Identifying Non-Heap Class Leaks

April 18, 2016

At Wealthfront, a significant portion of our backend applications are written in Java. The Java Virtual Machine (JVM) uses Garbage Collection (GC) for memory management, this forces us to pay close attention to its characteristics and behavior. Every JVM service we run accumulates JVM and application-level statistics, spools these metrics into a statsd server, and… Read more