Menu
Sophia Zhu
Sophia Zhu

Intelligent Metrics Monitoring

Here at Wealthfront we have many offline computations running in Spark. In some cases, small changes have caused a job to slow down dramatically, and other times, the size of the input may be growing and causing the job runtime to quickly increase. We normally check pipeline runtimes manually to make sure jobs are running […]

August 12, 2016

Emilio Lopez

Tips for Unit Testing D3

Note: D3 4.0 has been released, but we haven’t upgraded to it yet, so the syntax in the code examples below is written for D3 3.x. However, the same ideas should hold for the newer version. If you’ve ever attempted to write unit tests for D3.js code, you’ve undoubtedly noticed some pain points. Many are due […]

July 27, 2016
Yan Yang
Yan Yang

Integrating Apache Spark into your existing Hadoop system – Part I

As evidenced by our previous blog post, Statistics is Eating the World, data is at the very center of Wealthfront’s values. At Wealthfront, fields ranging from research, analytics to marketing, client services, human resources and even employee productivity all heavily rely on data in their decision makings. Such a variety of data sources and requirements need […]

June 22, 2016

Luke Hansen

Connecting to an FTPS Server with SSL Session Reuse in Java 7 and 8

“Good programmers write good code… Great programmers reuse great code.”  Or so I told myself as I snagged an Apache Commons class to connect to a new vendor’s FTPS server.  Several hours of debugging later, however, I realized to my dismay that the omnipotent Apache Commons did not support a major security feature required by most modern FTPS servers.  This post outlines my process […]

June 10, 2016
Sen Yu
Sen Yu

How to Make Your Persistent Queues Run Faster Safely

What is a Persistent Queue? A persistent queue is a list of objects that persist in the database waiting to be polled and processed in some way. Usually it is a table that has columns for data, timestamp of when the object persisted, and timestamp of when the object is polled. As opposed to an […]

May 04, 2016
Josh Toft
Josh Toft

Identifying Non-Heap Class Leaks

At Wealthfront, a significant portion of our backend applications are written in Java. The Java Virtual Machine (JVM) uses Garbage Collection (GC) for memory management, this forces us to pay close attention to its characteristics and behavior. Every JVM service we run accumulates JVM and application-level statistics, spools these metrics into a statsd server, and […]

April 18, 2016