Tagged In Cascading :
Lessons Learned when Scaling our Data Analytics Platform
Over the past year, we’ve gone from a single Java server running all analytics to a multi-node data pipeline. Along the way, we’ve refined our metrics for all parts of the business: from web and mobile analytics, to investment research, to trading and operations. We build metrics dashboards for everything we do, empowering us to… Read more
Testing Cascading applications
This post explores how we apply our test-driven-development philosophy to analytics problems. In particular, it shows how use test-driven development with Cascading, which we’ve recently started using to drive analytics at Wealthfront. Cascading let’s us specify complicated analytics pipelines in Java. It works well for problems that would normally require multiple MapReduce jobs to get a… Read more