End to end (E2E) tests play a crucial role in ensuring software quality and when implemented effectively, can eliminate the need for manual testing (see our engineering principles). However E2E tests are notoriously slow since they often involve navigating through multiple screens, making API calls, and running on real devices. In our previous blog post, Speed Up Your Android Tests: Gradle Plugin for Unit Test Filtering, we talked about how we want to ensure that our testing process scales proportionally with the size of our pull requests as opposed to the size of our codebase. In this blog post, we’ll extend that test filtering strategy from unit tests to E2E tests to further accelerate our builds.

Background

In a multi-module project like ours, identifying which unit tests to run is straightforward since they are located in the same module as the code they test. E2E tests however pose a greater challenge on Android since they’re all grouped together in the androidTest directory. This means that there is no clear-cut way of mapping an E2E test to a feature module. Additionally, E2E tests often cover the interactions between modules so a single test could be triggered by multiple modules. While there’s no perfect solution, we’ve developed a simple heuristic based on our project’s structure. By analyzing a test’s imports and identifying the files that have changed, we can determine which tests need to be run.

Technology

Our approach to E2E test filtering is shaped by our existing test infrastructure since these tests run on actual devices. Here is an overview of the tech stack we use for E2E tests.

Firebase test labs: Runs our tests on a wide variety of devices available on the cloud.
Flank: An open-source test orchestrator which simplifies our Firebase setup. It also supports running a subset of our test suite using the “test-targets” feature.
Gradle: E2E tests are triggered on CI via Gradle. Building on our Gradle plugin work from our previous blog post, we’ll introduce a new plugin to automate E2E test filtering.

Methodology

Our test filtering solution contains 3 main parts:

Creating a heuristic to determine which tests to run.
Formatting the tests in a way that Flank can understand.
Creating a Gradle plugin to generate a Flank configuration for us.

Developing a heuristic for mapping tests to relevant code

Developing a heuristic to determine which tests to run is highly dependent on your project’s structure. In our project, feature modules contain test helper methods that are referenced in our E2E tests. This allows us to create a 1:N mapping between tests and feature modules.

For example, if a test visits our app’s Login Page and Dashboard page, it may reference 2 helpers called onLoginPage and onDashboardPage. The relevant imports in our test file would then look something like this:

import com.wf.test.actions.login.onLoginPage
import com.wf.test.actions.dashboad.onDashboardPage

By parsing these imports, we can infer that this test is testing code in our login and dashboard modules. In other words, if either of those modules change, we should run this test. Therefore, our process will be as follows:

Identify a list of changed modules (check out our previous blog post for details)
Analyze test imports to create a list of associated feature modules.
Compare the lists to find out if any modules overlap. If there is an overlap, that means we run that test.

Now let’s look at the code. We know the test directory so we iterate through each test file and analyze its imports. Using the open-source library kotlinx-ast, we parse the abstract syntax trees (ASTs) to extract the imports. We can then filter for imports of the pattern com.wf.test.actions.<module>, where <module> represents the feature module. If the module name matches one from our list of changed modules, we include that test. Finally, with some string manipulation, we format the test file paths into package names.

Converting our tests to Flank test targets

Flank uses test targets to determine what tests to run. In our case, we want to filter at a test class level. This can be done in Flank using the format class test.package.name.

Creating a Gradle Plugin to apply test filtering

Creating a Gradle task for generating test targets

We created a Gradle task called TestTargetsTask which calls our function for generating the test targets.

Creating a Gradle task for generating a Flank config

We created a Gradle task called YamlConfigWriterTask which generates a YAML configuration file used by Flank. The YAML file contains a section called “test-targets” which is where our test targets will go. In the end, we produce some YAML which looks like this:

test-targets:
    - class com.wf.test.uitests.login.LoginTest
    - class com.wf.test.uitests.dashboard.DashboardTest
    - class com.wf.test.uitests.transfer.TransferTest

Creating a Gradle plugin to run our task

We can now create a Gradle plugin to run YamlConfigWriterTask to generate Flank’s YAML. Since it requires the test targets, this task depends on TestTargetsTask to generate the test targets.

Using our Gradle plugin

Finally we register the Gradle plugin and apply it using apply plugin: "com.wf.flank.app" to automatically start filtering our E2E tests.

Performance

We get it. Talk is cheap. Let’s look at the data to see our improvements! We tracked our E2E test runtimes on PRs against master. To measure the impact of E2E test filtering, we compared runtimes from the weeks before and after implementing. The graph below categorizes our E2E test builds into 3 buckets with light colours indicating faster builds and dark colours indicating longer builds. Since enabling test filtering, our slowest builds taking >60min (brown) have nearly vanished. Previously, most builds took 40-60min (orange), but now 90% finish in 20-40min (yellow). Our average build time has stabilized at around 33 minutes down from a previous average of 46 minutes with a peak of 68 minutes.

Conclusion

Using our learnings from unit test filtering, we developed a Gradle plugin to filter E2E tests as well. In doing so, we reduced our E2E test times by approximately 13 minutes.

Another welcome side-effect of this project was reducing test flakiness. Since E2E tests are more prone to flakes due to device interactions, running fewer tests naturally results in less flakes. Depending on your tolerance for flakiness, this can also save CI costs by minimizing the need to rerun flaky tests.

With all that saved time and money, that’s an extra Boba run per day!

Disclosures

The information contained in this communication is provided for general informational purposes only, and should not be construed as investment or tax advice. Nothing in this communication should be construed as a solicitation or offer, or recommendation, to buy or sell any security. Any links provided to other server sites are offered as a matter of convenience and are not intended to imply that Wealthfront Corporation (“Wealthfront”) or its affiliates endorses, sponsors, promotes and/or is affiliated with the owners of or participants in those sites, or endorses any information contained on those sites, unless expressly stated otherwise.

Investment advisory services are provided by Wealthfront Advisers LLC, an SEC-registered investment adviser. Brokerage services are provided by Wealthfront Brokerage LLC, Member FINRA/SIPC. Financial planning tools are provided by Wealthfront Software LLC.

All investing involves risk, including the possible loss of money you invest, and past performance does not guarantee success. Please see our Full Disclosure for important details.

Wealthfront Advisers LLC, Wealthfront Brokerage LLC, and Wealthfront Software LLC are wholly owned subsidiaries of Wealthfront Corporation.

Engineering Blog – Wealthfront

Speed up your Android Tests Part 2: End to End Test Filtering