In the Apache Flink community we’re in the process of porting our huge test codebase to JUnit 5. In order to leverage as much as we can the new JUnit 5 features, in the past days I’ve spent some time playing around with it.
In this blog post I’ll talk about my story about enabling the project to use the new JUnit 5 features, including parallel execution, parametrized tests and extensions, and how they’re going to help us improve our test codebase.
Our test codebase
In Flink we’ve all kind of tests you can think of:
- Simple unit tests with a certain degree of mocking
- Integration tests
- End-to-end tests
- Various other tests for utilities, such as tests for our bash scripts
In most of the codebase, we refer to integration tests as tests that define and run a streaming job, but use a mocked sink and source. While end-to-end tests are like integration tests, but they use real external systems, such as Kafka, deployed with TestContainers.
In particular in Flink SQL, unsurprisingly, we have a lot of integration tests, because each single feature requires to be “understood” by all the different moving parts of our stack. For example, take the built-in function
COALESCE: it has a runtime implementation, a Table API expression , a custom logic for arguments type inference and return type inference, an optimizer rule that removes the call when possible. Each of these single pieces need to work in harmony, and integration tests usually gives us the guarantee that everything fits together.
Another aspect of JUnit 5 tests is that we have a lot of test bases and of parametrized test bases, à la Junit 4. This is due to the organic growth of the project and the effort to try to standardize certain test aspects, like starting and stopping a Flink
MiniCluster, the embedded Flink cluster to run test jobs.
In porting to JUnit 5, we want to:
- Have less test bases, but more extensions, hence composition over inheritance. This simplifies contributing new tests, as adding a new “capability” to the test suite won’t require new ad-hoc test bases.
- Improve error reporting and test cases separation, in order to make the contributor experience nicer both when running tests from the IDE and with Maven
- Speedup the test suite as much as possible
In particular the last point is a hot topic, as today a CI run usually takes between 1 and a half and 2 hours, having a significant impact on the development loop of the project.
Parallel tests are, in my opinion, the killer feature of JUnit 5 and the real incentive to port JUnit 4 tests to JUnit 5.
With JUnit 4 you can parallelize test execution using the build tool, for example using
maven-surefire-plugin fork JVM feature. It runs tests in parallel by spawning several JVM processes, where each of them gets assigned a split of the overall list of tests to run. JUnit 5 on the other hand runs all the tests within the same JVM: the test runner manages a thread pool and takes care of assigning test cases to threads.
I think the JUnit 5 approach fits best in our use case, as spawning several JVMs is very resource intensive on constrained machines such as CI runners, so it’s a constant source of issues. This statement is also valid for contributor’s machines, as these days just running the browser with 20+ tabs open, Spotify, Slack and the IDE can easily eat up to 16Gb of RAM. Plus they work in any IDE without additional configuration, they can help you find out thread safety bugs and the granularity of the execution is easy and flexible to configure.
To start using JUnit 5 parallel tests, we just had to create a file called
junit-platform.properties in our test resources and add the following:
junit.jupiter.execution.parallel.enabled = true
This configuration enables to opt-in specific tests/classes to run in parallel. To flag a test class to run in parallel, we need to annotate it with
@Execution(CONCURRENT). Thanks to this configuration we can gradually enable parallel execution only for tests we know are safe.
JUnit 5 offers the ability to configure the granularity of the parallel test execution, e.g. run all test cases from all classes in parallel, run all test cases from a class sequentially but run the test classes in parallel, etc. Check out all the available options in the JUnit 5 documentation.
So I’ve decided I wanted to try this new feature with some complex parametrized test base. I ended up choosing
CastRulesTest, a parametrized unit test base checking the runtime implementation of the
CAST logic. Because there is no shared state whatsoever, this test suite is embarrassingly parallelizable. Just adding the annotation
@Execution(CONCURRENT) gave me a 3x times faster execution time for the whole suite.
Integration tests in parallel
That 3x totally got my attention, so I wanted to try to apply the same annotation to integration tests as well. As my next target, I’ve chosen
BuiltInFunctionTestBase. As the name implies, this is a parametrized integration test base we use to test the correct behaviour of our built-in functions. We have around 10 classes where we use this test base, for a total of 1200+ integration test cases.
Porting the test suite to JUnit 5
The first thing I had to do was to port to JUnit 5 the base class and its inheritors. My initial thought was to use
@TestFactory feature, a new JUnit 5 feature to spawn dynamic tests, allowing you to group test cases. Think to it as a more powerful
This would have allowed me to have a nice nested view of the tests in the reports. For example, look at
MathFunctionsITCase: by mixing
DynamicContainer I could have achieved a report like:
f0 + 6 = 20: Success
f0 + 6 = 10: Failure
Because the test base itself already had some models to define test cases, including input/output data, query expressions and configuration (see
TestSpec), what I had to do was simply to convert these models to
DynamicContainer. A little refactor of the
testFunction method to wrap the logic into
DynamicTest did it. Now every concrete test class just had to implement the abstract method
getTestSpecs to return the test cases defined with my
TestSpec class, so the final implementation of the
@TestFactory just looked like:
Every implementation class provided a
DynamicContainer, containing a set of tests for a specific built-in function, like
PLUS, and each container had a set of
DynamicTest with the specific test cases for that built-in function, like
f0 + 6 = 20.
Last but not least, to let the query run, I needed the extension to set up
MiniCluster once per class. This is already available in our
flink-test-utils, so with some copy-paste I enabled it:
public static final MiniClusterWithClientExtension MINI_CLUSTER_RESOURCE =
Tried a first run without parallel tests, and everything ran fine. Tried to add
@Execution(CONCURRENT), and Intellij IDEA welcomed my idea with a long report full of red crosses.
MiniCluster extension first
Looking at the logs, it became evident how the problem was
MiniClusterWithClientExtension, given several tests were trying to push jobs to a
MiniCluster already shut down.
MiniClusterWithClientExtension was developed by wrapping the JUnit 4
MiniClusterWithClientResource rule in a custom interface defining
after. Then, to register it, you could either use
EachCallbackWrapper to define whether to have one
MiniCluster per test class, or per single test.
In JUnit 4 rules have a
after extension point, and then when you register them, depending on whether you use
@ClassRule, the rule is executed for each test or once per class. In JUnit 5 the user cannot pick whether the extension is used globally or per method: it’s the extension itself that defines where it can hook in the lifecycle.
An example of this shift of concept between JUnit 4 and 5 is provided by the JUnit 5
TestContainers integration, which depending on whether the container field is static or not, decides to share it between test methods or not.
EachCallbackWrapper were circumventing the new JUnit 5 Extension paradigm, bringing back the same semantics of
@ClassRule. And this worked fine, until I tried to use parallel execution.
MiniClusterWithClientExtension#before method was starting
MiniCluster, creating some
ClusterClient and then setting up a thread local for the environment configuration lookup.
The interaction between our
ThreadLocal configuration when using
AllCallbackWrapper didn’t sound right, so I tried to do a little experiment. Take this simple extension:
public class TestExtension
This is simply going to print the thread where the various hooks are executed, and it also prints the name of single test in case in
afterEach. Then I tried to run this test:
And here is the result:
before all: ForkJoinPool-1-worker-1
As I was expecting,
afterEach runs in the same thread of the test, as effectively they just “wrap” the test method, as described here.
In other words:
- There is no guarantee about which thread is used to execute
afterEachare guaranteed to be executed within the same thread of the test
So here was the solution: we just had to set/unset that
ThreadLocal everytime in
afterEach, no matter whether the
MiniCluster instance was meant to be per test class or per test method.
I was ready to declare victory, so I did some refactoring of
MiniClusterWithClientExtension to always create one
MiniCluster per test class, I removed the wrapping around
AllCallbackWrapper and then I tried to run again my tests: still everything was red.
After some investigation, I found out that
DynamicTest lifecycle doesn’t work per
DynamicTest instance, while parallelization does. For example, for this test class:
You get this output:
before all: ForkJoinPool-1-worker-1
As you see, each
DynamicTest is executed in parallel, as you would expect, but the
afterEach is executed only once per
@TestFactory. Because of this behaviour, the tests were not picking the correct
ThreadLocal, hence failing the test because they could not find the
It seems like there is no solution to this problem, and there is an issue open in the JUnit 5 issue tracker about whether this should be supported or not.
So I had to fall back to the good old
@ParametrizedTest: I just removed the nested
DynamicContainer, and I created my own
DynamicTest like data structure to wrap a test
Executable and its display name:
/** Single test case. */
And then I kept the same code as before in the test base, but I had to add a method to flatten the previously used
abstract Stream<TestSpec> getTestSpecs();
The report doesn’t look as nice as with
@TestFactory, but it finally works fine with parallel execution.
Results and conclusion
Before the parallelization, this test suite ran in around 50 seconds from Intellij IDEA on my i7 11th Gen packed with 16Gb of RAM. After the parallelization the suite runs in around 20 seconds. Not bad.
I think JUnit 5 is a very powerful tool, but you need to be careful when developing extensions that are supposed to work with parallel tests. These are my gotchas from this experience:
- Make sure thread locals are set in
beforeEach, so they’re set in the right thread, and cleaned up in
afterEach, to avoid next tests reusing that thread to pick up the old
- Be aware that the method
beforeEachcould be invoked in parallel on the same instance of the extension, so you need to make sure the access to extension fields is thread safe, or…
- Use the
ExtensionContext.Store, as it’s thread safe, it’s hierarchically organized per hook context, and it’s particularly useful to auto cleanup objects built in the
- Rather than providing accessors in the extension class, use parameter injection implementing
ParameterResolvertogether with the
ExtensionContext.Store. This simplifies the implementation when storing your extension state in
And last but not least, don’t use
@TestFactory in conjunction with parallel execution if your test requires
ThreadLocal or other thread dependant thing correctly configured.