From 40 Minutes to 4 with Tests Parallelization
Last month, we finished a big upgrade for a client. The client had 2 main pain points for their app. The first one was that they were using Rails 2.3 LTS with Ruby 2.5. The next big issue was that the test suite took 40 minutes to run, blocking engineers from merging code into the main branch, and also slowing down the whole feedback loop for every code change.
After finishing the upgrade (we got the application to Rails 8.1.1 and Ruby 3.4.7), we focused our attention on improving the test’s speed and we reduced the time it took to run the whole test suite (over 10k tests) from 40 minutes to around 4!
The Different Causes of Slow Tests
The Test Runner
The most popular test runners for Rails are currently Minitest and RSpec , and this application was using Test::Unit . Minitest is the default testing framework in new Rails applications and it’s known to be the fastest of the 3 .
This, along with other improvements from Ruby and Rails upgrades, already proved beneficial… not on the total run time, but we did notice that the time it took to start the tests improved significantly: when trying to run a single file or a single method, it used to take around 30 seconds of booting the app, and by the time we finished the upgrades and the migration to Minitest, this was reduced to a few seconds.
Trying to run a single test inside a file took even longer (we don’t know the exact cause, but it was really slow), when now it takes a similar time to boot the app for the test either with a single file or with a file and a line number specified.
This had a huge impact on the feedback loop when working locally, since running a few specific test files was already better.
But we still had to improve the test suite speed and not just the boot time.
Factories vs Fixtures
This application was already using mostly fixtures for the tests data, with a few factories here and there for some dynamic data. We didn’t have that much to change here, but it’s worth mentioning here.
When we work with factories (like using the factory_bot gem ), if our tests keep persisting similar objects, we would be performing the same work many many times, adding up the more tests we have. Imagine a test suite that creates a new User record for every test for authentication purposes, always the same, with database writes every single time.
We can improve those cases with fixtures, records that get created in the database at the beginning of the test suite that can be reused by all tests, reducing the number of database writes and code executed. This always shows a positive improvement in tests speed, and factories can still be used for exceptional cases or more dynamic data that still requires writing to the db. We can also go “all in” and use fixtures for almost everything, but it can be hard to keep them clean and organized.
There are some trade-offs with fixtures though:
- the data is defined in a fixtures file instead of close to the test: this makes it a bit harder to understand what’s the specific state required for a test
- fixtures can be invalid by mistake: since Rails simply inserts the data in the database, if our fixture is not correct, we’ll have an invalid record when reading it from the database (this is really common when models change over time and people forget to update the fixtures to reflect those changes)
- complex associations are hard to setup and maintain: when the fixtures are loaded, Rails will insert records in bulks in the db, without setting up proper associations or join models, and without running callbacks, so setting associations means configuring all the records properly ahead of time, instead of letting Rails handle all that complexity
- fixtures that are only used once can make things hard to maintain with little performance gain
Our experience is that it’s better to combine both Fixtures and Factories in a test suite and use them when they make more sense: use fixtures to generate data that is created constantly by tests, and keep factories for data that is not shared that much or that is really hard to set up properly.
Slow Tests
Sometimes, the reason for a test suite to be slow are specific tests. We have seen in the past some ideas to speed up tests that were correct, changing a known slow pattern in a single file, but for a test that was already fast and the slowness of the pattern was not noticeable in practice.
Always, when talking about performance, it’s important to measure to find where to focus our efforts. With both RSpec and Minitest, the --profile flag can be passed in the command line to get a list of the top 10 slowest tests. With that information we can be more effective and really tackle the tests that will have the highest impact.
With this command we identified some tests that had a sleep(1) call (for historical reasons and some race conditions from when the application used threads) and we could remove them by addressing the actual problem.
Tests Parallelization
One of our goals during the upgrade was to migrate to Minitest, not only because we knew it’s faster, but also because we could remove the test-unit and test-unit-rails gems, and, more importantly, we knew we would be able to use Rails’ parallelization feature to split the workload of running the tests and significantly reduce the time.
Randomized Order
From the beginning of the project, the 10 thousand tests were executed in a non-random order. One of the first things we had to solve was making sure the tests ran randomized. This was not essential for the actual upgrades, but it’s a general good practice and also it was really important for the parallelization in the future, as adding/removing tests or having different numbers of cores was going to have the side effect of having the tests grouped differently over time and with different computers.
These issues typically involved tests changing something and not reverting it back at the end, or some that actually depended on these changes that “leaked” from other tests that ran before. We were already using transactional tests so the data in the database was not a problem, but that won’t solve issues caused by, for example, changing I18n.locale in a test and not changing it back to the original value.
There’s no single way of solving these issues, as they can be anything: a locale change, a file being deleted, a class constant override, some code loaded explicitly in a test, etc. What helps a lot is using the --seed flag to be able to run tests over and over in the same order.
Once we had the test suite running in randomized order, we could finally enable parallelization: a simple parallelize(workers: :number_of_processors) in the ActiveSupport::TestCase class in our test_helper.rb file. We ran the tests and now there’s a line that says Running 10438 tests in parallel using 14 processes and that’s it! … in theory.
Race Conditions
The theory is not wrong though, Rails was running the tests in parallel, but we quickly found out more problems, not just with randomized order for new combinations we didn’t catch before, but also with tests with side-effects that would conflict with other tests running at the same time, a kind of race condition.
Something that we noticed right away was that the test suite total run time changed from over 40 minutes to less than 4! but we could not be confident yet about these results, because we had so many tests failing from these race conditions that many tests were not running all the way to the end, making them take way less time than it should. But it was a good first teaser of the speed improvement!
We found that the main culprit for these race conditions was the heavy use of temporary files by this particular application, and how the tests were creating and deleting files and folders all the time. If two tests are adding files to (or clearing) the tmp/export folder (as an example) before or after they run, when they run in parallel, both will try to remove a file the other test needs.
The solutions were too specific for the client’s application, but there were a few common patterns:
- hardcoded directories: either directly in the code or as class constants
- aggressive deletion of files: instead of deleting the files just created, it was clearing complete directories
- pessimistic deletion of files: clearing complete directories in case a file was problematic (instead of relying on proper cleanup of previous tests)
- tests modifying the same file: with more than one test writing and reading a single file before reverting the changes
This was the most time-consuming process, because, combined with the randomized order and the fact that there’s no warranty 2 tests are going to run at the same time, it was not always easy to find the correct combinations to reproduce these issues (or even find them in the first place!). We had to run the tests over and over and over again for days while fixing these issues.
But now, running the tests was not a problem, it was not tedious, we could run the whole test suite (the 10 thousand tests) in around 4 minutes!
No Minitest? No Problem!
Since the application was using Test::Unit, it was fairly easy to migrate to Minitest, a lot of the syntax is similar, the assertion patterns are similar (even though with different assertion methods), and we knew we wanted to use the built-in parallelization of Rails. But this is not the case for all clients, when they use RSpec or Cucumber we can’t really use Rails’ parallelization, so we have to use alternative gems. Some tools we found over time that could help are:
- flatware (works with RSpec and Cucumber)
- turbo_tests (only RSpec)
- parallel_tests (works with RSpec, Cucumber, and Test::Unit)
- knapsack_pro (integrates with the CI infra to use more than the CPUs of the current machine)
Conclusion
This change was one of our objectives since the beginning of the project, the first priority was always to upgrade Ruby and Rails versions, but keeping in mind that we wanted to enable parallelization down the line right after so we took small steps in that direction during the whole process. Rails’ parallelization feature was introduced in Rails 6 so it can be enabled earlier if that’s the main priority for your app.
All the upgrades and parallelization of the tests really improved the developer experience:
- engineers can now use modern Ruby and Rails feature and remove old code and dependencies
- they can find better resources for new features
- all the Ruby and Rails speed improvements made the development faster, with better code loading, Ruby speed optimizations, and more
- the engineers could finally start running the tests locally if needed to not rely always on CI for small changes
But the main 2 benefits of the parallelization translate both in release speed and infrastructure costs. Before, they could effectively only merge code into the main branch once every 40 minutes, while it can now happen after 5; and if existing PRs had conflicts, those PRs had to be updated and way another 40 minutes before being ready for merge. This removed the biggest blocker for engineers to quickly iterate. The whole test suite runs in around 4 minutes (some machines have more or less processors), and it’s not the bottleneck of the process anymore.
The other benefit relates to costs of the CI infrastructure: CI machines were underutilized for years. Each CI machine has between 14 and 16 processors, but, before parallelization, only one core was used to run the tests while the rest were doing practically nothing. This also led to adding more CI machines to speed up the queue of the PRs waiting for tests to run. Now that the machines are being fully used and for a shorter time, the fleet can be reduced, saving not only time but also costs. And since engineers can now run tests locally, this also reduced the number of elements in the CI queue, allowing the fleet to be even smaller.
Do you need help with your slow tests? We can take a look!