GitHub Actions is becoming one of the major CI providers, benefitting hugely from the tight integration to GitHub's other features. In this post, I'll walk through a feature that is seemingly inconspicuous but can become quite powerful if used right: Job strategies, and more precisely, the matrix strategy.
Within GitHub Actions Workflows, everything you want to run needs to be declared as a job with steps. This is great until you have quite similar workflows with only a few variations, such as builds for different versions, or infrastructure-as-code deployments of different services and targets.
Using the matrix strategy allows to write a job once, but pass in several variants that the job will be run for. With this, you write a baseline set of steps and other job details and access individual details such as the current version via the matrix context.
A static matrix
Let's try to understand the matrix strategy with the simple example of running a build for multiple versions of Node.js.
jobs:
build:
strategy:
matrix:
node: [10, 12, 14]
steps:
# Configures the node version used on GitHub-hosted runners
- uses: actions/setup-node@v2
with:
# The Node.js version to configure
node-version: ${{ matrix.node }}
In this example, we provided a parameter called node
to the matrix, with a list of major versions we want to target. For each of these versions, we will run the job once, setting the matrix context to the current version. We can then access the current node version with ${{ matrix.node }}
.
If this sounds abstract, think of it as a for loop
const matrixNode = [10,12,14]
for (const node of matrixNode) {
runJob(..., { matrix: { node } })
}
// -> runJob(..., { matrix: { node: 10 } })
// -> runJob(..., { matrix: { node: 12 } })
// -> runJob(..., { matrix: { node: 14 } })
You can also specify multiple matrix configurations for a job
matrix:
os: [ubuntu-18.04, ubuntu-20.04]
node: [10, 12, 14]
# The matrix above generates the following jobs:
# os: ubuntu-18.04 node: 10
# os: ubuntu-18.04 node: 12
# os: ubuntu-18.04 node: 14
# os: ubuntu-20.04 node: 10
# os: ubuntu-20.04 node: 12
# os: ubuntu-20.04 node: 14
With this, GitHub Action will determine all variations between the two operating systems and three versions, resulting in six total jobs.
Of course, we had to specify each version we want to run on manually, so this workflow fits best if you don't often change the matrix configuration, or if it's fine to do so manually.
Scopes available in the strategy
Previously, we declared our matrix configuration statically, so for any change, we would have to edit the workflow configuration file. If your matrix configuration is more dynamic, or if you want to use a single source of truth for which jobs to generate, let's check out if there are other ways to pass in our matrix configuration.
Fortunately, the Actions documentation includes a helpful page explaining contexts and their availability. If we search for the strategy
scope, we can see that environment variables are unfortunately not available to use for the strategy, but the needs
context is. This way, we can chain two jobs together, one for retrieving the matrix configuration, and a second one that declares and uses it to generate a dynamic number of jobs.
Using a previous job's outputs
Checking out the documentation, we found out that you can use a previous job's output as input for a job strategy, including the matrix configuration. We can use this fact to dynamically generate our matrix configuration.
name: build
on: push
jobs:
job1:
runs-on: ubuntu-latest
outputs:
# This needs to match your step's id and name parameters
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
# Important: Do not forget the id!
- id: set-matrix
run: echo "::set-output name=matrix::{\"node\":[10, 12, 14]}"
job2:
needs: job1
runs-on: ubuntu-latest
strategy:
# This needs to match the first job's name and output parameter
matrix: ${{fromJSON(needs.job1.outputs.matrix)}}
steps:
- run: build
This example showcases how we can declare two jobs, a first one to output our matrix configuration by using the ::set-output
workflow command to set an output parameter, and a second job that will only run once the first one completes and uses the output as its strategy.
We pass the matrix configuration as a JSON string, so in the second job, we parse it using the fromJSON
function, as the strategy requires objects or arrays to work with.
Our example job pretty ends up with the same matrix configuration as the previous static example, but this time, we can use environment variables or any command to generate our workflow dynamically.
Example: From environment variables
With the separate preparation step, we can use environment variables to hold our matrix configuration.
name: build
on: push
env:
MATRIX: "{\"node\":[10, 12, 14]}"
jobs:
job1:
runs-on: ubuntu-latest
outputs:
# This needs to match your step's id and name parameters
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
# Important: Do not forget the id!
- id: set-matrix
run: echo "::set-output name=matrix::$MATRIX"
job2:
needs: job1
runs-on: ubuntu-latest
strategy:
# This needs to match the first job's name and output parameter
matrix: ${{fromJSON(needs.job1.outputs.matrix)}}
steps:
- run: build
This way, we just need to update the environment variable at the top, instead of moving through the complete workflow and finding places to update.
Example: Run for all Pulumi stacks
As a final example, we can improve the experience for infrastructure-as-code tooling in CI by using said tools as the source of truth. In this case, we'll use the Pulumi stack files to list all stacks we should run through, and use that as the matrix configuration. And whenever we add a new stack, it'll automatically be included.
job1:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v2
- id: set-matrix
run: echo "::set-output name=matrix::$(ls Pulumi.*.yaml | sed s/Pulumi\.// | sed s/\.yaml// | jq -Rsc '. / "\n" - [""]')"
job2:
name: ${{ matrix.stack }}
needs: job1
runs-on: ubuntu-latest
strategy:
matrix:
stack: ${{fromJSON(needs.job1.outputs.matrix)}}
The command can be a bit difficult to read but it loads all Pulumi stack files in the current directory, removes the prefix and suffix so only the stack name remains, and formats this as a JSON array in the form that we need in the second step.
This way, we run a job for each stack, which is reflected in the job name as well.
With the matrix strategy, you can make your GitHub Actions incredibly dynamic and versatile, using one source of truth such as another tool to generate as many jobs as you need.
One important limit you should take into account is that a job matrix can only generate up to 256 jobs per workflow run. If your use case would result in more than that, you might need to think about a different approach and investigate if GitHub Actions is the best fit for your case.