Required GitHub Actions Jobs in a Monorepo

Two weeks ago, I wrote about enforcing required status checks in GitHub actions. This was required in scenarios with sharding or running tests in parallel.

Today, I wanted to extend this to setups with multiple services, which create a different dynamic: When files in a service change, you want to test the service, otherwise you want to skip the tests.

I’ll walk through a couple of potential solutions that don’t quite work and then go into an approach that connects the required check enforcement with multiple services in a monorepo.

Using a generic no-op workflow

When you need to enforce status checks to pass and are dealing with multiple services of which only a subset are modified in a pull request, GitHub recommends adding a generic workflow that runs when other services are not modified

name: ci
on:
  pull_request:
    paths-ignore:
      - 'scripts/**'
      - 'middleware/**'
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - run: 'echo "No build required" '

While this looks good, the approach breaks down in multiple ways when taking a closer look:

Since you’ll make the ci action required, GitHub allows merging once any ci job passes or is skipped. Whenever you open a pull request that modifies more than scripts or middleware, the generic workflow will run and succeed, which allows you to merge changes that could potentially break.

No-op workflows per service

Okay, you think, let’s just create generic workflows per service then, that should make sure we run either the actual ci job or the generic replacement if no file was changed, right? Unfortunately, no.

In addition to you having to set up each service as a required status check, you may create a pull request that changes both the service and other files. This is possible because paths-ignore does not say that it will not trigger if the service changed itself (so if any service file was modified), it merely ignores all changes in a service and watches for file changes in other paths.

This is a very important difference that has you running both workflows with the generic one succeeding quicker, and once again, allows you to merge in broken code.

All-in-one workflow

Now, we have a rough feeling of what works and what doesn’t: We need to make sure that the generic workflow runs only if no service files changed, and we need a fallback action per service, not once for all services.

Let’s walk through a hypothetical workflow in .github/workflows/api-service.yaml:

name: api-service
on:
  pull_request:
  # note: This runs on all changes, and is expected to do so!

jobs:

First, let’s declare our workflow and make sure it runs whenever we interact with pull requests. We do not add a file filter here, as we will check for service changes in the next step, this is really important!

change-detection:
  runs-on: ubuntu-latest
  outputs:
    api-service: ${{ steps.changes.outputs.api-service }}
  steps:
    - uses: dorny/paths-filter@v2
      id: changes
      with:
        list-files: shell
        # all files that are part of any workflow
        # basically a summary of "paths" property of all workflow files
        filters: |
          api-service:
            - 'services/api-service/**'
            - '.github/workflows/api-service.yaml'

Now, we’ll go over the changed files and see if anything related to our current service was modified. The result of this check will be stored as an output and used in subsequent steps. We’ll now follow a branching strategy: If, and only if, service files were modified, we will run the service-specific steps like testing, linting, etc. If this was not the case, we will run a dummy step that simply passes, which is necessary to pass the required status check if the service is not modified.

shards:
  needs: change-detection
  if: needs.change-detection.outputs.api-service == 'true'
  runs-on: ubuntu-20.04
  strategy:
    matrix:
      shard: [0, 1, 2, 3]
  steps:
    - uses: actions/checkout@v2

		# just an example, run your tests in parallel here
		- run: npm run test ...

after-shards:
  needs: shards # run after shards
  runs-on: ubuntu-20.04
  if: success() # only run when all shards have passed
  # store success output flag for ci job
  outputs:
    success: ${{ steps.setoutput.outputs.success }}
  steps:
    - id: setoutput
      run: echo "::set-output name=success::true"

So first, we run the tests under the condition that our api-service was modified in any way. After successfully running the tests in parallel, we launch an after-shards job, which exports a success flag. If the tests failed or were skipped, after-shards will not run, and no output will be present.

dummy-step:
  runs-on: ubuntu-latest
  needs: change-detection
  # runs if service was not changed
  if: needs.change-detection.outputs.api-service == 'false'
  outputs:
    success: ${{ steps.setoutput.outputs.success }}
  steps:
    - id: setoutput
      run: echo "::set-output name=success::true"

In case the service was not modified, we will simply add the success flag. Since this step only runs if the tests did not run and vice-versa, we will not run into prematurely-mergeable pull requests.

# step that will be used for required status check
# make sure it has a unique name! this will always run, but
# after the shards & after-shards steps or the dummy step in case
# the service was not modified
ci_api_service:
  runs-on: ubuntu-20.04
  if: always()
  needs: [shards, after-shards, dummy-step]
  steps:
    - run: |
        passed="${{ needs.after-shards.outputs.success || needs.dummy-step.outputs.success }}"
        if [[ $passed == "true" ]]; then
          echo "Shards passed"
          exit 0
        else
          echo "Shards failed"
          exit 1
        fi

At last, we always run the final job that checks if any success output was present. In case the tests ran, there will not be a dummy output and vice-versa so the final job will never pass if tests were failing. Since only the last step is required, GitHub Actions waits until this job has been completed, so you never run into skipped jobs that allow merging breaking changes.

This solution combines hours of research (read: trial and error) on how the job execution system works in combination with required status checks, and I think it’s quite elegant considering it gives us guarantees that all important checks will definitely run.