Apr 17th, 2022

🚏 Routing Traffic for Dynamic Deployments using Traefik

Hey there 👋 I would like to quickly plug a product I am working on to make teams move faster, happier. If you are working on a software product or are interested in features like preview environments, infrastructure secret management, access control and approval workflows for teams, and other topics in cloud resource management, make sure to check it out!

Whenever I needed to route traffic to one of my side project deployments, I typically chose nginx in the past. I knew the setup process, including requesting certificates using an ACME client like Certbot with LetsEncrypt. Over time, however, adding new deployments started getting tedious.

A big issue of the previous stack was that configuration was exclusively file-based. This meant that every time I wanted to add a new site, even if it was just for reverse proxying to a Docker container, I’d have to append the configuration on disk and reload nginx. Furthermore, requesting certificates if you have dozens of domains becomes slightly chaotic when automatic renewals aren’t enabled.

Ideally, I’d deploy a reverse proxy in a container, making it easier to service, which could detect the deployments on the same machine and automatically manage routing and TLS termination including the provisioning of certificates.

After a bit of searching, I stumbled over Traefik, which I haven’t looked at in a long time. After some quick research, I was delighted to see that their architecture matched my requirements one-to-one.

As an alternative to Traefik, I initially looked at using Cloudflare Tunnel, which unfortunately does not seem to support gRPC at the time of writing. This was a strong requirement for an upcoming project, so I couldn’t go with this approach. I’ll still write a guide on the experience with Cloudflare Tunnel in the coming weeks, simply because it’s super easy to proxy traffic without the need of exposing ports or managing certificates.

An introduction to Traefik

Traefik (pronounced traffic) describes itself as an edge router, which automatically discovers the right configuration for your services. It has built-in support for many orchestrators, including Docker, Kubernetes, AWS ECS, etc. which it uses for discovering your deployed applications, creating a dynamic configuration, and removing the need for manually writing down where traffic should be routed to.

I’ve used Traefik primarily with the Docker provider, as this enables me to deploy side project workloads as containers on a simple virtual server, and automatically route traffic to my services.

Using lego, Traefik also implements an ACME client, so you can generate certificates for your services on the fly. This is incredibly convenient, as you can simply deploy a container with the appropriate labels and let Traefik take care of provisioning or renewing a certificate if needed.

In addition to simply routing traffic to services, there are many additional features like middleware to customize request handling (redirect, rate-limit, etc.), access logs, metrics, and tracing using OpenTracing. The Traefik community is quite active, and new versions are usually released multiple times a year.

Detecting deployed Docker workloads

When you want to expose a container to the internet via Traefik, all you need to do is to add a couple of labels. Let’s walk through an example deployment using Docker Compose to understand what’s needed for automatic certificate provisioning!

# modified version of https://doc.traefik.io/traefik/user-guides/docker-compose/acme-http/
version: '3.8'
services:
  traefik:
    image: 'traefik:v2.7'
    container_name: 'traefik'
    # Route to any container, requires that port is exposed!
    # We can assign random ports by not specifying a host port and setting
    # the internal (known) port for Traefik with Docker container labels
    network_mode: 'host'
    command:
      - '--accesslog=true'
      - '--api.insecure=true'

      # Configure Docker provider
      - '--providers.docker=true'
      - '--providers.docker.exposedbydefault=false'
      # Since we run in host network mode, connect to port binding (host) IP/Port instead
      # of internal (container) IP/Port
      - '--providers.docker.usebindportip=true'

      - '--entrypoints.web.address=:80'
      - '--entrypoints.websecure.address=:443'

      # Configure TLS using LetsEncrypt
      - '--certificatesresolvers.main.acme.httpchallenge=true'
      - '--certificatesresolvers.main.acme.httpchallenge.entrypoint=web'
      - '--certificatesresolvers.main.acme.email=<your email>'
      - '--certificatesresolvers.main.acme.storage=/letsencrypt/acme.json'
    volumes:
      - './letsencrypt:/letsencrypt'
      - '/var/run/docker.sock:/var/run/docker.sock:ro'

This is all you need to deploy a single Traefik instance accepting traffic at ports 80 (HTTP) and 443 (HTTPS). The instance is configured to use the Docker provider, but will only include containers that specifically enable Traefik routing. We also run the container in the host network, so that we can reach any container or application on the host that is bound to a port.

For this, we need to use the host network in the container settings, as well as instruct Traefik to use the bind port and IP instead of the internal port and IP of a container. The latter is necessary because the Traefik container will not be in the same network as the other containers and thus is unable to route traffic through internal Docker networking. Using host networking also removes the requirement of publishing any ports, as it uses the same network namespace.

Last but not least, we create a certificate resolver called main that is configured to use the HTTP challenge and listen on the web entrypoint (i.e. on port 80). It stores all details in a bind mount.

To interface with the Docker daemon, we also pass the Docker UNIX socket.

After setting up Traefik, let’s create a demo service to see how we can route traffic and terminate TLS!

whoami:
  image: 'traefik/whoami'
  container_name: 'traefik-whoami'
  ports:
    - '127.0.0.1::80'
  labels:

This service publishes a random port binding on the host network and routes traffic to port 80 inside of the container. The reasoning behind this decision is that when you deploy multiple applications on the same host, you might have to come up with port rules to avoid collisions between deployed workloads. An easy workaround is not to set a host port at all, so Docker chooses a random, unassigned port. You can also specify a range of ports if needed. You might wonder how Traefik can route traffic to a port you don’t even know, and that’s a great question, which we’ll solve in a couple of steps!

Next, we need to add a couple of labels to be picked up by Traefik.

- 'traefik.enable=true'

This label is required as we turned off exposedbydefault in our Docker provider. Without this label, Traefik will simply ignore the container and its configuration labels.

- 'traefik.http.routers.whoami.rule=Host(`<some hostname>`)'
- 'traefik.http.routers.whoami.entrypoints=websecure'

These two labels make sure we create a new router for the whoami service, which picks up all HTTPS traffic (via the websecure endpoint bound to port 443) to the given host.

- 'traefik.http.routers.whoami.tls.certresolver=main'

To accept encrypted traffic, we need to specify the certificate resolver we configured earlier. Setting this enables automatic provisioning using the LetsEncrypt ACME client with the HTTP challenge.

- 'traefik.http.routers.whoami.service=whoami'
# Route to random port binding associated with internal port 80
- 'traefik.http.services.whoami.loadbalancer.server.port=80'

Next, we need to help the Docker provider find the right IP/port combination to route traffic to. Previously, we opted for a random port to avoid collisions. We also know that Traefik and other deployments may not change the same Docker network and thus simply expose the random port.

Usually, when only one port is exposed, Traefik detects and uses that port. When more than one port is exposed, you need to configure the port manually in the service settings. For this, we bind a new whoami service to the router and configure port 80.

Since we enabled usebindportip, Traefik will not route traffic to the internal IP of the whoami container and port 80, but use the binding instead. To do this, it searches for a binding (IP/port combination on the host) with a matching container (internal) port based on the port setting. This requires us to set the loadbalancer.server.port label.

# Redirect to https
- 'traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https'
- 'traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=true'

- 'traefik.http.routers.redirect-to-https.entrypoints=web'
- 'traefik.http.routers.redirect-to-https.rule=!PathPrefix(`/.well-known/acme-challenge/`)'
- 'traefik.http.routers.redirect-to-https.middlewares=redirect-to-https'

At last, we create a new router on the HTTP entrypoint that will redirect all traffic not going to the ACME challenge to HTTPS. This is completely optional, but removes the need to configure such a redirect on all deployed services.

Accepting gRPC traffic

To forward gRPC traffic, you’ll need to decide whether you want to pass on unencrypted HTTP/2 traffic using the h2c scheme, or if you want to encrypt gRPC traffic between Traefik and your application. Since I’m hosting Traefik on the same machine as the gRPC server, I’ll outline the configuration change you need for unencrypted proxying.

- 'traefik.http.routers.grpc-app.service=grpc-app'
- 'traefik.http.services.grpc-app.loadbalancer.server.scheme=h2c'

Checking everything works

Now, that you’ve configured Traefik and your services, let’s check how we can make sure it all worked. By adding - "--api.insecure=true" to our Traefik configuration, we exposed a dashboard at port 8080. As the CLI option says, this method of exposing the dashboard does not allow for security measures like authentication or encryption, so you shouldn’t use the setting in production.

To test your HTTP applications, just navigate your browser to the endpoint or run a simple curl. Make sure the redirect rule works as expected by sending traffic using the http scheme. For your gRPC services, I recommend using grpcurl. If you cannot reach your endpoint, try sending traffic against the host binding IP/port combination. In case your application runs without TLS, set the -plaintext option. Also, make sure to provide your Proto Source files in case your server does not support Server Reflection.

Thanks for reading this post 🙌 I would like to quickly plug a product I am working on to make teams move faster, happier. If you are working on a software product or are interested in features like preview environments, infrastructure secret management, access control and approval workflows for teams, and other topics in cloud resource management, make sure to check it out!

Bruno Scheufler

Software Engineering, Management

On other platforms