Jul 17, 2020

The Weirdest Error I Ever Debugged

When you work on a software project of any kind, you'll come across situations where you spend hours of debugging time, picking everything apart to find why something just doesn't work. In most cases, you'll get down to the source relatively quickly, at least you might find indicators as to where you might need to look next.

I recently encountered a case that was different from the usual things that come up, like malformed SQL statements or other funky topics like race conditions and performance bottlenecks.

Working on my Portfolio API in Go, which is used to manage my newsletter, I was using my Postgres library of choice, pgx, to write some run of the mill CRUD logic. In this case, I attempted to load a subscriber, not caring about cases where the query didn't return any rows. My code looked something like the following

var isExistingSubscriber bool
err = tx.QueryRow(
  ctx,
  `select true from "subscriber" where "email" = $1;`,
  body.Email,
).Scan(&isExistingSubscriber)
if err != nil {
  // Handle error
  return
}

This is pretty self-explanatory, I ran a query to check whether a subscriber with a certain email address exists and based on that flip a toggle by scanning the true value returned only if the subscriber exists.

One important detail about the QueryRow method, though, is that if the driver does not receive any rows it'll return a pgx.ErrNoRows error. So far so good, with this in mind, we'll simply add a special check

if err != nil && !errors.Is(err, pgx.ErrNoRows) {
  // Handle error
}

Beautiful. I compiled and restarted the service and ran a request to check that everything was working as intended. In this case, no subscriber should be found, so I expected the flag to evaluate to false. The response was a nicely-formatted error message. Not what I expected, but fine, I fired up GoLand's trusty debugger and set some breakpoints during and after running the query above.

To my surprise, everything seemed like it should have succeeded: The flag was still set to false, and the query returned the error that no row was to be found. Continuing with the next step, however, we didn't skip the error handling, even though we clearly received an error, and it matched what we were looking for.

I mean sure, using the semi-recent errors.Is to run through the potential chain of wrapped errors and find an occurrence of our expected error might just not be supported, but the library did export the error as follows

var ErrNoRows = errors.New("no rows in result set")

With my runtime, that should have worked completely fine. Then, let's try something else, how about comparing the error plain and simple

if err != nil && err != pgx.ErrNoRows {

And it failed again. Plain equality also did not work. Of course, when debugging, I could just evaluate the current scope against whatever I liked to do, so I ran both the errors.Is and our equality attempt against the debugger, and it worked.

Matching the error worked while debugging, but not during runtime. I felt as if I was part of a big joke I didn't get, I couldn't even trust my debugger anymore.

But there was one last way to force it to work: String equality. We saw that the messages did match, so having

if err != nil && err.Error() != pgx.ErrNoRows.String() {

While that wasn't particularly pretty, or usable with error chaining, it worked. At that part, I was relieved that I didn't lose my sanity. But that failing comparison bugged me, it should have worked. It was the exact same error that was created in the library, which got returned and which I then checked against itself.

And in the midst of refactoring some other code, it hit me. Multiple versions of pgx can be consumed as a Go module, you can either consume github.com/jackc/pgx or github.com/jackc/pgx/v4, the latter of which I used for accessing the database.

In the request handler file, however, I imported v3. While the error itself never changed, comparing even the same structure across different code sources, will lead to inequality.

To demonstrate this case, I've put together a straightforward reproduction of the comparison

package main

import (
	"errors"
	"fmt"
	v3 "github.com/jackc/pgx"
	v4 "github.com/jackc/pgx/v4"
)

func main() {
	someErr := v4.ErrNoRows

	if errors.Is(someErr, v3.ErrNoRows) {
		fmt.Println("It's a match!")
	}
}

When running this file, you'll never see the message printed. Once you switch v3 to v4 in the condition, it'll work as expected.

This was an interesting lesson in understanding the Go internals, but I felt like I experienced something else as well: If I had checked out the imports first, this might have been obvious. When debugging problems, seeing the full picture is key to determining which part needs fixing and why an issue occurred in the first place.

In the end, the reason behind most errors is often simple, getting there is the hard part. Often it helps to pair up, to get a fresh pair of eyes to make sure you haven't ruled out any part that might be the true root cause.

Taking a break, maybe doing something completely different, giving your brain some much-needed time to process also helps to find solutions that seemed distant before.


Thanks for reading! As teased in the post, I just launched my brand-new newsletter, so if you want to stay on top of new posts, feel free to subscribe! If you've got any questions, suggestions, or feedback in general, don't hesitate to reach out on Twitter or by mail.