Jan 23rd, 2022

🪆 Important Rules of Embedding Types in Go

Hey there 👋 I would like to quickly plug a product I am working on to make teams move faster, happier. If you are working on a software product or are interested in features like preview environments, infrastructure secret management, access control and approval workflows for teams, and other topics in cloud resource management, make sure to check it out!

While Go doesn’t come with the typical language concepts for inheritance-based type hierarchies‚ it provides a powerful concept to reuse parts from other interfaces and structs, using embedding.

Using the example from Effective Go, for interfaces, embedding allows borrowing pieces of an implementation by embedding types

type Reader interface {
    Read(p []byte) (n int, err error)
}

type Writer interface {
    Write(p []byte) (n int, err error)
}

// ReadWriter is the interface that combines the Reader and Writer interfaces.
type ReadWriter interface {
    Reader
    Writer
}

Creating a ReadWriter interface that embeds other interfaces combines the functionality of both the Reader and Writer. Within interfaces, you can only implement interfaces!

A similar idea is to embed structs within structs, using unnamed (anonymous) fields

// ReadWriter stores pointers to a Reader and a Writer.
// It implements io.ReadWriter.
type ReadWriter struct {
    *Reader  // *bufio.Reader
    *Writer  // *bufio.Writer
}

This example constructs a ReadWriter struct that points to embedded Reader and Writer structs. You cannot use methods or fields from the Reader or Writer before pointing to valid structs, of course.

Embedding the fields instead of providing a field name allows to call the methods and arguments without an indirection, e.g. (readWriter.Read() instead of readWriter.reader.Read()).

There are some important aspects you need to keep in mind when using embedding in your applications, though. In the following, we’ll go over some of the common errors that can be hard to reason about.

Calling methods on the root type

When calling methods that are exposed by embedding anonymous fields on a struct, the receiver is always the inner type that declares the method. In a way, embedding forwards the information about a method existing to the outer type, while forwarding the method invocation back to the inner type that exposed it in the first place.

package main

import (
	"fmt"
)

type Dog struct {
	sound string
}

func (d *Dog) Sound() string {
	return d.sound
}

type Animal struct {
	Dog
	sound string
}

func main() {
	a := Animal{
		sound: "...",
		Dog:   Dog{sound: "woof"},
	}
	fmt.Println(a.Sound()) // <- woof
}

In this example, the method Sound() is forwarded to the Animal struct from the Dog struct, so we can call Sound() on an Animal. Doing this, however, will forward the call to the Dog struct!

Naming conflicts and hiding fields

When embedding types, you need to keep a couple of rules in mind.

A field or method X hides any other field or method X in a more deeply nested part of the type

Let’s create a short example of embedding types through multiple layers.

package main

import (
	"fmt"
)

type C struct {
	Hello string
}

type B struct {
	C
}

type A struct {
	B
}

func main() {
	A := A{B: B{C: C{Hello: "universe"}}}
	fmt.Println(A.Hello) // <- universe
}

The Hello field in C is embedded by B which is embedded in return by A. By this, A receives all fields and methods from B, which receives all fields and methods from C, so Hello is available on A. This case yields the expected universe value.

type A struct {
	B
	Hello string // <- added
}

func main() {
	A := A{Hello: "world", B: B{C: C{Hello: "universe"}}}
	fmt.Println(A.Hello) // <- world
}

Adding a Hello field to a higher layer hides the field in all more deeply nested parts of the type. Running the program again no longer yields universe but world.

If the same name appears at the same nesting level, the program will not compile. If the duplicate name is never used, there’s no error.

package main

import (
	"fmt"
)

type B struct {
	Hello string
}

type C struct {
	Hello string
}

type A struct {
	B
	C
}

func main() {
	A := A{B: B{Hello: "world 1"}, C: C{Hello: "world 2"}}
	fmt.Println(A.Hello) // <- not allowed, will break
	fmt.Println(A.B.Hello) // <- world 1
	fmt.Println(A.C.Hello) // <- world 2
}

In this case, we have the Hello field as part of two structs that are embedded by our main struct. Trying to access Hello on the root struct A will not compile. When you instead access Hello on one of the embedded structs, everything works as expected.

Marshaling / Unmarshaling

Trying to unmarshal JSON values into structs that embed other types can quickly lead to unexpected behaviour. Let’s use the following example for the remainder of this section:

package main

import (
	"encoding/json"
	"fmt"
)

type Dog struct {
	Color string `json:"color"`
}

type Cat struct {
	Color string `json:"color"`
}

type Animal struct {
	Dog
	Cat
	Kind string `json:"kind"`
}

func main() {
	var a Animal
	err := json.Unmarshal([]byte(`{"kind":"dog","color":"golden"}`), &a)
	if err != nil {
		panic(err)
	}

	fmt.Println(a.Kind)
	fmt.Println(a.Dog.Color)
	fmt.Println(a.Cat.Color)
}

What would you think this produces? Note the duplicate exported Color field with a JSON struct tag for the Dog and Cat struct. Ideally, since the types are equal, we’d like to have the Color value be present regardless, right?

Unfortunately, Go will print dog and exit. Our color field was completely ignored.

This can be explained by reading the documentation on Marshal, which in this case also applies to Unmarshal:

Anonymous struct fields are usually marshaled as if their inner exported fields were fields in the outer struct, subject to the usual Go visibility rules amended as described in the next paragraph. An anonymous struct field with a name given in its JSON tag is treated as having that name, rather than being anonymous. An anonymous struct field of interface type is treated the same as having that type as its name, rather than being anonymous.

The Go visibility rules for struct fields are amended for JSON when deciding which field to marshal or unmarshal. If there are multiple fields at the same level, and that level is the least nested (and would therefore be the nesting level selected by the usual Go rules), the following extra rules apply:

  1. Of those fields, if any are JSON-tagged, only tagged fields are considered, even if there are multiple untagged fields that would otherwise conflict.
  2. If there is exactly one field (tagged or not according to the first rule), that is selected.
  3. Otherwise there are multiple fields, and all are ignored; no error occurs.

This seems to explain our case. Go collected all possible fields from our anonymous struct fields (embedded types), noticed that we had multiple fields of the same name at the same level, both of which were tagged, and ignored all without returning an error.

Let’s modify our example slightly to fix this issue in a simple way.

package main

import (
	"encoding/json"
	"fmt"
)

type Dog struct {
	Color string `json:"color"`
}

type Cat struct {
	Color string `json:"color"`
}

type AnimalBase struct {
	Kind string `json:"kind"`
}

type Animal struct {
	AnimalBase
	Dog
	Cat
}

func (a *Animal) UnmarshalJSON(raw []byte) error {
	var base AnimalBase
	err := json.Unmarshal(raw, &base)
	if err != nil {
		return err
	}

	a.AnimalBase = base

	switch base.Kind {
	case "dog":
		var dog Dog
		err = json.Unmarshal(raw, &dog)
		if err != nil {
			return err
		}
		a.Dog = dog
	case "cat":
		var cat Cat
		err = json.Unmarshal(raw, &cat)
		if err != nil {
			return err
		}
		a.Cat = cat
	}
	return nil

}

func main() {
	var a Animal
	err := json.Unmarshal([]byte(`{"kind": "dog", "color":"golden"}`), &a)
	if err != nil {
		panic(err)
	}

	fmt.Println(a.Kind)
	fmt.Println(a.Dog.Color)
	fmt.Println(a.Cat.Color)
}

Before running the program again, we’ll walk through the changes: The Kind field moved from the Animal struct to the AnimalBase struct, which Animal now embeds. This change is required by the next step: We have implemented the Unmarshaler interface by declaring an UnmarshalJSON function on the Animal struct. When calling json.Unmarshal(..., &animal) the next time, Go will invoke this custom function.

Instead of unmarshaling the complete struct, our strategy is to decode the most specific animal kind based on the Kind value. We do this by unmarshaling the AnimalBase first, which works as AnimalBase is a part of Animal (by the nature of embedding), and the passed value resembles a complete Animal.

What would not have worked is trying to unmarshal into an Animal, as this would have caused an infinite loop: When calling Unmarshal on an Animal, Go would have invoked our function which would have called Unmarshal on an Animal, and so on.

With the Kind value available, we can switch over the supported animal kinds, and Unmarshal into the specific structs in case of a match. This completely removes the issue with conflicting names, as the specific structs should not have any fields (or embed structs) with matching field names.

Running the program again yields

dog
golden

Much better!


Embedding types is a great way of sharing fields and methods and allows you to create constructions like union types. It’s important to be aware of the underlying rules that determine how your program will run because it might diverge from your expected outcome. Switching between TypeScript and Go I was surprised to find out that unmarshaling a union-like data structure with embedded fields of the same name would lead to missing values until I read up on the documentation.

Thanks for reading this post 🙌 I would like to quickly plug a product I am working on to make teams move faster, happier. If you are working on a software product or are interested in features like preview environments, infrastructure secret management, access control and approval workflows for teams, and other topics in cloud resource management, make sure to check it out!

Bruno Scheufler

Software Engineering, Management

On other platforms