May 19, 2019

Reaching Consensus: GraphQL Input Unions

Achieving consensus in open source projects can be a long-winded process, especially when there's room for numerous implementation directions. To make it clear from the beginning, the process of finding the best solution through consensus in working groups or steering committees is, although tedious, the best way to go forward and maintain successful large-scale projects that require those measures, judging by my experience. A great example of this workflow, which I've followed closely is the proposal of input union types in the GraphQL specification.

Putting the technical details aside for a second, it's important to note that with the formation of the GraphQL Foundation, hosted by the Linux Foundation, governance of the GraphQL specification, reference implementation (graphql-js) and basic tooling is slowly shifting away from Facebook to a more user-centric group of company representatives and open source maintainers that will continue to steer progress in every part of the query language and its ecosystem.

Because of the skyrocketing adoption GraphQL has experienced in the last couple of years, new features have to be carefully evaluated before they are added to the specification since reference implementations in numerous languages have to follow those changes eventually.

Now back to input unions. The formal definition from the spec includes the following:

GraphQL Unions represent an object that could be one of a list of GraphQL Object types, but provides for no guaranteed fields between those types.

A really minimal example for a schema using union types could look like the following

# The type below is the most important thing here,
# it will either represent a Photo or a Person type
union SearchResult = Photo | Person

type Person {
  name: String
  age: Int
}

type Photo {
  height: Int
  width: Int
}

type SearchQuery {
  firstSearchResult: SearchResult
}

From this snippet you can already get the benefit of union types: By adding multiple types to a union, you can return more than one type from the same query. If you were to construct a schema without using this feature, you'd have to structure it differently, for example returning an object with nullable results for either a Photo or a Person object.

So unions are really useful and this is all great, let's continue and think of structuring an interactive web application including mutations, for example, a basic social network where you want to allow users to publish and retrieve stored content of multiple media types with one mutation that would roughly look like the following:

scalar Date
scalar Url
scalar Json

# Define the post base interface which
# contains fields all posts should inherit
interface PostBase {
  id: ID!
  author: User
  tags: [String!]
  createdAt: Date
}

# An image post contains an image location
# as well as optional metadata
type ImagePost implements PostBase {
  # Interface field implementations
  # are excluded for visual reasons

  imageType: String
  height: Int!
  width: Int!
  url: Url!
  metadata: Json
}

# A text post simply contains text content (duh)
type TextPost implements PostBase {
  # Interface field implementations
  # are excluded for visual reasons

  content: String!
}

# A link post contains a URL
type LinkPost implements PostBase {
  # Interface field implementations
  # are excluded for visual reasons

  url: Url!
}

# This union combines all post types into one
union Post = ImagePost | TextPost | LinkPost

type Query {
  # Querying is easy, all post-related queries will return
  # one of the union types
  post(id: ID!): Post
  recentPosts: [Post]
}

type Mutation {
  # But how do we structure inputs?
  createPost(): Post
}

Ideally, you would want to use a union type for inputs as well, so users can choose which post type they submit. But that's where the difficulty begins, how should servers decide which type an input is? Current proposals reach from splitting up input contents using directives like @oneField, to clients sending an __inputname field containing the type name inside of the input object.

Comparing Implementations

To understand the difficulties behind implementing input union types, I've selected the three most popular and recent ideas for solving this problem including using the suggested inputUnion keyword and an __inputname field to define the input object's type, tagged unions which base their type detection on having fields unique to each type and finally the @oneField (optionally named @taggedUnion) directive, which is another way to implement the tagged union proposal using directives instead of a new syntax.

Schema Definitions

__inputname Field

🔗 Related PR

# PostInput input type for text-based posts
input PostInput {
  title: String!
  body: String!
}

# ImageInput input type for image posts
input ImageInput {
  photo: String!
  caption: String
}

# Each media block can be one (and only one) of these types.
inputUnion MediaBlock = PostInput | ImageInput

type Mutation {
   addContent(content: [MediaBlock]!): Post
}

Tagged union

🔗 Related PR

# PostInput input type for text-based posts
input PostInput {
  title: String!
  body: String!
}

# ImageInput input type for image posts
input ImageInput {
  photo: String!
  caption: String
}

# Each media block can be one (and only one) of these types.
input MediaBlock = { post: PostInput! } | { image: ImageInput! }

type Mutation {
   addContent(content: [MediaBlock]!): Post
}

@oneField (@taggedUnion) Directive

🔗 Related PR

# PostInput input type for text-based posts
input PostInput {
  title: String!
  body: String!
}

# ImageInput input type for image posts
input ImageInput {
  photo: String!
  caption: String
}

# Each media block can be one (and only one) of these types.
input MediaBlock @oneField {
  post: PostInput
  image: ImageInput
}

type Mutation {
  addContent(content: [MediaBlock!]!): Post
}

Example Content Mutation

Since all proposals are roughly based on the same schema, we only need to know one mutation to add new content, which accepts a content variable of the defined MediaBlock input union list type.

mutation AddContent($content: [MediaBlock!]!) {
  addContent(content: $content) {
    id
  }
}

Accepted Inputs

Below you can find variable inputs for $content that will be accepted by the server. You can clearly see the differences of their respective implementations, as the __inputname proposal input includes the aforementioned field, whereas the tagged union input doesn't include an explicit structure to differ between input union types but rather builds on the usage of unique, type-specific fields. Another distinction can be seen in the input for the @oneField directive, which uses a descriptive nested object structure instead.

__inputname Field

[
  {
    "__inputname": "PostInput",
    "title": "Hello",
    "content": "World"
  },
  {
    "__inputname": "ImageInput",
    "photo": "http://graphql.org/img/logo.svg",
    "caption": "Logo"
  }
]

Tagged union

[
  {
    "title": "Hello",
    "content": "World"
  },
  {
    "photo": "http://graphql.org/img/logo.svg",
    "caption": "Logo"
  }
]

@oneField (@taggedUnion) Directive

[
  {
    "post": {
      "title": "@oneField directive",
      "body": "..."
    }
  },
  {
    "image": {
      "photo": "https://..."
    }
  }
]

To follow the various threads, I've put together a timeline of issues and pull requests related to input unions and similar proposals below.

A Timeline

2015

2016

2017

Interestingly enough, I couldn't find any discussion around union input types that was specifically collected in an issue or pull request opened in 2017.

2018

2019

As time goes on, we might finally get a proposal with enough momentum to be accepted and integrated into the specification and reference implementations. One recent RFC I'd follow closely if you're interested in how things will turn out is 🔗 Start Input Union RFC document, which is a pragmatic attempt to transform current implementation ideas of input unions into a written document.

I think this writeup should suffice for the time being, don't forget to check out the threads linked above, since there's a lot of progress going on currently and we might decide for a solution in the next time, with implementations for graphql-js and other reference libraries landing soon!