overwatering.org

blog

about

If you’ve built a service with an API, that service is under continuous development and the service has consumers then you’re going to need to version that API. You’ll want to build new features with the freedom to change existing behaviour, and you’re going to want to track and retire really old versions.

But how do you version a modern RESTful HTTP API?

  1. Treat the entire API as a single product, and apply a version number to the whole thing. This is useful for services consumed by large numbers of consumers that you have a very limited relationship with. Think external APIs.
  2. Version the documents that the service produces and consumes, effectively treating the data as a collection of separate products. This is effective when there are a smaller number of consumers, and there is a stronger relationship between consumers and producers. Think internal APIs.

API as a Product

This approach is to incorporate the version number into the API: /api/v1/transactions. But this requires that all changes across the entire API move in lock-step. Breaking changes will need to batched up, and released at once, as a new version. When a new version is released it’s a lot of work for consumers to update. URLs everywhere will change, many, many documents will have different fields with different meanings.

This approach works very nicely when you are providing an API for external consumers. For one thing, it makes it easy to disable old clients. The batching makes it a poor choice for internal consumers. Decoupling release schedules is usually a big motivation for internal APIs in the first place.

Versioning Documents

Moving the version number into the URL for each resource is another possibility: /api/transactions/v1. There are a few disadvantages with this approach.

  1. As you can see from that URL, it doesn’t really make sense. Is a v1 transaction semantically the same as a v2 transaction?
  2. A relatively isolated breaking change will have rippling effects throughout the API and consumers. The consumer will have to update all URLs that they consume, even if they’re not impacted by the breaking change.
  3. Linking becomes a challenge. How does a resource know which version of another resource to link to? After deploying v2 of transactions, where should the existing v1 of accounts link to?

Instead, approach this problem from first principles.

URLs represent resources that exist in the domain. Links represent relationships between those resources. If either the actions that can be performed or the data attached to a resource change, then the resource itself hasn’t changed — so it shouldn’t get a new version number.

Introducing a new resource isn’t a breaking change. Removing a resource is a pretty dramatic action, but it can be handled with status codes: 410 Gone springs to mind. Splitting a resource into multiple resources can be handled by adding links to the new sub-resource to the existing resource.

There’s no need to put version numbers in URLs.

The links represent relationships between resources, and actions that can be taken. These links can always either be present or not. The links between resources are generated by the server that provides the resource. Because links are the things that can be done with a resource, then the consumer always needs to check for their presence before attempting to follow them. We already know that we don’t need to put a version in the link itself, but even though links look like data they also don’t appear to require versions.

The only place that breaking changes can occur is in the data that is provided as the representation of a resource. Fields can be removed, or the meaning or structure of a field can change. This is what should be versioned. Anytime a breaking change to a representation is made, the representation should be assigned a new version.

How should this version be transmitted? As the content type. For example, for a transaction a suitable content type might be application/vnd.com.myco.transaction+json; version=20180503. I haven’t seen many APIs adopt this, though Django REST supports it directly.

There are a number of advantages to this approach.

  • Once a system starts defining content types for the key resources it exchanges, then those content types can be re-used across the system. For example, a transaction is probably used in more than one place.
  • Services and consumers can negotiate over the content type. This means that a service can choose to continue to support an older version for some time. The consumers request the version they want, and the service may be able to produce that version, or report 406 Not Acceptable or 415 Unsupported Media Type as appropriate. Interestingly, this happens in both directions: a service will both produce and consume a content type.
  • In the above example, +json is included to indicate how the representation is serialised. This can be negotiated over as well. For some resources, there can even be multiple representations. HTTP caches and client libraries are aware of this difference, and will cache according to URL and content type.
  • The API is now described in a way that is close to the domain, which alone is helpful documentation. It’s also really clear that not all application/json documents are the same, and it is therefore really easy to find the documentation for what an API produces.
  • And finally, these content types become a really good basis for contracts between APIs and consumers. A consumer-driven test of an API is asserting that the content type required is produced, containing the fields required, that are still of the right type.

In short, version your API. But don’t version the resources, or the links between them. Instead, version the content types produced by and accepted by the API.