This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Core API Structure

This section covers common structural elements, aiming to be the organizational foundation of your REST-ful API. It ensures that your project cleanly separates resources and their contexts and offers explicit rules for paths, actions, and errors. This structure is designed to be simple, easy to understand, and easy to implement.

A clear and consistent structure brings numerous benefits. It ensures that all APIs follow the same design principles, making them easier to understand and use. Scalability is achieved by maintaining consistent foundational elements, allowing for the seamless addition of new features and services. Standardized patterns for paths, actions, and errors simplify maintenance and updates, reducing bugs and inconsistencies, which enhances maintainability. Improved collaboration among developers is facilitated by clear structure and guidelines, reducing onboarding and code review time. Additionally, a consistent and well-structured API improves ease of interaction, leading to enhanced user satisfaction.

Or, in plain talk: If everything follows the same rules, you can implement it once it in a shared library and move on to worrying about your business logic instead.

1 - Resource Entities

The required fields and naming conventions for all resources in the API.

Requirements

Resource API’s should only be expressed in JSON or YAML format. For more details on type negotiation, please refer to our section on Content-Type.

Fields

For all resources, as expressed in request and response payloads, the following fields are required:

KeyFormatDescription
idstringA unique identifier for this resource, using either a KSUID or a UUID.
created_timeRFC-3339The date that this resource was created, in UTC.
modified_timeRFC-3339The date this resource was last modified. It must be used for Last-Modified style cache requests.
etagstringA base-64 style string as the document’s version identifier.

Naming Conventions

All request and response fields must follow the same naming conventions:

  • All fields to be snake_case. No golang-style exceptions for acronyms like URL, ID, or similar.
  • All complex types which are expressed using a basic type, must be suffixed with the name of the complex type. For example:
    • External ID’s should be suffixed with _id (whether they’re UUID’s or not).
    • UUID’s that are not used as external ID references must be suffixed with _uuid.
    • Timestamps must be suffixed with _time.
    • Email addresses must be suffixed with _email.
    • URLs must be suffixed with _url.
  • Do not stutter. Instead of resource.resourceName, use resource.name.
  • When using generic terms that may apply to multiple resources, use the most specific version. For example, project_group and user_group instead of group.
  • When referencing a resource in a third-party system, where that third party system is the source of truth for that resource, prefix the field with the name of that system. Examples include: okta_user_id, ms_client_id, aws_credentials_id.
  • All APIs must consistently use the same fields to mean the same thing, without exceptions.

Forbidden Fields

The following fields and field groups are forbidden for all resource.

  • links or selfLink, and any other fields prescribed by HATEOAS. We explicitly do not support it.
  • Sensitive data of any form.
  • Internal use fields and or debugging information.
  • Binary data is explicitly not permitted as part of a resource.

Validations

  • Email addresses must be validated to ensure they follow a standard format.
  • All URL’s must be represented as absolute.
  • Phone numbers must be standardized to a single format, preferably using E.164.
  • All time notation must adhere to RFC-3339 and be expressed in UTC.

Reasoning

“There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.” ― Leon Bambrick

I can’t count how often I’ve asked for a consistent name to be used, just to receive pushback because the owner of the PR has their own views on the topic. In the end we all have the same goal: An easy-to-use system that is legible during our engineering day-to-day. Unfortunately, without a consistent naming pattern, every engineer applies their own personal view on things, and we end up with a patchwork of conflicting standards across our codebase.

This is a usability nightmare. Not only for our engineers, who now have to read most of the code to decipher what a particular field might mean. It’s an even bigger problem for API consumers, who now have to read the documentation to understand what a field might mean. Assuming your documentation is good - if it’s not, then expect your support channels to be overwhelmed with clarifications.

All of this boils down to time, and therefore money. Time your engineers have to spend, time your support team has to waste, time your customers have to invest in understanding your system. All of this can be avoided by simply adhering to this convention.

2 - Error Format

The response format and structure for all error messages.

Error messages are a critical part of the API contract and should be consistent across all resources. They provide the necessary feedback to the client to understand what went wrong, and most importantly how to fix it. In all cases, the message should be actionable, providing the user with the necessary information to resolve the issue. This will reduce the number of support tickets and increase the overall user experience.

An error message needs to solve three problems, all without leaking implementation details to the outside world:

  1. For both engineers and users, it needs to explain what happened.
  2. For users, it needs to provide actionable feedback to self-resolve their issues.
  3. For engineers, it needs to provide detailed information needed for triage.

The error format adopted by this contract adheres to the OAuth2 error response structure, as per RFC 6749. This structure is as follows:

HTTP/1.1 4xx/5xx Message

{
    "error": "an_error_key",
    "error_description": "Text providing additional information, used to assist the client developer in understanding the error.",
    "error_uri": "An optional link to documentation that may assist in resolving this error"
}
PropertyRequiredTypeDescription
erroryesstringA short, unique string which can be used to switch automatic remediation steps in the callee. For example, a user may be prompted to add missing information so that the request can be retried quickly. This code should be in snake_case.
error_descriptionyesstringAn actionable, human-readable message of the error that occurred. Actionable, in this context, means that a non-engineer can read it and know how to remediate their error; even if it’s something like “Please wait and try again later.“ In the case where an error is not actionable, this should be the communicated.
error_urinouriIf documentation exists to describe the details of a request and/or feature, you may optionally add a fully qualified URI which a user can read for more information. Please ensure that the existence of this URI is automatically and continuously verified.

If your site makes use of internationalization, the error description must be appropriately localized as per our Internationalization

Guidelines for writing a good error message

  • Proper grammar is important. Use proper capitalization and punctuation. Use a spell checker.
  • Always end your message with a period.
  • Never include the product name in the error.
  • Never use explicit service or technology names, preferring the role they serve. For example, Error reading from redis should instead be Error reading from cache.
  • Always speak in the active voice, not the passive voice.
  • Do not address the actor using the word “You”; simply describe the error.
  • Do not reveal details about the underlying implementation or internal constructs.

Example error strings

  • The field ‘field’ is required.
  • The field ‘field’ may not be longer than 256 characters.
  • The server is busy, please try again later.
  • This request is not authorized.

3 - Domain Names

A clear and organized domain name strategy, explicitly separating functional areas via using subdomains.

In designing a robust and scalable API, having a clear and organized domain name strategy is imperative.

Requirements

  1. Dedicated Domain for each Project/Product

    • Every distinct project or product must have its own domain or subdomain.
    • Example:
      • A company with a single product would host it at example.com
      • A company with multiple products would host them at product1.example.com, product2.example.com, etc.
  2. Subdomains for each Functional Area

    • Each significant functional area within a project should have its own subdomain.
    • Examples:
      • A product’s data warehouse would be hosted at data.product1.example.com
      • A product’s reporting api would be hosted at reports.product1.example.com
  3. Cross-functional functional areas to be hosted at closest shared parent domain

    • If there is an area that applies to multiple products in a domain, it should be hosted within the closest namespace that makes sense.
    • Example:
      • A company-wide authentication service would be hosted at auth.example.com.

Rationale

“Do one thing, and do it well.” – Doug McIlroy

The Unix philosophy of creating small, focused tools that work together to solve complex problems is a guiding principle for API design. By dedicating domains and subdomains to specific functional areas, we can create a clear, modular, and scalable structure that aligns with this philosophy.

By following this approach, you get the following benefits:

  • Composability: Not all products need to re-implement something that’s already been built. You can reuse existing services across different products.
  • Clarity: Developers can quickly locate the API they need.
  • Independence: Each domain can have its own development, CI, and lifecycle.
  • Security: Limiting the scope of potential security breaches, and discouraging use of private back-door endpoints for internal use.
  • Modularity: Teams can work on different domains simultaneously without interference, increasing iteration cycles and agility. Updates, patches, and new features can be rolled out to specific parts of the API without risking the stability of the entire system.
  • Scalability: The structure allows for the seamless addition of new features and services, as the foundational elements remain consistent.

4 - Path Patterns

This section covers rules around paths and resource addressing in API’s.

This section covers rules around paths used in your API’s. It ensures that all paths are consistent, easy to understand, and easy to implement. It’s divided into four sections: Resources, Queries, Rules, and Examples.

Resource Paths

All paths to resources must adhere to this pattern. Components of this pattern are described below.

  • /<version>/<resourcename>/<id>

<version>

The version section is there to allow for multiple versions of the same resource to exist simultaneously as your software goes through various lifecycles. This is a common pattern in API design, and it is important to get right, so use a simple, incrementing integer for versioned paths in your api: v1, v3, v333. If it is necessary to introduce a major breaking change, all paths in that API must increment at the same time, even though they may not have any changed code. This is to assist your lifecycle; by gathering breaking changes all together and implementing them all at once, you can deprecate them on the same schedule.

  • /v1/resourcename
  • /v2/renamed_resource

<resource_name>

A resource name should always be plural and use snake_case.

<id>

If an obvious identifier is not available, such as the ID of an upstream cloud resource or an identifier derived from dependent sources, IDs should be a V4 UUID seeded from a strong random source. This field is only relevant if you are directly addressing an individual resource.

POST   /v1/widgets/
GET    /v1/widgets/{id}
PUT    /v1/widgets/{id}
DELETE /v1/widgets/{id}

Specific requirements are available on our page about CRUD Operations

Query paths

It is difficult to express a complex set of query parameters in a URL’s query string. Therefore, all actions that ask for the creation of a result set - List, Aggregation, Graph, or otherwise - must be POST operations with the query expressed in the request body. A side-effect of this is that a commonly used list endpoint is not used in our API contract, as the POST action on that route is already used for resource creation.

POST /v1/widgets/query
POST /v1/widgets/aggregate

Specific requirements are described in more detail in their respective pages:

API Rules

These are concrete rules to which all of paths must adhere.

Sub-resources are undesirable

Sub-resources are a common usability improvement for an API, allowing easy scoping of a result set based on a parent entity. They are permitted if any of the following is true:

  • There is an explicit 1-to-N relationship between the parent and the child resource.

    • Example: There are many book reviews for one book.
    • Example: There are many reviews submitted by a single author.
  • A child resource cannot be uniquely identified without its parent.

    • Example: Street numbers are meaningless without the street they are on.

It is important to consider what a sub-resource URL communicates to an API consumer.

  • Does this subresource incorrectly imply a hierarchical relationship?
  • Is the 1-to-N relationship between the parent and child likely to change in the long run?

If the answer to either of the above questions is yes, do not create a sub-resource. Instead, create a top-level resource with its own query endpoints so a user can constrain their result set based on a relationship.

In the majority of cases, a sub-resource is not desirable. Consider the points above carefully before creating one.

Sub-resource path patterns

Note that a resource can have both a child access path and a top-level access path. In this case, the resource name in the path MUST be identical for both the child route and the root resource route, and creation of a child resource via the parent resource’s path MUST use the location header of the root path.

  • /v1/widgets/<id>/sprockets/<id>
  • /v1/sprockets/<id>

No Business Logic Actions

A REST API is, by design, an expression of the state of a system. Business logic actions can be roughly described as permutation requests against this state, some of which can happen concurrently, while others happen asynchronously. Most importantly, some of these actions cannot be done in parallel.

In order to prevent accidental parallel actions, or the conflation of ‘business logic actions’ and our above sub-resource requirements, it is required that all business logic must be expressed as entity permutations.

A common counterargument from the engineer’s perspective is that it is far easier to build a single-purpose endpoint than a single all-purpose endpoint. This is true, but it forces the API user to build their own complex entity validation and action routing. From their perspective, it is far easier to call a single endpoint for all business operations than implement - and keep track of - multiple ones. Customer usability is more important than engineer convenience.

Example: Running a report

  • Do not call execute on a report…

    • POST /v1/reports/0000000000000000/execute
  • Instead, ask to create a new snapshot

    • POST /v1/reports/00000000000/snapshots

In this example, we are separating the concept of a report configuration, and the data generated by the report at a specific point in time.

  • Example API routes for the generated reports
    • POST /v1/reports/0000000000000000/snapshots/
    • POST /v1/reports/0000000000000000/snapshots/query
    • GET /v1/reports/0000000000000000/snapshots/{id}

In this example, we are creating a new, explicit sub-resource that acts as a generated report snapshot for a specific point in time. This allows the user to query the report, and retrieve the data at a specific point in time. The response entity could include such useful status information as the state of the current report (if it is taking some time), the current progress, and when it was started. Furthermore, the creation of a report can be explicitly blocked if another one is already being run.

No Resource Expansion

An individual resource must not expand any resources to which it is linked. Doing so can lead to memory pointer bugs in clients, where multiple instances of a resource are created in memory, and the client has to manage the state of the API.

5 - CRUD Operations

The constraints for all Create, Read, Update, and Delete operations in the system.

The Create/Read/Update/Delete (CRUD) requirements are deliberately optimized for Cache usage and conflict control, so that the client has a rich suite of tools at their disposal to manage their local state, and could even rely on their client’s own cache to manage the state of the API.

Create

A create request is a POST request to the root resource path, with the entity to be created in the request body. The entity should not be returned in the response; instead, the response must use a 201 Created header, and indicate in the Location header where the entity can be retrieved (or polled).

Create Request

POST /v1/resourcename/  HTTP/1.1
Content-type: application/json

{ ... sufficient data ... }

Create Response

HTTP 201 Created
Location: /v1/resourcename/{id}

Read

A read operation is a GET request to the resource path, with the entity’s ID in the path. The entity should be returned in the response, including all entity version headers as described in Entity Versioning and Conflict Management. Conditional cache headers may be included in the request, and must be respected by the server.

Read Request

GET /v1/resourcename/{id}  HTTP/1.1
Accept: application/json

Read Response

HTTP 200 OK
Content-Type: application/json
ETag: "..."
Last-Modified: "..."
Cache-Control: max-age=3600
Vary: Accept, Origin

{
    "id": "...",
    "eTag": "...",
    ...
}

Update

An update operation uses the PUT action to replace all fields in the entity, assuming they may be replaced. If the client sends fields that may not be updated - such as created_time - they should be ignored. As with the GET request, all headers indicating entity version and age must be returned.

Update Request

PUT /v1/resourcename/{id}  HTTP/1.1
Accept: application/json
Content-Type: application/json

{
    "id": "...",
    ...
}

Update Response

HTTP 200 OK
Content-Type: application/json
ETag: "..."
Last-Modified: "..."
Cache-Control: max-age=3600
Vary: Accept, Origin

{
    "id": "...",
    ...
}

Delete

Delete operations are idempotent, and should return a 204 No Content response. The entity should not be returned in the response.

Delete Request

DELETE /v1/resourcename/{id}  HTTP/1.1

Delete Response

HTTP 204 No Content

No entity needs to be returned.

Reasoning

The entire point of this section of the contract is to make development easy for the front-end/downstream engineers. We do this by focusing on two areas: Controlling the client’s cache using HTTP Semantics from RFC-9110, and outright rejecting PATCH operations as they almost always carry with them complex diffing logic.

Details about how we accomplish cache control in resources is described in Entity Versioning and Conflict Management. As for PATCH operations, I’ve found from experience that they add quite a bit of engineering overhead to both the client and server logic. Since the client will likely already have a copy of the existing entity in memory, creating a new entity for a patch request is a waste of resources as well as a potential soure of diffing bugs - especially if the data schema expands.

At the same time, this creates a attack vector for malicious actors, who can easily infer that any PATCH operation must load the rest of the entity from the database before they can validate the resulting change. They can both infer validation logic from measuring the time of their requests (hashing algorithms are particularly vulnerable to timing attacks), and there is an entire class of State Injection attacks that can be performed if we know that a entity will be loaded, and then have a change applied, before being validated.

In short, don’t use PATCH requests. Accept the whole entity, validate it, and then try to apply the change. Benign actors will just send back what they have, and malicious actors won’t be able to get any data.

6 - Standard Endpoints

Every resource server must provide a set of standard endpoints to ensure that clients can discover and interact with the service.

The following endpoints are required for every resource server:

  • Well-Known Resources: /.well-known
  • Public API Descriptor: /openapi.(json|yaml)

/.well-known/

At the root of every resource server, a .well-known directory must exist as per RFC-5785. This directory is there to contain URL’s that will never change, and therefore ‘well-known’ for any client that wishes to perform some method of autodiscovery. Similar to DNS-SD, it’s purpose is a way to discover a resource server’s configuration without having to reference external documentation.

Creating a .well-known directory within a subdirectory is expressly forbidden. While technically permitted under certain readings of some RFC’s, it obfuscates the location of such documents and thus makes autodiscovery difficult at best, and impossible at worst.

/openapi.(json|yaml)

Every resource server must provide its own OpenAPI specification document at a consistent path. This API descriptor must be accurate enough, and well documented enough, to generate SDKs and clients directly from the document.

This endpoint exposes an OpenAPI v3.0.3 document that describes the public API of this version of the microservice. This endpoint must return appropriately formatted content for both application/json and application/yaml. Each new version of the API should be added to this same document; for details on the document structure, please refer to the OpenAPI Specification - Version 3.0.3.