Core API Structure
This section covers common structural elements, aiming to be the organizational foundation of your REST-ful
API. It ensures that your project cleanly separates resources and their contexts and offers explicit rules for paths,
actions, and errors. This structure is designed to be simple, easy to understand, and easy to implement.
A clear and consistent structure brings numerous benefits. It ensures that all APIs follow the same design
principles, making them easier to understand and use. Scalability is achieved by maintaining consistent foundational
elements, allowing for the seamless addition of new features and services. Standardized patterns for paths, actions, and
errors simplify maintenance and updates, reducing bugs and inconsistencies, which enhances maintainability. Improved
collaboration among developers is facilitated by clear structure and guidelines, reducing onboarding and code review
time. Additionally, a consistent and well-structured API improves ease of interaction, leading to enhanced user
satisfaction.
Or, in plain talk: If everything follows the same rules, you can implement it once it in a shared library and
move on to worrying about your business logic instead.
1 - Resource Entities
The required fields and naming conventions for all resources in the API.
Requirements
Resource API’s should only be expressed in JSON or YAML format. For more details on type negotiation, please refer to
our section on Content-Type
.
Fields
For all resources, as expressed in request and response payloads, the following fields are required:
Key | Format | Description |
---|
id | string | A unique identifier for this resource, using either a KSUID or a UUID. |
created_time | RFC-3339 | The date that this resource was created, in UTC. |
modified_time | RFC-3339 | The date this resource was last modified. It must be used for Last-Modified style cache requests. |
etag | string | A base-64 style string as the document’s version identifier. |
Naming Conventions
All request and response fields must follow the same naming conventions:
- All fields to be snake_case. No golang-style exceptions for acronyms like URL, ID, or similar.
- All complex types which are expressed using a basic type, must be suffixed with the name of the complex type. For
example:
- External ID’s should be suffixed with
_id
(whether they’re UUID’s or not). - UUID’s that are not used as external ID references must be suffixed with
_uuid
. - Timestamps must be suffixed with
_time
. - Email addresses must be suffixed with
_email
. - URLs must be suffixed with
_url
.
- Do not stutter. Instead of
resource.resourceName
, use resource.name
. - When using generic terms that may apply to multiple resources, use the most specific version. For
example,
project_group
and user_group
instead of group
. - When referencing a resource in a third-party system, where that third party system is the source of truth for that
resource, prefix the field with the name of that system. Examples
include:
okta_user_id
, ms_client_id
, aws_credentials_id
. - All APIs must consistently use the same fields to mean the same thing, without exceptions.
Forbidden Fields
The following fields and field groups are forbidden for all resource.
links
or selfLink
, and any other fields prescribed by HATEOAS. We explicitly do not support it.- Sensitive data of any form.
- Internal use fields and or debugging information.
- Binary data is explicitly not permitted as part of a resource.
Validations
- Email addresses must be validated to ensure they follow a standard format.
- All URL’s must be represented as absolute.
- Phone numbers must be standardized to a single format, preferably using E.164.
- All time notation must adhere to RFC-3339 and be expressed in UTC.
Reasoning
“There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.” ― Leon
Bambrick
I can’t count how often I’ve asked for a consistent name to be used, just to receive pushback because the owner of the
PR has their own views on the topic. In the end we all have the same goal: An easy-to-use system that is legible
during our engineering day-to-day. Unfortunately, without a consistent naming pattern, every engineer applies their
own personal view on things, and we end up with a patchwork of conflicting standards across our codebase.
This is a usability nightmare. Not only for our engineers, who now have to read most of the code to decipher what
a particular field might mean. It’s an even bigger problem for API consumers, who now have to read the documentation to
understand what a field might mean. Assuming your documentation is good - if it’s not, then expect your support channels
to be overwhelmed with clarifications.
All of this boils down to time, and therefore money. Time your engineers have to spend, time your support team has to
waste, time your customers have to invest in understanding your system. All of this can be avoided by simply adhering to
this convention.
2 - Error Format
The response format and structure for all error messages.
Error messages are a critical part of the API contract and should be consistent
across all resources.
They provide the necessary feedback to the client to understand what went wrong,
and most importantly how to fix it.
In all cases, the message should be actionable, providing the user with the
necessary information to resolve the issue.
This will reduce the number of support tickets and increase the overall user
experience.
An error message needs to solve three problems, all without leaking
implementation details to the outside world:
- For both engineers and users, it needs to explain what happened.
- For users, it needs to provide actionable feedback to self-resolve their
issues.
- For engineers, it needs to provide detailed information needed for triage.
The error format adopted by this contract adheres to the OAuth2 error response
structure, as
per RFC 6749. This structure
is as follows:
HTTP/1.1 4xx/5xx Message
{
"error": "an_error_key",
"error_description": "Text providing additional information, used to assist the client developer in understanding the error.",
"error_uri": "An optional link to documentation that may assist in resolving this error"
}
Property | Required | Type | Description |
---|
error | yes | string | A short, unique string which can be used to switch automatic remediation steps in the callee. For example, a user may be prompted to add missing information so that the request can be retried quickly. This code should be in snake_case. |
error_description | yes | string | An actionable, human-readable message of the error that occurred. Actionable, in this context, means that a non-engineer can read it and know how to remediate their error; even if it’s something like “Please wait and try again later.“ In the case where an error is not actionable, this should be the communicated. |
error_uri | no | uri | If documentation exists to describe the details of a request and/or feature, you may optionally add a fully qualified URI which a user can read for more information. Please ensure that the existence of this URI is automatically and continuously verified. |
If your site makes use of internationalization, the error description must be
appropriately localized as per our
Internationalization
Guidelines for writing a good error message
- Proper grammar is important. Use proper capitalization and punctuation. Use a
spell checker.
- Always end your message with a period.
- Never include the product name in the error.
- Never use explicit service or technology names, preferring the role they
serve. For
example,
Error reading from redis
should instead be
Error reading from cache
. - Always speak in the active voice, not the passive voice.
- Do not address the actor using the word “You”; simply describe the error.
- Do not reveal details about the underlying implementation or internal
constructs.
Example error strings
- The field ‘field’ is required.
- The field ‘field’ may not be longer than 256 characters.
- The server is busy, please try again later.
- This request is not authorized.
3 - Domain Names
A clear and organized domain name strategy, explicitly separating functional areas via using subdomains.
In designing a robust and scalable API, having a clear and organized domain name strategy is imperative.
Requirements
Dedicated Domain for each Project/Product
- Every distinct project or product must have its own domain or subdomain.
- Example:
- A company with a single product would host it at
example.com
- A company with multiple products would host them at
product1.example.com
, product2.example.com
, etc.
Subdomains for each Functional Area
- Each significant functional area within a project should have its own subdomain.
- Examples:
- A product’s data warehouse would be hosted at
data.product1.example.com
- A product’s reporting api would be hosted at
reports.product1.example.com
Cross-functional functional areas to be hosted at closest shared parent domain
- If there is an area that applies to multiple products in a domain, it should be hosted within the closest
namespace that makes sense.
- Example:
- A company-wide authentication service would be hosted at
auth.example.com
.
Rationale
“Do one thing, and do it well.” – Doug McIlroy
The Unix philosophy of creating small, focused tools that work together to solve complex problems is a guiding principle
for API design. By dedicating domains and subdomains to specific functional areas, we can create a clear, modular,
and scalable structure that aligns with this philosophy.
By following this approach, you get the following benefits:
- Composability: Not all products need to re-implement something that’s already been built. You can reuse existing
services across different products.
- Clarity: Developers can quickly locate the API they need.
- Independence: Each domain can have its own development, CI, and lifecycle.
- Security: Limiting the scope of potential security breaches, and discouraging use of private back-door endpoints
for internal use.
- Modularity: Teams can work on different domains simultaneously without interference, increasing iteration cycles
and agility. Updates, patches, and new features can be rolled out to specific parts of the API without risking the
stability of the entire system.
- Scalability: The structure allows for the seamless addition of new features and services, as the foundational
elements remain consistent.
4 - Path Patterns
This section covers rules around paths and resource addressing in API’s.
This section covers rules around paths used in your API’s. It ensures that all paths are consistent, easy to understand,
and easy to implement. It’s divided into four sections: Resources, Queries, Rules, and Examples.
Resource Paths
All paths to resources must adhere to this pattern. Components of this pattern are described below.
/<version>/<resourcename>/<id>
<version>
The version section is there to allow for multiple versions of the same resource to exist simultaneously as your
software goes through various lifecycles. This is a common pattern in API design, and it is important to get right, so
use a simple, incrementing integer for versioned paths in your api: v1, v3, v333. If it is necessary to introduce a
major breaking change, all paths in that API must increment at the same time, even though they may not have any changed
code. This is to assist your lifecycle; by gathering breaking changes all together and implementing them all at once,
you can deprecate them on the same schedule.
/v1/resourcename
/v2/renamed_resource
<resource_name>
A resource name should always be plural and use snake_case.
<id>
If an obvious identifier is not available, such as the ID of an upstream cloud resource or an identifier derived from
dependent sources, IDs should be a V4 UUID seeded from a strong random source. This field is only relevant if you are
directly addressing an individual resource.
POST /v1/widgets/
GET /v1/widgets/{id}
PUT /v1/widgets/{id}
DELETE /v1/widgets/{id}
Specific requirements are available on our page about
CRUD Operations
Query paths
It is difficult to express a complex set of query parameters in a URL’s query string. Therefore,
all actions that ask for the creation of a result set - List, Aggregation, Graph, or otherwise - must be
POST operations with the query expressed in the request body. A side-effect of this is that a commonly used
list endpoint is not used in our API contract, as the POST action on that route is already used for resource creation.
POST /v1/widgets/query
POST /v1/widgets/aggregate
Specific requirements are described in more detail in their respective pages:
API Rules
These are concrete rules to which all of paths must adhere.
Sub-resources are undesirable
Sub-resources are a common usability improvement for an API, allowing easy scoping of a result set based on a parent
entity. They are permitted if any of the following is true:
It is important to consider what a sub-resource URL communicates to an API consumer.
- Does this subresource incorrectly imply a hierarchical relationship?
- Is the 1-to-N relationship between the parent and child likely to change in the long run?
If the answer to either of the above questions is yes, do not create a sub-resource. Instead, create a top-level
resource with its own query endpoints so a user can constrain their result set based on a relationship.
In the majority of cases, a sub-resource is not desirable. Consider the points above carefully before creating one.
Sub-resource path patterns
Note that a resource can have both a child access path and a top-level access path. In this case, the resource name in
the path MUST be identical for both the child route and the root resource route, and creation of a child resource
via the parent resource’s path MUST use the location header of the root path.
/v1/widgets/<id>/sprockets/<id>
/v1/sprockets/<id>
No Business Logic Actions
A REST API is, by design, an expression of the state of a system. Business logic actions can be roughly described as
permutation requests against this state, some of which can happen concurrently, while others happen asynchronously. Most
importantly, some of these actions cannot be done in parallel.
In order to prevent accidental parallel actions, or the conflation of ‘business logic actions’ and our above
sub-resource requirements, it is required that all business logic must be expressed as entity permutations.
A common counterargument from the engineer’s perspective is that it is far easier to build a single-purpose endpoint
than a single all-purpose endpoint. This is true, but it forces the API user to build their own complex entity
validation and action routing. From their perspective, it is far easier to call a single endpoint for all business
operations than implement - and keep track of - multiple ones. Customer usability is more important than engineer
convenience.
Example: Running a report
In this example, we are separating the concept of a report configuration, and the data generated by the report at a
specific point in time.
- Example API routes for the generated reports
POST /v1/reports/0000000000000000/snapshots/
POST /v1/reports/0000000000000000/snapshots/query
GET /v1/reports/0000000000000000/snapshots/{id}
In this example, we are creating a new, explicit sub-resource that acts as a generated report snapshot for a specific
point in time. This allows the user to query the report, and retrieve the data at a specific point in time.
The response entity could include such useful status information as the state of the current report (if it is taking
some time), the current progress, and when it was started. Furthermore, the creation of a report can be
explicitly blocked if another one is already being run.
No Resource Expansion
An individual resource must not expand any resources to which it is linked. Doing so can lead to memory pointer bugs in
clients, where multiple instances of a resource are created in memory, and the client has to manage the state of the
API.
5 - CRUD Operations
The constraints for all Create, Read, Update, and Delete operations in the system.
The Create/Read/Update/Delete (CRUD) requirements are deliberately optimized for Cache usage and conflict control, so
that the client has a rich suite of tools at their disposal to manage their local state, and could even rely
on their client’s own cache to manage the state of the API.
Create
A create request is a POST request to the root resource path, with the entity to be created in the request body. The
entity should not be returned in the response; instead, the response must use a 201 Created
header, and
indicate in the Location
header where the entity can be retrieved (or polled).
Create Request
POST /v1/resourcename/ HTTP/1.1
Content-type: application/json
{ ... sufficient data ... }
Create Response
HTTP 201 Created
Location: /v1/resourcename/{id}
Read
A read operation is a GET request to the resource path, with the entity’s ID in the path. The entity should be returned
in the response, including all entity version headers as described in Entity Versioning and Conflict Management. Conditional cache headers may be included in the request, and must
be respected by the server.
Read Request
GET /v1/resourcename/{id} HTTP/1.1
Accept: application/json
Read Response
HTTP 200 OK
Content-Type: application/json
ETag: "..."
Last-Modified: "..."
Cache-Control: max-age=3600
Vary: Accept, Origin
{
"id": "...",
"eTag": "...",
...
}
Update
An update operation uses the PUT action to replace all fields in the entity, assuming they may be replaced. If the
client sends fields that may not be updated - such as created_time
- they should be ignored. As with the GET
request, all headers indicating entity version and age must be returned.
Update Request
PUT /v1/resourcename/{id} HTTP/1.1
Accept: application/json
Content-Type: application/json
{
"id": "...",
...
}
Update Response
HTTP 200 OK
Content-Type: application/json
ETag: "..."
Last-Modified: "..."
Cache-Control: max-age=3600
Vary: Accept, Origin
{
"id": "...",
...
}
Delete
Delete operations are idempotent, and should return a 204 No Content
response. The entity should not be returned in
the response.
Delete Request
DELETE /v1/resourcename/{id} HTTP/1.1
Delete Response
No entity needs to be returned.
Reasoning
The entire point of this section of the contract is to make development easy for the front-end/downstream engineers.
We do this by focusing on two areas: Controlling the client’s cache using HTTP Semantics
from RFC-9110, and outright rejecting PATCH operations as they almost
always carry with them complex diffing logic.
Details about how we accomplish cache control in resources is described in
Entity Versioning and Conflict Management. As for PATCH
operations, I’ve found from experience that they add quite a bit of engineering overhead to both the client and server
logic. Since the client will likely already have a copy of the existing entity in memory, creating a new entity
for a patch request is a waste of resources as well as a potential soure of diffing bugs - especially if the data schema
expands.
At the same time, this creates a attack vector for malicious actors, who can easily infer that any PATCH
operation must load the rest of the entity from the database before they can validate the resulting change. They
can both infer validation logic from measuring the time of their requests (hashing algorithms are particularly
vulnerable to timing attacks), and there is an entire class of State Injection attacks that can be performed if
we know that a entity will be loaded, and then have a change applied, before being validated.
In short, don’t use PATCH requests. Accept the whole entity, validate it, and then try to apply the change. Benign
actors will just send back what they have, and malicious actors won’t be able to get any data.
6 - Standard Endpoints
Every resource server must provide a set of standard endpoints to ensure that clients can discover and interact with the service.
The following endpoints are required for every resource server:
- Well-Known Resources:
/.well-known
- Public API Descriptor:
/openapi.(json|yaml)
/.well-known/
At the root of every resource server, a .well-known
directory must exist as
per RFC-5785. This directory is there to contain
URL’s that will never change, and therefore ‘well-known’ for any client that wishes to perform some
method of autodiscovery. Similar to DNS-SD, it’s purpose is
a way to discover a resource server’s configuration without having to reference external documentation.
Creating a .well-known
directory within a subdirectory is expressly forbidden. While technically permitted
under certain readings of some RFC’s, it obfuscates the location of such documents and thus makes
autodiscovery difficult at best, and impossible at worst.
/openapi.(json|yaml)
Every resource server must provide its own OpenAPI specification document at a consistent path. This API descriptor must
be accurate enough, and well documented enough, to generate SDKs and clients directly from the document.
This endpoint exposes an OpenAPI v3.0.3 document that describes the public API of this version of the microservice. This
endpoint must return appropriately formatted content for both application/json
and application/yaml
. Each new
version of the API should be added to this same document; for details on the document structure, please refer to
the OpenAPI Specification - Version 3.0.3.