Queries are actions against a collection of resources, which allow your users to filter, sort, and aggregate data in a way that is meaningful to them. This section of the contract defines the structure of these queries, including Filtering, Searching, Pagination, Sorting, and Aggregation.
This is the multi-page printable view of this section. Click here to print.
Queries
- 1: Pagination and Sorting
- 2: Searching and Filtering
- 3: Projection
- 4: List Queries
- 5: Aggregation Queries
1 - Pagination and Sorting
Pagination
There are two ways to paginate a result set: Next/Previous, and Offset/Limit. We understand that most User Interfaces strongly prefer an offset/limit style, as it is more intuitive to human users. For the resource server, this provides a technical challenge with scale, as many data stores (most notably document-based stores) will not know how many documents match any given set of criteria until they have traversed the entire result set.
That’s not to say they can’t be used. For our purposes, however, we cannot create requirements in this contract which themselves will not grow with the scale of the data. Falling back to use cases, it is (from experience) quite rare for a human to page deeply into a result set, while a script loading all results is quite common. As such, we will only support the Next/Previous style of pagination, optimizing for the most frequent use case.
Please note that a proper implementation of Aggregation queries can easily provide the necessary metadata to simulate offset/limit style pagination, if that is a requirement for your use case. It is the responsibility of the client to decide which method is most appropriate for their audience.
Pagination Example
Here, we are submitting a query that will return the first 100 results of a resource.
POST /v1/resource/query HTTP/1.1
{
"start": "....", // Optional ID of the first record to return.
"limit": 100, // The number of results to return, default 100
// ... Sort and filter parameters, as appropriate for the query.
}
As there are more than 100 results, the response will include a Links
header with a next
link to the next page of
results, as per RFC-5988. It also includes an ETag
- calculated from the
content, and a Last-Modified
header, indicating the last-modified record in this set.
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
Last-Modified: "Wed, 21 Oct 2015 07:28:00 GMT"
Links: <https://api.example.com/v1/resource/query?start=....&limit=100>; rel="next"
{
... results as per the query type
}
Property | Relevance | Type | Description |
---|---|---|---|
start | request | string | An optional start index from which the result should be read. This must be the ID of the first record of the result set. |
limit | request | int | An optional number of results to return in the page, with a default of 100. |
Sorting
Every resource must choose a human-relevant, intuitive dimension to use as a default sort. For example, a Report service might choose to sort by name, while a Security Violations service may sort by severity or age. An API consumer may then choose to use their own dimensions. They are expressed in order, as below.
Sorting Example
{
"sort": [ // Note that the 'sort' field is optional.
{
"on": "age", // The resource property to sort on.
"order": "ASC|DESC" // "ASC" or "DESC", representing ascending or descending sorts. Default is "ASC".
},
{ // A second sort dimension, after the first is applied.
"on": "name",
"order": "ASC|DESC"
}
]
}
Sorting inherently conflicts with searching; Searching provides its own implicit ordering by relevance, which would be
overridden by sort. Therefore, any request that includes both a search and a sort must return a 400
response
indicating that they are not compatible. For “Search and sort” style operations, please use wildcards in a filter.
2 - Searching and Filtering
Searching is an inherently inclusive operation, whereby the system will add all records whose values closely - but not exactly - match the search string. Filtering, by contrast, constrains the set of records to those that exactly match the provided values, though they may support wildcards. They can operate in tandem, with a filter constraining the field in which a search is applied, but care must be taken that the system can handle the complexity of the combined operation.
Searching
A search string is provided by a user when they are not entirely certain where the result they are seeking is expressed. For example, a word expressed in a search string may exist in a title, a description, or any other property of the resource.
POST /v1/resources/query HTTP/1.1
{
"search": "...."
}
Property | Relevance | Type | Description |
---|---|---|---|
search | request | string | A string by which to search in the set of resources. |
It is left to each resource type to define which fields are included in the search index, and what tokenization method is used to decompose the resource instance and the incoming search expression. In all cases, the result should be a ' best fit’ match, should be case-insensitive, and should be returned in the order of most relevant first.
Filter
A query may include a filter object, which expresses a tree-like structure of logical filters and their relevant operands. There are two basic types of filter objects: single and multiple. If a search string is also provided, it must only be applied to resources that also match the filters.
If a search string is provided, and the query also accepts sorting criteria, the service must return a 400 Bad Request
stating that search and sort are not compatible. Searching already includes an implicit sort based on the relevance of
each record, which a sort expression would conflict with. These two expressions are not compatible.
Single Value Operation
Filtering for a single value on a single field looks as follows:
POST /v1/resources/query HTTP/1.1
{
"filters": {
"op": "....", // the logical operation that applies to a single value (see below)
"key": "dot.notation", // The key where the value should be found, using dot-notation for deep nesting.
"value": "some-value" // The input value of for the logical operation, if appropriate.
}
}
Multi-Value Operation
Filtering on multiple criteria would expand on the above.
POST /v1/resources/query HTTP/1.1
{
"filters": {
"op": "....", // the logical operation that applies to multiple values.
"values": [
{
// A list of single or multi-value operations.
}
]
}
}
Data Schema
Key | Type | Relevance | Description |
---|---|---|---|
op | string | All | The operation to perform. Case insensitive, see below for a full list of required operations. If not provided, the default value is assumed to be EQ for single values, and OR for multi values. |
key | string | Single Value | The property on which to perform the operation, which may include dot-notation. |
value | string | Single Value | A value to use for single-value operations. |
values | Filter Array | Multi Value | A list of operations. If an empty array is included, no records should match. |
The value
property is always a string, though its format may be type specific:
- Dates must be formatted as RFC-3339.
- String values may include the wildcard
*
, which represents zero or more of any character. - Large numbers (big.Int) must be expressed as base64 encoded strings.
- Regular Expressions must not include leading and trailing slashes.
Valid Operations
Operation | Key Name | Relevance | Notes |
---|---|---|---|
EQ | Strictly Equals | Single Value | |
NEQ | Not Equals | Single Value | |
GT | Greater Than | Single Value | |
LT | Less Than | Single Value | |
GE | Greater or Equal To | Single Value | |
LE | Less than or Equal To | Single Value | |
REGEX | Regular Expression | Single Value | |
AND | And | Multi Value | All of the provided filters must be true. |
OR | Or | Multi Value | Any of the provided filters must be true. |
XOR | Exclusive or | Multi Value | Only one of the provided filters may be true. |
XNOR | All or nothing | Multi Value | All of the provided filters must be true, or false. |
Wildcards
The use of wildcards may be used in string values, using simple Glob matching. For more complex queries, use the REGEX
operation.
Wildcard Character | Operation |
---|---|
* | One or more of any character. |
? | Any single character. |
Examples
String Equality
{
"filters": {
"op": "OR",
"values": [
// EQ is the default
{
"key": "name",
"value": "some_value"
},
{
"key": "name",
"value": "some_other_value"
}
]
}
}
Date Range
{
"filters": {
"op": "AND",
"values": [
{
"op": "GT",
"key": "createdDate",
"value": "1985-04-12T00:00:00Z"
},
{
"op": "LE",
"key": "createdDate",
"value": "1985-04-12T23:59:59Z"
}
]
}
}
3 - Projection
Projection allows a client to specify which fields it is interested in. This permits further optimization on client queries, however this feature should only be implemented if it is business critical. It is - in the strictest sense of the term - a premature optimization.
With that in mind, a client may add either an include
or an exclude
list to the query. If both are present,
the server should respond with a 400 Bad Request
error.
Property | Relevance | Type | Description |
---|---|---|---|
include | request | string array | An optional list of fields to include in response objects. |
exclude | request | string array | An optional list of fields to exclude from response objects. |
POST /v1/resources/query HTTP/1.1
{
// Optional list of fields to include or exclude from the result objects.
"projection": {
// Either "include" or "exclude" must be specified!
"include": ["fieldName1", "fieldName2"],
"exclude": ["fieldName3"]
}
}
4 - List Queries
In addition to basic CRUD operations, clients frequently list or search within resource sets. This breaks down into two different use cases: that of a UI, where a list with filters and a search box are offered to a user, and that of a machine client, which is usually only interested in a full list of resources.
There are also two RESTful philosophies around resource lists. The first is “Read all resources in a collection”, which
is usually implemented as a GET
request. The second is “Build a result set based on a query”, which is usually
implemented as a POST
request. Since we are prescribing a very rich and featured query language, it becomes
impractical to express all these options in the URL of a GET
request, forcing us to adopt the second philosophy.
The Query Path
Since performing a POST
request on the root resource is already assigned to creating a resource of that type,
we require a dedicated endpoint for querying resources. For generated result sets, we also require a subresource
hierarchy to allow for pagination and sorting.
POST /v1/resources/query
GET /v1/resources/query/<result_set_id>
GET /v1/resources/query/<result_set_id>/<page_id>
Requests
Our query endpoints construct their requests using the following three components:
- Filtering and Searching - Which are both optional, yet exclusive.
- Pagination and Sorting - Which are both optional.
- Projection - Which is optional.
POST /v1/resources/query HTTP/1.1
{
// Pagination and Sorting as per that spec.
"start": "....",
"limit": ....,
"sort": ....,
// As per our Searching and Filtering spec
"search": "...",
"filters": ....
}
Responses
List responses are a complex topic, as they can be quite large, require sophisticated pagination, and can be time-consuming to generate. As such, we require that all list responses - regardless of implementation - at least pretend to perform background processing to build the result set.
The response to a query request may be one of two types: a direct response, or a deferred response. The direct response
is the simplest, and is returned when the result set is already available. It contains the result set as described
below, but must also contain the Content-Location
header to indicate the actual URL of the provided result set.
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
Last-Modified: "Wed, 21 Oct 2015 07:28:00 GMT"
Content-Location: https://api.example.com/v1/resources/query/<result-set-identifier>/1
{
"results": [
....
]
}
A deferred response is returned when the result set is not yet available, or if you simply want to pretend like it’s not
available yet. In this case, the server should return a 201 Created
response with a Location
header pointing to the
first page of the result set.
HTTP/1.1 201 Created
Location: https://api.example.com/v1/resources/query/<result-set-identifier>/1
Once redirected, if the page of the request is not yet ready, the server must return 202 Accepted
response with an
appropriate Retry-After
header, and an error response body
that can assist in remediation.
HTTP/1.1 202 Accepted
Retry-After: 30
Cache-Control: no-store
{
"error": "not_ready",
"error_description": "A text description about how much longer it might take."
}
Once the result set is ready, the server should return a 200 OK
response with the result set, as well as the
following headers:
ETag
- A hash of the result set, used for caching, as described in Entity Versioning.Last-Modified
- The last-modified date of the most recently modified resource in the result set.Cache-Control
with themax-age
field, to communicate to the client when a result set will be considered stale.
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
Last-Modified: "Wed, 21 Oct 2015 07:28:00 GMT"
Cache-Control: max-age=3600
Link: <https://api.example.com/v1/resources/query/<result-set-identifier>/2; rel="next"
{
"results": [
....
]
}
Empty results
An empty result set should - for the first page - include a 200 OK
response with an empty result set, and no Link
locations.
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
Last-Modified: "Wed, 21 Oct 2015 07:28:00 GMT"
Cache-Control: max-age=3600
{
"results": []
}
Access Rights Violations
Access rights violations come in three types:
- A request has an invalid authorization token.
- A request has not been granted permission to read this resource type.
- A request is constrained to only a limited set of the available resources.
In the first case, the service should respond with a 401
response according to our common errors specification.
The second, similarly, should return 403
. For all other requests, the result set should be constrained only to the
resources which the user is authorized to see. Even if explicitly named in a filter, if a user cannot see that
resource, the result set should be empty.
5 - Aggregation Queries
Aggregation queries are a powerful way by which aggregate data can be collected in a single query without a client having to iterate over the entire result set. For those of you familiar with ElasticSearch, this is a simplified version of the Bucket, Metrics, and Pipeline aggregation request format, which expresses the options of each while keeping the contract concise for the user (note that only Buckets and Metrics are supported).
Supporting these kinds of queries can be quite complex, and your API may not even need them, so it’s up to you to decide if they are necessary. Use cases which this might satisfy include:
- Autocompleting tags already used in other documents.
- Showing how many documents exist in a particular result set.
- Gather averages, sums, and other metrics from a set of resources.
Path
Much like List Queries, aggregation queries follow the “Build a result set based on a
query” pattern, but this time using the /aggregate
sub-path of the resource’s endpoint. Unlike the list queries
however, there is no need to page the response.
POST /v1/resources/aggregate
GET /v1/resources/aggregate/<result_set_id>
Request and Response Schema
Aggregation queries do not support the same filtering, searching, pagination, or sorting semantics as List Queries. Filtering is applied at the top level of the request and affects all aggregations and sub-aggregations. This ensures consistency and simplifies the query structure. Filters cannot be applied to individual aggregations within the query. Sorting can be applied directly to each aggregation bucket (if appropriate).
Common Fields
Every aggregation request contains the same two fields: filters
, which is optional and follows our filtering rules,
and aggregations
, which is a map of the different aggregations requested by the server. The aggregations each
have a type
property to inform the server what form of aggregation is requested.
POST /v1/resources/aggregate
{
"aggregations": {
"<bucket_name>": {
"type": "terms",
....
},
"<another_bucket_name>": {
"type": "avg",
....
}
}
}
A response to an aggregation request returns the same map, replacing the query constraints with the results of the requested aggregation. For specific examples of requests and responses, please see the detailed examples below.
HTTP/1.1 200 OK
{
"aggregations": {
"<bucket_name>": {
.... results
},
"<another_bucket_name>": {
.... results
}
}
}
Aggregation Query: terms
A ’terms’ aggregation query sorts all documents into buckets defined by the provided fields’ value, and returns the count of those buckets. The number of terms to return should be provided.
POST /v1/resources/aggregate
{
"aggregations": {
"tags": {
"sort": [], // Optional sorting rules, as per the sort standard.
"type": "terms", // Always `terms`
"field": "tags.name", // The field name to aggregate.
"count": 20, // The total number of terms to return.
}
}
}
The server then must respond with the buckets into which the documents were sorted, along with the count of documents in each bucket. Terms should be sorted according to the sorting rules.
HTTP/1.1 200 OK
{
"aggregations": {
"tags": {
"tags.name": {
"Anchovy": { // The term name
"count": 3 // The number of documents which contain this term
},
"Sardine": { // The term name
"count": 60 // The number of documents which contain this term
}
}
}
}
}
Aggregation Query: sum
A sum
aggregation calculates sum of a numeric field. While itself perhaps not the most useful, it
becomes quite powerful when used as a nested aggregation (see examples at the end).
POST /v1/resources/aggregate
{
"filter": [...],
"aggregations": {
"award_points": {
"type": "sum", // Always `sum`
"field": "points", // The field name to calculate the sum of.
}
}
}
The server then must respond with the buckets into which the documents were sorted, along with the count of documents in each bucket. Terms should be sorted according to the sorting rules.
HTTP/1.1 200 OK
{
"aggregations": {
"award_points": {
"count": 10332,
"points": 234000
}
}
}
Aggregation Query: avg
A avg
aggregation calculates the average of a numeric field.
POST /v1/resources/aggregate
{
"filter": [...],
"aggregations": {
"rating": {
"type": "avg", // Always `avg`
"field": "stars", // The field name to calculate the average of.
}
}
}
The server then must respond with the buckets into which the documents were sorted, along with the count of documents in each bucket. Terms should be sorted according to the sorting rules.
HTTP/1.1 200 OK
{
"aggregations": {
"rating": {
"count": 10332,
"stars": 4.23322555
}
}
}
Aggregation Query: range
A range
aggregation collects documents into numeric ranges for a specific field. The ranges are defined by the
from
and to
properties. If either is omitted, the range is open-ended.
POST /v1/resources/aggregate
{
"filter": [...],
"aggregations": {
"runners": {
"type": "range", // Always `range`
"field": "pace", // The field name to divide into buckets
"ranges": { // An object of pre-named bucket ranges
"slow": { "from": 0, "to": 7 },
"normal": { "from": 7, "to": 9 },
"fast": { "from": 9 }
}
}
}
}
The server then must respond with the buckets that were requested by the user.
HTTP/1.1 200 OK
{
"aggregations": {
"runners": {
"count": 8700,
"pace": {
"slow": {
"count": 100
},
"normal": {
"count": 8000
},
"fast": {
"count": 600
}
}
}
}
}
Other Aggregation Queries
The above are not an exhaustive list of aggregations which your system may support; your use cases may vary, and you can expand on what we’ve provided here at your leisure. We just ask that you let us know of specific use cases, so we can evaluate them for inclusion here.
If you’re looking for inspiration, the ElasticSearch Aggregations documentation can provide some.
Examples
I’ve included some examples below to help you understand how to structure your requests and what to expect in the response.
Items by price
POST /v1/resources/aggregate HTTP/1.1
{
"aggregations": {
"price_range": {
"type": "range",
"field": "price",
"ranges": [
{ "from": 0, "to": 50 },
{ "from": 50, "to": 100 },
{ "from": 100 }
]
}
}
}
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
{
"aggregations": {
"price_range": {
"0-50": {
"doc_count": 100
},
"50-100": {
"count": 80
},
"100+": {
"count": 60
}
}
}
}
Autocompleting Tag Names
POST /v1/resources/aggregate HTTP/1.1
{
"aggregations": {
"tags": {
"type": "terms",
"field": "tags.name",
"count": 5,
"sort": [
{
"on": "tags.name",
"order": "ASC"
}
]
}
}
}
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
{
"aggregations": {
"tags": {
"Anchovy": {
"doc_count": 3
},
"Branzini": {
"count": 80
},
"Cod": {
"count": 60
}
}
}
}
Total Sales by Month
POST /v1/resources/aggregate HTTP/1.1
{
"aggregations": {
"monthly_sales": {
"type": "range",
"field": "closed_date"
"ranges": {
"2019-01": { "from": "2019-01-01", "to": "2019-02-01" },
"2019-02": { "from": "2019-02-01", "to": "2019-03-01" },
},
"aggregations": {
"sales": {
"type": "sum",
"field: "price"
}
}
}
}
}
HTTP/1.1 200 OK
ETag: "dd2796ae-1a46-4be5-b446-7f8c7a0e8342"
{
"aggregations": {
"monthly_sales": {
"count": 100,
"closed_date": {
"2019-01": {
"count": 100,
"sales": 10000
},
"2019-02": {
"count": 80,
"sales": 8000
}
}
}
}
}