knowledge/technology/tools/JSON Schema.md
2024-03-08 22:22:49 +01:00

479 lines
No EOL
20 KiB
Markdown

---
obj: concept
website: https://json-schema.org
---
# JSON Schema
JSON Schema is a schema specification for [JSON](../files/JSON.md), which can validate a [JSON](../files/JSON.md) Document to be of a specific format. Some common schemas can be found [here](https://www.schemastore.org/json/).
## `type` Keyword
The `type` keyword is used to restrict the types of a property
```json
{ "type": "string" }
````
Every type has specific keywords that can be used:
### `string`
#### Length
The length of a string can be constrained:
```json
{
"type": "string",
"minLength": 2,
"maxLength": 3
}
```
#### [Regex](Regex.md)
The `pattern` keyword is used to restrict a string to a particular [regular expression](Regex.md)
```json
{
"type": "string",
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}
```
#### Format
The format keyword allows for basic semantic identification of certain kinds of string values that are commonly used. For example, because [JSON](../files/JSON.md) doesn't have a "DateTime" type, dates need to be encoded as strings. format allows the schema author to indicate that the string value should be interpreted as a date. By default, format is just an annotation and does not effect validation.
Optionally, validator implementations can provide a configuration option to enable format to function as an assertion rather than just an annotation. That means that validation will fail if, for example, a value with a date format isn't in a form that can be parsed as a date. This can allow values to be constrained beyond what the other tools in JSON Schema, including Regular Expressions can do.
##### Built-in formats
The following is the list of formats specified in the JSON Schema specification.
###### Dates and times
Dates and times are represented in [RFC 3339, section 5.6](https://tools.ietf.org/html/rfc3339#section-5.6). This is a subset of the date format also commonly known as [ISO8601 format](https://www.iso.org/iso-8601-date-and-time-format.html).
- `"date-time"`: Date and time together, for example, `2018-11-13T20:20:39+00:00`.
- `"time"`: Time, for example, `20:20:39+00:00`
- `"date"`: Date, for example, `2018-11-13`.
- `"duration"`: A duration as defined by the [ISO 8601 ABNF for "duration"](https://datatracker.ietf.org/doc/html/rfc3339#appendix-A). For example, `P3D` expresses a duration of 3 days.
###### [Email](../internet/eMail.md) addresses
- `"email"`: Internet email address, see [RFC 5321, section 4.1.2](http://tools.ietf.org/html/rfc5321#section-4.1.2).
- `"idn-email"`: The internationalized form of an Internet email address, see [RFC 6531](https://tools.ietf.org/html/rfc6531).
###### Hostnames
- `"hostname"`: Internet host name, see [RFC 1123, section 2.1](https://datatracker.ietf.org/doc/html/rfc1123#section-2.1).
- `"idn-hostname"`: An internationalized Internet host name, see [RFC5890, section 2.3.2.3](https://tools.ietf.org/html/rfc5890#section-2.3.2.3).
###### IP Addresses
- `"ipv4"`: IPv4 address, according to dotted-quad ABNF syntax as defined in [RFC 2673, section 3.2](http://tools.ietf.org/html/rfc2673#section-3.2).
- `"ipv6"`: IPv6 address, as defined in [RFC 2373, section 2.2](http://tools.ietf.org/html/rfc2373#section-2.2).
###### Resource identifiers
- `"uuid"`: A [Universally Unique Identifier](../linux/UUID.md) as defined by [RFC 4122](https://datatracker.ietf.org/doc/html/rfc4122). Example: `3e4666bf-d5e5-4aa7-b8ce-cefe41c7568a`
- `"uri"`: A universal resource identifier (URI), according to [RFC3986](http://tools.ietf.org/html/rfc3986).
- `"uri-reference"`: A URI Reference (either a URI or a relative-reference), according to [RFC3986, section 4.1](http://tools.ietf.org/html/rfc3986#section-4.1).
- `"iri"`: The internationalized equivalent of a "uri", according to [RFC3987](https://tools.ietf.org/html/rfc3987).
- `"iri-reference"`: New in draft 7The internationalized equivalent of a "uri-reference", according to [RFC3987](https://tools.ietf.org/html/rfc3987)
If the values in the schema have the ability to be relative to a particular source path (such as a link from a webpage), it is generally better practice to use `"uri-reference"` (or `"iri-reference"`) rather than `"uri"` (or `"iri"`). `"uri"` should only be used when the path must be absolute.
###### URI template
- `"uri-template"`: A URI Template (of any level) according to [RFC6570](https://tools.ietf.org/html/rfc6570). If you don't already know what a URI Template is, you probably don't need this value.
###### JSON Pointer
- `"json-pointer"`: A JSON Pointer, according to [RFC6901](https://tools.ietf.org/html/rfc6901). There is more discussion on the use of JSON Pointer within JSON Schema in [Structuring a complex schema](https://json-schema.org/understanding-json-schema/structuring). Note that this should be used only when the entire string contains only JSON Pointer content, e.g. `/foo/bar`. JSON Pointer URI fragments, e.g. `#/foo/bar/` should use `"uri-reference"`.
- `"relative-json-pointer"`: A [relative JSON pointer](https://tools.ietf.org/html/draft-handrews-relative-json-pointer-01).
###### Regular Expressions
- `"regex"`: A [regular expression](Regex.md), which should be valid according to the [ECMA 262](https://www.ecma-international.org/publications-and-standards/standards/ecma-262/) dialect.
### `integer`
The `integer` type is used for integral numbers. JSON does not have distinct types for integers and floating-point values. Therefore, the presence or absence of a decimal point is not enough to distinguish between integers and non-integers. For example, `1` and `1.0` are two ways to represent the same value in JSON. JSON Schema considers that value an integer no matter which representation was used.
For differencing float and integer values these types can be used.
```json
{ "type": "integer"}
{ "type": "number" }
```
#### Multiples
Numbers can be restricted to a multiple of a given number, using the `multipleOf` keyword. It may be set to any positive number.
```json
{
"type": "number",
"multipleOf" : 10
}
```
#### Range
Ranges of numbers are specified using a combination of the `minimum` and `maximum` keywords, (or `exclusiveMinimum` and `exclusiveMaximum` for expressing exclusive range).
```json
{
"type": "number",
"minimum": 0,
"exclusiveMaximum": 100
}
```
### `object`
Objects are the mapping type in [JSON](../files/JSON.md). They map "keys" to "values". In [JSON](../files/JSON.md), the "keys" must always be strings. Each of these pairs is conventionally referred to as a "property".
```json
{ "type": "object" }
```
#### Properties
The properties (key-value pairs) on an object are defined using the `properties` keyword. The value of `properties` is an object, where each key is the name of a property and each value is a schema used to validate that property. Any property that doesn't match any of the property names in the `properties` keyword is ignored by this keyword.
```json
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
}
}
```
#### Pattern Properties
Sometimes you want to say that, given a particular kind of property name, the value should match a particular schema. That's where `patternProperties` comes in: it maps regular expressions to schemas. If a property name matches the given [regular expression](Regex.md), the property value must validate against the corresponding schema.
```json
{
"type": "object",
"patternProperties": {
"^S_": { "type": "string" },
"^I_": { "type": "integer" }
}
}
```
#### Additional Properties
The `additionalProperties` keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in the `properties` keyword or match any of the regular expressions in the `patternProperties` keyword. By default any additional properties are allowed.
The value of the `additionalProperties` keyword is a schema that will be used to validate any properties in the instance that are not matched by `properties` or `patternProperties`. Setting the `additionalProperties` schema to `false` means no additional properties will be allowed.
```json
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
},
"additionalProperties": false
}
```
#### Required Properties
By default, the properties defined by the `properties` keyword are not required. However, one can provide a list of required properties using the `required` keyword.
```json
{
"type": "object",
"properties": {
"name": { "type": "string" },
"email": { "type": "string" },
"address": { "type": "string" },
"telephone": { "type": "string" }
},
"required": ["name", "email"]
}
```
#### Property names
The names of properties can be validated against a schema, irrespective of their values. This can be useful if you don't want to enforce specific properties, but you want to make sure that the names of those properties follow a specific convention. You might, for example, want to enforce that all names are valid [ASCII](../files/ASCII.md) tokens so they can be used as attributes in a particular programming language.
```json
{
"type": "object",
"propertyNames": {
"pattern": "^[A-Za-z_][A-Za-z0-9_]*$"
}
}
```
#### Size
The number of properties on an object can be restricted using the `minProperties` and `maxProperties` keywords. Each of these must be a non-negative integer.
```json
{
"type": "object",
"minProperties": 2,
"maxProperties": 3
}
```
### `array`
Arrays are used for ordered elements. In [JSON](../files/JSON.md), each element in an array may be of a different type.
```json
{ "type": "array" }
```
#### Items
List validation is useful for arrays of arbitrary length where each item matches the same schema. For this kind of array, set the `items` keyword to a single schema that will be used to validate all of the items in the array.
```json
{
"type": "array",
"items": {
"type": "number"
}
}
```
#### Tuple Validation
Tuple validation is useful when the array is a collection of items where each has a different schema and the ordinal index of each item is meaningful.
```json
{
"type": "array",
"prefixItems": [
{ "type": "number" },
{ "type": "string" },
{ "enum": ["Street", "Avenue", "Boulevard"] },
{ "enum": ["NW", "NE", "SW", "SE"] }
]
}
```
#### Additional Items
The `items` keyword can be used to control whether it's valid to have additional items in a tuple beyond what is defined in `prefixItems`. The value of the `items` keyword is a schema that all additional items must pass in order for the keyword to validate.
```json
{
"type": "array",
"prefixItems": [
{ "type": "number" },
{ "type": "string" },
{ "enum": ["Street", "Avenue", "Boulevard"] },
{ "enum": ["NW", "NE", "SW", "SE"] }
],
"items": false
}
```
#### Unevaluated Items
The `unevaluatedItems` keyword is useful mainly when you want to add or disallow extra items to an array.
`unevaluatedItems` applies to any values not evaluated by an `items`, `prefixItems`, or `contains` keyword. Just as `unevaluatedProperties` affects only **properties** in an object, `unevaluatedItems` affects only **items** in an array.
Watch out! The word "unevaluated" _does not mean_ "not evaluated by `items`, `prefixItems`, or `contains`." "Unevaluated" means "not successfully evaluated", or "does not evaluate to true".
Like with `items`, if you set `unevaluatedItems` to `false`, you can disallow extra items in the array.
```json
{
"prefixItems": [
{ "type": "string" }, { "type": "number" }
],
"unevaluatedItems": false
}
```
#### Contains
While the `items` schema must be valid for every item in the array, the `contains` schema only needs to validate against one or more items in the array.
```json
{
"type": "array",
"contains": {
"type": "number"
}
}
```
#### minContains / maxContains
`minContains` and `maxContains` can be used with `contains` to further specify how many times a schema matches a `contains` constraint. These keywords can be any non-negative number including zero.
```json
{
"type": "array",
"contains": {
"type": "number"
},
"minContains": 2,
"maxContains": 3
}
```
#### Length
The length of the array can be specified using the minItems and maxItems keywords. The value of each keyword must be a non-negative number. These keywords work whether doing list validation or tuple-validation.
```json
{
"type": "array",
"minItems": 2,
"maxItems": 3
}
```
#### Uniqueness
A schema can ensure that each of the items in an array is unique. Simply set the `uniqueItems` keyword to `true`.
```json
{
"type": "array",
"uniqueItems": true
}
```
### `boolean`
The boolean type matches only two special values: `true` and `false`. Note that values that _evaluate_ to `true` or `false`, such as 1 and 0, are not accepted by the schema.
```json
{ "type": "boolean" }
```
### `null`
When a schema specifies a type of null, it has only one acceptable value: null.
```json
{ "type": "null" }
```
## Annotations
JSON Schema includes a few keywords, that aren't strictly used for validation, but are used to describe parts of a schema. None of these "annotation" keywords are required, but they are encouraged for good practice, and can make your schema "self-documenting".
The `title` and `description` keywords must be strings. A "title" will preferably be short, whereas a "description" will provide a more lengthy explanation about the purpose of the data described by the schema.
The `default` keyword specifies a default value. This value is not used to fill in missing values during the validation process. Non-validation tools such as documentation generators or form generators may use this value to give hints to users about how to use a value. However, `default` is typically used to express that if a value is missing, then the value is semantically the same as if the value was present with the default value. The value of `default` should validate against the schema in which it resides, but that isn't required.
The `examples` keyword is a place to provide an array of examples that validate against the schema. This isn't used for validation, but may help with explaining the effect and purpose of the schema to a reader. Each entry should validate against the schema in which it resides, but that isn't strictly required. There is no need to duplicate the `default` value in the `examples` array, since `default` will be treated as another example.
The boolean keywords `readOnly` and `writeOnly` are typically used in an API context. `readOnly` indicates that a value should not be modified. It could be used to indicate that a `PUT` request that changes a value would result in a `400 Bad Request` response. `writeOnly` indicates that a value may be set, but will remain hidden. In could be used to indicate you can set a value with a `PUT` request, but it would not be included when retrieving that record with a `GET` request.
The `deprecated` keyword is a boolean that indicates that the instance value the keyword applies to should not be used and may be removed in the future.
```json
{
"title": "Match anything",
"description": "This is a schema that matches anything.",
"default": "Default value",
"examples": [
"Anything",
4035
],
"deprecated": true,
"readOnly": true,
"writeOnly": false
}
```
## Enumerated values
The `enum` keyword is used to restrict a value to a fixed set of values. It must be an array with at least one element, where each element is unique.
```json
{
"enum": ["red", "amber", "green"]
}
```
## Constant Values
The `const` keyword is used to restrict a value to a single value.
```json
{
"properties": {
"country": {
"const": "United States of America"
}
}
}
```
## Media: string-encoding non [JSON](../files/JSON.md) Data
JSON schema has a set of keywords to describe and optionally validate non-JSON data stored inside [JSON](../files/JSON.md) strings. Since it would be difficult to write validators for many media types, JSON schema validators are not required to validate the contents of [JSON](../files/JSON.md) strings based on these keywords. However, these keywords are still useful for an application that consumes validated [JSON](../files/JSON.md).
### contentMediaType
The `contentMediaType` keyword specifies the [MIME](../files/MIME.md) type of the contents of a string, as described in [RFC 2046](https://tools.ietf.org/html/rfc2046). There is a list of [MIME types officially registered by the IANA](http://www.iana.org/assignments/media-types/media-types.xhtml), but the set of types supported will be application and operating system dependent. Mozilla Developer Network also maintains a [shorter list of MIME types that are important for the web](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Complete_list_of_MIME_types)
### contentEncoding
The `contentEncoding` keyword specifies the encoding used to store the contents, as specified in [RFC 2054, part 6.1](https://tools.ietf.org/html/rfc2045) and [RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648).
The acceptable values are `7bit`, `8bit`, `binary`, `quoted-printable`, `base16`, `base32`, and `base64`. If not specified, the encoding is the same as the containing [JSON](../files/JSON.md) document.
Without getting into the low-level details of each of these encodings, there are really only two options useful for modern usage:
- If the content is encoded in the same encoding as the enclosing [JSON](../files/JSON.md) document (which for practical purposes, is almost always [UTF-8](../files/Unicode.md)), leave `contentEncoding` unspecified, and include the content in a string as-is. This includes text-based content types, such as `text/html` or `application/xml`.
- If the content is binary data, set `contentEncoding` to `base64` and encode the contents using [Base64](https://tools.ietf.org/html/rfc4648). This would include many image types, such as `image/png` or audio types, such as `audio/mpeg`.
```json
{
"type": "string",
"contentMediaType": "text/html"
}
```
```json
{
"type": "string",
"contentEncoding": "base64",
"contentMediaType": "image/png"
}
```
## Conditional Schemas
### dependentRequired
The `dependentRequired` keyword conditionally requires that certain properties must be present if a given property is present in an object. For example, suppose we have a schema representing a customer. If you have their credit card number, you also want to ensure you have a billing address. If you don't have their credit card number, a billing address would not be required. We represent this dependency of one property on another using the `dependentRequired` keyword. The value of the `dependentRequired` keyword is an object. Each entry in the object maps from the name of a property, _p_, to an array of strings listing properties that are required if _p_ is present.
```json
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" },
"billing_address": { "type": "string" }
},
"required": ["name"],
"dependentRequired": {
"credit_card": ["billing_address"]
}
}
```
### dependentSchemas
The `dependentSchemas` keyword conditionally applies a subschema when a given property is present. This schema is applied in the same way [allOf](https://json-schema.org/understanding-json-schema/reference/combining#allof) applies schemas. Nothing is merged or extended. Both schemas apply independently.
```json
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" }
},
"required": ["name"],
"dependentSchemas": {
"credit_card": {
"properties": {
"billing_address": { "type": "string" }
},
"required": ["billing_address"]
}
}
}
```
### If-Then-Else
The if, then and else keywords allow the application of a subschema based on the outcome of another schema, much like the if/then/else constructs you've probably seen in traditional programming languages.
```json
{
"type": "object",
"properties": {
"street_address": {
"type": "string"
},
"country": {
"default": "United States of America",
"enum": ["United States of America", "Canada"]
}
},
"if": {
"properties": { "country": { "const": "United States of America" } }
},
"then": {
"properties": { "postal_code": { "pattern": "[0-9]{5}(-[0-9]{4})?" } }
},
"else": {
"properties": { "postal_code": { "pattern": "[A-Z][0-9][A-Z] [0-9][A-Z][0-9]" } }
}
}
```