20 KiB
obj | website |
---|---|
concept | https://json-schema.org |
JSON Schema
JSON Schema is a schema specification for JSON, which can validate a JSON Document to be of a specific format. Some common schemas can be found here.
type
Keyword
The type
keyword is used to restrict the types of a property
{ "type": "string" }
Every type has specific keywords that can be used:
string
Length
The length of a string can be constrained:
{
"type": "string",
"minLength": 2,
"maxLength": 3
}
Regex
The pattern
keyword is used to restrict a string to a particular regular expression
{
"type": "string",
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}
Format
The format keyword allows for basic semantic identification of certain kinds of string values that are commonly used. For example, because JSON doesn't have a "DateTime" type, dates need to be encoded as strings. format allows the schema author to indicate that the string value should be interpreted as a date. By default, format is just an annotation and does not effect validation.
Optionally, validator implementations can provide a configuration option to enable format to function as an assertion rather than just an annotation. That means that validation will fail if, for example, a value with a date format isn't in a form that can be parsed as a date. This can allow values to be constrained beyond what the other tools in JSON Schema, including Regular Expressions can do.
Built-in formats
The following is the list of formats specified in the JSON Schema specification.
Dates and times
Dates and times are represented in RFC 3339, section 5.6. This is a subset of the date format also commonly known as ISO8601 format.
"date-time"
: Date and time together, for example,2018-11-13T20:20:39+00:00
."time"
: Time, for example,20:20:39+00:00
"date"
: Date, for example,2018-11-13
."duration"
: A duration as defined by the ISO 8601 ABNF for "duration". For example,P3D
expresses a duration of 3 days.
Email addresses
"email"
: Internet email address, see RFC 5321, section 4.1.2."idn-email"
: The internationalized form of an Internet email address, see RFC 6531.
Hostnames
"hostname"
: Internet host name, see RFC 1123, section 2.1."idn-hostname"
: An internationalized Internet host name, see RFC5890, section 2.3.2.3.
IP Addresses
"ipv4"
: IPv4 address, according to dotted-quad ABNF syntax as defined in RFC 2673, section 3.2."ipv6"
: IPv6 address, as defined in RFC 2373, section 2.2.
Resource identifiers
"uuid"
: A Universally Unique Identifier as defined by RFC 4122. Example:3e4666bf-d5e5-4aa7-b8ce-cefe41c7568a
"uri"
: A universal resource identifier (URI), according to RFC3986."uri-reference"
: A URI Reference (either a URI or a relative-reference), according to RFC3986, section 4.1."iri"
: The internationalized equivalent of a "uri", according to RFC3987."iri-reference"
: New in draft 7The internationalized equivalent of a "uri-reference", according to RFC3987
If the values in the schema have the ability to be relative to a particular source path (such as a link from a webpage), it is generally better practice to use "uri-reference"
(or "iri-reference"
) rather than "uri"
(or "iri"
). "uri"
should only be used when the path must be absolute.
URI template
"uri-template"
: A URI Template (of any level) according to RFC6570. If you don't already know what a URI Template is, you probably don't need this value.
JSON Pointer
"json-pointer"
: A JSON Pointer, according to RFC6901. There is more discussion on the use of JSON Pointer within JSON Schema in Structuring a complex schema. Note that this should be used only when the entire string contains only JSON Pointer content, e.g./foo/bar
. JSON Pointer URI fragments, e.g.#/foo/bar/
should use"uri-reference"
."relative-json-pointer"
: A relative JSON pointer.
Regular Expressions
"regex"
: A regular expression, which should be valid according to the ECMA 262 dialect.
integer
The integer
type is used for integral numbers. JSON does not have distinct types for integers and floating-point values. Therefore, the presence or absence of a decimal point is not enough to distinguish between integers and non-integers. For example, 1
and 1.0
are two ways to represent the same value in JSON. JSON Schema considers that value an integer no matter which representation was used.
For differencing float and integer values these types can be used.
{ "type": "integer"}
{ "type": "number" }
Multiples
Numbers can be restricted to a multiple of a given number, using the multipleOf
keyword. It may be set to any positive number.
{
"type": "number",
"multipleOf" : 10
}
Range
Ranges of numbers are specified using a combination of the minimum
and maximum
keywords, (or exclusiveMinimum
and exclusiveMaximum
for expressing exclusive range).
{
"type": "number",
"minimum": 0,
"exclusiveMaximum": 100
}
object
Objects are the mapping type in JSON. They map "keys" to "values". In JSON, the "keys" must always be strings. Each of these pairs is conventionally referred to as a "property".
{ "type": "object" }
Properties
The properties (key-value pairs) on an object are defined using the properties
keyword. The value of properties
is an object, where each key is the name of a property and each value is a schema used to validate that property. Any property that doesn't match any of the property names in the properties
keyword is ignored by this keyword.
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
}
}
Pattern Properties
Sometimes you want to say that, given a particular kind of property name, the value should match a particular schema. That's where patternProperties
comes in: it maps regular expressions to schemas. If a property name matches the given regular expression, the property value must validate against the corresponding schema.
{
"type": "object",
"patternProperties": {
"^S_": { "type": "string" },
"^I_": { "type": "integer" }
}
}
Additional Properties
The additionalProperties
keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in the properties
keyword or match any of the regular expressions in the patternProperties
keyword. By default any additional properties are allowed.
The value of the additionalProperties
keyword is a schema that will be used to validate any properties in the instance that are not matched by properties
or patternProperties
. Setting the additionalProperties
schema to false
means no additional properties will be allowed.
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
},
"additionalProperties": false
}
Required Properties
By default, the properties defined by the properties
keyword are not required. However, one can provide a list of required properties using the required
keyword.
{
"type": "object",
"properties": {
"name": { "type": "string" },
"email": { "type": "string" },
"address": { "type": "string" },
"telephone": { "type": "string" }
},
"required": ["name", "email"]
}
Property names
The names of properties can be validated against a schema, irrespective of their values. This can be useful if you don't want to enforce specific properties, but you want to make sure that the names of those properties follow a specific convention. You might, for example, want to enforce that all names are valid ASCII tokens so they can be used as attributes in a particular programming language.
{
"type": "object",
"propertyNames": {
"pattern": "^[A-Za-z_][A-Za-z0-9_]*$"
}
}
Size
The number of properties on an object can be restricted using the minProperties
and maxProperties
keywords. Each of these must be a non-negative integer.
{
"type": "object",
"minProperties": 2,
"maxProperties": 3
}
array
Arrays are used for ordered elements. In JSON, each element in an array may be of a different type.
{ "type": "array" }
Items
List validation is useful for arrays of arbitrary length where each item matches the same schema. For this kind of array, set the items
keyword to a single schema that will be used to validate all of the items in the array.
{
"type": "array",
"items": {
"type": "number"
}
}
Tuple Validation
Tuple validation is useful when the array is a collection of items where each has a different schema and the ordinal index of each item is meaningful.
{
"type": "array",
"prefixItems": [
{ "type": "number" },
{ "type": "string" },
{ "enum": ["Street", "Avenue", "Boulevard"] },
{ "enum": ["NW", "NE", "SW", "SE"] }
]
}
Additional Items
The items
keyword can be used to control whether it's valid to have additional items in a tuple beyond what is defined in prefixItems
. The value of the items
keyword is a schema that all additional items must pass in order for the keyword to validate.
{
"type": "array",
"prefixItems": [
{ "type": "number" },
{ "type": "string" },
{ "enum": ["Street", "Avenue", "Boulevard"] },
{ "enum": ["NW", "NE", "SW", "SE"] }
],
"items": false
}
Unevaluated Items
The unevaluatedItems
keyword is useful mainly when you want to add or disallow extra items to an array.
unevaluatedItems
applies to any values not evaluated by an items
, prefixItems
, or contains
keyword. Just as unevaluatedProperties
affects only properties in an object, unevaluatedItems
affects only items in an array.
Watch out! The word "unevaluated" does not mean "not evaluated by items
, prefixItems
, or contains
." "Unevaluated" means "not successfully evaluated", or "does not evaluate to true".
Like with items
, if you set unevaluatedItems
to false
, you can disallow extra items in the array.
{
"prefixItems": [
{ "type": "string" }, { "type": "number" }
],
"unevaluatedItems": false
}
Contains
While the items
schema must be valid for every item in the array, the contains
schema only needs to validate against one or more items in the array.
{
"type": "array",
"contains": {
"type": "number"
}
}
minContains / maxContains
minContains
and maxContains
can be used with contains
to further specify how many times a schema matches a contains
constraint. These keywords can be any non-negative number including zero.
{
"type": "array",
"contains": {
"type": "number"
},
"minContains": 2,
"maxContains": 3
}
Length
The length of the array can be specified using the minItems and maxItems keywords. The value of each keyword must be a non-negative number. These keywords work whether doing list validation or tuple-validation.
{
"type": "array",
"minItems": 2,
"maxItems": 3
}
Uniqueness
A schema can ensure that each of the items in an array is unique. Simply set the uniqueItems
keyword to true
.
{
"type": "array",
"uniqueItems": true
}
boolean
The boolean type matches only two special values: true
and false
. Note that values that evaluate to true
or false
, such as 1 and 0, are not accepted by the schema.
{ "type": "boolean" }
null
When a schema specifies a type of null, it has only one acceptable value: null.
{ "type": "null" }
Annotations
JSON Schema includes a few keywords, that aren't strictly used for validation, but are used to describe parts of a schema. None of these "annotation" keywords are required, but they are encouraged for good practice, and can make your schema "self-documenting".
The title
and description
keywords must be strings. A "title" will preferably be short, whereas a "description" will provide a more lengthy explanation about the purpose of the data described by the schema.
The default
keyword specifies a default value. This value is not used to fill in missing values during the validation process. Non-validation tools such as documentation generators or form generators may use this value to give hints to users about how to use a value. However, default
is typically used to express that if a value is missing, then the value is semantically the same as if the value was present with the default value. The value of default
should validate against the schema in which it resides, but that isn't required.
The examples
keyword is a place to provide an array of examples that validate against the schema. This isn't used for validation, but may help with explaining the effect and purpose of the schema to a reader. Each entry should validate against the schema in which it resides, but that isn't strictly required. There is no need to duplicate the default
value in the examples
array, since default
will be treated as another example.
The boolean keywords readOnly
and writeOnly
are typically used in an API context. readOnly
indicates that a value should not be modified. It could be used to indicate that a PUT
request that changes a value would result in a 400 Bad Request
response. writeOnly
indicates that a value may be set, but will remain hidden. In could be used to indicate you can set a value with a PUT
request, but it would not be included when retrieving that record with a GET
request.
The deprecated
keyword is a boolean that indicates that the instance value the keyword applies to should not be used and may be removed in the future.
{
"title": "Match anything",
"description": "This is a schema that matches anything.",
"default": "Default value",
"examples": [
"Anything",
4035
],
"deprecated": true,
"readOnly": true,
"writeOnly": false
}
Enumerated values
The enum
keyword is used to restrict a value to a fixed set of values. It must be an array with at least one element, where each element is unique.
{
"enum": ["red", "amber", "green"]
}
Constant Values
The const
keyword is used to restrict a value to a single value.
{
"properties": {
"country": {
"const": "United States of America"
}
}
}
Media: string-encoding non JSON Data
JSON schema has a set of keywords to describe and optionally validate non-JSON data stored inside JSON strings. Since it would be difficult to write validators for many media types, JSON schema validators are not required to validate the contents of JSON strings based on these keywords. However, these keywords are still useful for an application that consumes validated JSON.
contentMediaType
The contentMediaType
keyword specifies the MIME type of the contents of a string, as described in RFC 2046. There is a list of MIME types officially registered by the IANA, but the set of types supported will be application and operating system dependent. Mozilla Developer Network also maintains a shorter list of MIME types that are important for the web
contentEncoding
The contentEncoding
keyword specifies the encoding used to store the contents, as specified in RFC 2054, part 6.1 and RFC 4648.
The acceptable values are 7bit
, 8bit
, binary
, quoted-printable
, base16
, base32
, and base64
. If not specified, the encoding is the same as the containing JSON document.
Without getting into the low-level details of each of these encodings, there are really only two options useful for modern usage:
- If the content is encoded in the same encoding as the enclosing JSON document (which for practical purposes, is almost always UTF-8), leave
contentEncoding
unspecified, and include the content in a string as-is. This includes text-based content types, such astext/html
orapplication/xml
. - If the content is binary data, set
contentEncoding
tobase64
and encode the contents using Base64. This would include many image types, such asimage/png
or audio types, such asaudio/mpeg
.
{
"type": "string",
"contentMediaType": "text/html"
}
{
"type": "string",
"contentEncoding": "base64",
"contentMediaType": "image/png"
}
Conditional Schemas
dependentRequired
The dependentRequired
keyword conditionally requires that certain properties must be present if a given property is present in an object. For example, suppose we have a schema representing a customer. If you have their credit card number, you also want to ensure you have a billing address. If you don't have their credit card number, a billing address would not be required. We represent this dependency of one property on another using the dependentRequired
keyword. The value of the dependentRequired
keyword is an object. Each entry in the object maps from the name of a property, p, to an array of strings listing properties that are required if p is present.
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" },
"billing_address": { "type": "string" }
},
"required": ["name"],
"dependentRequired": {
"credit_card": ["billing_address"]
}
}
dependentSchemas
The dependentSchemas
keyword conditionally applies a subschema when a given property is present. This schema is applied in the same way allOf applies schemas. Nothing is merged or extended. Both schemas apply independently.
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" }
},
"required": ["name"],
"dependentSchemas": {
"credit_card": {
"properties": {
"billing_address": { "type": "string" }
},
"required": ["billing_address"]
}
}
}
If-Then-Else
The if, then and else keywords allow the application of a subschema based on the outcome of another schema, much like the if/then/else constructs you've probably seen in traditional programming languages.
{
"type": "object",
"properties": {
"street_address": {
"type": "string"
},
"country": {
"default": "United States of America",
"enum": ["United States of America", "Canada"]
}
},
"if": {
"properties": { "country": { "const": "United States of America" } }
},
"then": {
"properties": { "postal_code": { "pattern": "[0-9]{5}(-[0-9]{4})?" } }
},
"else": {
"properties": { "postal_code": { "pattern": "[A-Z][0-9][A-Z] [0-9][A-Z][0-9]" } }
}
}