(lispkit json schema)

Library (lispkit json schema) implements JSON Schema as defined by the 2020-12 Internet Draft specification for validating JSON data.

Overview

A JSON schema is represented by the json-schema type. It is possible to load schema objects either from a file, decode them from a string, or from a bytevector. Schema objects have an identity, they are pre-processed and pre-validated. The identity of a JSON schema is defined by a URI in string form. Such URIs are either absolute or relative and it is either a base URI, i.e. it is referring to a top-level schema, or it is a non-base URI and thus refers to a schema nested within another schema via a URI fragment.

The semantics of a schema is defined by its dialect. A schema dialect is again identified by a URI in string form. If schema does not define a dialect identifier, a default dialect is assumed (which is json-draft2020-default right now for top-level schema and the dialect of the enclosing schema for nested schema). Schema dialects are represented by a pair consisting of a meta schema URI and a list of enabled vocabularies. Right now, only meta schema json-draft2020 is supported with dialects json-draft2020-default (for enabling the default set of vocabularies) and json-draft2020-all for enabling all standard vocabularies.

Schema validation is the process of determining whether a given JSON value matches a given schema. The whole schema validation process is controlled by a json-schema-registry object. Schema registries define:

  • A set of supported dialects with their corresponding URI identities,

  • A default dialect (for schema resources that do not define a dialect themselves),

  • A set of known/loaded schema with their corresponding identities, and

  • Schema resources, providing means for discovering and loading schema which are not loaded yet.

Most of the schema registry functionality is about configuring registry objects by registering supported dialects, inserting available schema and setting up schema resources for automatically discovering new schema. To simplify the API, library (lispkit json schema) provides a parameter object current-schema-registry referring to the current default schema registry. For most use cases, one can simply work with this pre-initialized default by registering schema and schema resources and then using it implicitly in validation calls.

Schema validation is performed via two procedures: json-valid? and json-validate both taking four arguments: the JSON value to validate, the schema against which validation takes place, a default dialect, and a schema registry coordinating the validation process. While json-valid? simply returns a boolean result, json-validate returns a json-validation-result object which encapsulates the output of the validation process.

json-validation-result objects include information about:

  • Whether the validation process succeeded

  • Errors that were encountered during validation (if there are errors, then the validation process failed)

  • Tag annotations denoting what values were deprecated, read-only, or write-only.

  • Format annotations denoting violations of format constraints by string arributes (when the format-annotation vocabulary is enabled, then these violations are automatically turned into errors).

  • Default member values for missing object members.

Workflow

The following code snippet showcases the typical workflow for performing schema validation with library (lispkit json schema). First, a new schema registry is created. Then, a new schema object is loaded from a file and registered with the registry. Next, a JSON value that does not conform to the schema is defined and validated. The validation errors are printed. Finally, a JSON value which conforms to the schema is defined and defaults are printed out.

(import (lispkit base) (lispkit json) (lispkit json schema))
; Make a new JSON schema registry with default dialect `json-draft2020-default`
(define registry (make-schema-registry json-draft2020-default))
; Load a JSON schema from a file
(define person-schema
  (load-json-schema (asset-file-path "person" "" "JSON/Schema/custom")))
; Register the schema with the registry
(schema-registry-register! person-schema registry)

; Define an invalid person
(define person1 (json '(
  (name . "John Doe")
  (email . #("john@doe.com" "john.doe@gmail.com")))))
; Validate `person1` and show the validation errors
(let ((res (json-validate person1 person-schema #t registry)))
  (display* "person1 valid: " (validation-result-valid? res) "\n")
  (for-each
    (lambda (err) (display* "  - " (caddr err) " at " (cdar err) "\n"))
    (validation-result-errors res)))

; Define a valid person
(define person2 (json '(
  (name . "John Doe")
  (birthday . "1983-03-19")
  (address . "12 Main Street, 17445 Noname"))))
(let ((res (json-validate person2 person-schema #t registry)))
  (display* "person2 valid: " (validation-result-valid? res) "\n")
  (for-each
    (lambda (x)
      (if (cadr x)
          (display* "  - " (car x) " exists; default: "
                    (json->string (car (caddr x))) "\n")
          (display* "  - " (car x) " does not exist; default: "
                    (json->string (car (caddr x))) "\n")))
    (validation-result-defaults res)))

Using the default registry

Library (lispkit json schema) defines a default schema registry and initializes parameter object current-schema-registry with it. The default registry is configured such that

  • The schema definitions in the asset directory JSON/Schema/2020-12 are available via the base schema identifier https://json-schema.org/draft/2020-12.

  • The schema definitions in the asset directory JSON/Schema/custom are available via the base schema identifier https://lisppad.app/schema.

With such a setup, it is easy to make new schema available by dropping their definition files into the asset directory JSON/Schema/custom.

If a different object with the same setup is needed, an equivalent schema registry can be created via (make-schema-registry json-draft2020-default #t #t).

JSON schema dialects

Meta schema and vocabularies are specified via URIs. URIs are represented as strings. A schema dialect specifier is a pair whose head refers to the URI of the meta schema and tail refers to a list of URIs for enabled vocabularies. Two dialect specifiers are predefined via the constants json-draft2020-default and json-draft2020-all.

URI identifying the meta schema for the 2020-12 Internet Draft specification of JSON schema.

URI identifying the core vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the applicator vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the unevaluated vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the validation vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the meta vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the format vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the content vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the format assertion vocabulary of the 2020-12 Internet Draft specification of JSON schema.

URI identifying the deprecated vocabulary of the 2020-12 Internet Draft specification of JSON schema.

Specifier for the schema dialect as defined by the 2020-12 Internet Draft specification of JSON schema, with vocabularies that are enabled by default.

Specifier for the schema dialect as defined by the 2020-12 Internet Draft specification of JSON schema, with all standard vocabularies enabled.

Returns #t if obj is a JSON schema dialect specifier; #f otherwise.

JSON schema registries

Symbol representing the json type. The type-for procedure of library (lispkit type) returns this symbol for all JSON values.

This parameter object represents the current default schema registry. Procedures requiring a registry typically make the registry argument optional. If it is not provided, the result of (current-schema-registry) is used instead. The value of current-schema-registry can be overridden with parameterize.

Returns #t if obj is a schema registry; #f otherwise.

Returns a new schema registry with dialect being the default schema dialect. If meta? is provided and set to true, a data source will be configured such that the meta schema and vocabularies for the Draft 2020 standard can be referenced. If argument custom is provided, it defines a base URI for schema definitions placed into the asset directory JSON/Schema/custom. If custom is set to #t, the base URI https://lisppad.app/schema/ will be used as a default.

Returns a copy of registry. If argument registry is not provided, a copy of the current value of parameter object current-schema-registry is returned.

Returns a list of URIs identifying the dialects registered for registry. If argument registry is not provided, the current value of parameter object current-schema-registry is used as a default.

Returns a list of URIs identifying the schema registered with registry. If argument registry is not provided, the current value of parameter object current-schema-registry is used as a default.

Adds a new data source to registry. A data source is defined by a directory at path and a base URI for schema available in the directory. If argument registry is not provided, the current value of parameter object current-schema-registry is used as a default.

Registers either a dialect or a schema with registry. If expr is a schema object, the schema is being registered under its identifier. If the schema does not define an identifier, an error is signaled. Alternatively, it is possible to specify a pair consisting of a URI as schema identifier and a schema object. If expr specifies a dialect, then this dialect gets registered with registry. If argument registry is not provided, the current value of parameter object current-schema-registry is used as a default.

Returns a schema object from registry for the given URI specified via argument ident. First, schema-registry-ref will check if a schema object is available already for ident in registry. If it is, the object gets returned. If it is not, then schema-registry-ref attempts to load the schema from the data sources of registry. If no matching schema is found, #f is returned. If argument registry is not provided, the current value of parameter object current-schema-registry is used as a default.

JSON schema

Symbol representing the json-schema type. The type-for procedure of library (lispkit type) returns this symbol for all JSON schema values.

Returns #t if obj is a JSON schema; #f otherwise.

Returns #t if obj is a boolean JSON schema, i.e. a schema that is either #t or #f; #f otherwise.

Returns a new JSON schema object dependent on argument expr:

  • If expr is #t or #f, a new schema object for a boolean schema is returned.

  • If expr is a JSON object then it is interpreted as a representation of a schema and if the coercion to a schema suceeds, a corresponding JSON schema object is returned. Otherwise, an error is signaled.

  • If expr is a JSON schema object, a new copy of this object is returned.

  • In all other cases, (json-schema expr) is implemented via (json-schema (json expr)), i.e. a new schema object is based on a coercion of expr into JSON.

Loads a JSON schema from a file at path and returns a new JSON schema object. uri is an optional identifier for the loaded schema. If it is provided, it is used as a default for the loaded schema in case it does not define its own identifier.

Returns the identifier of schema as a URI. If schema does not define its own identifier, then #f is returned.

Returns the identifier of the meta schema of schema as a URI. If schema does not define a meta schema, then #f is returned.

Returns the title of schema as a string. If schema does not define a title, then #f is returned.

Returns the description of schema as a string. If schema does not define a description, then #f is returned.

Returns a list of local schema nested within schema. Each element of the list is a pair consisting of a JSON location and a JSON schema object. The location refers to the place within schema where the nested schema was found.

Returns a schema nested within schema at the location at which JSON reference ref is referring to. If that reference does not point at a location with a valid schema, then #f is returned.

Returns a schema nested within schema with the local schema identifier fragment (a string). This procedure can be used to do a nested schema lookup by name.

Returns a JSON object representing schema.

JSON validation

Returns #t if value json conforms to schema. dialect specifies, if provided, a default dialect overriding the default from registry specifically for schema. If argument registry is not provided, the current value of parameter object current-schema-registry is used instead.

Returns an object encapsulating validation results from the process of validating json against schema. dialect specifies, if provided, a default dialect overriding the default from registry specifically for schema. If argument registry is not provided, the current value of parameter object current-schema-registry is used instead.

JSON validation results

Symbol representing the json-validation-result type. The type-for procedure of library (lispkit type) returns this symbol for all JSON validation result objects.

Returns #t if obj is a JSON validation result object; #f otherwise.

Returns #t if the validation results object res does not contain any validation errors; i.e. the validation succeeded. Returns #f otherwise.

Returns three results: the number of erros in validation results object res, the number of tags, and the number of format constraints.

Returns a list of errors from the validation results object res. Each error is represented by the following data structure: ((value . value-location) (schema-rule, rule-location) message) where value is the affected JSON value at a location at which reference value-location is referring to within the verified value. schema-rule is a part of the schema causing the error, again with reference rule-location referring to it. message is an error message.

Here is some sample code displaying errors:

(for-each
  (lambda (err)
    (let ((val (caar err))
          (val-loc (cdar err))
          (schema (caadr err))
          (schema-loc (cdadr err))
          (message (caddr err)))
      (display* "  - " message " at " val-loc "\n")))
  (validation-result-errors res))

Returns a list of tagged JSON values from the validation results object res. Each tagged value is represented by the following data structure: ((value . value-location) tag-location taglist) where value is the tagged JSON value at a location at which reference value-location is referring to within the verified value. rule-location refers to the tagging rule in the schema. taglist is a list containing at least one of the following symbols: deprecated, read-only, and write-only.

Returns a list of format constraints from the validation results object res. Each format constraint is represented by the following data structure: ((value . value-location) format-location (message . valid?)) where value is the tagged JSON value at a location at which reference value-location is referring to within the verified value. format-location refers to the format constraint in the schema. message is a string describing the format constraint. valid? is () if the constraint has not been checked, it is #t if the constraint has been checked and is valid, and it is #f if the constraint has been checked and is invalid.

Returns a list of computed defaults from the validation results object res. Each default is represented by the following data structure: (value-location exists? defaults) where value-location is a JSON reference referring to a member with a default. exists? is a boolean value; it is #t if that member has a value already, or #f if it does not have a member value already. defaults is a list of computed default JSON values for this member.

Last updated