Shape Expressions Interface

W3C Editor's Draft 01 April 2025

This version:: https://shexspec.github.io/spec/API
Latest published version:: https://www.w3.org/TR/shex-api/
Latest editor's draft:: https://shexspec.github.io/spec/API
Test suite:: https://github.com/shexSpec/shexTest
Editors:: Gregg Kellogg (Spec-Ops); Eric Prud'hommeaux (W3C/MIT)
Version control:: Github Repository

Abstract

The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.

This document defines programming and REST interfaces for instantiating and executing validation services. This includes the use of ShapeMaps to bind RDF data to ShEx schemas. See the Shape Expressions Primer for a non-normative description of shape maps.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document will be presented to the Shape Expressions Community Group.

This document was published by the Shape Expressions Community Group as an Editor's Draft.

Comments regarding this document are welcome. Please send them to public-shex@w3.org (archives).

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

Introduction

The Shape Expressions (ShEx) language provides a structural schema for RDF data. This can be used to document APIs or datasets, aid in development of API-conformant messages, minimize defensive programming, guide user interfaces, or anything else that involves a machine-readable description of data organization and typing requirements.

A practical use of ShEx is to test nodes in RDF nodes for conformance with shape expressions. This document defines interfaces for doing that.

To understand the programatic interface described in this specification and how it is intended to operate in a programming environment, it is useful to have working knowledge of WebIDL [WebIDL]. To understand how ShEx relates to RDF, it is helpful to be familiar with the basic RDF concepts [RDF11-CONCEPTS].

Terminology

The ShEx interface is defined using terms from RDF semantics [rdf11-mt]:

Node: one of IRI, blank node, literal
Graph: a set of Triples of (subject, predicate, object)

Issue 1

For the purposes of the ShExProcessor interface, a promise is an object that represents the eventual result of a single asynchronous operation. Promises are defined in [ECMASCRIPT-6.0].

Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

Conformance criteria are relevant to authors and authoring tool implementers. As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

ShapeMap structure

The ShEx specification defines two forms of ShapeMaps:

fixed ShapeMap: a ShapeMap in which all IRIs are absolute, and all shape expression identifiers are either a label that appears in the schema or, when relating to schemas which have a start shape, the value start shape.
annotated ShapeMap: a ShapeMap with the node and shape properties of a fixed ShapeMap and unspecified extra properties.

A result ShapeMap: has the following properties:

node: an RDF node or a node selector.
shape: a ShEx shape expression.
status: [default="conformant"] "conformant" | "nonconformant".
reason: [optional] a string stating a reason for failure or success.
appInfo: [optional] application-specific JSON-LD structure

If the status property is absent, the status is assumed to be "conformant". The reason and appInfo properties may also be absent but have no default value.

A node selector is a subset of a SPARQL triple pattern with these restrictions:

V (the set of variables) is either a fresh variable or a known token to identify the focus node.
The focus node token appears in either the subject or the object position.
The predicate position is filled by an IRI (I in the SPARQL definitions).
Each optional focus node f_i in focusNodes, m has a start shapeExpr se, and satisfies(f_i, se, G, m).adjust to fit in this doc

A query ShapeMap is a ShapeMap with only a node, shape and possibly a status property. A query ShapeMap with a status must have a status of "conformant" or "nonconformant".

A ground ShapeMap is a ShapeMap in which all node selector/shape pairs have been replaced by a set of zero or more node/shape pairs. The ShEx validation process takes as input a ground ShapeMap.

async Application Programming Interface

This API provides a clean mechanism that enables developers to determine conformance of RDF Graphs to a ShEx Schema.

Each function in the ShEx API defines a set of results and errors. Errors are identified by their error code.

The ShExProcessor Interface

The ShExProcessor interface is the high-level programming structure that developers use to determine conformance of RDF Graphs to a ShEx Schema. All input parameters are static; meaning they are not modified by any defined API opperation.

          [Constructor]
          interface ShExProcessor {
            Schema parseSchema(
              (USVString or RDFGraph) schemastring,
              optional ShExParserOptions? options);
            RDFGraph parseData(
              (USVString) datastring,
              optional DataParserOptions? options);
            fixedShapeMap resolveShapeMap(
              queryShapeMap map,
              optional Graph? data);
            resultShapeMap validate(
              Schema schema,
              RDFGraph graph,
              fixedShapeMap shapeMap,
              optional ShExOptions? options);
          };

parseSchema

Checks conformance of the given parseSchema.graph with parseSchema.schema using parseSchema.shapeMap according to .

If validate.schema has the form of an IRI, set schema to the result of dereferencing the IRI, treating the result as ShExC, ShExJ, or some other RDF serialization, depending on the content-type of the result, and transforming into the ShExJ abstract syntax for processing.
Otherwise, if a USVString, validate.schema is treated as a string, and parsed into the ShExJ abstract syntax, using format as a hint to determine the content-type.
Otherwise, if an RDF Graph, schema is transforming into the ShExJ abstract syntax for processing.
If validate.schema is parsed as ShExC, and a syntax error is found, reject promise passing syntax error.
If a transformation to the ShExJ abstract syntax does not result in valid ShExJ, reject promise, passing invalid schema.

schemastring: The ShEx Schema, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.; A set of options to configure parsing.

parseData

Checks conformance of the given parseData.graph with parseData.data using parseData.shapeMap according to .

If validate.graph has the form of an IRI, set graph to the result of dereferencing the IRI, treating the result an RDF Graph serialization, depending on the content-type of the result. Otherwise, validate.graph either is an RDF Graph, or a processor should use other means to transform it into an RDF Graph.

datastring: The ShEx Data, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.; A set of options to configure parsing.

resolveShapeMap

Checks conformance of the given resolveShapeMap.graph with resolveShapeMap.data using resolveShapeMap.shapeMap according to .

Issue 2

parse queryShapeMap string or JSON doc (resolve prefixes?).

Issue 3

construct fixedShapeMap with each fixed pairs and replacing node selectors with their corresponding nodes.

datastring: The ShEx Data, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.; A set of options to configure parsing.

validate

Checks conformance of the given validate.graph with validate.schema using validate.shapeMap according to .

Process schema and graph according to the requirements in , creating shapeResults, a ShapeResults dictionary based on a copy of validate.shapeMap with entries added, as necessary, for each focus node f_i in focusNodes associated with http://www.w3.org/ns/shex#Start as a shapeExprLabel, a BOOL result value, depending on if the node conforms with the referenced shape(s), and an optional message, describing details conformance, and promise is fulfilled using shapeResults.
If graph is found not to conform using schema and validate.shapeMap, reject promise, passing non-conforming graph along with shapeResults.

schema: The ShEx Schema, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.
graph: The graph to check conformance with, either as a a dereferencable IRI to some RDF serialization, or as an accessable RDF Graph.
shapeMap: A dictionary mapping a node in validate.graph to one ore more shapeExprLabel.
options: A set of options to configure the algorithms.

The rest of this follows sections 6.2 - 6.5 in the orig API.

The Application Programming Interface

This API provides a clean mechanism that enables developers to determine conformance of RDF Graphs to a ShEx Schema.

The ShEx API uses Promises to represent the result of the various asynchronous operations. Promises are defined in [ECMASCRIPT-6.0]. General use within specifications can be found in [promises-guide]. When using promises, an error is reported as a rejected promise and a result is reported as an accepted promise.

The ShExProcessor Interface

The ShExProcessor interface is the high-level programming structure that developers use to determine conformance of RDF Graphs to a ShEx Schema.

It is important to highlight that implementations do not modify the input parameters. If an error is detected, the Promise is rejected passing a ShExError with the corresponding error code.

          [Constructor]
          interface ShExProcessor {
            Promise validate(
              (USVString or RDFGraph) schema,
              (USVString or RDFGraph) graph,
              ShapeMap shapeMap,
              optional ShExOptions? options);
          };

validate

Checks conformance of the given validate.graph with validate.schema using validate.shapeMap according to .

Create a new Promise promise and return it. The following steps are then executed asynchronously.
If validate.schema has the form of an IRI, set schema to the result of dereferencing the IRI, treating the result as ShExC, ShExJ, or some other RDF serialization, depending on the content-type of the result, and transforming into the ShExJ abstract syntax for processing.
Otherwise, if a USVString, validate.schema is treated as a string, and parsed into the ShExJ abstract syntax, using format as a hint to determine the content-type.
Otherwise, if an RDF Graph, schema is transforming into the ShExJ abstract syntax for processing.
If validate.schema is parsed as ShExC, and a syntax error is found, reject promise passing syntax error.
If a transformation to the ShExJ abstract syntax does not result in valid ShExJ, reject promise, passing invalid schema.
If validate.graph has the form of an IRI, set graph to the result of dereferencing the IRI, treating the result an RDF Graph serialization, depending on the content-type of the result. Otherwise, validate.graph either is an RDF Graph, or a processor should use other means to transform it into an RDF Graph.
Process schema and graph according to the requirements in , creating shapeResults, a ShapeResults dictionary based on a copy of validate.shapeMap with entries added, as necessary, for each focus node f_i in focusNodes associated with http://www.w3.org/ns/shex#Start as a shapeExprLabel, a BOOL result value, depending on if the node conforms with the referenced shape(s), and an optional message, describing details conformance, and promise is fulfilled using shapeResults.
If graph is found not to conform using schema and validate.shapeMap, reject promise, passing non-conforming graph along with shapeResults.

schema: The ShEx Schema, either as a dereferencable IRI (URL), an inline data string, or an RDF Graph which is transformed into the ShExJ abstract syntax.
graph: The graph to check conformance with, either as a a dereferencable IRI to some RDF serialization, or as an accessable RDF Graph.
shapeMap: A dictionary mapping a node in validate.graph to one ore more shapeExprLabel.
options: A set of options to configure the algorithms.

The RDFGraph Type

The RDFGraph type is describe an RDF Graph accessed as described in Graph access.

WebIDL
          interface RDFGraph {
            void bgp();
          };

bgp: Performs a SPARQL Basic Graph Pattern query on the graph.

The ShExOptions Type

The ShExOptions type is used to pass various options to the ShExProcessor methods.

          dictionary ShExOptions {
            USVString?             base;
            sequence?   focusNodes;
            USVString              format = "ShExC";
            USVString              processingMode = "shex-2.0";
          };

base: The base IRI to use when parsing validate.schema. If set, this overrides the validate.schema's IRI.
focusNodes: One or more nodes in validate.graph which are are checked for conformance against start.
format: One of ShExC or ShExJ.
processingMode: If set to shex-2.0, the implementation has to produce exactly the same results as the algorithms defined in this specification. If set to another value, the ShEx processor is allowed to extend or modify the algorithms defined in this specification to enable application-specific optimizations. The definition of such optimizations is beyond the scope of this specification and thus not defined. Consequently, different implementations may implement different optimizations. Developers must not define modes beginning with shex as they are reserved for future versions of this specification.

The ShapeResults Type

The ShapeResults type is used for reporting the result of conformance checking of an RDF Graph against a schema.

ShapeResults is a dictionary mapping a node in validate.graph to a sequence of ShapeResult entries.

          typedef dictionary ShapeResults;

          dictionary ShapeResult {
            USVString  shape;
            boolean    result;
            USVString? reason;
          };

shape: A shapeExprLabel in validate.schema against which node conformance was checked.
result: The result of the conformance check.
reason: An optional message describing the result of the conformance check.

Error Handling

This section describes the datatype definitions used within the ShEx API for error handling.

ShExError

The ShExError type is used to report processing errors.

WebIDL
            dictionary ShExError {
              ShExErrorCode code;
              DOMString?      message = null;
            };

code: a string representing the particular error type, as described in the various algorithms in this document.
message: an optional error message containing additional debugging information. The specific contents of error messages are outside the scope of this specification.

ShExErrorCode

The ShExErrorCode represents the collection of valid ShEx error codes.

WebIDL
            enum ShExErrorCode {
                "invalid schema",
                "non-conforming graph",
                "syntax error"
            };

invalid schema: The validate.schema argument to validate is invalid. @gkellogg: This could use some improvement.
non-conforming graph: The validate.graph is found not to conform with validate.schema.
syntax error: Parsing validate.schema as ShExC failed with a syntax error.

ShapeMap syntax

ShapeMaps can be easily transmitted an understood with a specialized syntax.

Example 1: Simple human-syntax ShapeMap

A simple ShapeMap with one node/shape pair might be both a query ShapeMap and a result ShapeMap:

<http://data.example/#n1> @ <http://data.example/#s>, # a simple node/shape pair

Relative and prefixed IRIs in the node position are resolved against the application-defined prefix map and base URL for the data. Likewise, schema IRI forms are resolved against the Schema PREFIX and namespace.

Example 2: ShapeMap with prefixed names

d:n1 @ s:S1, # d: prefix from data, s: from schema

... status, reason, appInfo ...

d:n1 @ s:S1, # d: prefix from data, s: from schema
"foo"^^xsd:string@START!           # "foo" did not match the start shape.
  /"missing :p1"                   # The reason given is "missing :p1".
  $"appinfo":{"myextra1":["..."]} ,# The application provide structural data.
"chat"@en-fr@<http://...S3>?, # validate a literal
{FOCUS :p2 "abcd"@en-us}@START, # validate subjects of :p2 "abcd"
{_ :p3 FOCUS}@START # valide all objects of :p3

The node or node selector is before the '@'.
The shape expression label is immediately after the '@'.
The '!' means node does not conform to the shape expression.
For systems that peform validation asynchronously, '?' means that conformance has not been tested.
The absense of '!' or '?' means that node conforms to the shape expression.
The '/' character introduces a url-encoded reason string.
The '$' character precedes a JSON string which precedes the ':' character and a JSON value. At present, the only string permitted here is "appinfo".

ShapeMap grammar

[1]	`shapeMap`	::=	`pair (',' pair)*;`
[2]	`pair`	::=	`nodeSelector shapeSelector status? reason? jsonAttributes?`
[3]	`nodeSelector`	::=	`objectTerm \| triplePattern`
[4]	`subjectTerm`	::=	`iri \| BLANK_NODE_LABEL`
[5]	`objectTerm`	::=	`subjectTerm \| literal`
[6]	`triplePattern`	::=	`'{' "FOCUS" iri (objectTerm \| '_') '}' \| '{' (subjectTerm \| '_') iri "FOCUS" '}'`
[7]	`shapeSelector`	::=	`'@' (iri \| "START") \| ATSTART \| ATPNAME_NS \| ATPNAME_LN`
[8]	`status`	::=	`'!' \| '?'`
[9]	`reason`	::=	`'/' string`
[10]	`jsonAttributes`	::=	`'$' '"appinfo"' ':' jsonValue`
[11]	`jsonValue`	::=	`'false' \| 'null' \| 'true' \| jsonObject \| jsonArray \| DOUBLE \| STRING_LITERAL2;`
[12]	`jsonObject`	::=	`'{' (jsonMember (',' jsonMember)*)? '}';`
[13]	`jsonMember`	::=	`STRING_LITERAL2 ':' jsonValue;`
[14]	`jsonArray`	::=	`'[' (jsonValue (',' jsonValue)*)? ']';`
[13t]	`literal`	::=	`rdfLiteral \| numericLiteral \| booleanLiteral`
[16t]	`numericLiteral`	::=	`INTEGER \| DECIMAL \| DOUBLE`
[65]	`rdfLiteral`	::=	`langString \| string ("^^" iri)?`
[134s]	`booleanLiteral`	::=	`"true" \| "false"`
[135s]	`string`	::=	`STRING_LITERAL1 \| STRING_LITERAL_LONG1 \| STRING_LITERAL2 \| STRING_LITERAL_LONG2`
[66]	`langString`	::=	`LANG_STRING_LITERAL1 \| LANG_STRING_LITERAL_LONG1 \| LANG_STRING_LITERAL2 \| LANG_STRING_LITERAL_LONG2`
[136s]	`iri`	::=	`IRIREF \| prefixedName`
[137s]	`prefixedName`	::=	`PNAME_LN \| PNAME_NS`
Terminals
[18t]	<`IRIREF`>	::=	"<" ([^#0000- <>\"{}\|^`\\] \| UCHAR)* ">"
[140s]	<`PNAME_NS`>	::=	`PN_PREFIX? ":"`
[141s]	<`PNAME_LN`>	::=	`PNAME_NS PN_LOCAL`
[70]	<`ATPNAME_NS`>	::=	`"@" PN_PREFIX? ":"`
[71]	<`ATPNAME_LN`>	::=	`"@" PNAME_NS PN_LOCAL`
[142s]	<`BLANK_NODE_LABEL`>	::=	`"_:" (PN_CHARS_U \| [0-9]) ((PN_CHARS \| ".")* PN_CHARS)?`
[15]	<`AT_PNAME_NS`>	::=	`'@' PNAME_NS`
[16]	<`AT_PNAME_LN`>	::=	`'@' PNAME_LN`
[17]	<`AT_START`>	::=	`"@START"`
This terminal has precendence over LANGTAG
[145s]	<`LANGTAG`>	::=	`"@" ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)*`
[19t]	<`INTEGER`>	::=	`[+-]? [0-9]+`
[20t]	<`DECIMAL`>	::=	`[+-]? [0-9]* "." [0-9]+`
[21t]	<`DOUBLE`>	::=	`[+-]? ([0-9]+ "." [0-9]* EXPONENT \| "."? [0-9]+ EXPONENT)`
[155s]	<`EXPONENT`>	::=	`[eE] [+-]? [0-9]+`
[156s]	<`STRING_LITERAL1`>	::=	`"'" ([^'\\\n\r] \| ECHAR \| UCHAR)* "'"`
[157s]	<`STRING_LITERAL2`>	::=	`'"' ([^\"\\\n\r] \| ECHAR \| UCHAR)* '"'`
[158s]	<`STRING_LITERAL_LONG1`>	::=	`"'''" ( ("'" \| "''")? ([^\\'\\] \| ECHAR \| UCHAR) )* "'''"`
[159s]	<`STRING_LITERAL_LONG2`>	::=	`'"""' ( ('"' \| '""')? ([^\"\\] \| ECHAR \| UCHAR) )* '"""'`
[73]	<`LANG_STRING_LITERAL1`>	::=	`"'" ([^'\\\n\r] \| ECHAR \| UCHAR)* "'" LANGTAG`
[74]	<`LANG_STRING_LITERAL2`>	::=	`'"' ([^\"\\\n\r] \| ECHAR \| UCHAR)* '"' LANGTAG`
[75]	<`LANG_STRING_LITERAL_LONG1`>	::=	`"'''" ( ("'" \| "''")? ([^\\'\\] \| ECHAR \| UCHAR) )* "'''" LANGTAG`
[76]	<`LANG_STRING_LITERAL_LONG2`>	::=	`'"""' ( ('"' \| '""')? ([^\"\\] \| ECHAR \| UCHAR) )* '"""' LANGTAG`
[26t]	<`UCHAR`>	::=	`"\\u" HEX HEX HEX HEX \| "\\U" HEX HEX HEX HEX HEX HEX HEX HEX`
[160s]	<`ECHAR`>	::=	`"\\" [tbnrf\\\"\\']`
[164s]	<`PN_CHARS_BASE`>	::=	`[A-Z] \| [a-z] \| [#00C0-#00D6] \| [#00D8-#00F6] \| [#00F8-#02FF] \| [#0370-#037D] \| [#037F-#1FFF] \| [#200C-#200D] \| [#2070-#218F] \| [#2C00-#2FEF] \| [#3001-#D7FF] \| [#F900-#FDCF] \| [#FDF0-#FFFD] \| [#10000-#EFFFF]`
[165s]	<`PN_CHARS_U`>	::=	`PN_CHARS_BASE \| "_"`
[167s]	<`PN_CHARS`>	::=	`PN_CHARS_U \| "-" \| [0-9] \| [#00B7] \| [#0300-#036F] \| [#203F-#2040]`
[168s]	<`PN_PREFIX`>	::=	`PN_CHARS_BASE ( (PN_CHARS \| ".")* PN_CHARS )?`
[169s]	<`PN_LOCAL`>	::=	`(PN_CHARS_U \| ":" \| [0-9] \| PLX) ( (PN_CHARS \| "." \| ":" \| PLX)* (PN_CHARS \| ":" \| PLX) )?`
[170s]	<`PLX`>	::=	`PERCENT \| PN_LOCAL_ESC`
[171s]	<`PERCENT`>	::=	`"%" HEX HEX`
[172s]	<`HEX`>	::=	`[0-9] \| [A-F] \| [a-f]`
[173s]	<`PN_LOCAL_ESC`>	::=	`"\\" ( "_" \| "~" \| "." \| "-" \| "!" \| "$" \| "&" \| "'" \| "(" \| ")" \| "*" \| "+" \| "," \| ";" \| "=" \| "/" \| "?" \| "#" \| "@" \| "%" )`
[98]	`PASSED TOKENS`	::=	`[ \t\r\n]+ \| "#" [^\r\n]*`

Shape Expressions Interface

W3C Editor's Draft 01 April 2025

Abstract

Status of This Document

Introduction

Terminology

Conformance

ShapeMap structure

async Application Programming Interface

The ShExProcessor Interface

The Application Programming Interface

The ShExProcessor Interface

The RDFGraph Type

The ShExOptions Type

The ShapeResults Type

Error Handling

ShExError

ShExErrorCode

ShapeMap syntax

ShapeMap grammar

Terminals