The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.
This document defines programming and REST interfaces for instantiating and executing validation services. This includes the use of ShapeMaps to bind RDF data to ShEx schemas. See the Shape Expressions Primer for a non-normative description of shape maps.
This document will be presented to the Shape Expressions Community Group.
The Shape Expressions (ShEx) language provides a structural schema for RDF data. This can be used to document APIs or datasets, aid in development of API-conformant messages, minimize defensive programming, guide user interfaces, or anything else that involves a machine-readable description of data organization and typing requirements.
A practical use of ShEx is to test nodes in RDF nodes for conformance with shape expressions. This document defines interfaces for doing that.
To understand the programatic interface described in this specification and how it is intended to operate in a programming environment, it is useful to have working knowledge of WebIDL [[WebIDL]]. To understand how ShEx relates to RDF, it is helpful to be familiar with the basic RDF concepts [[RDF11-CONCEPTS]].
The ShEx interface is defined using terms from RDF semantics [[!rdf11-mt]]:
For the purposes of the ShExProcessor interface, a promise is an object that represents the eventual result of a single asynchronous operation. Promises are defined in [[ECMASCRIPT-6.0]].
Conformance criteria are relevant to authors and authoring tool implementers. As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The ShEx specification defines two forms of ShapeMaps:
start shape
.node
and shape
properties of a fixed ShapeMap and unspecified extra properties.A result ShapeMap: has the following properties:
If the status property is absent, the status is assumed to be "conformant". The reason and appInfo properties may also be absent but have no default value.
A node selector is a subset of a SPARQL triple pattern with these restrictions:
V
(the set of variables) is either a fresh variable or a known token to identify the focus node.I
in the SPARQL definitions).satisfies(fi, se, G, m)
.adjust to fit in this doc
A query ShapeMap is a ShapeMap with only a node, shape and possibly a status property.
A query ShapeMap with a status must have a status of "conformant"
or "nonconformant"
.
A ground ShapeMap is a ShapeMap in which all node selector/shape pairs have been replaced by a set of zero or more node/shape pairs. The ShEx validation process takes as input a ground ShapeMap.
This API provides a clean mechanism that enables developers to determine conformance of RDF Graphs to a ShEx Schema.
Each function in the ShEx API defines a set of results and errors. Errors are identified by their error code.
The ShExProcessor interface is the high-level programming structure that developers use to determine conformance of RDF Graphs to a ShEx Schema. All input parameters are static; meaning they are not modified by any defined API opperation.
http://www.w3.org/ns/shex#Start
as a shapeExprLabel, a BOOL result value, depending on if the node
conforms with the referenced shape(s), and an optional message, describing details conformance,
and promise is fulfilled using shapeResults.The rest of this follows sections 6.2 - 6.5 in the orig API.
This API provides a clean mechanism that enables developers to determine conformance of RDF Graphs to a ShEx Schema.
The ShEx API uses Promises to represent the result of the various asynchronous operations. Promises are defined in [[ECMASCRIPT-6.0]]. General use within specifications can be found in [[promises-guide]]. When using promises, an error is reported as a rejected promise and a result is reported as an accepted promise.
The ShExProcessor interface is the high-level programming structure that developers use to determine conformance of RDF Graphs to a ShEx Schema.
It is important to highlight that implementations do not modify the input parameters. If an error is detected, the Promise is rejected passing a ShExError with the corresponding error code.
http://www.w3.org/ns/shex#Start
as a shapeExprLabel, a BOOL result value, depending on if the node
conforms with the referenced shape(s), and an optional message, describing details conformance,
and promise is fulfilled using shapeResults.The RDFGraph type is describe an RDF Graph accessed as described in Graph access.
The ShExOptions type is used to pass various options to the ShExProcessor methods.
ShExC
or ShExJ
.shex-2.0
, the implementation has to produce
exactly the same results as the algorithms defined in this specification.
If set to another value, the ShEx processor is allowed to extend
or modify the algorithms defined in this specification to enable
application-specific optimizations. The definition of such
optimizations is beyond the scope of this specification and thus
not defined. Consequently, different implementations may implement
different optimizations. Developers must not define modes beginning
with shex
as they are reserved for future versions
of this specification.The ShapeResults type is used for reporting the result of conformance checking of an RDF Graph against a schema.
ShapeResults is a dictionary mapping a node in validate.graph to a sequence of ShapeResult entries.
This section describes the datatype definitions used within the ShEx API for error handling.
The ShExError type is used to report processing errors.
The ShExErrorCode represents the collection of valid ShEx error codes.
ShapeMaps can be easily transmitted an understood with a specialized syntax.
Relative and prefixed IRIs in the node position are resolved against the application-defined prefix map and base URL for the data. Likewise, schema IRI forms are resolved against the Schema PREFIX and namespace.
... status, reason, appInfo ...
d:n1 @ s:S1, # d: prefix from data, s: from schema "foo"^^xsd:string@START! # "foo" did not match the start shape. /"missing :p1" # The reason given is "missing :p1". $"appinfo":{"myextra1":["..."]} ,# The application provide structural data. "chat"@en-fr@<http://...S3>?, # validate a literal {FOCUS :p2 "abcd"@en-us}@START, # validate subjects of :p2 "abcd" {_ :p3 FOCUS}@START # valide all objects of :p3
!
' means node does not conform to the shape expression.?
' means that conformance has not been tested.!
' or '?
' means that node conforms to the shape expression./
' character introduces a url-encoded reason string.$
' character precedes a JSON string which precedes the ':
' character and a JSON value. At present, the only string permitted here is "appinfo"
.[1] | shapeMap | ::= | pair (',' pair)*; |
[2] | pair | ::= | nodeSelector shapeSelector status? reason? jsonAttributes? |
[3] | nodeSelector | ::= | objectTerm | triplePattern |
[4] | subjectTerm | ::= | iri | BLANK_NODE_LABEL |
[5] | objectTerm | ::= | subjectTerm | literal |
[6] | triplePattern | ::= | '{' "FOCUS" iri (objectTerm | '_') '}' |
[7] | shapeSelector | ::= | '@' (iri | "START") | ATSTART | ATPNAME_NS | ATPNAME_LN |
[8] | status | ::= | '!' | '?' |
[9] | reason | ::= | '/' string |
[10] | jsonAttributes | ::= | '$' '"appinfo"' ':' jsonValue |
[11] | jsonValue | ::= | 'false' | 'null' | 'true' | jsonObject | jsonArray | DOUBLE | STRING_LITERAL2; |
[12] | jsonObject | ::= | '{' (jsonMember (',' jsonMember)*)? '}'; |
[13] | jsonMember | ::= | STRING_LITERAL2 ':' jsonValue; |
[14] | jsonArray | ::= | '[' (jsonValue (',' jsonValue)*)? ']'; |
[13t] | literal |
::= | rdfLiteral | numericLiteral | booleanLiteral |
[16t] | numericLiteral |
::= | INTEGER | DECIMAL | DOUBLE |
[65] | rdfLiteral |
::= | langString | string ("^^" iri)? |
[134s] | booleanLiteral |
::= | "true" | "false" |
[135s] | string |
::= | STRING_LITERAL1 | STRING_LITERAL_LONG1 |
[66] | langString |
::= | LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1 |
[136s] | iri |
::= | IRIREF | prefixedName |
[137s] | prefixedName |
::= | PNAME_LN | PNAME_NS |
Terminals | |||
[18t] | <IRIREF > |
::= | "<" ([^#0000- <>\"{}|^`\\] | UCHAR)* ">" |
[140s] | <PNAME_NS > |
::= | PN_PREFIX? ":" |
[141s] | <PNAME_LN > |
::= | PNAME_NS PN_LOCAL |
[70] | <ATPNAME_NS > |
::= | "@" PN_PREFIX? ":" |
[71] | <ATPNAME_LN > |
::= | "@" PNAME_NS PN_LOCAL |
[142s] | <BLANK_NODE_LABEL > |
::= | "_:" (PN_CHARS_U | [0-9]) ((PN_CHARS | ".")* PN_CHARS)? |
[15] | <AT_PNAME_NS > |
::= | '@' PNAME_NS |
[16] | <AT_PNAME_LN > |
::= | '@' PNAME_LN |
[17] | <AT_START > |
::= | "@START" |
This terminal has precendence over LANGTAG | |||
[145s] | <LANGTAG > |
::= | "@" ([a-zA-Z])+ ("-" ([a-zA-Z0-9])+)* |
[19t] | <INTEGER > |
::= | [+-]? [0-9]+ |
[20t] | <DECIMAL > |
::= | [+-]? [0-9]* "." [0-9]+ |
[21t] | <DOUBLE > |
::= | [+-]? ([0-9]+ "." [0-9]* EXPONENT | "."? [0-9]+ EXPONENT) |
[155s] | <EXPONENT > |
::= | [eE] [+-]? [0-9]+ |
[156s] | <STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" |
[157s] | <STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' |
[158s] | <STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" |
[159s] | <STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' |
[73] | <LANG_STRING_LITERAL1 > |
::= | "'" ([^'\\\n\r] | ECHAR | UCHAR)* "'" LANGTAG |
[74] | <LANG_STRING_LITERAL2 > |
::= | '"' ([^\"\\\n\r] | ECHAR | UCHAR)* '"' LANGTAG |
[75] | <LANG_STRING_LITERAL_LONG1 > |
::= | "'''" ( ("'" | "''")? ([^\\'\\] | ECHAR | UCHAR) )* "'''" LANGTAG |
[76] | <LANG_STRING_LITERAL_LONG2 > |
::= | '"""' ( ('"' | '""')? ([^\"\\] | ECHAR | UCHAR) )* '"""' LANGTAG |
[26t] | <UCHAR > |
::= | "\\u" HEX HEX HEX HEX |
[160s] | <ECHAR > |
::= | "\\" [tbnrf\\\"\\'] |
[164s] | <PN_CHARS_BASE > |
::= | [A-Z] | [a-z] |
[165s] | <PN_CHARS_U > |
::= | PN_CHARS_BASE | "_" |
[167s] | <PN_CHARS > |
::= | PN_CHARS_U | "-" | [0-9] |
[168s] | <PN_PREFIX > |
::= | PN_CHARS_BASE ( (PN_CHARS | ".")* PN_CHARS )? |
[169s] | <PN_LOCAL > |
::= | (PN_CHARS_U | ":" | [0-9] | PLX) ( (PN_CHARS | "." | ":" | PLX)* (PN_CHARS | ":" | PLX) )? |
[170s] | <PLX > |
::= | PERCENT | PN_LOCAL_ESC |
[171s] | <PERCENT > |
::= | "%" HEX HEX |
[172s] | <HEX > |
::= | [0-9] | [A-F] | [a-f] |
[173s] | <PN_LOCAL_ESC > |
::= | "\\" ( "_" | "~" | "." | "-" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | "/" | "?" | "#" | "@" | "%" ) |
[98] | PASSED TOKENS |
::= | [ \t\r\n]+ |