Introduction
This is a position statement for participation to the W3C Workshop on Web Standardization for Graph Data 2019. This position statement lies within the topic of graph query languages, and focuses on the GraphQL [1] query language, and its link to the RDF [2] data model.
We focus on GraphQL, because it provides an easy-to-use and highly popular language for querying graphs. Unfortunately, these queries only have semantics in isolated servers. In order to add meaning across data sources, GraphQL queries can be aligned with the RDF data model.
In the next section, we will list existing techniques for using GraphQL to query RDF data. After that, we conclude with a proposal on a way forward to harmonize these approaches, which we hope to discuss during the workshop.
Existing Approaches
To the best of our knowledge, four solutions exist for querying RDF via GraphQL queries:
- HyperGraphQL [3] (enterprise)
- TopBraid [4] (enterprise)
- Stardog [5] (enterprise)
- GraphQL-LD [6] (academic)
The key properties of these solutions will be discussed hereafter.
HyperGraphQL
HyperGraphQL [3] is an open-source GraphQL interface wrapper around RDF sources, maintained by Semantic Integration Ltd. The main goal of this system is to support federated querying over multiple RDF sources and make it accessible through a single interface.
Each HyperGraphQL instance uses a configuration file and an annotated GraphQL schema. The configuration file is used to define the RDF services to fetch data from. The annotated GraphQL schema is a GraphQL schema that is annotated with GraphQL directives that indicate for each type and argument in which RDF service the type can be fetched, and what their full URI is.
Users can send plain GraphQL queries to an HyperGraphQL instance, and the instance will reply with a GraphQL response that is enhanced with a JSON-LD context [7]. This implicitly transforms the GraphQL response into RDF. Optionally, the server provides different content types to convert the response into alternative RDF serializations.
TopBraid
TopBraid [4] is a commercial set of semantic data governance tools by TopQuadrant. In a recent release, GraphQL was added as a way to query the stored RDF.
TopBraid’s approach is similar to HyperGraphQL. It also uses GraphQL schemas to expose a GraphQL interface over RDF sources. TopBraid’s schemas are automatically generated based on data shape definitions, which can be defined using SHACL [8].
These schemas are more expressive than HyperGraphQL’s schema, as it supports things like SHACL-based constraint filters, full-text-search, aggregations, ordering and field transformations. These features are defined via custom query functions in the schema. In contrast to HyperGraphQL, TopBraid attaches no JSON-LD context to its GraphQL query responses.
Stardog
Stardog [5] is a commercial graph store that recently (as of release 5.3.3) adds GraphQL query support.
Stardog makes GraphQL schemas optional for querying RDF using GraphQL.
If no schema is used, then GraphQL term to RDF conversion will be done using the default namespace in that graph,
but this can be overridden by using the @prefix
directive inside GraphQL queries.
Stardog assumes that the top-level node of any query is assumed to refer to a type. Any deeper query nodes refer to predicate links from the parent node.
Stardog also adds several features to the GraphQL query side. Compared to TopBraid, Stardog uses directives for this, instead of dedicated arguments. It adds support for functionality such as filters and bindings.
Just like TopBraid, no JSON-LD context is attached to its GraphQL query responses.
GraphQL-LD
Finally, GraphQL-LD [6] is an academic approach that is able to convert GraphQL queries to SPARQL [9], and SPARQL results to the corresponding GraphQL query results, independent of a query engine. As an example, GraphQL-LD was implemented in the Comunica engine [10], and can be tried out via a Web interface. Compared to the previously explained approaches, it does not use the GraphQL schema at all, but exploits JSON-LD contexts [7] to handle the conversion of GraphQL terms to RDF terms. This offers a separation of concerns where developers can use simple GraphQL queries, and domain experts can offer the JSON-LD context which semantifies these queries.
GraphQL-LD also supports custom features such as filtering and ordering through directives.
GraphQL query responses in this approach are plain JSON objects, and are compatible with the user-provided JSON-LD context. As such, responses are usable as both JSON and RDF.
Conclusions
In the previous section, we provided an overview of four approaches for querying RDF with GraphQL queries. These approaches are significantly different and mostly incompatible with each other. As such, one GraphQL query that works for one approach will most likely not work in the other approaches, or it can have different semantics.
Therefore, it is important to achieve a better understanding of the different ways of querying RDF with GraphQL. Ideally, and if possible, a consensus should reached on a single way.
In order to do this, we propose the following plan of action:
- Determine the different methods of querying RDF with GraphQL.
- Decompose and categorize the existing approaches in these methods.
- Determine the advantages and disadvantages of each method.
- Propose a standard approach and semantics for querying RDF with GraphQL.
Clarifying the differences and a path to unification will lower the barrier for developers that have knowledge on GraphQL to start querying RDF.