data:image/s3,"s3://crabby-images/23040/23040b1351d79aab94ca7382a6b6ac43d657899c" alt="RDF graph"
Förderjahr 2023 / Stipendien Call #18 / ProjektID: 6794 / Projekt: Combining SHACL and Ontologies
My third blog post consisted of a short introduction to core universal models: structures that are general enough to represent all models, without any redundancy. In this post, we discuss capturing the structure of such models in SHACL constraints.
For more information on core structures, please visit (https://www.netidee.at/combining-shacl-and-ontologies/shacl-and-owl-part-3). More discussion on models can be found in (https://www.netidee.at/combining-shacl-and-ontologies/short-introduction-models).
The main theme of these blog series has been arguing why validating SHACL over the core universal model of some RDF data and OWL axioms is a promising approach. However, computing the core universal model is rather expensive and may even result in a loop of producing repetitive structures. The good news is that there is a solution to this: encoding axioms in constraints. As a result, the expensive building of core universal models can be avoided.
Rewriting SHACL
More precisely, based on the original set of constraints C and a set of ontology axioms T, we want to produce a set of SHACL constraints C’ such that testing validation of C’ over the original dataset gives the same result as testing validation of C over the core universal model of T and the same dataset.
To achieve this, we use that we can compute how the core universal model looks like in small pieces. The good successor configuration is a function that takes as input a part of the model, and returns exactly which successors are required at that point to get the core universal model. Taking one of these successors as the subject of the same function again lets us move further down in the model. In this way, we can represent the possibly infinite core universal model in a finite way. That is, we avoid mentioning repeating structures more than once.
As one input of the good successor configuration determines the whole tree hanging below in the core universal model, we can also use the good successor configuration to compute, with a certain calculus, which constraints are satisfied where, given the input at the top. In the end, this means we can reduce SHACL constraints to constraints that only consider the data in the original dataset, as this data contains all information we need to determine which structures appear in its core universal model. For the technical details, we refer to [1] and [2].
Now suppose we retrieved this rewritten set of constraints C’ for a particular set of axioms T. This means that for any given dataset, we may consider C’ instead of C, and just skip the expensive and possibly non-terminating core universal model computation. This makes our proposed semantics not only intuitively appealing, but also way more feasible then the approach described before.
Even in the case of a rather simple ontology T, that is, core universal model computations that are not too hard, our approach may give interesting results. Note that when multiple dataset are considered under the same ontology T, the core universal model has to be computed for every single data set. The rewriting approach we propose only has to be performed once to the constraint set C, after which the result can directly be applied to all original data sets, and any other fresh data set later provided.
Implementation
Clearly, a rewriting of constraints as described above is an essential step in the construction of an implementation. However, as the current version is capturing quite some theory, it is rather heavy and causing an exponential blow-up (in size of the axiom set T and the constraints) of the constraints. And although this may even be not too bad, it is worth exploring which fragments are powerful enough to capture some relevant real-world examples, while having a much simpler rewriting. Thus, the main open problems in this regard are determining such fragments and deciding how to simplify the rewriting techniques for these fragments.
References:
- S. Ahmetaj, M. Ortiz, A. Oudshoorn, M. Simkus; Reconciling SHACL and Ontologies: Semantics and Validation via Rewriting, ECAI 2023, Reconciling_SHACL_and_Ontologies_Semantics_and_Val.pdf.
- A. Oudshoorn, M. Ortiz, M. Simkus; Reasoning with the Core Chase: the Case of SHACL Validation over ELHI Knowledge Bases. DL 2024, Reasoning_with_the_core_chase.pdf.
Anouk Michelle Oudshoorn
data:image/s3,"s3://crabby-images/089f0/089f011a50cf8e780a9c7876dcd806c2d64333ee" alt="Profile picture for user Anouk Michelle Oudshoorn"