How do I ensure keys of a key-value mapping are unique in SHACL? - shacl

Suppose my data models a key-value-mapping, for example, I run a fancy hotel and want to keep track of my guests' orders for every meal. How do I ensure that for each meal, I get every guest's order (i.e. all keys are present) and per guest I only have one order (i.e. all keys are unique)?
Example code to get us started:
Shapes:
ex:MealShape
a sh:NodeShape ;
sh:targetClass ex:Meal ;
sh:property [
sh:path ex:orders ;
sh:minCount 2 ;
sh:maxCount 2 ;
sh:node ex:OrderShape ;
] .
ex:OrderShape
a sh:NodeShape ;
sh:targetClass ex:Order ;
sh:property [
sh:path ex:guest ;
# The guest list! Code maintenance should happen here.
sh:in ( "James" "Margaret" ) ;
sh:minCount 1 ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path ex:order ;
sh:datatype xsd:string ;
sh:minCount 1 ;
sh:maxCount 1 ;
] .
Data:
## Guests
ex:james ex:name "James" .
ex:margaret ex:name "Margaret" .
## Meals
### Valid meal
ex:breakfast
a ex:Meal ;
ex:orders [ ex:guest "James" ; ex:order "Eggs" ] ;
ex:orders [ ex:guest "Margaret" ; ex:order "Cereal" ] .
### DESIRED TO BE invalid meal
### currently does not cause a validation result
ex:lunch
a ex:Meal ;
ex:orders [ ex:guest "James" ; ex:order "Salad" ] ;
# Problem: James placed two orders, Maggie placed zero
ex:orders [ ex:guest "James" ; ex:order "Burger" ] .
One solution I am aware of is to use qualifiedShape and its minCount and maxCount constraints separately for each key in the list. However, for larger "guest lists" this becomes hard to maintain. In my work, I have a list of roughly 40 keys. With 40 blocks of qualifiedShape statements, it becomes impractical to inspect the list (and I have already scripted the generation of those statements in the first place).
I have searched the documentation but could not find a kind of "sh:disjointInScope" or "sh:uniqueFromList" statement I wanted (crucially, the constraint should not force the value to be unique in the entire data graph, since e.g. James might appear in several orders). How do I get the desired behavior in human-readable SHACL code?

If I understand your particular scenario correctly, then this should give you the violation:
ex:MealShape
sh:property [
sh:path ( ex:orders ex:guest ) ;
sh:maxCount 2 ;
sh:minCount 2 ;
] ...
The mechanism here is that a path expression (ex:orders/ex:guest in SPARQL notation) is used to state that there need to be exactly two distinct guests per Meal, which also implies that each of them can only be part of one order. Combined with the sh:in, this makes sure that only the allowed keys are present, and all of them. You need to however align the length of the sh:in list with the sh:min/max counts, so I am not sure how manageable that would be.
You can probably further generalize this pattern with the help of SHACL-SPARQL, e.g. to introduce the higher-level constraint components that you are talking about.

Related

Simple way to model "inverse cardinality" in SHACL?

We want to transform a UML diagram of an ontology with cardinalities into a SHACL shape to validate if the cardinalities in our data are correct.
Let's say we have Author 1 ---first author ---> 1.n Book, the right part is quite easy to model as:
:AuthorShape a sh:NodeShape;
sh:targetClass :Author;
sh:property [sh:path :firstAuthor; sh:minCount 1].
However now I also want to model the "other end", i.e. that a book cannot have more than 1 first authors:
:FirstAuthorCardinalityOtherEndShape a sh:NodeShape;
sh:targetObjectsOf :firstAuthor;
sh:property [
sh:path [ sh:inversePath :firstAuthor ];
sh:minCount 1;
sh:maxCount 1
];
sh:nodeKind sh:IRI.
However that looks quite convoluted (8 lines instead of 3) and error prone (:firstAuthor is mentioned twice). Is there a simpler way to model this?
For example, this could be like this, but sh:inverseMinCount doesn't exist:
:AuthorShape a sh:NodeShape;
sh:targetClass :Author;
sh:property [sh:path :firstAuthor; sh:minCount 1; sh:inverseMinCount 1; sh:inverseMaxCount 1].
The issue that :firstAuthor is mentioned twice can be avoided by attaching the property to Book, e.g.
:BookShape a sh:NodeShape ;
sh:targetClass :Book ;
sh:property [
sh:path [ sh:inversePath :firstAuthor ] ;
sh:maxCount 1 ;
] .
(You already have AuthorShape, so having BookShape would be a perfectly natural thing to do).
In any case you wouldn't need the sh:minCount 1 because the sh:targetObjectsOf already implies this, although I can see why you would want this from an UML point of view.
And I don't think the design above is much more complex than the forward direction, assuming you're OK with the sh:inversePath overhead, which is unavoidable.

How to create a SHACL rule to infer rdf:type from rdfs:subClassOf

In order to validate my RDF graph against my SHACL validation shapes V, I want to infer some triples to keep my shapes simple. In particular, one of the rule I need to implement is (in pseudo code):
(?s, rdf:type, :X) <-- (?s, rdfs:subClassOf, :Y)
I was trying several implementations, ending up with this triple rule (and its variants):
#prefix sh: <http://www.w3.org/ns/shacl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix : <http://example.com/ex#> .
:s
a sh:NodeShape ;
sh:targetClass rdfs:Resource ;
sh:rule [
a sh:TripleRule ;
sh:subject sh:this ;
sh:predicate rdf:type ;
sh:object :X ;
sh:condition [ sh:property [ sh:path rdfs:subClassOf ;
sh:hasValue :Y ] ]
] .
However the rule does not infer :A rdf:type :X . for data graph
:A rdfs:subClassOf :Y .
(Executing against https://github.com/TopQuadrant/shacl). It is possible to solve this issue with a SPARQL rule, so my question is whether there is an option to do it through Triple rule as well. Thanks for hints!
Why don't you keep the inference rules and the validation separate, as you've noted is possible using SHACL + SPARQL, as this will keep things simpler?
You could use pySHACL and put rules into an ontology file since pySHACL can run ontology rules/inference before applying SHACL validators (see the -i and -e options).
Given the "MAY" in following quote, the advice in previous answer by #NicholasCar is solid IMO.
Purpose of answering here, is just to corroborate and expand with recent experience.
The 2017 W3C SHACL docs regarding Relationship between SHACL and RDFS inferencing:
SHACL implementations MAY, but are not required to, support entailment
regimes. If a shapes graph contains any triple with the predicate
sh:entailment and object E and the SHACL processor does not support E
as an entailment regime for the given data graph then the processor
MUST signal a failure.
(AFAICT the phrase "entailment regime" only refers to SPARQL as standard)
Looking at the section on Property Paths:
SPARQL Property path: rdf:type/rdfs:subClassOf*
SHACL Property path: (rdf:type [ sh:zeroOrMorePath rdfs:subClassOf ] )
In most of the SHACL implementations I've played with basic rdfs type entailment works (obv IFF the rdf:type/rdfs:subClassOf* path is visible to the SHACL validator), so (rdf:type [ sh:zeroOrMorePath rdfs:subClassOf ]) isn't needed explicitly.
The problem comes when you try to stuff advanced paths into the shapes - e.g. following this example to enforce graph contains at least one instance of an abstract type:
sh:path [ sh:inversePath ( rdf:type [ sh:zeroOrMorePath rdfs:subClassOf ] ) ] ;
... isn't working for me in a number of SHACL validation implementations.

Assigning patch values from raster data in NetLogo

I am attempting to assign values to patches in NetLogo based upon raster values: 0, 1, and 2. These patches need only relate to the values of my raster, which does display properly using a greyscale, and then 'paint' themselves the colors blue, green, and white, respectively.
This raster data loads fine using the gis extension. Following gis:load-dataset, I attempt to use the apply-raster command and ifelse in order to give options based on the values. I believe I am misusing a boolean operator but very few examples online are as extensive as what I am attempting.
patches-own [value]
; Draws raster dataset (terrain of each Millenium)
to display-terrain
gis:paint terrain 62
ask patches [
(ifelse
value = 0 [
set pcolor blue
]
value = 1 [
set pcolor green
]
; elsecommands
[
set pcolor white
])
]
end
I currently cannot tell if the values are properly assigned and keep receiving the error that 'ifelse expects this to be a command block' so I assume the formatting is incorrect and/or a value association is missing.
Actually, you are using it exactly as the documentation says to use it, but you probably don't have the current version. The multiple choice ifelse is brand new in NetLogo v6.0.4. You need to explicitly include the cf extension and you need the extension name when calling the new ifelse syntax.
Earlier versions of NetLogo won't do this at all. The syntax you have is for v6.1 which has been released only in the last couple of weeks.
Try this for v6.0.4:
extensions [cf]
patches-own [value]
to testme
clear-all
ask patches [ set value one-of [0 1 2] ]
ask patches [
(cf:ifelse
value = 0 [
set pcolor blue
]
value = 1 [
set pcolor green
]
; elsecommands
[
set pcolor white
])
]
end

SPARQL construct/insert query and blank nodes

I'm trying to create a SPARQL query to construct or insert graphs, following the BIBFRAME 2.0 Model, using a personal database with a lot of datas. I want to get a result like this:
Subject a bf:Topic, madsrdf:ComplexSubject ;
rdfs:label "Subject" ;
madsrdf:componentList [ a madsrdf:Topic ;
madsrdf:authoritativeLabel "FirstSubject" ] ;
But I do not know how to do it in SPARQL. I tryed with this query, but I always get a lot of blank nodes (as much as registers with empty "?Subject" fields I have in my database):
PREFIX bf: <http://id.loc.gov/ontologies/bibframe/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#>
CONSTRUCT{
?subject a bf:Topic, madsrdf:ComplexSubject ;
rdfs:label ?subject;
madsrdf:componentList [ a madsrdf:Topic ;
madsrdf:authoritativeLabel ?firstsubject ];
} where{ service <http://localhost:.......> {
?registerRow a <urn:Row> ;
OPTIONAL{?registerRow <urn:col:Subject> ?subject ;}
OPTIONAL{?registerRow <urn:col:FirstSubject> ?firstsubject ;}
}
}
#Wences, AKSW answered you, please read more carefully.
You don't use ?registerRow in the CONSTRUCT part, that's why it is executed once for each row.

Assign turtle variables from csv in Netlogo

I want turtles to read and adopt data from csv file. I have written the following code: the problem is even-though the data gets loaded, i'm unable to make the individual turtles take on each of the income values. Any assistance to this effect would be appreciated
extensions [csv]
breed [households household]
households-own [income]
globals [income-data]
to setup
load-income-data
setup-households
end
to load-income-data
set income-data []
file-open "income.csv"
while [ not file-at-end? ]
[ set income-data sentence income-data ( file-read-line)
]
user-message "income data loading complete!"
file-close
end
to setup-households
create-households 700
ask one-of households
[ setxy random-xcor random-ycor
set income income-data
]
end
Have a look at the File Input Example in the NetLogo Model Library (Code Examples). You need to use a foreach to loop through the imported values / agents.