Why while querying ontologies we have to load the ontology, also provide its namespace? - namespaces

I wonder why we have to load an ontology, also provide its namespace while querying it? Why loading the ontology is not enough?
To understand my question better, here is a sample code:
g = rdflib.Graph()
g.parse('ppp.owl', format='turtle')
ppp = rdflib.Namespace('http://purl.org/xxx/ont/ppp/')
g.bind('ppp', ppp)
In line 2, we have opened the ontology (ppp.owl), but in line 3 we also provided its namespace. Does namespace show the program how to handle the ontology?
Cheers,
RF

To specify an element over the semantic web you need its URI: Unique Resource Identifier, which is composed of the namespace and the localname. For example, consider Person an RDF class; how would you differentiate the Person DBpedia class http://dbpedia.org/ontology/Person from Person in some other ontology somewhere? you need the namespace http://dbpedia.org/ontology/ and the local name Person. Which both uniquely identify the class.
Now coming back to your specific question, when you query the ontology, you might use multiple namespaces, some namespaces may not be the one of your ontology. You need other namespaces for querying your own ontology, e.g. rdf, rdfs, and owl. As an example, you can rarely write an arbitrary query without rdf:type property, which is included under the rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns> namespace, not your ontology namespace. As a consequence, you need to specify the namespace.
Well, now as you should know why to use a namespace, then we can proceed. Why to repeat the whole string of the namespace each time it is needed? It is nothing more than a prefix string appended to the local names to use in the query, to avoid writing exhaustively the full uri. See the difference between <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> and type.
Edit
As #AKSW says, as a conclusion, there is no need to declare a namespace in order to work with the ontology but it increases the convenience when working quite often with resources whose URI has particular namespace.

Related

SHACL validate existing domain and range definitions?

I want to validate the existing rdfs:domain and rdfs:range statements of an existing ontology and knowledge base with SHACL.
However it seems like it is extremely verbose to do that with SHACL.
Existing Definition
:prop1 a owl:ObjectProperty;
rdfs:domain ?A;
rdfs:range ?B.
In SHACL
:AShape a sh:NodeShape;
sh:targetClass :A;
sh:property [sh:path :prop1];
sh:closed true.
:ADomainShape a sh:NodeShape;
sh:targetSubjectsOf :prop1;
sh:class :A.
:prop1RangeShape a sh:NodeShape;
sh:targetObjectsOf :prop1;
sh:ClassB.
When you have dozens of properties, this adds a large amount of ceremony over something that is already declared in the original domain and range statements.
While it is possible to speed up the process using a script or a multiline editor, it still seems unnecessary to me.
Is there a way to tell a SHACL validator like PySHACL to just validate the existing domain and range statements without requiring all those extra triples?
Let me start by saying that rdfs:domain and rdfs:range are not constraints and don't mean what you imply. They are merely producing inferences. Having said this, many people have in the past used them to "mean" constraints simply because there was no other modeling language. For background, see
https://www.topquadrant.com/owl-blog/
If you don't want to duplicate the RDFS triples as individual SHACL constraints, you can write a single generic SHACL shape that has sh:targetSubjectsOf rdfs:domain (and range) and then uses a SPARQL constraint to check that there is no instance of a class other than the domain class that has values for the given property. The end result would be that all rdfs:domain statements would be checked at once.
But arguably RDFS should be left alone. If you want to use closed-world semantics you should use a language that has been designed for that purpose, i.e. SHACL.
(BTW there is a Discord group in case you want a higher bandwidth to discuss such things https://twitter.com/HolgerKnublauch/status/1461590465304662019)

Namespaces equivalence and deprecation in OWL/RDF

I am creating an ontology based on OWL/RDF/RDFS. My first ontology schema has namespaces as :
#prefix abc: https://example.com/a#
I want to change the namespace the next version of the ontology as
#prefix def: https://example-new.com/b#
But I dont want the previous users of the ontology to be affected at all. I was thinking if there is a way to define equivalent namespaces, and classify that the first name space will be deprecated. I am not sure if there is any provision in the OWL/RDF or even Dublin-core to so do.
Any help is appreciated. Thanks.
You can't do this at the namespace level but you can declare all old classes and properties as equivalent to the new ones; you'd keep the old IRIs in the equivalent axioms and declaration axioms only. Then any third part that uses a reasoner would be able to run queries just as before; parties not using a reasoner would have to rewrite their queries following equivalent axioms (something that they might already be doing for other similar use cases).

IBM Websphere scripting ID 'containment paths' explained (I.e. AdminConfig.getid(...) )

I have been digging through IBM knowledge center and have returned frustrated at the lack of definition to some of their scripting interfaces.
IBM keeps talking about a containment path used in their id definitions and I am only assuming that it corresponds to an xml element within file in the websphere config folder hierachy, but that's only from observation. I haven't found this declared in any documentation as yet.
Also, to find an ID, there is a syntax that needs to be used that will retrieve it, yet I cannot find a reference to the possible values of 'types' used in the AdminConfig.getid(...) function call.
So to summarize I have a couple questions:
What is the correct definition of the id "heirachy" both for fetching an id and for the id itself?
e.g. to get an Id of the server I would say something like: AdminConfig.getid('/Cell:MYCOMPUTERNode01Cell/Node:MYCOMPUTERNode01/Server:server1'), which would give me an id something like: server1(cells/MYCOMPUTERNode01Cell/nodes/MYCOMPUTERNode01/servers/server1|server.xml#server_1873491723429)
What are the possible /type values in retrieving the id from the websphere server?
e.g. in the above example, /Cell, /Node and /Server are examples of type values used in queries.
UPDATE: I have discovered this document that outlines the locations of various configuration files in the configuration directory.
UPDATE: I am starting to think that the configuration files represent complex objects with attributes and nested attributes. not every object with an 'id' is query-able through the AdminConfig.getid(<containment_path>). Object attributes (and this is not attributes in the strict xml sense as 'attributes' could be nested nodes with a simple structure within the parent node) may be queried using AdminConfig.showAttribute(<element_id>, <attribute_name>). This function will either return the string value of the inline attribute of the element itself or a string list representation of the ids of the nested attribute node(s).
UPDATE: I have discovered the AdminConfig.types() function that will display a list of all manipulatible object types in the configuration files and the AdminConfig.parent() function that displays what nodes are considered parents of the specified node.
Note: The AdminConfig.parent() function does not reveal the parents in order of their hierachy, rather it seems to just present a list of parents. e.g. AdminConfig.parent('JDBCProvider') gives us this exact list: 'Cell\nDeployment\nNode\nServer\nServerCluster' and even though Server comes before ServerCluster , running AdminConfig.parent('Server') reveals it's parents as: 'Node\nServerCluster'. Although some elements can have no parents - e.g. Cell - Some elements produce an error when running the parent function - e.g. Deployment.
Due to the apparent lack of a reciprocal AdminConfig.children() function, It appears as though obtaining a full hierachical tree of parents/children lies in calling this function on each of the elements returned from the AdminConfig.parent(<>) call and combining the results. That being said, a trial and error approach is mostly fruitful due to the sometimes obvious ordering.

What are the actual advantages of the visitor pattern? What are the alternatives?

I read quite a lot about the visitor pattern and its supposed advantages. To me however it seems they are not that much advantages when applied in practice:
"Convenient" and "elegant" seems to mean lots and lots of boilerplate code
Therefore, the code is hard to follow. Also 'accept'/'visit' is not very descriptive
Even uglier boilerplate code if your programming language has no method overloading (i.e. Vala)
You cannot in general add new operations to an existing type hierarchy without modification of all classes, since you need new 'accept'/'visit' methods everywhere as soon as you need an operation with different parameters and/or return value (changes to classes all over the place is one thing this design pattern was supposed to avoid!?)
Adding a new type to the type hierarchy requires changes to all visitors. Also, your visitors cannot simply ignore a type - you need to create an empty visit method (boilerplate again)
It all just seems to be an awful lot of work when all you want to do is actually:
// Pseudocode
int SomeOperation(ISomeAbstractThing obj) {
switch (type of obj) {
case Foo: // do Foo-specific stuff here
case Bar: // do Bar-specific stuff here
case Baz: // do Baz-specific stuff here
default: return 0; // do some sensible default if type unknown or if we don't care
}
}
The only real advantage I see (which btw i haven't seen mentioned anywhere): The visitor pattern is probably the fastest method to implement the above code snippet in terms of cpu time (if you don't have a language with double dispatch or efficient type comparison in the fashion of the pseudocode above).
Questions:
So, what advantages of the visitor pattern have I missed?
What alternative concepts/data structures could be used to make the above fictional code sample run equally fast?
For as far as I have seen so far there are two uses / benefits for the visitor design pattern:
Double dispatch
Separate data structures from the operations on them
Double dispatch
Let's say you have a Vehicle class and a VehicleWasher class. The VehicleWasher has a Wash(Vehicle) method:
VehicleWasher
Wash(Vehicle)
Vehicle
Additionally we also have specific vehicles like a car and in the future we'll also have other specific vehicles. For this we have a Car class but also a specific CarWasher class that has an operation specific to washing cars (pseudo code):
CarWasher : VehicleWasher
Wash(Car)
Car : Vehicle
Then consider the following client code to wash a specific vehicle (notice that x and washer are declared using their base type because the instances might be dynamically created based on user input or external configuration values; in this example they are simply created with a new operator though):
Vehicle x = new Car();
VehicleWasher washer = new CarWasher();
washer.Wash(x);
Many languages use single dispatch to call the appropriate function. Single dispatch means that during runtime only a single value is taken into account when determining which method to call. Therefore only the actual type of washer we'll be considered. The actual type of x isn't taken into account. The last line of code will therefore invoke CarWasher.Wash(Vehicle) and NOT CarWasher.Wash(Car).
If you use a language that does not support multiple dispatch and you do need it (I can honoustly say I have never encountered such a situation though) then you can use the visitor design pattern to enable this. For this two things need to be done. First of all add an Accept method to the Vehicle class (the visitee) that accepts a VehicleWasher as a visitor and then call its operation Wash:
Accept(VehicleWasher washer)
washer.Wash(this);
The second thing is to modify the calling code and replace the washer.Wash(x); line with the following:
x.Accept(washer);
Now for the call to the Accept method the actual type of x is considered (and only that of x since we are assuming to be using a single dispatch language). In the implementation of the Accept method the Wash method is called on the washer object (the visitor). For this the actual type of the washer is considered and this will invoke CarWasher.Wash(Car). By combining two single dispatches a double dispatch is implemented.
Now to eleborate on your remark of the terms like Accept and Visit and Visitor being very unspecific. That is absolutely true. But it is for a reason.
Consider the requirement in this example to implement a new class that is able to repair vehicles: a VehicleRepairer. This class can only be used as a visitor in this example if it would inherit from VehicleWasher and have its repair logic inside a Wash method. But that ofcourse doesn't make any sense and would be confusing. So I totally agree that design patterns tend to have very vague and unspecific naming but it does make them applicable to many situations. The more specific your naming is, the more restrictive it can be.
Your switch statement only considers one type which is actually a manual way of single dispatch. Applying the visitor design pattern in the above way will provide double dispatch.
This way you do not necessarily need additional Visit methods when adding additional types to your hierarchy. Ofcourse it does add some complexity as it makes the code less readable. But ofcourse all patterns come at a price.
Ofcourse this pattern cannot always be used. If you expect lots of complex operations with multiple parameters then this will not be a good option.
An alternative is to use a language that does support multiple dispatch. For instance .NET did not support it until version 4.0 which introduced the dynamic keyword. Then in C# you can do the following:
washer.Wash((dynamic)x);
Because x is then converted to a dynamic type its actual type will be considered for the dispatch and so both x and washer will be used to select the correct method so that CarWasher.Wash(Car) will be called (making the code work correctly and staying intuitive).
Separate data structures and operations
The other benefit and requirement is that it can separate the data structures from the operations. This can be an advantage because it allows new visitors to be added that have there own operations while it also allows data structures to be added that 'inherit' these operations. It can however be only applied if this seperation can be done / makes sense. The classes that perform the operations (the visitors) do not know the structure of the data structures nor do they have to know that which makes code more maintainable and reusable. When applied for this reason the visitors have operations for the different elements in the data structures.
Say you have different data structures and they all consist of elements of class Item. The structures can be lists, stacks, trees, queues etc.
You can then implement visitors that in this case will have the following method:
Visit(Item)
The data structures need to accept visitors and then call the Visit method for each Item.
This way you can implement all kinds of visitors and you can still add new data structures as long as they consist of elements of type Item.
For more specific data structures with additional elements (e.g. a Node) you might consider a specific visitor (NodeVisitor) that inherits from your conventional Visitor and have your new data structures accept that visitor (Accept(NodeVisitor)). The new visitors can be used for the new data structures but also for the old data structures due to inheritence and so you do not need to modify your existing 'interface' (the super class in this case).
In my personal opinion, the visitor pattern is only useful if the interface you want implemented is rather static and doesn't change a lot, while you want to give anyone a chance to implement their own functionality.
Note that you can avoid changing everything every time you add a new method by creating a new interface instead of modifying the old one - then you just have to have some logic handling the case when the visitor doesn't implement all the interfaces.
Basically, the benefit is that it allows you to choose the correct method to call at runtime, rather than at compile time - and the available methods are actually extensible.
For more info, have a look at this article - http://rgomes-info.blogspot.co.uk/2013/01/a-better-implementation-of-visitor.html
By experience, I would say that "Adding a new type to the type hierarchy requires changes to all visitors" is an advantage. Because it definitely forces you to consider the new type added in ALL places where you did some type-specific stuff. It prevents you from forgetting one....
This is an old question but i would like to answer.
The visitor pattern is useful mostly when you have a composite pattern in place in which you build a tree of objects and such tree arrangement is unpredictable.
Type checking may be one thing that a visitor can do, but say you want to build an expression based on a tree that can vary its form according to a user input or something like that, a visitor would be an effective way for you to validate the tree, or build a complex object according to the items found on the tree.
The visitor may also carry an object that does something on each node it may find on that tree. this visitor may be a composite itself chaining lots of operations on each node, or it can carry a mediator object to mediate operations or dispatch events on each node.
You imagination is the limit of all this. you can filter a collection, build an abstract syntax tree out of an complete tree, parse a string, validate a collection of things, etc.

Is having a single massive class for all data storage OK?

I have created a class that I've been using as the storage for all listings in my applications. The class allows me to "sign" an object to a listing (which can be created on the fly via the sign() method like so):
manager.sign(myObject, "someList");
This stores the index of the element (using it's unique id) in the newly created or previously created listing "someList" as well as the object in a 2D array. So for example, I might end up with this:
trace(_indexes["someList"][objectId]); // 0 - the object is the first in this list
trace(_instances["someList"]); // [object MyObject]
The class has another two methods:
find(signature:String):Array
This method returns an array via slice() containing all of the elements signed with the given signature.
findFirst(signature:String):Object
This method just returns the first object in a given listing
So to retrieve myObject I can either go:
trace(find("someList")[0]); or trace(findFirst("someList"));
Finally, there is an unsign() function which will remove an object from a given listing. This function basically:
Stores the result of pop() in the specified listing against a variable.
Uses the stored index to quickly replace the specified object with the pop()'d item.
Deletes the stored index for the specified object and updates the index for the pop()'d item.
Through all this, using unsign() will remove an object extremely quickly from a listing of any size.
Now this is all well and good, but I've had some thoughts which are making me consider how good this really is? I mean being able to easily list, remove and access lists of anything I want throughout the application like this is awesome - but is there a catch?
A couple of starting thoughts I have had are:
So far I haven't implemented support for listings that are private and only accessible via a given class.
Memory - this doesn't seem very memory efficient. Then again, neither is creating arrays for everything I want to store individually either. Just seems.. Larger.. Somehow.
Any insights?
I've uploaded the class here in case the above doesn't make much sense: https://projectavian.com/AviManager.as
Your solution seems pretty solid. If you're looking to modify it to be a bit more extensible and handle rights management, you might consider moving all those individually indexed properties to a value object for your AV elements. You could perform operations like "sign" and "unsign" internally in the VOs, or check for access rights. Your management class could monitor the collection of these VOs, pass them around, perform the method calls, and the objects would hold the state in a bit more readable format.
Really, though, this is entering into a coding style discussion. Your method works and it's not particularly inefficient. Just make sure the code is readable, encapsulated, and extensible and you're good.