schematron: can I group namespaces - namespaces

our content has stuff from around 8 different namespaces. In my schematron rule below I am just checking in one of them. I want to do the rule to check elements in 6 of them but not the other 2, I'm not sure the best way to do this. I am thinking it would be good to group the 6 namespaces and give them one single prefix and then use that. Is it possible to do this?
<pattern>
<rule context="def:para | def:para-text | def:block | def:quote-para | def:source-para | def:note-para" role="warning">
<report test="text()[contains(.,'www.')]">URLs should be marked up with a url tag</report>
</rule>
</pattern>

You can use the following construct:
*:ElementLocalName[namespace-uri()='http://namespace.tld']
If you're using XSLT2, you could declare a variable near the top of your schema like this:
<sch:let name="my6namespaces" value="('http://ns1.com', 'http://ns2.com', 'http://ns3.org', 'http://ns4.com, 'http://ns5.net', 'http://ns6.com')"/>
Then you can use it in your rule:
<pattern>
<rule context="*:para[namespace-uri()=$my6namespaces] | *:para-text[namespace-uri()=$my6namespaces] | *:block[namespace-uri()=$my6namespaces] | *:quote-para[namespace-uri()=$my6namespaces] | *:source-para[namespace-uri()=$my6namespaces] | *:note-para[namespace-uri()=$my6namespaces]" role="warning">
<report test="text()[contains(.,'www.')]">URLs should be marked up with a url tag</report>
</rule>
</pattern>
If it doesn't hurt performance too much, you could shorten the rule context like this:
<pattern>
<rule context="*[local-name()=('para', 'para-text', 'block', 'quote-para', 'source-para', 'note-para') and namespace-uri()=$my6namespaces]" role="warning">
<report test="text()[contains(.,'www.')]">URLs should be marked up with a url tag</report>
</rule>
</pattern>
Note, your rule as currently written will generate a successful-report whenever one of those elements has "www." in it. I think you might have intended to write it as an assert rather than a report, in which case it would generate a failed-assert whenever one of those elements doesn't have "www." in it.

In XML, you can't map multiple namespace URIs to the same prefix at the same time. The whole point of namespaces is to disambiguate same-named elements (and attributes) from different vocabularies, so they didn't provide a way to conflate multiple namespaces.
Untested, but you could match on the elements in any namespace then test only elements that are not in the unwanted namespaces:
<pattern>
<rule context="*:para | *:para-text | *:block | *:quote-para |
*:source-para | *:note-para" role="warning">
<report test=".[not(self::a:*) and not(self::b:*)]/text()[contains(.,'www.')]"
>URLs should be marked up with a url tag</report>
</rule>
</pattern>
It's not clear from your question whether same-named elements can appear in multiple namespaces. The code above assumes that the same local name can be used in any namespace, and that the 'a' and 'b' prefixes are mapped to the namespace URIs of the namespaces that you don't want to test.

Related

Exclude elements from Xpath groups

I have a very complex HTML of a website. I want to select multiple groups of elements by relative Xpath. For example:
//div[#class="something] | //span/div | //div/span/div[#class="otherthing"]
Once I have all elements selected I want to exclude some specific elements within the same xpath.
Let's say the xpath above returned 150 elements as a result. I want to exclude the following 3 xpaths which all point to 1 element each. So the end result should be 147 elements found:
//div[#id="menu"]
//div/span/div[#class="something3"]
//body/span/div/span/div[1]
How can I do this within 1 xpath with Xpath 1?
I have tried the solution here, however this only works while you are selecting one group you would like to exclude from: https://stackoverflow.com/a/74615054/12366148
I have also tried multiple ways to combine the 'not' and 'self' commands, however nothing seems to work. I can't use the except operator unfortunately as that only works in Xpath 2.0.
Given <xsl:variable name="ns1" select='//div[#class="something] | //span/div | //div/span/div[#class="otherthing"]'/> and <xsl:variable name="ns2" select='//div[#id="menu"] | //div/span/div[#class="something3"] | //body/span/div/span/div[1]'/>, you can use e.g.
<xsl:variable name="ns1-except-ns2" select="$ns1[count(. | $ns2) != count($ns2)]"/>
if my XPath 1 recollection still works.
You haven't used an XSLT tag so I guess you want
(//div[#class="something] | //span/div | //div/span/div[#class="otherthing"])[count(. | //div[#id="menu"] | //div/span/div[#class="something3"] | //body/span/div/span/div[1]) != count(//div[#id="menu"] | //div/span/div[#class="something3"] | //body/span/div/span/div[1])]

Meta tags in skin from MediaWiki template

Let's say i have a template in my MediaWiki like
<includeonly>
<div id="custom-person">
* <span>Birthday:</span> {{#if: {{{birth date|}}} | <b>{{#ol-time:|{{{birth date}}}}}</b> | — }}
{{#if: {{{full name|}}} | * <span>full name:</span> <b>{{{full name}}}</b>}}
{{#if: {{{birth place|}}} | * <span>birth place:</span> <b>{{{birth place}}}</b>}}
{{#if: {{{age|}}} | * <span> age:</span> <b>{{{age}}}</b>}}
{{#if: {{{nationality|}}} | * <span> nationality:</span> <b>{{{nationality}}}</b>}}
<div class="clear"></div>
</div>
[[Category:Person]]
__NOTOC__
</includeonly>
All these pages are in one Namespace (0).
I need to generate head meta tags with data from this template.
I figured out how to filter such a pages and add title tags in my SkinPerson.php
if ( $out->getTitle()->getNamespace() == 0 ) {
$out->addMeta( "description", $out->getPageTitle());
$out->addHeadItem( 'og:description', '<meta property="og:description" content="' . $out->getPageTitle() . '">');
}
But I'm really stuck on how can I insert in, say, 'og:description' tag something like {{{full name}}} + {{{age}}} ?
That's simply not possible, and I would wonder what your use case here would be, why you want to do that. First some explanation, why this is not possible in the way you want to achieve that:
The template is evaluated by a piece of software we call the Parser. The parser is generating a html representation of your wikitext, including all the templates and so on. The result of that is then saved in the ParserOutput and probably cached in ParserCache (so that not every time it needs to be parsed again).
However, the skin, where you want to add the head item, is using the output of the parser directly, so it does not really know about the wikitext (including template parameters) anymore, and really shouldn't.
One possible solution for what you want to achieve is probably to extend the wikitext markup language by providing a tag extension, parsing that during the parsing of the wikitext, and save the values for the head items in the database. During the output of the page you can then retrieve these values from the database again and add them into the head items like you want. See more information about that in the documentation.
There might be other ways, apart from the database, to get information from the parsing time into the output time, which I'm not aware of.

Show webservices expose nested or flat lists?

When designing a webservice, not matter if it's soap, xml or json: would you prefer flat or nested lists?
Example:
Nested:
<carRequest>
<cars>
<car>
<manufature />
<price />
<description />
</car>
<car>
<manufature />
<price />
<description />
</car>
</cars>
</carRequest>
Flat:
<carRequest>
<car>
<manufature />
<price />
<description />
</car>
<car>
<manufature />
<price />
<description />
</car>
</carRequest>
What's the advantage of one over the other?
There are advantages and disadvantages combined with personal style, tools (their default configurations, limitations or ease of use), need to support multiple MIME types from a single object representations, etc. I'm not going to go into all of that - since what works for some might not be a good solution for others - but I just want to point out a few things...
Which one seems more natural, the flat elements or the wrapped elements? How do people usually think about repeated elements? For example, <manufature>, <price> and <description> are wrapped in a <car> element. Why? Because they are related and together form a structure. Multiple <car>s are also related and form a structure too: a list of <car>s. It's more expressive in your representation and XML schema, and more readable. But of course now we go into personal preferences and wholly wars...
There is another advantage of the wrapped element. How do you express a list of cars that is empty versus a list of cars that is null?
If the elements are flat and you have no cars then what does this represent when you unmarshall it into an object?
<carRequest>
</carRequest>
Does your request have cars = null or cars = []? You don't know.
If you go with nested elements then cars = null is this:
<carRequest>
</carRequest>
while cars = [] is this:
<carRequest>
<cars>
</cars>
</carRequest>
And since you mentioned SOAP, you might at some point need to consider interoperability across technologies and tools (see Why is it important to be WS-I Basic Profile compliant?) which has rules on how the XML should look like inside the SOAP message. The style called document/literal wrapped pattern is preferred.
This is a broad subject and as a TL;DR I can only think of "choose your poison". I hope my answer is of help to you.

Inline query for listing all pages from a namespace without any subobjects

I need an inline query that lists all pages from a specific namespace, but without listing subobjects specified on these pages.
Restricting results to a namespace is possible like that:
{{#ask: [[ExampleNamespace:+]] }}
But it lists all subobjects, too.
Workarounds:
Specify a category on these pages (subobjects don’t inherit it) and query for the category instead:
{{#ask: [[ExampleCategory]] }}
Specify a property on these pages (and never on the subobjects) and query for the property (with a wildcard value) instead:
{{#ask: [[ExampleProperty::+]] }}
But both workarounds require editing, which I would like to avoid. Is there a better way to solve this?
Not sure if it's a better way, but it looks like array formats/arrays and their #arraymap and #arrayunique functions are a way to go in order to trim SMW subobject tags and make the DISTINCT operation. Unfortunately, the solution below has a query result limit issue described as well (at least out of what I understand in SMW). In general, it may look like the following, and I will appreciate if someone suggests a nicer solution:
<!-- Fetch all pages from the "Live event" namespace -->
{{#arraydefine: QUERY_RESULT
| {{#ask: [[Live event:+]]
| format = array
| link = none <!-- NOTE: array item link -->
| limit = 10000 <!-- NOTE: limit -->
}}
}}
<!-- Store the mapped result into another array -->
{{#arraydefine: MAPPED_QUERY_RESULT
| {{#arraymap: {{#arrayprint: QUERY_RESULT}}
| ,
| $O <!-- NOTE: array map iterator value -->
| {{#explode: $O <!-- NOTE: explode by hash -->
| #
| 0
}}
}}
| ,
| unique
}}
<!-- Generate links markup -->
{{#arraymap: {{#arrayprint: MAPPED_QUERY_RESULT}}
| ,
| $O
| [[$O]] <!-- NOTE: plain links -->
}}
The notes from the code above:
NOTE: array item link - Not suppressing the links causes the mapper to be more complicated (including parsing HTML <span> tags and class attributes).
NOTE: limit - This is probably the biggest issue here as the number of subobjects affects the query result. SMW by default limits the query results, and the maximum query limit cannot be overridden as far as I know. Having more rows, which count is greater than the limits is, will cause the 'Further limits' link to appear. Actually speaking, I have no idea how to work around it nicely.
NOTE: array map iterator value - {{#arraymap}} seems to replace strings in the simplest way like sed or a simple text editor app do. So $O is used as the iterator value placeholder for the formula parameter trying not to clash with other string tokens.
NOTE: explode by hash - #ask subobject results generate hashed links like PageA#_159c1f213de2fcaf165f2c9c5c56686b. Just getting rid of them. In case you need to strip wiki links, you might also play around with [[ or | (encoded like [<nowiki/>[ and <nowiki>|</nowiki> respectively)
NOTE: plain links - The generated links will have underscores instead of spaces. Unfortunately, [[{{#replace: $O | _ | <nowiki> </nowiki>}}]] didn't work for me -- the underscores are simply consumed for some reason, however this approach is also recommended at the #replace function wiki page.
Some links:
SMW array result format
SMW configuration
SMW further results
#arraymap:
#explode:
#replace:
Help:List the set of unique values for a property (pay attention at the "Limitations and issues" section)

Are property values in CSS case-sensitive?

I have observed that some CSS properties, like font-family declared with quotation marks, perhaps are case-sensitive, but all other are not... But how web-browsers and "HTML renderers" MUST interpret? Is the same in any CSS context (XML, SVG, etc.) and all other applications? What the standards say about?
Example: Adobe InDesign exported both, font-family:'Optima Bold' and font-family:'optima bold'. Can I "normalize to lower case" (ex. to merge similar classes)?
NOTES
References are incomplete and in conflict:
sitepoint.com/font-family say "Note that font family names may be case sensitive on some operating systems"... It is valid for XHTML, it is updated with HTML5? font-family is really the unique case-sensitive value?
Is it necessary to use lowercase for every Element and attribute , properties in css and xhtml ? say indirectly "... use lowercase for every properties...", and answers not negate it.
Comparing with this question/answers, the point here, perhaps, can be translated to some (personal) objective considerations:
There are a (objective!) normative (W3C spec of CSS2, CSS3, XHTML1, or HTML5) source for this answer?
"Standard font-family unique names" can not be case-sensitive (otherwise cease to be standard)... So, the only justifiable (by sensible arguments) properties to be case-sensitive are:
2.1. X values at url(X), see background, etc. properties;
2.2. content values, example;
2.3. ... more ?? ...
(updating #ÁlvaroG.Vicario answer and comments, and complementing this answer... This is a Wiki, please edit to enhance)
Example: for CSS3 (and HTML5) there are new explicit rules, as "font-face property must be case-insensitive".[2]
Context
W3C interoperating standards, mainly XML, HTML, CSV and CSS.
CSS general rules
CSS2 (a W3C standard of 2008) fixed basic conventions about "Characters and case", and CSS3 (a W3C standard for 2015) added something more.
By default "all CSS syntax is case-insensitive (...)" [1]
There are exceptions, "(...) except for parts that are not under the control of CSS"[1]
2.1. element names are case-sensitive in HTML5 (?) and XML, but case-insensitive in HTML4.
2.2. identifiers (including element names, classes, and IDs in selectors) are case-sensitive. HTML attributes id and class, of font names, and of URIs lies outside the scope of the CSS specification.
....
The
Case matrix
Exceptions and specific (explicited in a reference) rules. "YES" indicate that value is case-sensitive.
Property values:
CSS property | Case-sens. | Reference and notes
------------------|------------|--------------------
%colorVals | NO | [3]
font-family | NO | [2]
%url | YES | ...
content | YES | ...
----------------------------------------------------
%colorVals = color, background, etc.
%url = background-image, etc. that use `url()`, see [7] and notes.
Selector values:
CSS selector | Case-sens. | Reference and notes
------------------|------------|--------------------
id | YES |...
element | YES/NO | ... YES for XML...
class name | YES | [5]
(`~ i` operator) | NO | [6]
----------------------------------------------------
YES/NO = depends on the document language (see ref. and notes).
REFERENCES:
[1] W3C/CSS2/syndata, sec. 4.1.3 Characters and case.
[2] W3C/CSS3-fonts, sec. 5.1 Case sensitivity of font family names
[3] W3C/CSS3-color, sec. 4.1. Basic color keywords
[4] W3C/CSS3-values, sec. 3.1. Pre-defined Keywords
[5] W3C/Selectors, sec. 3. Case sensitivity
[6] W3C/Selectors4, sec. 6.3. Case-sensitivity
[7] RFC 3986 and URL syntax illustration at Wikipedia.
Quotations and notes
Typical URLs starts with domain, that is case insensitive, but after it (path, query or fragment syntatical components), is case sensitive. See [7].
"User agents must match these names case insensitively". [2]
The spec for CSS 2 says:
CSS syntax is case-insensitive within the ASCII range (i.e., [a-z] and
[A-Z] are equivalent), except for parts that are not under the control
of CSS. For example, the case-sensitivity of values of the HTML
attributes "id" and "class", of font names, and of URIs lies outside
the scope of this specification. Note in particular that element names
are case-insensitive in HTML, but case-sensitive in XML.
... which makes quite sense: CSS itself accepts both background-image and BACKGROUND-IMAGE but it has no way to know whether your web server considers LOGO.PNG and logo.png as identical or different resources.
(I've been unable to find the equivalent document for CSS3)