OneNote - Not able to find all the Property IDs in the Microsoft documentation - onenote

I am parsing One Note documents.
When I find property IDs from within the OneNote documents, I find most of them in the OpenSpecs document: https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-one/e9bf7da8-7aab-4668-be5e-e0c421175e3c?redirectedfrom=MSDN
But there are quite a few that seem to appear in OneNote files that I do not see in the Spec:
0x14001c9e
0x14001c9f
0x14001ca0
0x14001ca1
0x14001df9
0x1400348b
0x1c001d61
0x1c001df8
0x1c001dfb
0x1c001dfc
0x1c001e30
0x1c0035b7
0x88001c8e
0x88001cbd
0x88001cdc
0x88001d13
0x88003462
Where do I find information about these properties?

Related

How to find for the wikipedia links in the infobox templates and other templates, using sql dumps

I want to extract the pages mentioned in the infobox and templates of pages.
E.g. From this page:
https://en.wikipedia.org/wiki/DNA
I want to extract all of the links in the infobox, like: "Genetics", "Introduction to Genetics" etc.
I want to do it, by using the sql dumps, possibly avoiding to parse the xml of whole pages, and I don't want to do it with APIs.
I could not find a way.
While Pagelinks does include also the links of infoboxes, I cannot find a way to exclude them.
I thought Templatelinks may have that info, but it is not: I could not find the pageids of the corresponding links in infoboxes.
Where is this information stored?
Or which kind of tables should I look at?
I consulted previous questions:
where can I find the infobox templates used in wiki?
and Mediawiki reference:
https://www.mediawiki.org/wiki/Manual:Templatelinks_table#Schema_summary
but could not find a solution.
That is a sidebar rather than an infobox: https://en.wikipedia.org/wiki/Template:Genetics_sidebar
I don't think there's a way of doing it other than parsing the content of the template to extract the links or using the API: e.g. https://en.wikipedia.org/w/api.php?action=query&prop=links&titles=Template:Genetics%20sidebar&pllimit=100&plnamespace=0
Something like this should also work but it's not returning any results for me:
SELECT * from pagelinks
where pl_title = 'Genetics_sidebar'
and pl_namespace = 0
and pl_from_namespace = 10
https://quarry.wmcloud.org/query/71442

convert docx with (ordered) list to html

I'm trying to convert a large docx document with several layers' ordered list to an html. (see an example of the document here: http://docdro.id/X1oyfBv You should download it)
I tried the following things, including:
online converters such as html-cleaner and index.html (which only recognize one layer of the list)
save as html - which creates an horrendous file but still doesn't recognize the ol structure.
saved the file as zip and then opened the xml file, but I dont see an easy way to get the ol structure out of the w:... tags
saving it to google docs and running Omar Alzabir's script
http://omaralzabir.com/wp-content/uploads/2014/05/GoogleDocsEmail.jpg
btw. If I create a word file with an ordered list with multiple layers and i convert it, it does recognize it as ol's. But the existing file is not recognized as ol's even if I 'un-list' and list it again. So possibly there is something wrong with how the original document was created (?)
Any suggestions much appreciated:) Or indications as to why this problem occurs
Are you asking how to save a Word-doc in HTML format, with multi-level ordered-lists?
Word-HTML has bugs in its multi-level ordered lists. For the list-items, the indentation tends to be incorrect and inconsistent. There's an example here.
Word-HTML has similar bugs in its multi-level unordered lists. An example is here.
I recently wrote a Python program that fixes these bugs, in Word's HTML. The program is part of WordWebNav (WWN), which is free and open-source.
WWN is an app that converts a Microsoft-Word document to a usable web-page. It adds some missing features in the Word-HTML web-page (e.g., a navigation pane), and it fixes bugs in the Word-HTML.
You can use pandoc : https://github.com/jgm/pandoc
This is an open source universal command line tool to convert markup source based document files.
You can use it as something like that:
pandoc -o output.html input.docx

HTML rel="up" attribute?

I'm using mobile template HTML files on a PHPBB forum.
I tested the html for errors at http://validator.w3.org/
The test results showed the following error
Line 24, Column 66: {navlinks.FORUM_NAME}
Bad value up for attribute rel on element a: The string up is not a registered keyword or absolute URL.
Not having heard back from the author and not finding much on Google search, I'm trying to understrand what rel="up" does, if anything constructive.
Can't find any mention as an official HTML attribute
http://www.w3schools.com/tags/att_link_rel.asp
wondering if it's probably safe to just remove the phrase rel="up"
The Internet Assigned Numbers Authority (IANA) keeps a list of link relationships The latest version is from March 21 2013.
up: Refers to a parent document in a hierarchy of documents.
Unfortunately, despite the fact that this registry was long established, it was decided that HTML5 would not use this registry and would use a Wiki page to list the conforming link types instead.
Up, is listed in a rather insane section marked "dropped without prejudice", which nobody seems to know what to do with, or how to get those link types out of.
It's safe to drop it, but some browsers and browser plugins make use of it. For example, I use a Firefox plugin called "Link Widgets" like this to make use of the link type.
From: http://www.w3.org/MarkUp/html3/dochead.html
REL=Up
When the document forms part of a hierarchy, this link references the immediate parent of the current document.
If this is causing any specific problems or unexpected results, please post your code. Thanks.

wikipedia template data api

I want to download the template source used in a wikipedia page (basically for generating the display text of a key). SO i am basically want this info
http://en.wikipedia.org/w/index.php?title=Template:Infobox%20cricketer&action=edit
for Template:Infobox cricketer
I have found an api for wikipedia called Template data
http://www.mediawiki.org/wiki/Extension:TemplateData
But the examples given:
http://en.wikipedia.org/w/api.php?action=templatedata&titles=Template:Stub
does not seem to work.
I think you misunderstood what Extension:TemplateData is for. It's for getting metadata about a template, which only works if that template provides those metadata.
If what you want the text of the template, you should use prop=revisions&rvprop=content, for example:
http://en.wikipedia.org/w/api.php?action=query&titles=Template:Infobox%20cricketer&prop=revisions&rvprop=content

How to view xsd:documentation that is in HTML markup?

I am generating WSDL/XSD for SOAP services from a UML model using IBM Rational Software Architect (RSA). RSA allows you to document the classes and attributes in the model using rich-formatting.
For example, I have the following documentation on a Trailer class:
A wheeled Vehicle that is designed for towing by another
Vehicle. Known subtypes include:
Caravan
BoxTrailer
BoatTrailer
When the UML model is transformed to WSDL/XSD (using the out-of-the-box UML to WSDL transform), the formatting is preserved as HTML markup inside the xsd:documentation element:
<xsd:complexType name="Trailer">
<xsd:annotation>
<xsd:documentation><p>
A&nbsp;wheeled <strong>Vehicle</strong> that is designed for&nbsp;towing by another <strong>Vehicle.</strong> Known
subtypes include:&nbsp;
</p>
<ul>
<li>
<strong>Caravan</strong>
</li>
<li>
<strong>BoxTrailer</strong>
</li>
<li>
<strong>BoatTrailer</strong>
</li>
</ul></xsd:documentation>
</xsd:annotation>
</xsd:complexType>
Unfortunately, this is really hard to read and I've been searching (with no luck) for a program that can view WSDL/XSD with documentation in HTML markup.
XmlSpy 2008 can't do it, RSA can't do it (which is a bit surprising, as it generated the XSD in the first place), neither can any web browser I've tried.
I did write a JET template that extracted the documentation from the model and outputted to HTML, and I could probably write some XSLT to do something similar from the XSD, but I was hoping there's a program out there (ideally free) that could view the documentation as HTML.
Essentially, I'd like to be able to tell the consumers of our web service that they can view the WSDL in X program if they want to read the documentation - does anybody know the best solution to this?
Edit:
Thanks for the suggestions, but I think I have a solution! I didn't realise that RSA can export a WSDL to HTML (right-click on WSDL, export, HTML). The generated HTML has a graphical view of each schema element, the documentation for each element, as well as the original source, and everything is hyperlinked together.
Most importantly, the documentation is richly-formatted again! One small caveat is that the ;nbsp's appear in the HTML output. This seems to be because the ampersand is escaped in the HTML:
&nbsp;
Instead it should be
I will update my model-to-model transform to ensure that the ;nbsp's are replaced with real spaces (I don't believe I'll need non-breaking spaces in the documentation), so the generated WSDL/XSD won't ever have them.
I highly doubt if the standard xml/xsd editors can interpret the html tags and generate appropriate documentation. Oxygen XML Editor does a decent job of understanding and converting the XML entities (liket < etc) but HTML tags and entities are left as is. Below is the screen shot in design view.
The type of <xs:documentation> is <xs:any> so you should actually be able to include your documentation without escaping the markup, provided that it is a well formed XHTML fragment instead of HTML. I guess some XML Schema tools would be capable to interpret the embedded XHTML and show it as formatted text.
Do note that if the markup is not escaped it absolutely must be a well formatted XML fragment or the documentation element will cause your schema to be malformed. This applies also to HTML entities! If the documentation contains an (unescaped) entity reference (other than the 5 pre-defined XML entities), then your schema either must contain an external DTD reference or have an embedded DTD that defines what is the replacement text of that entity. In your case the documentation contains an entity reference. Probably easiest will be to replace such entities with the corresponding Unicode character/text or with character references (use   for )
If you have a chance, try to include the documentation without escaping the markup and make sure that it will be well formed. Otherwise you probably need to process the documentation twice: 1) parse the schema and extract documentation 2) parse the documentation text again (possibly as HTML, not XML).
I've tried this with the latest build of QTAssistant and it shows like this in the Schema Help Panel only; I've put a feature request for the grid view, as well as the documentation generator to work the same. Is this what you're expecting?
The help panel shows the annotation of the schema object that is selected in the Graph/Diagram view. To display the help panel press F1.
This issue is fixed in RSA 8.0.4 - which now supports exporting to WSDL/XSD with plain text (as well as an option to sort the schema by type, then name alphabetically!).
To view the the documentation in a WSDL/XSD generated from a UML model in prior versions of RSA, the easiest solution is to export the WSDL/XSD as HTML using RSA. You can do this by right-clicking on the WSDL/XSD, selecting export, then selecting HTML.
The generated HTML has a graphical view of each schema element, the documentation for each element, as well as the original source, and everything is hyperlinked together.
Most importantly, the documentation (that's virtually unreadable in the WSDL/XSD) is richly-formatted again! One small caveat is that the ;nbsp's that RSA's documentation editor inserts also appear in the HTML output. This seems to be because the ampersand is not only escaped in the WSDL/XSD (which is good), but also in the HTML (bad!):
&nbsp;
Instead it should be
A simple workaround to this is to replace all &nbsp;'s in the WSDL/XSD with real spaces before generating the HTML.