I am trying to understand more on the physical and logical structure of an XML document. From the specification at W3C describing the physical structures:
An XML document may consist of one or many storage units. These are
called entities;...
So my question is:
What exactly is a storage unit referring in this context?
Is it used from the perspective of an XML processor and how it would store and manipulate the XML document in memory or is it referring to a persistent storage used to store the document?
An entity in XML and SGML represents a character stream. It can be an external entity, where the character content is accessed from another file or network (HTTP) stream, or an internal entity, which is part of the literal content of the document in which it's declared and referenced. An internal entity can be declared like this
<!ENTITY e "replacement text for e">
and then used as the &e; entity reference in content like this
<p> some text ... &e; ... other text </p>
such that an XML or SGML processor will replace &e; with replacement text for e. The concept of an entity is also used for other purposes.
As to the second question, the entity concept is related to "storage" of character data in external files or network streams; it doesn't refer to internal memory representations of a markup processor.
Related
Is there currently any syntax to allow for replaceable values in a .json ?
If not I would propose to have this capability.
After all this is 2022 shouldn't we be able to populate values easily without have to write our own parsers?
simple example:
{
"%h" : "some value",
}
this would replace "%h" with the system specific hostname, and would be completely option, if the "parameter" does not exist, no change in the parse.
And if .json has a specific "parameter" syntax that is native to them, that's fine.
Just the idea, we could have this option, "%h", or "", whatever syntax .json would like.
The JSON standard does not allow for dynamic values. It is a very simple text representation format meant for encoding data interchange the standard publcation describes it like this:
JSON is a lightweight, text-based, language-independent syntax for defining data interchange formats. It was derived from the ECMAScript programming language, but is programming language independent. JSON defines a small set of structuring rules for the portable representation of structured data.
The goal of this specification is only to define the syntax of valid JSON texts. Its intent is not to provide any semantics or interpretation of text conforming to that syntax. It also intentionally does not define how a valid JSON text might be internalized into the data structures of a programming language. There are many possible semantics that could be applied to the JSON syntax and many ways that a JSON text can be processed or mapped by a programming language. Meaningful interchange of information using JSON requires agreement among the involved parties on the specific semantics to be applied. Defining specific semantic interpretations of JSON is potentially a topic for other specifications. Similarly, language mappings of JSON can also be independently specified. For example, ECMA-262 defines mappings between valid JSON texts and ECMAScript’s runtime data structures.
Therefore there is no way to define a template value of some sort. You can achieve such functionality by processing your JSON separately but that would depend on the technology stack and the tools you use. However, it would only work for your project, not in any other third party projects.
We have already been using Dragon View XBRL Parser to read out Tables, paragraphs and other content from XBRL documents. Now that more companies are switching over to file/report their Financial documents in iXBRL instead of XBRL, we have to write/have a new parser for iXBRL to read out its contents. Instead if we can have a mechanism to convert iXBRL documents to XBRL, we would still be able to use the existing parser with little changes to process iXBRL documents
In XBRL: instance document is separate and independent of rendering document
In iXBRL: instance document is integrated inline in rendering document
My Question is: Is there any known/easy way to convert an iXBRL document to XBRL.
Many know what an XBRL document is.
To know more details about iXBRL document read here: http://www.xbrl.org/Specification/inlineXBRL/CR-2009-11-16/inlineXBRL-background-CR-2009-11-16.html
Differences between XBRL and iXBRL: http://www.datatracks.co.uk/ixbrl-blog/what-is-ixbrl/
There is an open source xslt based converter: https://sourceforge.net/projects/inlinexbrl/
But it hasn't been maintained for a long time and would need some updates to support the latest ixbrl version. Still, its a starting point.
I am not able to figure out from where I can get the URL for the 'scheme' attribute of xbrli:identifier tag in the 'entity' portion of a context definition in an XBRL document. I am not able to find it in the taxonomies or link bases. I have searched the net for hours.
The taxonomy I'm following is IFRS based.
Thanks in advance. :)
Section 4.7.3.1 of the XBRL Specification states:
An element specifies a #scheme for identifying business
entities. The required #scheme attribute contains the namespace URI of
the identification #scheme, providing a framework for referencing
naming authorities. The element content MUST be a token that is a
valid identifier within the namespace referenced by the #scheme
attribute. XBRL International is not a naming authority for business
entities. XBRL makes no assumption about the ability of an application
to resolve an identifier that may appear as element content in any
particular scheme.
Therefore, the URI used as the value for the scheme attribute is context specific. In other words, it is not defined by XBRL, but by the context of the usage implied by the taxonomy.
The element a required element in a context. It identifies the entity that is reporting the facts. It contains an identifier element and may include a segment description. The value of the scheme attribute, therefore, means something to the reporting entity rather than to the XBRL specification itself.
This is a near duplicate of How to reliably hash JavaScript objects?, where someone wants to reliably hash javascript objects ;
Now that the json-ld specification has been validated, I saw that there is a normalization procedure that they advertise as a potential way to normalize a json object :
normalize the data using the RDF Dataset normalization algorithm, and then dump the output to normalized NQuads format. The NQuads can then be processed via SHA-256, or similar algorithm, to get a deterministic hash of the contents of the Dataset.
Building a hash of a json object has always been a pain because something like
sha1(JSON.stringify(object))
does not work or is not guaranteed to work the same across implementations (the order of the keys is not defined of example).
Does json-ld work as advertized ? Is it safe to use it as universal json normalization procedure for hashing objects ? Can those objects be standard json objects or do they need some json-ld decorations (#context,..) to be normalized ?
Yes, normalization works with JSON-LD, but the objects do need to be given context (via the #context property) in order for them to produce any RDF. It is the RDF that is deterministically output in NQuads format (and that can then be hashed, for example).
If a property in a JSON-LD document is not defined via #context, then it will be dropped during processing. JSON-LD requires that you provide global meaning (semantics) to the properties in your document by associating them with URLs. These URLs may provide further machine-readable information about the meaning of the properties, their range, domain, etc. In this way data becomes "linked" -- you can both understand the meaning of a JSON document from one API in the context of another and you can traverse documents (via HTTP) to find more information.
So the short answer to the main question is "Yes, you can use JSON-LD normalization to build a unique hash for a JSON object", however, the caveat is that the JSON object must be a JSON-LD object, which really constitutes a subset of JSON. One of the main reasons for the invention of the normalization algorithm was for hashing and digitally-signing graphs (JSON-LD documents) for comparison.
I find this URL (or a similar one) always on HTML files, XML, XSD...
Like "http://www.w3.org/2001/XMLSchema" or "http://www.w3.org/2001/XMLSchema-instance"
I always wonder what those URLs means.
Even offline the XML or HTML document works without changes. What's the benefits on links to those URLs??
Thanks
Those URLs do not necessarily point to any website/server. They are a convenient naming mechanism. The idea is since every company will have a unique website, using that as their namespace will avoid clashes. Hence better interoperability. Hence the custom.
Namespaces in XML 1.0 Specification
It's the XML Schema.
An XML schema provides a view of the
document type at a relatively high
level of abstraction.