Change <html xmlns="http://www.w3.org/1999/xhtml"> to https? [duplicate] - html

Is it better to have a xmlns URI with https protocol?
For example this is the recommended way by the manual:
<http xmlns="http://www.springframework.org/schema/security"/>
Is it legal and better to use this way?
<http xmlns="https://www.springframework.org/schema/security"/>
Are there XML parsers that try to connect to an address defined by xmlns URI?
Parsers always download schemas by xsi:schemaLocation attribute?

The URI is the namespace name, which identifies the namespace.
While, in case of some URI schemes (like http, https, ftp etc.), it would be possible to provide the schema (or other related information), this is "not a goal":
It is not a goal that it be directly usable for retrieval of a schema (if any exists).
(Most URI schemes wouldn’t allow this to begin with, e.g., urn, tag, jabber etc.)
You should specify the URI exactly as documented, as this is what consumers expect and look for (most consumers probably never try to actually retrieve the URI), and XML Names 1.0 is pretty strict about comparing URIs.
All these would be different namespace names, even if they would resolve to the same Web document:
http://www.springframework.org/schema/security
http://www.Springframework.org/schema/security
httP://www.springframework.org/schema/security
http://www.springframework.org/schema/Security
https://www.springframework.org/schema/security
https://www.springframework.ORG/schema/security

Related

HTTP Basic Authentication in URL supported or deprecated

On a project we spent considerable effort to work around basic authentication (because webdriver tests were depending on it, and webdriver has no api for basic authentication), and I remember basic authentication in the URL clearly not working. I.e. could not load http://username:password#url
Just google "basic authentication in url" and you will find tons of people complaining: https://medium.com/#lmakarov/say-goodbye-to-urls-with-embedded-credentials-b051f6c7b6a3
https://www.ietf.org/rfc/rfc3986.txt
Use of the format "user:password" in the userinfo field is deprecated.
Now today I told this quagmire to a friend and he said they are using http://username:password#url style basic authentication in webdriver tests without any problem.
I went in my current Chrome v71 to a demo page and to my surprise I found it indeed very well working: https://guest:guest#jigsaw.w3.org/HTTP/Basic/
How is this possible?? Are we living in parallel dimensions at the same time? Which one is true: is basic authentication using credentials in the URL supported or deprecated? (Or was this maybe added back to Chrome due to complaints of which I can't find any reference?)
Essentially, deprecated does not mean unsupported.
Which one is true: is basic authentication using credentials in the URL supported or deprecated?
The answer is yes, both are true. It is deprecated, but for the most part (anecdotally) still supported.
From the medium article:
While you would not usually have those hardcoded in a page, when you open a URL likehttps://user:pass#host and that page makes subsequent requests to resources linked via relative paths, that’s when those resources will also get the user:pass# part applied to them and banned by Chrome right there.
This means urls like <img src=./images/foo.png> but not urls like <a href=/foobar>zz</a>.
The rfc spec states:
Use of the format "user:password" in the userinfo field is
deprecated. Applications should not render as clear text any data
after the first colon (":") character found within a userinfo
subcomponent unless the data after the colon is the empty string
(indicating no password). Applications may choose to ignore or
reject such data when it is received as part of a reference and
should reject the storage of such data in unencrypted form. The
passing of authentication information in clear text has proven to be
a security risk in almost every case where it has been used.
Applications that render a URI for the sake of user feedback, such as
in graphical hypertext browsing, should render userinfo in a way that
is distinguished from the rest of a URI, when feasible. Such
rendering will assist the user in cases where the userinfo has been
misleadingly crafted to look like a trusted domain name
(Section 7.6).
So the use of user:pass#url is discouraged and backed up by specific recommendations and reasons for disabling the use. It also states that apps may opt to reject the userinfo field, but it does not say that apps must reject this.

Is "http://" or "mailto:" considered as a namespace in semantic languages such as RDF?

Is it correct to say that http:// or mailto: are a namespace in RDF?
Where can I find a definition what a namespace is? An can I say that a namspace is a URI?
RDF itself has no notion of namespaces. However several RDF serialization formats (like RDF/XML and Turtle) use namespaces to abbreviate URIs such as by using CURIEs. The CURIE spec mandates the prefix to be mapped to an IRI, so you couldn't map it to just "http://".
Found the solution by myself. Here is the answer:
A uri consists of the following format:
URI = scheme:[//authority]path[?query][#fragment]
A URI must contain a scheme and a path. http and mailto are only schemes and not considered as a URI because the path is missing.
A namespace is definied by a uri. However, http:// and mailto: is not a URI and that is why they are not a namespace.

Is it possible to use invalid(non existing) Uri for JSON schema definition?

Is it possible to use invalid(non existing) Uri for JSON schema definition?
So that I can specify it and use for versioning, without need to deploy it anywhere?
A URL is expected to resolve to the resource, so if you say "this is the URL for the schema" then that URL should resolve to the schema.
However, URLs are not the only sort of URI - it sounds like a URN might be what you want. In contrast to a URL (uniform resource location), a URN (uniform resource name) is an identifier for a resource, but it doesn't carry a generic method to resolve it.
For example, the URN urn:ietf:rfc:2648 is an identifier for RFC 2648, but there isn't a standard way to get the RFC text from just that URN (you'd need some kind of special service that knew about urn:ietf:rfc:... URNs). If you used something like this, then it should (in theory) do what you want.
(You might run into trouble referencing one schema from another if your library is mistakenly assuming all URIs are URLs, but that would be a bug in your library.)

How can I safely add user-supplied URLs to my HTML page?

As with any user supplied data, the URLs will need to be escaped and filtered appropriately to avoid all sorts of exploits. I want to be able to
Put user supplied URLs in href attributes. (Bonus points if I don't get screwed if I forget to write the quotes)
...
Forbid malicious URLs such as javascript: stuff or links to evil domain names.
Allow some leeway for the users. I don't want to raise an error just because they forgot to add an http:// or something like that.
Unfortunately, I can't find any "canonical" solution to this sort of problem. The only thing I could find as inspiration is the encodeURI function from Javascript but that doesn't help with my second point since it just does a simple URL parameter encoding but leaving alone special characters such as : and /.
OWASP provides a list of regular expressions for validating user input, one of which is used for validating URLs. This is as close as you're going to get to a language-neutral, canonical solution.
More likely you'll rely on the URL parsing library of the programming language in use. Or, use a URL parsing regex.
The workflow would be something like:
Verify the supplied string is a well-formed URL.
Provide a default protocol such as http: when no protocol is specified.
Maintain a whitelist of acceptable protocols (http:, https:, ftp:, mailto:, etc.)
The whitelist will be application-specific. For an address-book app the mailto: protocol would be indispensable. It's hard to imagine a use case for the javascript: and data: protocols.
Enforce a maximum URL length - ensures cross-browser URLs and prevents attackers from polluting the page with megabyte-length strings. With any luck your URL-parsing library will do this for you.
Encode a URL string for the usage context. (Escaped for HTML output, escaped for use in an SQL query, etc.).
Forbid malicious URLs such as javascript: stuff or links or evil domain names.
You can utilize the Google Safe Browsing API to check a domain for spyware, spam or other "evilness".
For the first point, regular attribute encoding works just fine. (escape characters into HTML entities. escaping quotes, the ampersand and brackets is OK if attributes are guaranteed to be quotes. Escaping other alphanumeric characters will make the attribute safe if its accidentally unquoted.
The second point is vague and depends on what you want to do. Just remember to use a whitelist approach instead of a blacklist one its possible to use html entity encoding and other tricks to get around most simple blacklists.

SAXParseException: Element type SOAP:Text must be followed by either attribute specifications, ">" or "/>"

I'm attempting to read a response from a web service call in a junit test running in Eclipse Galileo. I'm able to successfully receive responses except when the response is a SOAP fault. Then I get the following exception:
org.xml.sax.SAXParseException: Element type "SOAP:Text" must be followed by either attribute specifications, ">" or "/>"
I have validated the XML in LiquidXML Studio against the SOAP 1.2 schema and it checks out.
Here is the XML response that SAX appears to be choking on. It has been stripped to the minimum in an attempt to eliminate anything obvious (I even made sure it didn't have any self closing elements):
<SOAP:Envelope xmlns:SOAP="http://www.w3.org/2003/05/soap-envelope" xmlns:SOAP_ENC="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP:Header>
</SOAP:Header>
<SOAP:Body>
<SOAP:Fault>
<SOAP:Code>
<SOAP:Value>SOAP:Sender</SOAP:Value>
<SOAP:Subcode>
<SOAP:Value>SOAP:Sender</SOAP:Value>
</SOAP:Subcode>
</SOAP:Code>
<SOAP:Reason>
<SOAP:Text xml:lang="">
</SOAP:Text>
</SOAP:Reason>
<SOAP:Node>
</SOAP:Node>
<SOAP:Role>
</SOAP:Role>
<SOAP:Detail>
</SOAP:Detail>
</SOAP:Fault>
</SOAP:Body>
</SOAP:Envelope>
Any help would be appreciated.
Its obviously not regcognising 'xml:lang=""' as an attribute. CHeck with your xsd or xmlSchema what attributes are valid. Also you should be using
xml:lang=""
rather than "" although most parsers forgive you for this.
I think the problem is in mapping the soap fault xml to its corresponding object.
It turns out the problem was related to a tool I was using to return static string responses to web service requests. The static response XML contained the xml:lang attribute. However, when the tool was returning the static string, it was modifying it on the way out and replacing xml:lang on-the-fly with the fully qualified namespace equivalent {http://www.w3.org/XML/1998/namespace}lang. When this response was received, the SAXParser was choking because it couldn't interpret the fully qualified equivalent.
The tool returning the static responses used a Groovy xml parser as an integral part of sending the response.
The XmlParser Groovy class has a constructor that I had to change to set validating and namespaceAware attributes to false. So instead of XmlParser(), the tool now calls XmlParser(false, false).
Problem solved.
Thanks for the responses.