First let me give you simple overview how it loads, then ill ask question regarding that.
Browser Fetch HTML => parse html => create nodes => parse nodes and start converting them to Dom elements => finds style node so start creating CSSOM => on finishing parsing if there was style tag it waits to let it construct CSSOM tree => once both are finished it merges both, DOM and CSSOM, and fires DOMContentLoaded Event.
So in summary as soon as CSSOM is ready browser starts rendering and Dom can incrementally be added.
This is all fine, but how does the flow go when browser starts rendering page when not the whole html is loaded..(for example in nodejs you can partial html then wait 2s and then send more)
What if there was another style tag at the bottom of the page. Not having received all html, and no css browser would start rendering, but from my understanding rendering only occurs after cssom has been completely built.
What happens to script tag, if css isn't done processing script tag isn't executed and thus also stops parsing. JS is ran after CSSOM is complete.
Things may block the DOMContentLoaded Event, but that does not prevent rendering of the incomplete page. That can be important for very long pages streamed from a slow server.
Browsers can and do interleave script execution, re-styling, rendering with the document parsing. This can be trivially shown by executing javascript in the <head> and querying the DOM, you will see that the document will not have all of its nodes (possibly not even a body element) before the DOMContentLoaded event has fired.
You have to think of document construction more as a stream than sequentially executed blocks that run to completion before the next block starts.
CSSOM stops parsing. Thus execution of subsequent script tags, and also delays rendering.
Script tags before style tags will execute before CSS is loaded into CSSOM from style tags afterwards.
Style tags that come after script tags will alter CSSOM. And if script accessed styles that are being altered then what it read is outdated. Order matters.
Parsing is stopped not just rendering.
JavaScript blocks parsing because it can modify the document. CSS
can’t modify the document, so it seems like there is no reason for it
to block parsing, right?
However, what if a script asks for style information that hasn’t been
parsed yet? The browser doesn’t know what the script is about to
execute—it may ask for something like the DOM node’s background-color
which depends on the style sheet, or it may expect to access the CSSOM
directly.
Because of this, CSS may block parsing depending on the order of
external style sheets and scripts in the document. If there are
external style sheets placed before scripts in the document, the
construction of DOM and CSSOM objects can interfere with each other.
When the parser gets to a script tag, DOM construction cannot proceed
until the JavaScript finishes executing, and the JavaScript cannot be
executed until the CSS is downloaded, parsed, and the CSSOM is
available
.
https://hacks.mozilla.org/2017/09/building-the-dom-faster-speculative-parsing-async-defer-and-preload/
A few important facts:
Event DOMContentLoaded is fired when the document has been fully parsed by the main parser AND the DOM has been completely built.
Any normal script (not async or deferred) effectively blocks DOM construction. The reason is that a script can potentially alter DOM.
Referencing a stylesheet is not parser-blocking nor a DOM-construction-blocker.
If you add a <script> (be it external or inline) after referencing a CSS, the execution (not fetching) of the script is delayed until fetching and parsing of the CSS has been finished even if the script's fetch finishes sooner. The reason is that the scripts may be dependent on the to-be-loaded CSS rules. So the browser has to wait.
Only in this case, a CSS blocks the document parser and DOM construction indirectly.
When the browser is blocked on a script, a second lightweight parser scans the rest of the markup looking for other resources e.g. stylesheets, scripts, images etc., that also need to be retrieved. It's called "Pre-loading".
Related
I have a small web app which has <a href="> tag which renders the only runtime I mean it does not appear in HTML page it appears only in chrome developers tools. How to eliminate such kind a tag and override or edit it.
I tried editing code via its script when I inspected for long it split document.querySelector("#m360CrA483349594983") but when I search entire project there's no sign of it!
demos written can be eliminated for sure!
Even I tried to Enable Local Overrides on but nothing seems to work that mysterious tag is kept coming saying
"demos written can be eliminated for sure!"
once the dom is ready you can access the element
<script>
(function() {
// the DOM will be available here
document.getElementById("m360CrA483349594983");
})();
</script>
Add body onload event listener in order to search for tag appearing at runtime.
Also, pay attention to tag ID value, it seems it is generated every time, so, you won't be able to find it by static ID
Consider the following HTML code:
<script>setTimeout(() => document.querySelector('link').remove(), 0);</script>
<link rel="stylesheet" href="http://localhost:8080/test.php">
where http://localhost:8080/test.php is a link to a simple PHP script which just waits 5 seconds (<?php sleep(5);).
The script removes the link tag as expected, but browser doesn't abort the request to the stylesheet. This doesn't make sense, because when the request is complete, browser doesn't apply the styles anyway. Is this a browser bug, or is there an explanation for this behavior in the specification?
This happens in Chrome and Firefox; I didn't test other browsers.
In a browser, the layout engine always parses the HTML from top to bottom, sequentially. However, the request to get CSS happens in parallel because CSS never changes the DOM Tree, there is nothing to worry.
Since style sheets don't change the DOM tree, there is no reason to
wait for them and stop the document parsing
Resource: Read Parsin Scripts > The order of processing scripts and style sheets
The main reason not to abort the CSS request is because it causes no harm. The effort to abort it would be much more painful.
However, note that:
Webkit blocks scripts only when they try to access for certain style
properties that may be effected by unloaded style sheets.
Image: WebKit Layout Engine. Credits - http://taligarsiel.com
Iv'e been reading up on the HTML Parser and your case seems a bit like the chicken and the egg story because naturally a <script> is supposed to block the parsing execution but there are exceptions to this (async & defer attributes)...
what you are doing is abit different ...
by addressing the document in the script tag the browser is forced to stop parsing and create a document with the tag you are addressing without executing the script itself again in the new document and thus requesting its related content before the DOM is able to remove the element ...
the DOM parsing process in the browser consists of a number of steps :
*If the document.write() method was called from script executing inline (i.e., executing because the parser parsed a set of script tags), then this is a reentrant invocation of the parser.
Edit :
A 'script' element is processed as follows:
If the 'script' element's "already processed" flag is true or if the element is not in the document tree, then no action is performed and these steps are ended.
If the 'script' element references external script content, then the external script content using the current value of the 'xlink:href' attribute is fetched. Further processing of the 'script' element is dependent on the external script content, and will block here until the resource has been fetched or is determined to be an invalid IRI reference.
The 'script' element's "already processed" flag is set to true.
If the script content is inline, or if it is external and was fetched successfully, then the script is executed. Note that at this point, these steps may be re-entrant if the execution of the script results in further 'script' elements being inserted into the document.
Note that a load event is dispatched on a 'script' element once it has been processed, unless it referenced external script content with an invalid IRI reference and 'externalResourcesRequired' was set to 'true'.
The Load Event - The event is triggered at the point at which the user agent (Browser) finishes loading the element and any dependent resources (such as images, style sheets, or scripts). In the case the element references a script, the event will be raised only after an attempt to interpret the script has been made. Dependent resources that fail to load will not prevent this event from firing if the element that referenced them is still in the document tree unless they are designated as externalResourcesRequired. The event is independent of the means by which the element was added to DOM tree.
If you are asking why the event loop is built that way I can't give you a definitive answer nor can I suggest a better way for it to operate but in the comments you asked what the specifications say about this condition and the specifications state that it is due to historical reasons as stated below :
HTML 5.1 W3C Recommendation, 1 November 2016
7. Web application APIs
7.1.4.2. Processing model
Some of the algorithms in this specification, for historical reasons, require the user agent to pause while running a task until a condition goal is met. This means running the following steps:
If necessary, update the rendering or user interface of any Document or browsing context to reflect the current state.
Wait until the condition goal is met. While a user agent has a paused task, the corresponding event loop must not run further tasks, and any script in the currently running task must block. User agents should remain responsive to user input while paused, however, albeit in a reduced capacity since the event loop will not be doing anything.
In which case DOM, can generate 2 trees?
I had this question on a test and I said this happens when we have 2 htmls in the same Web page.
Is this true?
There are a number of ways to do that, depending on how you define "tree".
You can have an <iframe> in your document, but that tree will have its own window, and will not be directly connected to your original tree.
You can have an <html> element inside your HTML (which is invalid HTML, but will still work), but that will actually be a subtree
You can use DOM APIs to build a detached <html> element
Simply instantiating a separate Document object, e.g. through DOMParser, an XHR with .responseType = "document" or with the DOMImplementation.createDocument factory method would create independent DOM trees.
I'm working on a project where we went from XHTML to HTML back to XHTML and there are some definite behavioral changes going back with regards to the page rendering before the CSS loads and scripts that read styles reading them before the CSS loads. Can anyone shed some light on why the following is happening and what can be done about it?
Basically, I have a page with the following structure:
<body>
<!-- Content from Source A -->
<link href="http://a.example.com/style.css" />
<header>...</header>
<!-- Content from Source B -->
<link href="http://b.example.com/style.css" />
<div>...</div>
<!-- Content from Source A -->
<footer>...</footer>
<script src="http://a.example.com/script.js">
/* e.g. */
alert($('header').offset().height);
</script>
</body>
When we were in HTML rendering mode, the page blocks rendering at expected points. When we hit the Source A CSS, rendering pauses (blank screen); when we hit the Source B CSS, rendering pauses (header is visible). When we hit the Source A JavaScript, rendering pauses (full page shown) and the script reads element styles from their rendered state. (In reality, of course, WebKit doesn't stop parsing the DOM or executing JavaScript while the CSS loads, but it does halt execution at the first point where the script needs to read a style.)
When we are in XHTML mode, the page doesn't halt rendering at all and will render the entire page completely unstyled. After that, it appears to process the scripts and stylesheets in the order loaded, or rather it executes the scripts in order but doesn't wait for the stylesheet to load before executing a loaded script. This means that the page will render three times (unformatted, with one stylesheet, and with two stylesheets) and the script may infer completely inaccurate values for element sizes.
Can someone shed light on this? This is happening in all WebKit browsers I've tested, including Chrome 17, Mobile Safari 5, and Android Browser 2.1. Is there any way to ensure HTML render ordering without resorting to the text/html mime type?
WebKit uses libxml2 to handle XML, which sends the parsed XHTML back to WebCore and JavaScriptCore to do the CSS rendering and JavaScript execution.
Stylesheet and script tags link to what's called an external entity in XML terminology. That means they are processed last. The XML spec says:
Except when standalone="yes", they must not process entity declarations or attribute-list declarations encountered after a reference to a parameter entity that is not read, since the entity may have contained overriding declarations; when standalone="yes", processors must process these declarations.
Since standalone="yes" specifies that the XML document should be validated by a DTD, this triggers a different processing model.
Link tags are handled differently than xml-stylesheet processing-instructions. The XML stylesheet spec says:
Any links to style sheets that are specified externally to the document (e.g. Link headers in some versions of HTTP [RFC2068]) are considered to create associations that occur before the associations specified by the xml-stylesheet processing instructions. The application is responsible for taking all associations and determining how, if at all, their order affects its processing.
Try commenting out the script tags and converting the link tags xml-stylesheet instructions. Also, try adding standalone="yes" to the XML declaration:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="foo.css"?>
In addition, the use of special characters, entities, and XSLT can further complicate the picture, since the processing model differs between HTML and an XML dialect like XHTML:
The range of allowed chars in XML is defined by the XML spec, and
the range is fully checked by libxml2. Not a concern, unless you parse
this for example with an HTML parser and give the preparsed tree to
libxml2 to serialize back. I hope you're not doing this as XSLT is
an XML language and must be parsed by an XML parser.
References
libxml2 Paser Internals
blink-dev => Intent to Deprecate and Remove: XSLT
blink-dev => Security: libxml2 growBuffer integer overflow on 64-bit machines
blink-dev => Stack-buffer-overflow in xmlSerializeHexCharRef
Webkit Title Index
I'm running a content-based website, and I usually used ajax to dynamically add items to the content list. Every time I updated my item structure I have to change my javascript to fit the new structure. I wonder whether there was any solution to keep script stable regardless of the changing of HTML?
Simple, instead of using the DOM to handle your data, process everything upon completion of the ajax request and only then call a function that has all of your data display functionality. Obviously you can't get away from having to change some code somewhere when you for instance rename HTML elements but you can separate concerns so that you only have to touch code in one place.
I do quite a bit of this in my app, and I follow the same pattern every time:
View page fires an ajax function to another page, which I call the "dispatcher" I use this pattern because I want a plain text output without header, footer, other JS, etc, so the dispatcher is a simple page that gets the request from the Ajax, fires appropriate PHP functions, and echos the results. In some cases it will return JSON strings while in others it will return HTML or plain text. For your example, return HTML from your server-side language.
Back in the AJAX success callback, inner html (.html()) an element with the returned html content. Have your server side language do the work of assembling the HTML (or even text if you're so inclined) because it is far less work and less overhead to accomplish.
Not too bad, huh?