Comparing different DOM nodes which ones are more performant? - html

Trying to reduce the amount of DOM nodes I did some research on this, but have not found any comparing numbers, like is it better to use two DOM elements instead of two pseudo-elements or is it better to use 20-characters text-nodes instead of 10-characters DOM-elements (assuming it contains much more extra parameters than a text-node (aren't they cached or something?))?
The target is to simplify work with big DOM tree (mostly a table with some structure in each cell with about a 30000 DOM-elements total).
I've read W3 specs on pseudo-elements, but did not found any useful info.
So is there any common rules or is it may be discovered only with benchmarks?
I've seen this question as well, but it did not help a lot, as my question is about comparing different nodes - which should be preferred to improve performance?
And yes, I know about the cell-reusing approach, but it also depends on the complexity of cells (in IE11 scrolling lagging a lot, much more than just render entire structure at once).

Related

Can too many HTML elements affect page performance?

i wonder if there's a difference between
1.) 10,000 tablerow which is visible
2.) 10,000 tablerow which is hidden using display:none
what i want to know is that. if all 10,000 row is visible on the page, could it cause the page scrolling to lag?
but if i hide for example the 9000 of them. could this reduce the lagging? Thanks guys.
In general display: none; will save the browser some work.
The browser will start by parsing your HTML and building the so called DOM (document object model), when all the CSS is received it will go on and build the CSSOM (CSS object model). Those two combined will give the render tree.
With the render tree in hand the browser will perform a layout step (deciding where each element goes on the screen and how big it will be) and then paint the page on the screen.
When combining DOM and CSSOM to become the render tree, however, the browser will discard all subtrees that are display: none; and thus, have less work to do in the layout and paint step.
Great question! It is not too broad, it is not discussed enough.
I'm in the middle of researching this based on a lazy-loading issue I'm having and cross-referencing other websites like Twitter and Reddit feeds.
Lighthouse flags pages with DOM trees that:
Have more than 1,500 nodes total.
Have a depth greater than 32 nodes.
Have a parent node with more than 60 child nodes.
For example I see that Reddit scores a dismal 26.
"Avoid an excessive DOM size 1,456 elements" is included as a recommended action.
Reddit.com: Lighthouse says:
A large DOM will increase memory usage, cause longer style
calculations, and produce costly layout reflows. Learn more. React
Consider using a “windowing” library like react-window to minimize
the number of DOM nodes created if you are rendering many repeated
elements on the page. Learn more. Also, minimize unnecessary re-renders
using shouldComponentUpdate, PureComponent, or React.memo and skip
effects only until certain dependencies have changed if you are using
the Effect hook to improve runtime performance.
https://web.dev/dom-size/#how-to-optimize-the-dom-size
Just ran into this question and wanted to put my 2 cents as well
even though modern browsers get smarter on fast rendering and
machines are getting faster, the best practice still stand that don't
render too many table rows. Use pagination.
it also depends how you render your table rows. If you're using JS to
render it, it will definitely have a negative impact on scroll
performance. There are very good js templating solutions you can
minimize js execution overhead. Call me old-school but I rather using
less js rendering on the client.

What happened to the "Use efficient CSS selectors" rule?

There was a recommendation by Google PageSpeed that asked web developers to Use efficient CSS selectors:
Avoiding inefficient key selectors that match large numbers of
elements can speed up page rendering.
Details
As the browser parses HTML, it constructs an internal document tree
representing all the elements to be displayed. It then matches
elements to styles specified in various stylesheets, according to the
standard CSS cascade, inheritance, and ordering rules. In Mozilla's
implementation (and probably others as well), for each element, the
CSS engine searches through style rules to find a match. The engine
evaluates each rule from right to left, starting from the rightmost
selector (called the "key") and moving through each selector until it
finds a match or discards the rule. (The "selector" is the document
element to which the rule should apply.)
According to this system, the fewer rules the engine has to evaluate
the better. [...]. After that, for pages that contain large numbers of
elements and/or large numbers of CSS rules, optimizing the definitions
of the rules themselves can enhance performance as well. The key to
optimizing rules lies in defining rules that are as specific as
possible and that avoid unnecessary redundancy, to allow the style
engine to quickly find matches without spending time evaluating rules
that don't apply.
This recommendation has been removed from current Page Speed Insights rules. Now I am wondering why this rule was removed. Did browsers get efficient at matching CSS rules in the meantime? And is this recommendation valid anymore?
In Feb 2011, Webkit core developer Antti Koivisto made several improvements to CSS selector performance in Webkit.
Antti Koivisto taught the CSS Style Selector to skip over sibling selectors and faster sorting, which bring some minor improvements, after which he landed two more awesome patches: one which enables ancestor identifier filtering for tree building, halving the remaining time in style matching over a typical page load, and a fast path for simple selectors that speed up matching up another 50% on some websites.
CSS Selector Performance has changed! (For the better) by Nicole Sullivan runs through these improvements in greater detail. In summary -
According to Antti, direct and indirect adjacent combinators can still be slow, however, ancestor filters and rule hashes can lower the impact as those selectors will only rarely be matched. He also says that there is still a lot of room for webkit to optimize pseudo classes and elements, but regardless they are much faster than trying to do the same thing with JavaScript and DOM manipulations. In fact, though there is still room for improvement, he says:
“Used in moderation pretty much everything will perform just fine from the style matching perspective.”
While browsers are much faster at matching CSS selectors, it's worth reiterating that CSS selectors should still be optimised (eg. kept as 'flat' as possible) to reduce file sizes and avoid specificity issues.
Here's a thorough article (which is dated early 2014)
I am quoting Benjamin Poulain, a WebKit Engineer who had a lot to say about the CSS selectors performance test:
~10% of the time is spent in the rasterizer. ~21% of the time is spent
on the first layout. ~48% of the time is spent in the parser and DOM
tree creation ~8% is spent on style resolution ~5% is spent on
collecting the style – this is what we should be testing and what
should take most of the time. (The remaining time is spread over many
many little functions)
And he continues:
“I completely agree it is useless to optimize selectors upfront, but
for completely different reasons:
It is practically impossible to predict the final performance impact
of a given selector by just examining the selectors. In the engine,
selectors are reordered, split, collected and compiled. To know the
final performance of a given selectors, you would have to know in
which bucket the selector was collected, how it is compiled, and
finally what does the DOM tree looks like.
All of that is very different between the various engines, making the
whole process even less predictable.
The second argument I have against web developers optimizing selectors
is that they will likely make things worse. The amount of
misinformation about selectors is larger than correct cross-browser
information. The chance of someone doing the right thing is pretty
low.
In practice, people discover performance problems with CSS and start
removing rules one by one until the problem go away. I think that is
the right way to go about this, it is easy and will lead to correct
outcome.”
There are approaches, like BEM for example, which models the CSS as flat as possible, to minimize DOM hierarchy dependency and to decouple web components so they could be "moved" across the DOM and work regardless.
Maybe because doing CSS for CMSes or frameworks is more common now and it's hard then to avoid using general CSS selectors. This to limit the complexity of the stylesheet.
Also, modern browsers are really fast at rendering CSS. Even with huge stylesheets on IE9, it did not feel like the rendering was slow. (I must admit I tested on a good computer. Maybe there are benchmarks out there).
Anyway, I think you must write very inefficient CSS to slow down Chrome or Firefox...
There's a 2 years old post on performance # Which CSS selectors or rules can significantly affect front-end layout / rendering performance in the real world?
I like his one-liner conclusion : Anything within the limits of "yeah, this CSS makes sense" is okay.

Storing site data in columns or rows

This is a question of how to perform the best practice of storing data from a webpage. Like texts/image-urls/links etc.
I have an CMS were you can create web pages. Here you can edit texts/upload images. In the future it would also be nice to "add new elements", add links to a-tags etc.
I need to have a robust and flexible solution that also have good performance. In both getting/recieving this data.
Lets consider I have 1000 pages with each around 25 elements on each page that can be updated and stored in the database.
Alternative 1)
Create a table and 1 column for each element on these pages for example columns like:
title_1, title_2,image_1,image_2.
Here we have a set of columns that we can update, these we can use on the web page.
Alternative 2)
Create 1 table with the columns (id, namespace, page_id, data)
And for each element on the page I add the namespace in association with the page_id to make the data output unique. In the data I can add any kind of information; text, links etc.
What do you suggest as a good solution for this issue? I'm ofcourse also open for other alternatives.
Thanks!
I would recommend option two, with the addition of a column identifying the element id/or type, if indeed the element id is somehow comparable. That is to say, if anchor text (say) is always stored as element id = 4, then you might want an element id = 4 so that you could compare anchor texts across multiple documents.
If, on the other hand (and this is the scenario I imagine is more likely), you may have 1-25 elements on a page and each of them could be different (eg document one has three anchor texts and four images, document two has one anchor text and no images, etc) it would make sense to add an element_type_id table that stores a bit of information about the element types. This is assuming that you ever have any interest in comparing (say) images across multiple documents, or anchor texts across multiple documents, etc.
Another thing to consider: if you are likely to see the same element over and over again, it actually makes more sense to effectively parameterize those elements by way of a lookup table. So basically store each (say) unique anchor text in one table and reference its id in your actual data table.
If I may add one additional thing: SO may not be the best place for the particular question you are asking. I'm not totally sure of that and maybe I'm wrong... but I would poke around the Stack Exchange network and see if other forums more closely deal with the type of question you asking. In the very least, I'd observe that your question is fairly vague and the goal of achieving a "robust and flexible solution that also {has} good performance. In both getting/recieving this data." is not likely to be accomplished simply by asking for advice on SO. There is a LOT that goes into data architecture, and certainly many of the details I would consider important in designing this myself are not present in your questions. And if you're not sure what those details are, I am not sure if SO is really the best place to set about learning them. I think https://softwareengineering.stackexchange.com/ may be a better fit for this question.
Just my opinion, and I could be wrong. Either way, I would consider learning a bit about database normal forms (http://www.bkent.net/Doc/simple5.htm or Google it) as well as do a little research on the types of design considerations that go into building a database (an old but still good SO article on that is here: What are the most important considerations when designing a database?)

Unused classes in my HTML

Do unused class names on html elements impede rendering performance (with no corresponding style in the style).
eg: I have a number of game types that have a fixed number set, some game types require the number set be styled differently. The game type key is added to the parent of all game types to allow the number set to be styled differently for each game type if required, although most use the default styling as such have unused classes.
Not on active performance. It will only give your stylesheets itself more data-weight to download, but might also seem trivial if you count in browser-caching.
Classes are lazy-loaded and aren't chunked in as a whole into your rendering-set of your browser. They are only searched for when they are needed. If they are never used, it will not impact the performance of your website.
There's one final note tho; if you use different class chains (.abc .def:nth-child(1) .ghi) with complex selectors, it might take some browsers a bit time (fractions of miliseconds) to try to still figure out what's happening tho. You really need to benchmark these situations yourself and may differ strongly per browser.
The browser will have to read those, yes, and this will certainly take a little time, but it should not be anywhere near enough time to affect performance in any way you would notice (unless we're talking hundreds or thousands of unused classes maybe). I would consider this a micro-optimization and move on.

Differences in query algorithms between XPath and CSS

I'm wondering why someone would want to use CSS selectors rather than XPath selectors, or vice-versa, if he could use either one. I think that understanding the algorithms that process the languages will resolve my wonder.
There's a lot of documentation on XPath and CSS selectors individually, but I've found very few comparisons. Also, I don't use CSS selectors that much.
Here's what I've read about the differences. (These three references discuss the use of XPath and CSS selectors in Selenium to query HTML, but my wonder is general.)
XPath allows traversal from child to parent
CSS selectors have features specific to HTML
CSS selectors are faster when you're using Internet Explorer in Selenium
It looks like CSS selection algorithms are somehow optimized for HTML, but I don't know how.
Is there a paper on how CSS and XPath query algorithms work and how they differ?
Are there other abstract differences between the languages that I'm missing?
The main difference is in how stable is the document structure you target:
XPath is a good query language when the structure matters and/or is stable. You usually specify path, conditions, exact offset... it is also a good query language to retrieve a set of similar objects and because of that, it has an intimate relationship with XQuery. Here the document has a stable structure and you must retrieve repeated/similar sections
CSS selectors suits better CSS stylesheets. These do not care about the document structure because this changes a lot. Think of one CSS stylesheet applied to all the HTML pages of a website. The content and structure of every page is different. Here CSS selectors are better because of that changing structure. You will notice that access is more tag based. Most CSS syntax specify a set of elements, attributes, id, classes... and not so much their structure. Here you must locate sections that do not have a clear location within a document structure but are marked with certain attributes.
Update: After a closer look to your question I realized that you are more interested in the current implementation, not the nature of the the query languages. In that case I cannot give you the answer you are looking for. I can only suppose that the reason is still that one is more dependent on the structure than the other.
For example, in XPath you must keep track of the structure of the document you are working on. On the other hand CSS selectors are triggered when a specific tag shows up, and it usually does not matter what came before it. I can imagine that it will be much easier to implement a CSS selector algorithm that work as you read a document, while XPath has more cases where you really need the full document and/or strict track of what it is reading (because the history and background of what you are reading is more important)
Now, do not take me too serious on my update. I am only guessing here because I had some background on language parsing, but I actually do not have experience with the ones designed for data querying.