HTML Parsing in web browsers? - html

I’m a beginner and I’m wandering for this question.
Right now I’m concerned with webkit (a web browser).
Here, my problem is, what I want to know, how the web browser is handling all the HTML data coming from any network.
E.g. how is it getting the data and parsing that?
Very specifically, I want to know about HTML Parser!
If you have the code base of webkit, you can find a part that’s webcore.
In webcore, there is a HTML module.
As I think, this is the part of HTML parser.
But it seems very tough for me to understand that code without knowing the basics.
So please help me.

You would need some basic understanding on formal language definitions and compilers. Without this knowledge looking at the parser code for 1000 years is futile.
I recommend to read this book first:

Related

Understanding the Basics behind Handlebars, Express, and Node.js

I've been struggling to understand Handlebars ever since I was introduced to it in class. I've researched different resources and videos (e.g., YouTube, StackOverflow, etc.) to try and learn more about it, but I still feel like I'm not getting it.
Could somebody please either explain to me what Handlebars is in their own terms or send me resources they found helpful when learning it?
Thanks!
handlebars.js is a templating engine which allows dynamic data to be mixed in with your HTML code. Templating engines were created due to complex projects requiring a lot of dynamic HTML manipulation. Previously, software developers created new chunks of HTML code and dynamically inserted them into the DOM using Javascript. This eventually became unwieldy and difficult to maintain. Also, it lead to repetition of code. To solve this issue, templating engines allowed one to create predefined templates to be used in multiple locations without repeating the code. Templates are like “macros”; wherever they are used, the code in them is inserted at that place. They also help to keep your html away from your javascript files, thereby increasing the readability and re-usability of your code. For a more comprehensive explanation see this blog post
The best resource to learn handlebars.js is their documentation
There is a course on Udemy that uses handlebarsjs and Node and Express and goes from pretty basic I think. This is it.

What it's the canonical way of interfacing DB's id with HTML id?

So I'm writing a simple web app. It has SQLite as the RDBMS, Python's bottle framework as server and rest is HTML, CSS and JavaScript.
Now, we all know that HTML tags cannot have id(s) starting with a numeral. And using text/string as primary key in database would be hell.
Right now I'm converting DB id to string and adding an _ before it so it renders correctly in HTML. And doing the reverse when DB needs to be updated. But my code doesn't look beautiful.
So what's the standard accepted way of making these two talk? Where should I implement the translator, in Python or in JavaScript? What background does this problem have in conventional computer science?
Edit: for those asking for more specificity
What it's the best way to ensure both model and view work correctly without monkey coding a to and from function everywhere (a programmer might forget to call it).
Also if you are trained in computer science at college or know super academic languages like Java or Haskell, what would be your approach.
It was news to me that ids cannot start with a numeral, although I checked the official reference and you are right.
In practice I have used integers and uuids as element ids for years across many browsers and never had a problem with the browser rendering it or accessing it via JavaScript.
So my answer is just to ignore it.
EDIT
In response to the downvote - just try this
<body>
<p id="1" onclick="alertme()">This is some text</p>
<script>
function alertme(){alert(document.getElementById('1').innerText);}
</script>
</body>
as a plain html file opened in various browsers and using developer tools.
Not one browser complains about it. (Firefox complains there is no html tag or encoding but there you go). Everything works. This is simple practical advice for the real world.

Make a html to pdf converter

I am pretty new to developing softwares and am intrigued by the huge world out there!! I have working knowledge of C/C++ and Java.. I was thinking of making an application that would convert a webpage to a pdf document.. I know there are many solutions available -- both online and offline..But I want to develop my own.. I googled but couldn't find anything that would help me get started..
I want to know how do we go about a conversion process?? How to get started?? What languages and technologies are pre-requisites for making a converter like this??
Thank You
So at least you need to get to the bottom to following specifications:
HTML specification
CSS specification
JavaScript specification
PDF specification
Moreover here are a lot of minor stuff such as Fonts, Decription/Encription algorithms and many many other minor but still necessary things.
I think you can imagine that this is quite a long way to get all this working. In fact, the complexity of such software is the reason why so many companies make money in this field.
Anyway, I'd suggest you to start from the simple things and grow your software gradually. Start with converting HTML to Image, because it is a bit simpler. Take and parse HTML, its CSS, its JavaScript. Clean HTML. Build DOM of the HTML document. Apply styles. Go thru the DOM and draw elements to the image.
Good luck!

Flash parser for html

As I was working on this project for a friend of mine who is terrified of changing from HTML to flash, I realized that maybe there could be a bridge between them. So I started working on a flash project that would grab the HTML from his page and parse it to display it in flash. Although I am sure there are resources available for this already, I figured that the experts on SO might be willing to suffer through the logic of one user trying to develop this script.
So basically, I am not asking for an answer, I am asking for some step-by-step direction that could be posted so other people could see the logic behind breaking down this project. I think it would be really useful (not just for me, but for anyone wanting to learn more about objects and oop).
So, much like the thread between primarily Senocular and Rampage, this would be a thread where I would be the student asking the questions in a logical step-by-step manner and someone else (or someones else) could provide guidance.
Let me know if you are interested and I can start by posting what I have already written. We can go from there and I am sure it will prove insightful to anyone who reads it. If no one is interested, or no one has the time or inclination, no problem.
Best wishes,
Jase
Who in their right mind would change from html to flash for displaying a simple website? I don't see the logic behind it, it's more like you are trying too hard. Flash has its function in the web, as well as html does. If it's just for simple displaying, using flash is just the wrong way and won't make your website any better but worse because its loading time will be too long.
Goole Search retrieved these:
HTMLWrapper
Groe.org HTMLParser
There is an article about the 1st on *drawlogic. I think the seconds' home is on sourceforge here.
Thing is, browsers already do a fine job at parsing html code. Having the flash player parse html files not only does away with any accessibility advantage your markup can offer but it also feels like reinventing the wheel. If you need to display html content, leave it to the browser.
Slightly offtopic - Flashpaper can convert most HTML pages into swf format.
Given properly "disciplined" HTML, you can use the XML parser in the player for the basic parsing. Are you really talking about writing an HTML renderer in Flash though? Or just being able to pull information from HTML dynamically?

Whats better using HTML/CSS edited by hand or using design programs? and why? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
For design websites is it better to do it your self by learning HTML/CSS or using web design programs? and why?
I've bounced back and forth between hand coding and Dreamweaver in my history as a web developer.
I originally started out hand coding HTML. This was back in the day when table layout was king, and editing nested tables became a real headache. Couple this with a lack of good tools for visualizing hidden elements and this quickly became a nightmare.
I started using Dreamweaver primarily to speed up my table design workflow. Soon, however, Dreamweaver's templating system became a godsent when I started producing static websites that had no server backend. Being able to update one template and have it propagate across the entire static site cut down my cross-page inconsistencies to nearly zero.
More recently, the whole web 2.0 push has got me, and almost everyone else, back into the hand coding game. I found Dreamweaver wasn't really suitable for the compliant designs, since it was heavily table-centric. I find that most of the HTML I write these days is so straightforward and simple there's little need for an editor. Additionally, all my development is now dynamic once again, so there's no need for a static html generating template system anymore.
Learn for yourself so you can figure out how to do things exactly how you want them to be done, and not have to rely on some sort of program to figure it out for you.
Like anything else in technology, learn the core concepts first, and then use a tool to automate the things you have mastered. By doing so, you will gain a better understanding of how everything works together, and you be able to easily tell when something goes wrong. In this way you will not be bound to any one design tool, and can use whatever works best because you understand the core concepts.
In the words of Richard Feynman,
"That which I cannot create, I do not
understand."
They really serve two purposes, and either one is "better" for it's purpose.
If you learn to do it by hand, you will:
Have more control over exactly what is happening
Have less extraneous code
Be able to maintain your code more easily
If you use a program, you will:
Be able to design visually
Possibly be able to design more quickly
Not have to learn to write CSS by hand
It really depends on what your goal is.
I prefer HTML/CSS by hand because you have the most control over the code. Most design programs will add additional markup that is not required. Even simple WYSIWYG JavaScript editors add extra markup. Although, not a huge difference in file size, the additional markup will add up over time. I would also argue that its easier to maintain code when you know what went into its creation.
Additionally, you'll learn a lot more by taking the time to do it by hand.
Personally, I always edit my HTML/CSS by hand using editors with auto-completion if I can, because that always makes life easier. You should definitely always learn a language as much as you can before you start relying on any program to generate code for you, because most of the time you end up fixing what they gave you.
I tend to do it all by hand.
Doesn't matter what IDE or
server-side language I'm using.
Mark up is markup. Being able to do
it rapidly by hand is valuable.
More often then not, you'll have to
edit some markup manually. By
writing it from scratch, you're
already very familiar with the structure of the markup.
You don't have to spend any time
orienting yourself to the
designer-generated markup.
Although not necessarily a rule,
those who live in the designer I've
found to be less sharp in their
markup and code craftsmanship.
I prefer the by hand approach. That way you know exactly what you're getting. Plus I haven't found an editor that produces HTML/CSS that doesn't need some tweaking especially if you are targeting multiple browsers.
Doing it by hand. Using design programs tends to insert a lot of extra markup you don't really need, which will just complicate your ability to learn.
If you do it by hand you at least know what was inserted where, and why. Plus there are a lot of good websites out there that can walk you through the basics.
IMO you will still learn using web design programs like Dreamweaver, since you have to look at the source and make it fit your exact desires,and its quicker. But doing it by hand will give you the more you write the more you learn type of thing that I agree with 100%
This is a bit vague.
I think that "better" (qualitatively) depends greatly on (1) the competency of the designer, and (2) the sophistication of the application.
Regarding "better" (as in "advisable"): using an application can be a crutch that may fail to save you in all cases. Knowing how to "raw code" html and css is valuable in understanding the limitations of the application and working around those limitations. For that reason alone I suggest knowing how to do it by hand and then keep a sharp eye on the output generated by the application, should you choose to use one.
The absolute best is when you understand what you are doing - you can only do this by coding by hand.
If you don't know HTML or CSS and you use a WYSIWYG editor then how can you be sure everything is right? You can't!
If you have a good understanding of HTML and CSS why would you use a WYSIWYG editor? They make things harder because you can't see the code and extra tags and rules get inserted without you knowing.
Coding by hand is always the best.
Why should you know about xhtml/css ?
Here is some reasons:
Respect semantics meaning
DOM compliant (you know the javascript mess)
Easier to maintain
Search Engine Optimization
You still think it takes a longer time to design/integrate a website ?
Think of use vi, eclipse, quanta, and probably some others...
By hand is the obvious answer, because your website/application will be, well, better. (And also because, if you use JavaScript, it's good to traverse through the DOM of a document you've written yourself, versus a generated one that you have to examine beforehand.) But that's mostly only because the visual tools that exist today really suck (I'm thinking of Dreamweaver). It's definitely possible to create a good visual editing (WYSIWYG) program that actually generates good HTML/CSS/JavaScript, but nothing even close has come up yet, so right now hand-coding is much, much better.
I'm not going to read the responses, so its quite possible someone has already said this, but oh well.
First and foremost, you should always write out your HTML / CSS by hand. The reason for this is that no matter how advanced an HTML editor is, it will never be as good as it could / should be. For "good" html / css, you will actually end up writing your page in a different order than what you see.
For example, a page that is displayed like:
________________
|logo |
|----menu------|
|..............|
|...content....|
|..............|
|....footer....|
----------------
"should" actually flow as follows:
<h1>title of site</h1>
<div id="content">.....</div>
<ul id="menu">....</ul>
<div id="footer">...</div>
which an HTML editor would simply throw a hissy fit if you did it through the nice pretty gui. What may be advantageous is to use Web Expression 2 or Visual Studio for its intellisense. It may help speed up (or maybe slow down) your learning curb.
I really recommend Transcending CSS Design if you are already familiar with HTML / CSS. Otherwise grab a CSS book first even over an HTML book. Styling through CSS will teach you proper semantic HTML (or should,anyway).
I like to code by hand because i can keep my code clean and tidy that way. HTML is not very hard anyway.
If you decide to code by hand you will need an editor that supports syntax highlight, and you will need to validate your code as often as possible to avoid errors (this is good practice anyway). This extension for Firefox will ease your work a lot: users.skynet.be/mgueury/mozilla/