traverse html dom tree and its css style - html

I am looking for a way to adapt web pages for various screen devices.
My idea is to build a proxy server to return a new dom tree and CSS style of the page requested by a client browser.
Can I run javascript on the proxy server (node.js?) to analysis the dom node tree?
Any other ways?
Thanks!

Are you in control of the web pages that you want to adapt for different devices?
If so, you could write different CSS stylesheets for the pages, and apply them to devices with different screen sizes using media queries:
https://developer.mozilla.org/en-US/docs/CSS/Media_queries
http://msdn.microsoft.com/en-us/magazine/gg619395.aspx
http://www.alistapart.com/articles/responsive-web-design/
Edit: as you’re not in control of the web pages, then I guess some sort of proxy server is the way to go.
Building one that works well with most web pages sounds like a huge job though.
To inspect the DOM on the server, I imagine node.js would work fine (I don’t have much node experience, so I’m not 100%), but most languages will have a library (or a few) that allow you to supply HTML as text, and get back a DOM representation of it. Python, for example, has lxml, Beautiful Soup, and others.
The tricky bit would be automatically figuring out how to change the DOM of random pages to make them display better on mobile phones. I don’t know of any libraries that help you figure out what styles apply to a give DOM node, as that only becomes relevant when you’re rendering a web page, which is what browsers do.

Related

Approach to building a GUI for a web application

First, a short disclaimer. I have next to no knowledge about web applications. I come from an iOS background where I exclusively wrote native code, so if you write your answers like I know nothing outside the shallow parts, that would be great.
I'm interested in learning a stack to develop web applications, but I'm not sure what the right way to build the GUI is. I know that a web front end consists of html and CSS to create the display and javascript as the bridge between the back end and the GUI, but I don't know the best way to put something together.
I know in iOS, you can use the Interface Builder (part of xcode that lets you graphically create the xml that describes the display) to create GUI's without any knowledge of how iOS translates the xml to some rendering, or even what is written in your xml files. Is there any analog to the web front end?
I'm mainly just looking for a list of the accepted ways to develop the GUI for a web application. If I have to learn HTML and CSS, so be it, but I'd like to know what my options are and the tradeoffs between each of them.
I can answer shortly stating that (technically) you can design web pages without coding in HTML or CSS, or even Javascript - although, you would be somewhat limited in your creative abilities and applications.
You can read about WYSIWYG html editors on this link, or try out ckeditor (someone said it's good)...
...I think a bit of background will help you reach a correct decision...
so here goes:
The Long Answer
I would start by trying to put the world of web programming and design into concepts that correlate with iOS coding.
If we look at the whole of a web app from an MVC perspective, then the browser is the view, the server is the controller and the database is the model... although this is very simplified.
Just like in iOS, each of these can be (but doesn't have to be) broken down into sub-MVC systems.
Just like any model in MVC, the view (the browser) can talk to the model (the database), but really it shouldn't. that's just bad practice.
If we break down the main-view (the browser) to a sub-MVC system, I would consider the HTML as the model, the CSS as the view and the browser (through links and javascript) as the controller.
It's not all that clear cut, but thinking like this helps me practice better and cleaner coding.
The HTML is the view's model for the web-app - it contains the data to be displayed or used.
HTML is a variation on the XML format and it contains data organized in a similar way to an XML file.
The basic HTML file will contain:
<html>
<head>
</head>
<body>
</body>
</html>
this should look familiar to you if you read any XML.
The CSS (cascading style sheets) is the view - it states HOW the html DATA should be displayed.
if your web app does't have any CSS, it will use the browser's default CSS/styling to be applied on the data in the HTML.
This "language" makes me think more about dictionaries in iOS (I think that's what they're called in Objective C). They have properties and values (like key-value pairs) that determine how the HTML data is displayed (if it's displayed).
They could look something like this:
body
{
color: white;
background-color: black
}
The browser is the web-app's view's controller - it makes it all work together and serves it up to the screen.
Javascript and links help us tell the controller what we want it to do, but it is the browser that acts (and willfully at times).
You can have a whole web app that acts without javascript, using only the default actions offered by links - in which case the browser will usually ask the server (the main controller) what to do.
Javascript helps us move some of the legwork from the server to the client, by allowing us to have a "smarter" controller for our view - just like in iOS.
The issue of the errant main-view / browser
Not all views are created equal, and not all browsers are the same.
Because the browser is used as the controller for the web-app's view, and because some browsers act differently then others, we web coders have the problem of working around someone else's idea of how our view's controller should behave.
You might see us complaining about it quite a lot (especially complaining about Internet Explorer).
These days, this issue is not as big of a problem as it used to be... it's just that some people don't update their computers...
WYSIWYG web editors
There are website builders and editors that try to work like X-Code does, by allowing us to build the website much like we would write word documents.
But, unlike X-Code which codes only the graphical interface of the view, these website builders write the model as well and usually add javascript into the mix.
When we use these tools (which I avoid), the whole MVC model breaks apart.
We can use them as a starting point for dirty work, but then we take their code apart and adjust it to our needs - usually by taking the code we need for the view (CSS) and applying it where we need it (and discarding much of the nonsense they add to the code and the HTML).
To summarize
As you can tell, HTML and CSS (and Javascript) are only a small part of a web app - as they all relate to the main view of the web app.
To write the controller and models for web apps, we use other tools (such as Ruby, PHP, js.node, MySQL and the like).
Coding the HTML and CSS isn't as hard as you might think, although it might be harder then I present it to be.
You can avoid writing code for web apps and use applications that offer WYSIWYG (What You See Is What You Get), but that would limit your abilities and will take away from your control over what you want to create.

HTML : Parse, Analyze, and Reconstruct in mobile site

I am working on a school project for me CS class where we are to find an application that is interesting to us and figure out how it works.
I have picked the technology on dudamobile.com
The app takes a URL and then converts the desktop version of a website into a mobile version.
I think I have a basic understanding out how this works....
Clean up the HTML
Parse the HTML and look for key tags
Store key tags in variables
Apply variables to a premade mobile HTML template
insert custom CSS to fit mobile devices
This is a pretty high level analysis, my question is what specific tools can I use to create something similar for my project? Is my analysis correct, in your opinion?
To me it just looks like they strip out large sections of css and replace it with more mobile friendly and less graphic intensive css. It also seems to choose odd links in navigation to make the top links so some kind of algorithm there (the top links of my site were all the ones I had for mobile devices).
I would start by doing a few tricks on the css files the site uses and strip out background images, changing images over a certain width into block level elements, changing font size, etc. until you get it looking like more of a simple layout and then add the other mobile elements as you see fit (making it readable without zooming is probably the most important though).
If the site uses html5 you can apply different css rules to the navigation section to make it a list of links and move it to the top.
Don't expect it to be a perfect crossover unless the site is moderately simple though.
You would need a back end technology to retrieve and parse the .css file to re-serve it. I would suggest whatever language you're comfortable with. I don't have much experience in PHP but I believe it would be a good fit for that application.

Is there a way to get a list of all the CSS applied to a HTML fragment or page?

I know it's easy to get the CSS that is applied to a single node in HTML, using tools like the Firebug extension for Firefox, etc.
But is there a way to see all the CSS that is in effect on an entire page, or a larger fragment of HTML?
Specifically, we are cleaning up our one extremely large CSS file into smaller modules and would like to find out what CSS is used on a certain page, so we can move all the non-used CSS to another module.
Thank you all! These are the various solutions I've looked at now from your recommendations (collected here for people with the same problem):
Dust-Me Selectors (Firefox Add-on)
This does exactly what I need. Lists used and unused CSS selectors (for the current page, or the entire site, after spidering), and can dump both lists as CSV text. Great.
https://addons.mozilla.org/firefox/addon/5392/
CSS Usage (Firefox Add-on)
https://addons.mozilla.org/firefox/addon/10704/
WARI - Web Application Resource Inspector (Java tool)
Appears to only handle static code, not dynamically generated HTML as is present in most web applications that use ajax – which, unfortunately, makes it useless, at least for me
http://wari.konem.net/
CSS Redundancy Checker (Ruby script, requires Rubygems and Hpricot)
http://code.google.com/p/css-redundancy-checker/
TopStyle (Windows-only application, $79.95 (!!))
http://svanas.dynip.com/topstyle/
These are all cross-platform and free, except for TopStyle.
As far as tools go, you could use the css usage plugin for firebug. It will analyze pages for used css.
Or were you looking for a way to do it more programmaticly?
You can try Dust-Me Selectors, it's add-on for firefox, if you use firebug, as you stated, you may find useful CSS Usage.

Alternatives to HTML for website creation?

It seems the most common aproach to web design is to use HTML/XHTML & CSS in conjunction with other technologies or languages like Javascript or PHP.
On a theoretical level, I'm interested to know what other languages or technologies could be used to build an entire site without using a single HTML tag or CSS style for styling/positioning?
Could a website be made only using XML or PHP alone, including actual styling and positioning?
Presumably Flash sites are till embedded in HTML tags?
Thanks
There are actually several solutions that allow you to nearly completely avoid CSS and HTML.
GWT: Google Web Toolkit
Written in Java and will allow you to build both server and client code in Java. Used to build Google Wave.
Cappuccino and Objective-J:
Objective-J is to JavaScript as Objective-C is to C. It extends JavaScript with many near features, including type-checking, classes and types.
Cappuccino is like Cacoa (Mac OS X GUI toolkit).
Using these two you can build incredibly rich and desktop like webapps. They run mostly on the client side and you can use whatever you want on the server.
A good example is 280slides
SproutCore is similar to Cappuccino, but it uses pure JavaScript instead. Apple is using SproutCore to make me.com.
I should also mention that knowledge to HTML, CSS, JavaScript is a good skill to know, just like understanding your compiler is a good skill.
EDIT:
As said above Adobe Flash can also be used.
You can make a website with out a single html tag. Just give folder read access to all your directories, have sensible file names. From here you user will be able to browse images , read text files, download videos and depending on the content he may or may not come back ever again, but you do achieve the goal of setting up a "website" with out a single line of html or css or any other code for that matter.
:-) :-) :-)
You can host a telnet server with anonymous access and a specialized shell that restricts the user to doing whatever it is you want the site to do. ;)
Lets make the distinction between what is required by the web browser, and what you as a developer use to create that markup.
Remember that HTML nowadays is xml. You could use any markup language you like and convert that to HTML using XML.
eg ASP.NET uses markup such as which is converted on the server to .
As long as the content going down the wire to the browser is HTML, or generates HTML through script, you can use any approach you like.
However these approaches have mostly failed as developers prefer having direct control over the markup. It makes css as well as scripting much easier when you are certain what the html is going to be.
ASP.NET MVC is a product created in response to criticisms leveled at the ASP.NET webforms model.
Also, this is another answer because it's a completely different technology, but you can write an application in XUL and it'll run in Mozilla-based browsers without any HTML.
There's also XML. You can create websites with XML only. A well known one is World Of Warcraft. Check the page source. An XSL is used as stylesheet. There exist even XML based web frameworks like OpenLaszlo. You can let it serve either DHTML or Flash on reqeust out of a single XML template.
The Wt C++ Web Toolkit.
You can write your web application in C++ using Qt-style widgets (input boxes, buttons, tabs etc) and hook up client-side events to C++ code on your server. All without writing any HTML or CSS.
A sample application from their website (you may also want to look at this excellent tutorial):
HelloApplication::HelloApplication(const WEnvironment& env)
: WApplication(env)
{
setTitle("Hello world"); // application title
root()->addWidget(new WText("Your name, please ? ")); // show some text
nameEdit_ = new WLineEdit(root()); // allow text input
nameEdit_->setFocus(); // give focus
WPushButton *b = new WPushButton("Greet me.", root()); // create a button
b->setMargin(5, Left); // add 5 pixels margin
root()->addWidget(new WBreak()); // insert a line break
greeting_ = new WText(root()); // empty text
/* when the button is clicked, call the 'greet' method */
b->clicked().connect(this, &HelloApplication::greet);
}
void HelloApplication::greet()
{
/* set the empty text object greeting_ to greet the name entered */
greeting_->setText("Hello there, " + nameEdit_->text());
}
Curl (requires a browser plugin)
Wikipedia article
A webpage looks like this:
{curl 1.7 applet}
{value
let b:int=99
let song:VBox={VBox}
{while b > 0 do
{song.add b & " bottle(s) of beer on the wall,"}
{song.add b & " bottle(s) of beer."}
{song.add "Take one down, pass it around,"}
set b = b - 1
{song.add b & " bottle(s) of beer on the wall."}
}
song
}
Source
Since browsers view HTML, I'm assuming you mean create a site without ever having to edit/write HTML/CSS. The framework/app environment/whatever taking care of everything for you - yet still allowing you control over the presentation layer.
Seems like that is certainly possible on a theoretical level.
I ran across Noloh (not one line of html) a while back. Was intrigued, but never actually tried it out.
From various places on the Noloh site:
Because NOLOH does not rely on HTML or pages, maintaining complex rich Internet applications is significantly easier than with other methods.
Developing applications with NOLOH only requires using a single, unified language: a superset of PHP that completely maintains all aspects of server-client communication for you!
I think you could build a site entirely in SVG.
The front page of emacsformacosx is almost entirely SVG, for example.
Downsides: It wouldn't be viewable in IE (at least through version 8). And last I looked, text support, like flowing and justification, was weaker in SVG. (You could embed HTML inside an SVG element when you needed sophisticated text features, but that would violate your no-HTML rule.)
You'd probably still want to use CSS with SVG, because it's a good idea there for the same reason it's a good idea with HTML, but it wouldn't be necessary.
A website is always viewed through a browser (at least always if you are human :)). Browsers understand HTML. Whatever the technology - you have to basically render HTML. Even in cases with rich technologies like flash, the flash object that is rendered by a browser plugin is embedded inside the HTML.
In theory it is possible to do it without HTML, but the question becomes how much does the product diverge from the definition of a website...
One really short, simple answer... you can't :D
Flash requires an embed tag, an image requires an embed tag etc, so you'd have to use HTML in some method or another.
PHP is an embedded language, it is used to generate HTML on which the browsers renders, with XML, well technically a browser like Ie or FireFox will render it in it's own way for readability, but I would not class that as a website.
The major developments in the world of web technologies involves the development of HTML and CSS to improve them, there isn't any need for an alternative. In fact we're pushing towards a standard, what point would there be in introducing a new language to negate these standards. The whole IE saga would simply get worse.
Like the others have suggested, you could directly load an image or a flash file, but an image is useless on it's own, and a flash interface throws up loads of problems like SEO, accessibility etc, not least it's very heavy and usually completely misused. In my mind I wouldn't even class this method as a website, it just doesn't tick any of the boxes (IMO).
I think you can have an URL pointing directly at a hosted Flash (SWF) file, I've certainly done this though I don't know if all browsers work.
Anyhow, I tested this when developing MyDinos.
e.g: http://mydinos.com/home.swf
You can use Emscripten and its SDL subset.
You could try using quickstatic. You can code HTML templates from Python3. What is super cool about it is the fact that if you put in a for-loop for a certain item, you can generate that many items (maybe even use it to print items from a directory or quickly serve thousands of links).

Displaying the website's content (html) through the specific browser - is it possible to realize?

I'm interested is there a possibility that could allow to display website's content or to say exactly an HTML through a specific browser installed on the web server?
I mean something like a module for a web server may be, that can display the website's content through the built-in browser, ignoring the clients browser?
If this possibility really exists, so I don't need to adopt my HTML to different browsers.
No, there isn't any web server module that takes control over the client's computer.
Depending on your situation I suppose you could replace the HTML page with Flash, Silverlight or a Java applet if that would make things easier.
Well, you could make some pdfs or images if you really want a specific rendering, but it's really not a website anymore at that point. You could also beg your users to use a particular browser, but they'll probably ignore you. The world spends (wastes) gazillions of dollars on good, standards-compliant HTML and CSS for a reason.
How about a browser in a Java applet?