Do you know a webpage appearance comparator? - html

I need a tool to compare the design of a website, I do not want to compare the HTML code only, but the output design.
Is this even possible? also is there any opensource program of this kind?
I have searched google, but I only get one candidate so far which is an HTML Match.

In modern webpages the appearance is controlled by various 'things': html code, css styles and images at least (also javascript in some pages). Simple text-based diff programs are not enough because their output can be irrelevant to the webpage appearance (i.e. cleaning up css can show many differences but the rendered webpage remains the same).
For simpler pages HTML Match mentioned above could do the job. If I have to compare the design of two "complex" pages (including layout, space, image and text changes) I would do a two-step approach:
Run a diff tool on the html sources to highlight the textual content differences. Then I would modify one of the pages to show the same content as the other (in order to make the next step more accurate and 'focused' to show 'real' layout changes). Of course it works only with very similar html.
Load the pages in the same web browser, get some screenshots from the rendered output at fixed positions and compare the images (i.e. with ImageMagick). It should show all visual differences in the rendered output.
It is not perfect but should work.
[UPDATE] HTML Match seems dead, see this answer for an alternative solution.

Solution: “compare web pages” tool. (“We've been doing it since 1999. It's free.”)
Example output (comparing pages for TP-Link USB hub model UH700 and UH720):

Under windows:
http://www.htmlmatch.com/

If you are using KDE, you can use Kompare or KDiff3.
However, if you want to view how your web page looks in different browsers in different operating systems, BrowserShots can used.

There are these online tools - that aren't brilliant:
http://www.w3.org/2007/10/htmldiff
http://www.aaronsw.com/2002/diff/
I like the look of daisydiff but have not used it in anger: http://code.google.com/p/daisydiff/

The keyword you're looking for is "diff".
A good program that can show you the differences between two files (html markup or other) would be ExamDiff for windows.

I'm working on one and i tell you it's hard and there is nothing on the market. Maybe Google and Bing have something inhouse. You can use some image comparison tools which identify rectangle regions of changed images. This is for example a part of all modern video compression but you have to do it for different regions of the webpage (the nav bar section, the main article, the region filtered by an ad-blocker etc.) as some of them may change and it's still considered the same content on the page.
As i said very complex problem with no exact solution.
The other is going the non visual way and just compare the resulting computed computer styles of each html element. You have to hack the browser to get access to the layout tree. There is also no official API or existing library/program/hack/patch for it.

You can make a visual comparison with Araxis Merge Pro by taking screen output with systems like BrowserStack, Cross Browser, PhantomJS

Related

How to replicate the CSS design of a website

I have a existing parent website and I have to design a new website with similar theme and css styles.
I do not have access to the code of the parent website in which I can look into the styling.
Is there a way I can extract or replicate the css style of the website and use it for the new one. I just need to get the same theme going in the new website as well.
I came across that I could use some adobe tools for the same.
Can anyone give a brief idea of how this can be done or is there a generic procedure to be followed in replicating the style.
Replicate given design using your own, most appropriate appropriate markup and CSS rules—and have some sort of QA process that will help you find obvious inconsistencies with appearance and interaction.
Why I don't think you want to copy HTML and CSS from the parent site:
The parent website can change its style later in an unpredictable way. You will have to duplicate these changes. Since you mentioned you don't have the access to the codebase, you can't just diff their changes and apply them to your codebase.
Therefore I'd say it's best to ignore the original HTML and CSS, and just follow your eye and have a QA that will carefully test your work for consistency with the original.
(I had to do a similar thing once, and I think it usually isn't required to follow parent website pixel-perfect—just consistent enough to facilitate painless navigation for the end user.
In cases where pixel-perfect consistency is required it makes more sense to build the additional website off the same codebase as original. You weren't given that possibility, so I doubt that perfection will be requested from you.)
I think Your trying to shoot fly with cannon. All javascript/css/html code is at Your hand when viewing sources. No advanced tools are needed.
For better look on minified files You may try developers tools provided by modern browsers like chrome and firefox.
You may also just use beautification tools for css and html like http://www.codebeautifier.com to get nice, indented document.
Just google html or css beautification and find the one that fit Your needs in best way. Most of them are free online tools.
The css is probably minified. This question shows ways to unminify it so you can read it.
Browsers such as Firefox and Chrome have a built in Code Inspector tool that will show you which styles are applied to each item. Just right-click on a page element (for example, a paragraph or heading), and select "Inspect Element" from the menu that appears. A toolbar will appear at the bottom of your window. Use the arrow on the toolbar to select different elements to examine. Usually the left side of that tool shows the HTML for that element and the right-side shows the CSS styles applied and the line of the css they come from. You can get a similar tool in IE by pressing the F12 key.
If you have a text editor that allows regular expressions in the search (Dreamweaver has this if you have the Creative Suite) use this search term with the "regular expression" box checked: #[a-z|0-9]{3,6}. This will find all of the hexadecimal values for the colors you need. It says to find the pound sign followed by either three or six letters or numbers, which will mostly be hexadecimal values (e.g. #333 or #333333 for dark grey). It may also bring up some IDs and you can ignore those and keep searching. You'll also want to search for rgba because colors may be listed that way. Using this in conjunction with the browser's code inspector will help you figure out the colors that are used on different elements. Some things may have background images, so you'll need to use the code inspector to figure that out. The code inspector will also show you how much padding you'll need, widths, etc.

Html code well formatted vs one line code

I see more and more pages (eg. translate.google) where the html code is formatted in one line? Is it made to shorten the loading time? Is it the state of the art now?
Thanks
HTML is for browsers. They don't need extra newlines/spaces people need.
If HTML is generated in a program, it needs extra time/code to format HTML readable for humans. So it is easier to output it in one line.
The short answer is Yes: it reduces the size of the download.
Even if that doesn't have much impact on the download speed for the individual user, if the site is serving pages to a lot of users, then the cumulative effect is a significant reduction in the amount of traffic their server has to send.
It's pretty easy to strip the redundant white space out of a HTML document, so it would probably have been written with the white space in-tact during development, and then removed afterward when it was deployed to the live system.
You'll find that Javascript and CSS files are often given the same treatment as well.
As an end user, you shouldn't have any need to look at the raw HTML. If you really want to see how the page is written, don't look at the source, rather look at the DOM - ie the tree view of the elements in the HTML page (for visual purposes; the DOM is a lot more than this, but that's what you can see)
You can see this using Firefox's Firebug extension, or the Developer Tools feature in IE8, Chrome or Safari.
Hope that helps.
Just to add up (you've got great answers answers already):
If you want to be state of the art in web performance here is a fantastic resource.
Yahoo's Best Practices for Speeding Up Your Web Site
Good luck!

How can I save a webpage as an image in my rails app?

In my rails app I have a need to save some webpages and display them to the user as images. For example, how would I save www.google.com as an image?
There is a command line utility called CutyCapt that is using the WebKit-Rendering engine to render HTML-Pages into various image formats. Maybe this is for you?
http://cutycapt.sourceforge.net/
Prohibitively difficult to do in pure Ruby, so you'd want to use an external service for this. Browsershots does it, for example, and it looks like they have an api, although I haven't used it myself. Maybe someone else can chime in with alternative but similar services.
You'll also want to read up on delayed_job or something similar, to make sure you're accessing those page images as a background task and that it doesn't interfere with your actual application.
You can't do it easily (probably can't do it at all).
Each page is just a text - html data. The view you want to make an image of is a rendered page. Browser renders the page using tonns of techniques like html parsing, javascript parsing, css parsing, font rendering, etc.. To make the screenshot of google page - you would need to do all the rendering somewhere in memory and then take a screenshot of rendered page.
That task is almost impossible (there is nothing fully impossible).
If you are really eager to donate tonns of time to accomplish that task - you should do this steps:
1) Find some opensource rendering engine. Firefox would do.
2) Find some way to communicate between ruby-on-rails and that engine.
3) Wire it all together and see the results.
However, I see steps 1 and 2 as nearly impossible.
Firefox addon:
https://addons.mozilla.org/en-US/firefox/addon/1146/

HTML printing - what methods are there to make an html based printout? What are the pros and cons?

I have a report I need to print out in an application I'm usually doing maintenance for. My question, which interests me beyond the scope of this task is, what are the ways to format an HTML page for printing? What are the pros and cons of each?
Note that the page is meant only to be printed. I'm not asking about an HTML page that looks ok also when printed.
Generally speaking, I know I can either rely heavily on <table>s or on <div>s, but I don't know which way to go.
I would also appreciate some resources to get me started, or to help with known problems, in any method you suggest.
Thanks,
Asaf
As you can certainly see, printing and web presentation are two different creatures. The main issue is the bounds of the printed page, which does not exist in a web page. Even if you think you have a page laid out in a manner that will fit a printed page, then you need to deal with the fact that the font you are using may not work or scale correctly on the user's printer.
I know of three ways to deal with this issue:
Use fixed-sized fonts (like Courier), limit yourself to an 80 column width, and only use font characters: meaning use something like asterisks for borders, etc. This is VERY old school - your reports look simple and old and plain. But, they will always print they way you intended.
Convert your report to an image. Images can be made to confirm to a specific size which can fit on a page. However, you can still have issues due to printer margin settings.
Let another application do the work for you. What I mean by this is put your report into a PDF or a spreadsheet. Both PHP and Perl have easy to use modules for creating a PDF - with no licensing needed. Perl has a fantastic spreadsheet module. This route takes a little learning up front, but frees you from having to be an expert on printing (which can be a real pain).
In case you DO want to have a page that also looks good when viewed in a browser, consider multiple stylesheets for different medias.

Suggestions for making pixel-perfect CSS layouts?

One business goal requires that I make a form on screen that's pixel-perfect. If a user prints this form, it will exactly match the US Government Printing Office version of the form; the printer will produce a (reasonably) scannable copy of this document. The previous solution is PDF, which will only work to a certain point for us.
I'm leaning towards HTML/CSS, and would like suggestions on tools to assist that.
For tools, PixelPerfect in Firefox seems a good start. The target platform for this is (drum roll) IE6, if it helps. The document looks like this.
If HTML/CSS is a complete no-go, Adobe Flex is my next choice.
If pixel perfect printing is the goal, and not even PDF will get you there, you can pretty much give up straight away on printing from the browser. There are waaaay too many variables when rendering on the client side: from different browsers (IE6? Good luck!) to different fonts, to user settings, to A4 vs Letter size paper.
Could I ask why PDF doesn't suit?
I agree that pixel-perfect layouts are very, very hard to achieve with html/css, particularly with forms. However, I think pdfs can recieve input from external web forms, or have textfields that when filled out will print.
Flex outputting to pdf would be a good idea, but I don't think using flex as an rendering engine will help too much with this.
Another option would be to make the pdf and use a server-side langage to customize it with fields from a prior webform, and output the result. (Can easily be done with ruby/django/php, there are some good pdf libraries out there.)
First, abandon pixels. What you're looking for is a print stylesheet, with everything specified using physical units (cm/inches), font size in pt, etc. What is displayed on the screen, in what font size, and whether it is pixel-perfect or not doesn't seem relevant to your requirement of producing a scannable copy.
The question is now, is IE6 support for physical units and print stylesheets complete enough for that? Given my experience with making print stylesheets for clients, where IE would simply crash during the print process if you looked at it wrong, I'd say not too likely -- not with the complexity of the forms you're dealing with.
If you're worried about the renderer (IE, Acrobat, etc.) screwing up, you could always render the form on the server, and just serve an image to the user.
Dean, check out Prince. Bert Bos and Håkon Wium Lie used it for production of their book on CSS. They explain a bit about it in an A List Apart article.