Designing entire webpages as SVG files - html

Disclaimer
I realize that given the absurdity of the title, this sounds like a troll. However, it's a genuine question. My background involves OpenGL / x86 assembly. I've recently started learning web programming. I really like SVG + CSS, and was wondering -- why do people not design entire webpages in SVG?
Context
SVG provides beautiful primitive: quadratic + cubic bezier curves; lines + filling -- all as vector graphics
SVG provides text
SVG provides affine transformations
Questions
Are there examples of people designing entire websites as a giant SVG file?
If not, what the limitations?
Are there performance hits when using SVG primitives as opposed to divs/tables?

It is possible; for example you can embed HTML fragments in SVG documents in order to get things like hyperlinks.
However, there are some significant disadvantages, at least at present:
Current web browsers treat SVGs as images, and may not present as good a UI to users. For example, I think Firefox doesn't allow the user to select text in SVG files.
You lose separation of content and presentation. While SVG does use CSS, and you can in principle maintain the separation if you edit by hand, you are probably designing the layout together with the content. This has several drawbacks:
As a corollary, it's harder to adapt the resulting page to other formats. Particularly:
What's the behavior when the text size is changed? the document is printed? the window is resized? It's hard to design a complex drawing that supports reflow nicely (and if your drawing isn't complex, you may as well just use HTML+CSS).
Screen reader support: since order is not clear (below), screen readers may give incomprehensible, scrambled output. More basically, screen readers may assume the SVG is an image and not even try to read the text.
SVG is exclusively based on XML, and hence requires pretty strict adherence to the rules. With (X)HTML, you have the option of using the plain HTML serialization. Then many of these rules are relaxed, and browsers are more robust if you feed them bogus input (as opposed to an XML PARSING ERROR if you have a single misplaced >).
Current search engines probably won't index your pages (they'll just treat them as monolithic images). Never mind
Order of the content is not clear. Tools like Inkscape don't need to care about the order elements are output in, as long as they are positioned correctly in the output and the z-order is correct. But if you're making a web page, this does matter, because screen readers don't know for sure which element is semantically first. This isn't an issue if you're only editing the SVG by hand, but the usual SVG tools may scramble your order. With HTML, it's generally clear.
It's difficult to implement fragment identifiers (#id_of_some_element at end of URL) well. The presentation program below uses them, but I think it depends on Javascript (bad for search engines and people with javascript disabled). (I'm not sure about this one)
*̶ ̶I̶t̶'̶s̶ ̶d̶i̶f̶f̶i̶c̶u̶l̶t̶ ̶t̶o̶ ̶c̶o̶n̶v̶e̶r̶t̶ ̶t̶o̶ ̶t̶e̶x̶t̶ ̶(̶f̶o̶r̶ ̶s̶e̶a̶r̶c̶h̶ ̶e̶n̶g̶i̶n̶e̶s̶,̶ ̶s̶c̶r̶e̶e̶n̶ ̶r̶e̶a̶d̶e̶r̶s̶,̶ ̶c̶o̶p̶y̶-̶a̶n̶d̶-̶p̶a̶s̶t̶e̶,̶ ̶e̶t̶c̶.̶)̶,̶ ̶p̶a̶r̶t̶i̶c̶u̶l̶a̶r̶l̶y̶ ̶w̶h̶e̶n̶ ̶g̶e̶n̶e̶r̶a̶t̶e̶d̶ ̶w̶i̶t̶h̶ ̶g̶r̶a̶p̶h̶i̶c̶a̶l̶ ̶t̶o̶o̶l̶s̶.̶ ̶F̶o̶r̶ ̶e̶x̶a̶m̶p̶l̶e̶,̶ ̶h̶t̶t̶p̶:̶/̶/̶w̶w̶w̶.̶a̶f̶r̶o̶k̶a̶d̶a̶n̶s̶.̶c̶o̶m̶/̶P̶a̶g̶e̶_̶0̶1̶.̶s̶v̶g̶ ̶s̶h̶o̶w̶s̶ ̶u̶p̶ ̶i̶n̶ ̶G̶o̶o̶g̶l̶e̶ ̶a̶s̶ ̶̶D̶A̶N̶S̶E̶ ̶E̶T̶ ̶M̶I̶S̶E̶ ̶E̶N̶ ̶F̶O̶R̶M̶E̶ ̶B̶I̶E̶N̶V̶E̶N̶U̶E̶ ̶D̶ ̶a̶ ̶n̶ ̶s̶ ̶e̶ ̶r̶ ̶p̶o̶ ̶u̶ ̶r̶ ̶c̶é̶ ̶l̶ ̶é̶b̶ ̶r̶ ̶e̶ ̶r̶ ̶l̶ ̶a̶ ̶v̶ ̶i̶e̶.̶ ̶E̶ ̶n̶ ̶v̶e̶ ̶l̶ ̶o̶p̶p̶ ̶é̶ ̶e̶ ̶d̶ ̶e̶ ̶t̶ ̶a̶ ̶m̶ ̶b̶ ̶o̶ ̶u̶r̶ ̶s̶ ̶i̶n̶ ̶d̶o̶ ̶c̶ ̶i̶l̶ ̶e̶ ̶s̶ ̶e̶ ̶t̶ ̶i̶ ̶n̶ ̶c̶e̶ ̶s̶s̶ ̶a̶ ̶n̶t̶s̶̶.̶ ̶I̶n̶ ̶c̶o̶n̶t̶r̶a̶s̶t̶,̶ ̶e̶v̶e̶n̶ ̶c̶r̶a̶p̶p̶y̶ ̶W̶Y̶S̶I̶W̶Y̶G̶ ̶H̶T̶M̶L̶ ̶g̶e̶n̶e̶r̶a̶t̶o̶r̶s̶ ̶w̶o̶n̶'̶t̶ ̶s̶p̶l̶i̶t̶ ̶u̶p̶ ̶r̶u̶n̶s̶ ̶o̶f̶ ̶t̶e̶x̶t̶ ̶l̶i̶k̶e̶ ̶t̶h̶a̶t̶.̶
Something to consider is that it is possible to embed SVG elements in XHTML and HTML 5, so you get some of the benefits without throwing off browsers/search engines.
Existing usage
I've heard of its use for presentations (which are closer to drawings in some respects, so some of the above drawbacks don't apply):
"Jessyink is a JavaScript that can be incorporated into an Inkscape SVG image containing several layers. Each layer will be converted into one slide of a presentation."
Another implemetation; info and an example

It is really cool to think that an .svg image file can be used to create a webpage without knowing any html. Considering how messy html standards are, to be able to use an Inkscape .svg file might be a lot easier for people who are naturally artistic. When you think about Inkscape and Openclipart being mostly useful for people who are doing Desktop Publishing, Scrapbooking etc to be able to export a webpage directly from Inkscape would be a powerful tool.
As for 600 years of typesetting we are in the 21st century, and media publishing doesn't have to conform to medieval ideas of formatting. Typesetting does have its palace but we are talking about a powerful experiment that could help 'average Joe' users turn their art into webpages without knowing CSS or HTML.
For reference, the Opera team has done quite a lot of work with using SVG for web pages - http://dev.opera.com/articles/svg/

There is a platform designed specifically to create SVG-based websites: Svija (disclaimer: it is my project).
The advantage of an SVG website is that you can literally do anything you want; you're not limited to rows of rectangles as you are with HTML.
The big issue is accessibility. Right now SVG text is crawled by Google but it can easily be out of order depending on the source document. Also semantic information that would normally be conveyed by HTML tags (H1 is more important than P) is nonexistant.
The main things to realize are that:
you can use external fonts and images in an SVG file
the width and height have to be specified correctly for Microsoft's browsers to display the SVG at the correct size
if you combine different SVG files, conflicting ID's can cause CSS styles to be misapplied
The only program that I have found that can correctly link to images and fonts is Adobe Illustrator. There is a list of all the programs I have found that can create SVG files here, with some information about which programs support which features.

Actually, there are pages that rely heavily on SVG; using a Javascript library such as Raphaël is a common way to do it. Paper.js is also worth a peek, although they chose not to use SVG.

You can indeed make entire web sites with SVG, I've been doing it for years. If you're willing to change how you look at designing pages (Think Cards) then it's just a matter of trial and error.

Related

Developing an entire website with only SVGs

I've been using SVGs for a multitude of things and I was able to get the exact result I was looking for. This got me wondering why we can't build entire websites with them...
I don't mean like replace index.html with index.svg. More like having the basic HTML framework (for meta tags, styelsheets, scripts, etc.) and then giving <body> a single inline <svg> child where everything else happens. (text, layout, scaling, etc.)
My initial assumptions are that the website will fall short when it comes to form functionality, SEO, and accessibility. But then again if those can somehow be circumvented, then powerful SVG features can be used to render lightweight graphical effects that can effectively scale to any device dimensions.
Has this been attempted before and what are the potential pitfalls of "replacing" HTML with SVG? Should form functionality, SEO, and accessibility be a valid concern?
Maybe it is time to answer this question again. Contributions and edits are welcome.
Accessibility
SVG content is visual content. That in itself poses restrictions on accessibility. Nonetheless, the upcoming SVG2 spec has a paragraph on ARIA attributes that tries to give at least a bit of assistance. If you look more closely, you will find that for example a graphics primitive will get a default aria-role="graphics-symbol", which is not really helpful.
Whether screen readers implement any of this I don't know.
SVG gives you the opportunity to add <title> and <desc> tags anywhere. But the standard semantics that HTML5 offers through tags are lost. No header, no sections, no TOC or navigation. Try to imagine what the Firefox Reader View could make out of such a page: nothing.
Even linking to partial content of a page is difficult. While the spec is there, the semantic remains constricted: do you really want to highlight a link target by scaling it up to cover the whole viewport?
SEO in the end is just a specialized aspect of accessibility. Having an HTML <header> segment, a few things might be possible. Maybe someone else can comment on what Google implements.
Responsiveness
Problems start with the current inability of automatically line-wrapping text. Layout facilities are still a future project without working implementations. (While Firefox knows CSS inline-size, it is only implemented for HTML and does not work in SVG.)
CSS has put a lot of effort into providing layout mechanisms like flex and grid layout. In SVG, you lose all these techniques for reordering content in favor of simple automatic scaling.
Interaction
Text input is a solvable problem. You can use HTML inputs wrapped in a <foreignObject> tag, or you can build text input widgets from scratch by capturing keyboard input into a <text> element.
For most other forms of user input SVG might even be the superior platform. For example, clicking on/touching something to mark it and dragging it around/swiping it turn out to be only two aspects of basically the same interaction.
Especially Web components are the perfect complement to SVG for building interactive widgets. (Despite you still have to go through polyfills for compatibility with Firefox.)
Form submission
While HTML form submission for initiating HTTP requests might be absent, the advent of single-page apps has shown that it is possible to employ XHR requests instead.
Conclusion
If your content depends on or benefits from the standard semantics HTML5 offers, SVG is clearly not appropriate.
If you need to change the layout of your page in a responsive way, SVG will not be helpful.
SVG gives a lot of additional possibilities for creative, visually-oriented user interactions. As long as you find fallback solutions for visually-impaired users, your page will gain from it.
Overall, SVG gives you creative possibilities for specialized areas and widgets on your page, but falls short on the basic semantic questions a web page has to answer: How does your page relate to other resources on the web? What on your page is content, what is meta information, what is decoration?
The best of both worlds is to be gained by using a mix-and-match approach: structure your page with HTML, decorate it with SVG graphics, and build interactive/animated widgets with SVG and maybe Web Components.
You can build an entire website with SVG. I do it all the time, using Adobe Illustrator to create the pages.
You can go to my site, ozake.com to see a nice example. Even that is pretty basic compared to what is possible.
At first I did it all by hand, but the repetitive parts were annoying so I built a tool called Svija (svija.love, also SVG only).
Essentially you just put an SVG inside of the html body tag.
There are a few things to be aware of:
Microsoft browsers need exact pixel dimensions to be specified or the SVG will be drawn at the wrong size.
The Safari browser does not smoothly scale fonts (see my answer on this page). The consequence is that if you have two adjacent text blocks the space between them will vary randomly if the SVG size varies.
If you want to have several SVG's (one for the page, one for the footer, one for a menu, and one for a special event, for example), you need to be careful that the internal ID's are different.
Otherwise, a color style from the second SVG could be applied to the first SVG, with unpredictable results.
You will probably need to update the font style definitions inside the SVG's to make them work with web fonts. I use both Google Fonts and uploaded WOFF's.
You can make forms as a separate HTML layer based on the Adobe Illustrator system of coordinates. I just build the form in Illustrator, then copy the location and size of each element into absolutely-positioned HTML elements.
It's easier to build separate SVG's for repetitive elements, as I alluded to above. Rather than having to copy a footer to each SVG, just make the footer a separate SVG that's draw on top of the main page.
I built Svija in part to make sure that each SVG has unique internal ID's and to handle the font naming conversion.
You can link to external fonts and images as you would in HTML.
You can animate anything you want using GSAP.
The site is listed normally by Google, but the text will be in the order that it appears in the SVG. You need to pay attention in constructing the SVG if this is a priority.
To handle accessibility, I put the page text in a separate DIV before the SVG. Theoretically, a page reader should read the text inside an inline SVG but I have been unable to find out if this is the case.
I don't want to come across as trying to push my project. It's totally possible to do it all by hand.

What is the use of HTML5 semantic markup for intranet applications?

As far as I understand, the only real advantage of HTML5 semantic markup is for search engines and web crawlers to interpret the document better.
Since intranet applications have nothing to do with search engines or web crawlers, what are the advantages of using semantic markup in HTML5?
There is no straightforward example to point out, but the website (even intranet) can be consumed by different user agents (on different devices).
You are probably familiar with Skype (and the iOS Safari) making phone-number-like words clickable. In the future I can easily imagine mobile browsers being smarter to assist the user in completing tasks on the page, like importing a clearly indicated contact to the address book.
Screenreaders for blind people?
While there is not a whole lot of immediate benefit for non-disabled people, it is still good practice. Does your company not have any externally facing sites? If it does, do those people not look at internal page code? Good practices spread just like bad ones.
see also: http://en.wikipedia.org/wiki/Semantic_HTML
Simply said there the only rule you have to follow if you create your html documents is that it is valid html, otherwise you will have the problem that the browser would try to correct your broken syntax which may result in defects of the visual representation of your content.
In modern browsers you can use display to given any element - with some limitations e.g. with input element - the same visual look and behavior then any other element.
So if you ask what are the advantages of using semantic markup in HTML5 you should ask, why to use any of the semantic markup if it is possible to have the same result using css.
The short answer is, no one will stop you if it is your own project where you are responsible to - except the client that probably gives you requirements.
It is similar to asking: Language xyz provides comments and there is a syntax for doc-comments, but why should I use them?.
Using the semantics wisely increases the readability and thus maintainability. You are not required to use every possibility of semantics at all costs.
Using them will help you to get into the code again if you haven't looked at it for a longer time, e.g. to distinguish between the elements that encapsulate logical parts and elements that are used for styling. Especially if you use a template engine to create your code or to search for certain elements in multiple files.
Even if you are now the only one who works on the code it may happen that if the project grows that you need other people to work on the code. Or for the situation, you for some reason are not available, someone else needs to maintain your code, a good markup is essential.
Using the correct markup and additions like WAI-ARIA is not only essential for handicapped people, but also allows the browser to recognize the meaning of elements, allowing to e.g. improve the keyboard navigation. Especially in a productive environment where you need to type much, it is often faster to navigate with keyboard then using a trackpad or a mouse.

Why HTML with disabled CSS matters?

When creating a website why should you care for HTML with no style?
Is there any device which will render HTML only (no CSS or JavaScript)?
Do you usually care how your website will display without CSS?
Why is it important?
There are several cases in which websites may be used without styling. As mentioned in the comments, screen readers (such as those used by visually-impaired people) read only content, not styling.
Perhaps more importantly, many search engine spiders (think: Google) read your site without styling. When you view your site without CSS, you will gain a better understanding of how search engines view your content.
And if you are lucky, or your content is particularly geeky, you may get the occasional guru who browses your site via Lynx.
There seem to be a few misconceptions here. First of all and most importantly, screenreaders do take into account CSS and JavaScript. Why? Simply because unlike in the past they are not running on their own, but rather work as addons for existing browsers or include the render engines inline in their own systems.
Does that mean you don't need to concern yourself with screenreaders at all? Sadly that's not the case either. For example, if you add display:table to an element just because you want to vertically align something some screenreaders will actually treat it like a real table (which makes no practical sense). The good part is though that pages are read top-to-bottom, header and menu first (if found) and that adding display:none through javascript to an element will hide the element from the screen reader as well. Now, the following is going to sound really harsh, but except if you're making a real high profile website I wouldn't advice you to concern yourself with this too much. On one hand screenreaders are becoming better and better (try for example the one that's included on your android device if you have a recent version of android) and on the other hand blind people are used to websites being a 'bit' messy. Now, that doesn't mean you should start using flash or otherwise crazy stuff, but it does mean that if you just write a proper website, make your menu a list, make your divisions divs and not tables etc. you should in general be fine. And if you are making a high profile website then you should check out WAI-ARIA.
Now, getting to the search engine part, that's not true either for the big search engines at least. Google does take styling into account. Not all the styling that's unimportant for Google, but it actually will realize which stuff is hidden and analyzes the javascript whether hidden content will be shown (as part of it's anti-SEO work), it will search for links in your javascript and probably lots more I am not aware of. Bing does this to a large extend as well, though for example duckduckgo does not do this too much/at all. Either way, once again, the notion that Google sees your site like lynx does was true in the far past, but by now is invalid.
And if you check your serverlogs you will see that nobody accessed your site through lynx. That's just the reality of life nowadays. In the past (again) people would occasionally use lynx if they only had access to a console, but nowadays it's far easier to pull your phone from your pocket which runs a full web browser.
First part of the answer : 'text based browsers'
Text-based browser list
Alynx
ELinks (active version of Links)
Emacs/W3
Line Mode Browser
Links
Lynx
Net-Tamer
w3m
WebbIE
Second part : 'search engines'
List of semantic search engines
List of search engines
Third part : 'web accessibility' where software helps people with disabilities get access to the web.
It's important to note that for the third part, accessibility, it is
sometimes a legal requirement. For example, in the UK it is illegal to
have a website that is not accessible to blind people. There are
similar requirements for US government services. – slebetman
It's also an applicable law in canada
See this list of tools from w3 for a Complete List of Web Accessibility Evaluation Tools
CSS isn’t an on/off thing. Although CSS may be completely disabled, it is much more common that some of your CSS settings get ignored or overridden. Here is an incomplete list of cases (see my CSS Caveats for some additional details):
Speech-based browsers generally ignore most of CSS, largely because most of CSS is directed towards visual rendering.
So do “text-only browsers” (more accurately, character cell browsers, which render in plain text only using a monospace font but may be able to use colors and bolding).
Search engines generally don’t care about CSS at all.
CSS support varies. The more advanced CSS features you use, the more probable it is that many browsers don’t implement them.
CSS support may be disabled by the user, completely or partially.
User style sheets may interfere with your CSS code or even override them.
Browser settings e.g. on minimum font size may make some of you CSS settings ineffective.
Browsers have bugs. The more complicated CSS techniques you use, the more probable it is that you trigger a bug in some browsers.
An external CSS file (the recommended way of using CSS) may get lost, e.g. a browser may need to wait (perhaps in vain) from a server, or an archiving system may archive an HTML file but fail to archive the CSS file.
Styling may get lost in transfer, e.g. when copying content from a web page to MS Word or Excel or Notepad or email.

Does SVG text generated by Javascript get indexed by search engines?

I am concerned about doing a product animated presentation with SVG. The planed animation is a little too complex to be achieved with regular DOM manipulation (non SVG) and of course canvas is not an alternative since the content has to be indexed by Search engines. The animation is already mocked up and follows a typographic style.
That concern comes from the fact that I don't know if dynamic generated and injected text inside SVG will be indexed by search engines with the same richness other DOM elements will, or if it will be indexed at all!?
Would be good to know if somebody here already managed this situation, in practice, and if the indexing happened as expected (although any good and well documented hypothesis could help). In negative case, alternative solutions are welcome too.
No - it won't be in your page when it is crawled. If you want content to be indexed, serve it up straight from the server.

Apart from <script> tags, what should I strip to make sure user-entered HTML is safe?

I have an app that reprocesses HTML in order to do nice typography. Now, I want to put it up on the web to let users type in their text. So here's the question: I'm pretty sure that I want to remove the SCRIPT tag, plus closing tags like </form>. But what else should I remove to make it totally safe?
Oh good lord you're screwed.
Take a look at this
Basically, there are so many things you want to strip out. Plus, there's stuff that's valid, but could be used in malicious ways. What if the user wants to set their font size smaller on a footnote? Do you care if that get applied to your entire page? How about setting colors? Now all the words on your page are white on a white background.
I would look into the requirements phase again.
Is a markdown-like alternative possible?
Can you restrict access to the final content, reducing risk of exposure? (meaning, can you set it up so the user only screws themselves, and can't harm other people?)
You should take the white-list rather than the black-list approach: Decide which features are desired, rather than try to block any unwanted feature.
Make a list of desired typographic features that match your application. Note that there is probably no one-size-fits-all list: It depends both on the nature of the site (programming questions? teenagers' blog?) and the nature of the text box (are you leaving a comment or writing an article?). You can take a look at some good and useful text boxes in open source CMSs.
Now you have to chose between your own markup language and HTML. I would chose a markup language. The pros are better security, the cons are incapability to add unexpected internet contents, like youtube videos. A good idea to prevent users' rage is adding an "HTML to my-site" feature that translates the corresponding HTML tags to your markup language, and delete all other tags.
The pros for HTML are consistency with standards, extendability to new contents types and simplicity. The big con is code injection security issues. Should you pick HTML tags, try to adopt some working system for filtering HTML (I think Drupal is doing quite a good job in this case).
Instead of blacklisting some tags, it's always safer to whitelist. See what stackoverflow does: What HTML tags are allowed on Stack Overflow?
There are just too many ways to embed scripts in the markup. javascript: URLs (encoded of course)? CSS behaviors? I don't think you want to go there.
There are plenty of ways that code could be sneaked in - especially watch for situations like <img src="http://nasty/exploit/here.php"> that can feed a <script> tag to your clients, I've seen <script> blocked on sites before, but the tag got right through, which resulted in 30-40 passwords stolen.
<iframe>
<style>
<form>
<object>
<embed>
<bgsound>
Is what I can think of. But to be sure, use a whitelist instead - things like <a>, <img>† that are (mostly) harmless.
† Just make sure that any javascript:... / on*=... are filtered out too... as you can see, it can get quite complicated.
I disagree with person-b. You're forgetting about javascript attributes, like this:
<img src="xyz.jpg" onload="javascript:alert('evil');"/>
Attackers will always be more creative than you when it comes to this. Definitely go with the whitelist approach.
MediaWiki is more permissive than this site; yes, it accepts setting colors (even white on white), margins, indents and absolute positioning (including those that would put the text completely out of screen), null, clippings and "display;none", font sizes (even if they are ridiculously small or excessively large) and font-names (even if this is a legacy non-Unicode Symbol font name that will not render text successfully), as opposed to this site which strips out almost everything.
But MediaWiki successifully strips out the dangerous active scripts from CSS (i.e. the behaviors, the onEvent handlers, the active filters or javascript link targets) without filtering completely the style attribute, and bans a few other active elements like object, embed, bgsound.
Both sits are banning marquees as well (not standard HTML, and needlessly distracting).
But MediaWiki sites are patrolled by lots of users and there are policy rules to ban those users that are abusing repeatedly.
It offers support for animated iamges, and provides support for active extensions, such as to render TeX maths expressions, or other active extensions that have been approved (like timeline), or to create or customize a few forms.