Prevent html emails with Style element from affecting entire web page - html

We have an single-page web app that displays emails. Some of the emails we're viewing contain style elements that, when loaded into the DOM, affect our entire app. What's the best way to prevent this from happening? I'm currently removing style elements using the HtmlAgilityPack as shown in the post below, but I'm wondering if there's an easier way.
Regex to remove body tag attributes (C#)

Use iframes. That will put the message into a separate document, and there will be no styling interference.

Since you said html emails, the only way to make the Css work is to give inline css style's. External CSS will never work on html mails.

Related

Injecting a dynamic html to pages - how to not inherit css?

I have this extension which injects HTML as a notification, the problem is that every site renders this HTML different since my HTML code inherits all the css rules.
so I wondered if there's a way to inject this HTML and keep it from rendering different in every website.
I would personally tag up all parts of my html and include inline css rules that specify exactly how I want the html to appear.
Not including any styling information puts you at the mercy of the designers od each site.

gmail html preview removes css rule with page-break property

we are sending html page as attachment to our users which they can use to print. When you view the html page using gmail in browser it removes any css rule with page-break-after property.
We want to force page-break for printing.
What is the work around for this.
Nothing you can do about it. As explained by chipcullen Gmail strips out all CSS in webpage except the one inside the HTML tags, and even with those it does strange stuff like removing the page-break-after attributes.
The only workaround I can think about is to keep the html file or your server and simply send a link to it in your mail message in place of the attachement.
Great is the mistery surrounding these type of decisions made by Google engineers.
Maybe they want users to go back using Outlook. Well, they almost convinced me.
I can't say about the page-break-after property specifically, but I do know that Gmail does WEIRD things to CSS in HTML emails. For instance, Gmail will strip out any CSS that is either in the <head> or inline with the <body> tag. You will have to apply your rule inline.
From what it sounds like, you're relying more on the browser to render the attachment, and the user is printing from there. Are you sure page-break-after is supported in your test browser?

Best way to display a web page within a web page?

I'm writing a webmail product and some emails have body css that changes the background ... so when I Html.Decode() that emailbody, it's altering the CSS of the entire page.
Is there a good way to contain that problem?
You can make your CSS more specific than the email's rules. For example:
body.body is more specific than .body or body
Any styles in body.body that clash with those in the lesser examples above, will override. But to stop the styles merging together, you'll need to define every single style.
Alternatively you can go with rewriting the CSS in the emails, which is the way most webmail/desktop email clients go these days, one way or the other. If you prefix all the rules with #emailMessage, for example, and place the email inside a <div id="emailMessage"></div> tag, all the styles in the email will only apply inside that namespace.
Using an iframe to display emails only introduces more problems based around accessibility, etc etc. Good luck.
The answer to your question is probably "iframe", but in your specific situation, writing a webmail client is going to introduce you to a wonderful new hell called "stripping css from possibly extremely invalid html generated by a large variety of clients that all have their own ideas about what kind of html should be allowed in an email".
Good luck!
A common way is to use iframe, although i'm not sure this is applicable for your problem.
Basically it loads a different html page inside another page. Which makes it independent, but it does mean you have 2 pages to display one email.
http://www.w3schools.com/TAGS/tag_iframe.asp

How to make HTML written by users on a site, not conflict with the site's stylesheets?

I have a website that allows a user to create blog posts. There are some backlisted tags but most standard HTML tags are acceptable.
However, I'm having issues with how the pages get displayed.
I keep the HTML wrapped in its own div.
I would ultimately like to keep the HTML from the user separate from the main sites stylesheets so it can avoid inheriting styles and screwing up the layout of the originating site where the HTML is being displayed.
So in the end, is there anything I can apply to a div so its contents are quarantined from the rest of the site?
Thanks!
You could use a reset stylesheet to reset the properties for that specific DIV and it’s children. And on the other side, you’ll probably need a CSS parser to adjust the user’s stylesheet for that specific DIV.
You can do it in a frame or an iframe. That will keep it separate in every way.
Could you format each user-generated content with a div of class 'username' in addition to any other classnames you may add automatically?
Then they -I assume 'they'- can format and style as they please, can have all their styles prefaced like so: div.username selector.
Otherwise, you may be able to use iframes.
You could use an iframe to keep them completely separate if you really wanted to be extreme about it.
Or, you could restrict them to only writing inline styles so that they can't affect your page's stylesheet:
1) strip out any style tags html the user creates. This way they can't override your styles.
2) validate their code and either fix or reject things like unclosed tags or elements that shouldn't be inside a div (like head or body tags) and make sure it all gets closed properly so it can't mess up any html from your page after the div it's contained in.

Screen scraping pages that use CSS for layout and formatting...how to scrape the CSS applicable to the html?

I am working on an app for doing screen scraping of small portions of external web pages (not an entire page, just a small subset of it).
So I have the code working perfectly for scraping the html, but my problem is that I want to scrape not just the raw html, but also the CSS styles used to format the section of the page I am extracting, so I can display on a new page with it's original formatting intact.
If you are familiar with firebug, it is able to display which CSS styles are applicable to the specific subset of the page you have highlighted, so if I could figure out a way to do that, then I could just use those styles when displaying the content on my new page. But I have no idea how to do this........
Today I needed to scrape Facebook share dialogs to be used as dynamic preview samples in our app builder for facebook apps. I've taken Firebug 1.5 codebase and added a new context menu option "Copy HTML with inlined styles". I've copied their getElementHTML function from lib.js and modified it to do this:
remove class, id and style attributes
remove onclick and similar javascript handlers
remove all data-something attributes
remove explicit hrefs and replace them with "#"
replace all block level elements with div and inline element with span (to prevent inheriting styles on target page)
absolutize relative urls
inline all applied non-default css atributes into brand new style attribute
reduce inline style bloat by considering styling parent/child inheritance by traversion DOM tree up
indent output
It works well for simpler pages, but the solution is not 100% robust because of bugs in Firebug (or Firefox?). But it is definitely usable when operated by a web developer who can debug and fix all quirks.
Problems I've found so far:
sometimes clear css property is not emitted (it breaks layout pretty badly)
:hover and other pseudo-classes cannot be captured this way
firefox keeps only mozilla specific css properties/values in it's model, so for example you lose -webkit-border-radius, because this was skipped by CSS parser
Anyway, this solution saved lot of my time. Originally I was manually selecting pieces of their stylesheets and doing manual selection and postprocessing. It was slow, boring and polluted our class namespace. Now I'm able to scrape facebook markup in minutes instead of hours and exported markup does not interfere with the rest of the page.
A good start would be the following: make a pass through the patch of HTML you plan to extract, collecting each element (and its ID/classes/inline styles) to an array. Grab the styles for those element IDs & classes from the page's stylesheets immediately.
Then, from the outermost element(s) in the target patch, work your way up through the rest of the elements in the DOM in a similar fashion, eventually all the way up to the body and HTML elements, comparing against your initial array and collecting any styles that weren't declared within the target patch or its applied styles.
You'll also want to check for any * declarations and grab those as well. Then, make sure when you're reapplying the styles to your eventual output you do so in the right order, as you collected them from low-to-high in the DOM hierarchy and they'll need to be reapplied high-to-low.
A quick hack would be to pull down their CSS file and apply it to the page you are using to display the data. To avoid any interference you could load the page into an IFrame wherever you need to display it. Of course, I have to question the intention of this code. Are you allowed to republish the information you are scraping?
If you have any way to determine the "computed style" then you could effectively throw away the style sheet and, ****gasp****, apply inline styles using all of the computed styles' properties.
But I don't recommend this. It will be very bloated.