How to force some HTML to render inside some other website? - html

I am working with an open source ETL tool (dagster) that allows me to attach some "metadata" to the results of each operation.
Typically the metadata should be text or numbers, but I want to insert an HTML snippet.
My issue is that the HTML is not "rendering" in the tool's web-ui. Here is a quick screenshot of the UI and the source tree:
Any ideas on how I can have it "render" in the UI?

I think the closest you can get to rendering HTML in Dagster is using the markdown metadata type. Rendering HTML is specifically not supported due to the possible vulnerabilities it could introduce.

Related

Get the "real" source code from a website

I've got a problem getting the "real" source code from a website:
http://sirius.searates.com/explorer
Trying it the normal way (view-source:) via Chrome I get a different result than trying it by using inspect elements function. And the code which I can see (using that function) is the one that I would like to have... How is that possible to get this code?
This usually happens because the UI is actually generated by a client-side Javascript utility.
In this case, most of the screen is generated by HighCharts, and a few elements are generated/modified by Bootstrap.
The DOM inspector will always give you the "current" view of the HTML, while the view source gives you the "initial" view. Since view source does not run the Javascript utilities, much of the UI is never generated.
To get the most up-to-date (HTML) source, you can use the DOM inspector to find the root html node, right-click and select "Edit as HTML". Then select-all and copy/paste into your favorite text editor.
Note, though, that this will only give you a snapshot of the page. Most modern web pages are really browser applications and the HTML is just one part of the whole. Copy/pasting the HTML will not give you a fully functional page.
You can get real-time html with this url,bookmark this url:
javascript:document.write('<textarea width="400">'+document.body.innerHTML+'</textarea>');

Render pages in Collection as if they were rendered by wiki engine

I am experimenting with Mediawiki Collection extension, for generating books from my articles (which I find very useful). However it doesn't render everything in the same way my mediawiki instance does.
Namely,
mathjax elements don't render
and pictures don't render as well
For instance, here how things get rendered by wiki
And how they end up in the pdf generated by Collection
I understand the reasons behind this behavior: the rendering is done not by my wiki, but by some external service, which has no idea about my client-side plugins.
My question is: how can I get my wiki render all the pages I want, possibly in HTML with all client-side extensions, and then convert the results to PDF?
When I open a page in "printable" view (with &printable=yes) it renders everything as I want. It could be nice to use that to render multiple pages at once (this is in essence what Collections does...)
Thanks
I ended up creating an extension that takes a list of pages and renders them all to a single page. After that I can just print it
The code is available here
This is how it looks like in PDF now:
Also you can add &printable=yes and it will give you a fully prepared printable version which you can just save as HTML.

Is it possible to get the generated source of a webpage programatically?

As the title states, I am wondering if there is a method to obtain the generated HTML code of a page. Obviously I can inspect the page with web developer tools (browser built-in, or external program) and get it, but I would really like to do it automatically. Perhaps using Fiddler's API it could be possible?
Thanks!
"Source" doesn't get altered by JavaScript after page load, it's the document object model (DOM) generated from the source that gets altered. It is this DOM that is then translated to the GUI, and is altered with every change as long as the page is not re-loaded.
The DOM is not a string of HTML code, it is an in-memory hierarchical object representation of the page. The browser does not maintain an up-to-date, flat-file representation of the DOM as it gets altered, which is why when you "view source" you only ever see what was originally sent to the browser over HTTP.
The node-for-node representation of the page/DOM, in developer tools such as Firebug is the closest you'll get to a re-generation of the source code (AFAIK) without building some new tool yourself.
You may be able to write a script in Python that would take a variable (the URL) and insert it after a command that would download the webpage, such as wget.
Googling it, I have found this to parse HTML files: maybe you could wget the index.HTML and use one of these:
How do you parse and process HTML/XML in PHP?

Display/Render RTF doc in browser display using html textarea or something similar

My web application has an feature wherein preformatted RTF documents are used as templates and the user can select the source of data and then merge with the RTF documents templates to create merged RTF files. The RTF templates have placeholders which get replaced with user selected content. The final doc can either be saved or opened directly if word/wordpad is available on the local users machine.
Now, I have a requirement to display the merged document to the user for confirmation. The user may either print or save the document to the system directly. The display should not be word/wordpad application but should be within the application itself, using textarea or something similar to render the document. Can you please let me know if its possible to render the RTF document in textarea or not. Along with the displayed content, there should be options to print and save the document.If I have to convert the RTF to Html and then display the html content in textarea , please let me know how i can do the conversion and then display the html in the page.
That's a very difficult requirement. First of all, let's dismiss the idea about a <textarea>, because it does not support any formatting at all. All the WYSIWYG editors you've seen out there are based on <iframe>s.
Secondly, no browser can directly display a RTF. You can embed it as an <object>, and some might show it (IE probably will), but I can't say which ones won't. Portable devices almost certainly won't. But you should test this though, maybe it works well enough after all.
Failing that, HTML conversion is also out of question, because RTF has very very many features that cannot be emulated in HTML. There are some converters out there (google), but but they will all come with serious limitations. If you want full support, you will have to do your own rendering via Canvas or Flash or something.
To this end I'd suggest checking out Google Docs. They've gone through all of this hassle and have a rather feature-full engine for displaying most possible documents. I think it was also possible to embed them in your own webapges, though I've never checked it out myself.
Use a <PRE> tag to Display/Render RTF doc in browser.

PdfSharp, GDI+ and HTML printing

I currently have a "PrintingWebService" that I call from an AJAX page with all the information that is needed to construct a highly customized PDF printout using PDF Sharp and the PDFSharp's GDI+ mode, which takes DrawString and other commands that work basically just like GDI+ only they are drawn to the PDF.
I then save the PDF file to a location on the webserver and return the file name from the web service, and the AJAX page opens a new window with the pdf file.
So far, it works well, however, there is one part of my AJAX page that I want to printout and I haven't come up with a solution for yet. I've got a string of the HTML content of a TinyMCE editor that I want to dispay in the bottom part of the PDF page.
I'm looking for some sort of tool I could use for this purpose. Even something opensource that prints to GDI+ I could use by taking the source code and translating it to use PdfSharp's GDI+ (the class names are like XGraphics, with each class having X before the GDI+ name).
If I have to I will limit what HTML can be generated by TinyMCE and write my own renderer, but that will be a big challenge, so I'm looking for other solutions first.
I've stayed away from a printer-friendly page approach because I wanted to construct a page that was a near identical of an existing WinForms printout, using my existing code. With PdfSharp I was able to convert all the code except the text area stuff (which used the RichTextBox and RTF in the WinForms version).
Tony,
I personally have used WebSupergoo's ABCPdf library with much success. You can actually render HTML directly to the PDF and it does fairly well in regards to accuracy.
Another free software that will allow you the flexibility of writing HTML to PDF that I have used in the past with much success is iTextSharp.
Otherwise, I think you'll have to write something to render HTML to GDI.
Either way, you may want to consider using an HttpHandler that you map to using your web.config to generate the PDF file. This will allow for you to render the PDF to a bytestream and then dump it directly to the user (as opposed to having to save each PDF receipt to the web server). It will also allow for you to use the .pdf extension in the page that returns the receipt (PurchaseReceipt.pdf could be mapped to a HttpHandler)... making it more cross-browser friendly. Older versions of Adobe / Browsers will not display correctly if you start throwing a PDF byte stream from an ASPX page.
Hope this helps.