Generate PDF in Go with a dynamic image - html

I am new to Go and actually trying to figure out the way to handle images in templates.
My goal is to generate a barcode and insert it into a template I wrote.
The program already use go-wkhtmltopdf to generate pdf but lacks about images.
My main question is : what's nicest way to do this ?
Should I generate an image in a public directory then insert into img src tag/property ?

Supposedly you might get away by using embedding image data directly into your HTML pages.

Thank you #kostix I'm pretty close. Now i'm stuck into another problem. I generated barcode (128), converted it to base64. When I pass it to my template like so : it breaks my png once I open the pdf. But if I take the content of base64BarcodeUrl and paste it directly as src to my img tag, it works like a charm.
base64BarcodeUrl = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAABkEAAAAABJ+o5fAAABYUlEQVR4nOzSUYrCMBRA0ekw+99yhyLBUhIq2Pt3zo9V40si92/ffx6zba/XY+bxPGaPz8d317Xj/fksq9/P5p/Xr+Ze18/OtZoxO+/svqs51z1W97j7n+6eZ/vMzvvJfb/1+9woeBMWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGREBYJYZEQFglhkRAWCWGR+A8AAP//9bRAyVZD8C0AAAAASUVORK5C"
Is there any issue I'm not aware about "injecting" some data ?

Related

Php script to loop list of urls and save as PDFs?

I was wondering if it was possible and what exactly I would need to loop through a list of url links and save those pages as pdfs. I would like to create a script that could do this but not sure how realistic it is.
Example:
www.site1.com
- save pdf locally site1.pdf
www.site2.com
- save pdf locally site2.pdf
It is realistic, although it needs a bit coding, as saving HTML pages as PDF files is not that simple.
There are libraries like FPDF or mPDF for PHP that can convert a HTML doc into a valid pdf, but it will not take a 'screenshot' of the page, but rather build it from the HTML tags and the CSS. They even allow changing the CSS file to your custom one.
(If you want to take a screenshot there is always PHP's imagegrabscreen(), but it will only work on a Windows server.)
You will just have to fetch the dom from your url:
$html = file_get_contents('http://site1.com');
Convert it to a pdf with one of the mentioned libraries, and save it as a file:
$mpdf = new mPDF();
$mpdf->WriteHTML($html);
$mpdf->Output();
See:
http://fpdf.org
http://mpdf1.com
(I personally prefer mPDF - it is based on FPDF and has a nice and easy API.)

What is this image stored as?

I want to extract these telephone numbers from the website, either as an image or if possible as a string.
Here is an example from the website: Link
As you can see the telephone number is an image.
However I cant seem to view the image when I open the image source:
<img src="http://www.callmyname.sg/search/display_phone_number/VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0=">
But when put into html and viewed in a browser, you can see the image fine.
It's a solution to prevent people like you from scraping their website :)
The url http://www.callmyname.sg/search/display_phone_number/VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0= leads to a script that generates the image - probably based on the argument.
VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0=
Since it ends with an equals sign, I tried to decode it as base64:
UHdTMV9nAWYEYlRmXTtTNlc3BTAEOg==
Now it looks even more like base64, so I tried another round:
PwS1_gfbTf];S6W70:
So it's clearly not plaintext (or not encoded with base64), which would be ridiculous and would let you extract the number this way. They either use some special cipher, or store the numbers in database with this as identifier.
I don't think you can steal the phone number easily, only using OCR perhaps.
When you visit the URL, you will get garbage, since they do not send proper MIME header
�PNG IHDR�,���tRNS���7X}4IDATx���_HZo�g�� E��p��l��EHTx!]�DtQ�M�.x3��.dx�*b]Dl"]�D���bQq.B����Z2$��:ȡ�wq��9�s���Cx>W�}���ٳ��ڶ����]���Ǐ�/_���ݿ���ahh���\q����������555�=���*�"�*�*�f�����}uu�e�d2���o����?00p����J%ȴds���BB�˲�`�`0RJy����n�{cc�e�H$b�ۻ����(�~�_����A4�Z��_�V|��J�w�����t:��333.��ƕ������+^����L`���֑��W��3�X�" y���$p'U"��F���y���z&�ioo��萟�*� ����\�L&Sx����p�e���ׯ_R��y�J%�~����|qq��|e�Z%:�J�{��q��nW�ՉD"�J��~�n4��������̔Ty���qF���>BwGa�z����������8��ߡc�f��B�>!�Ub�N�s���|�F�^/B���Lj��i��NfJ��͛D"����� o!t��`����fvv�eم��V���D)�����x���d2966&�n� ^,0O4��(!D��l�h46�-�~��Tً>B�"�Q�>,�P��ok#U \�BU,�P���=G SA+GIEND�B`�
but it's really just ordinary PNG image:
img http://www.callmyname.sg/search/display_phone_number/VUhkVU5scGlBV1lDWWdFelVEUUhZQWRvQlRZR013PT0=
It's a PNG image, but the server doesn't specify the right content header. It tells your browser that is't an html page in UTF-8 encoding, so you just see some garbage (including the letters PNG at the start).
The <img> tag though doesn't know how to display text so it just tries to load it as an image (and with success).
I don't see a way to extract the numbers in any other way than just reading the image. Because it contains only numbers and will have a similar format all the time, maybe you can find a simple way to parse it instead of using a full fledged OCR library.
It's actually a png-file, generated by a computer before being displayed. You can reference it fine from any other page though, and you should also be able to download it easily (right click, save as ...) Note: I tested this, make sure you save the image with the extension .png and not .html which it will default to.
<img src="http://www.callmyname.sg/search/display_phone_number/QkNOVE1RODNBV1lDWWdVM1V6ZFZNZ1JyRFQ0Rk1BPT0=">

How to print or view HTML from a TDBGrid?

I was, until now, unable to find or create a good component to print the result of a TDBGrid, so what I did was to create a couple of for ... do and then save the result in a text file and opened right after with Notepad, so the user could print or save from there. Pretty ugly, right?
Now it just came to me that I could use those loops to create HTML code instead, which is more presentable. But how can I use, for example a TWebBrowser or something else to show that result instead of the TDBGrid approach?
And how can I print this HTML (with or without the TWebBrowser, as for example if I still use the TDBGrid to show the report and the HTML approach just if the user wants to print it)?
You can use either
TWebBrowser printing abilities,
Or a pure VCL component like THtmlViewer.
I like very much THtmlViewer since it won't depend on the IE installation, is pretty fast and has good printing abilities. You can even export to pdf if needed, using e.g. Open Source SynPdf unit.

Pulling out some text from a giant HTML file using Nokogiri/xpath

I am scraping a website and am trying to pull out certain elements from the HTML. In the sites I am scraping, there are script tags with a bunch of info in them however, there is one part inside these tags that I am interested in. The line basically looks like:
'image':'http://ut5.example.com/t/231/3_b_643435.jpg',
With some stuff above and below it. Now, this is different for each page source except for obviously the domain and some of the subfolders that store the images.
How would I go about looking through the source for this specific line, and cutting out just the URL? I would need to use regular expressions I feel as the URLs are dynamic.
The "gsub" method does something similar to what I want to search for, with its ability to use /regex/. But, I am not wanting to replace anything, I just want to find that URL in the source code using a /regex/ and copy it.
According to you comments, this is what you're looking for I guess
var regex = /http.+/;
Example http://jsfiddle.net/Km9ZB/

Localizing a Google Chrome Web App

I'm trying to add localization support to a Google Chrome Web App and, while it is easy to define strings for manifest and CSS files, it is somewhat more difficult for HTML pages.
In the manifest and in CSS files I can simply define localization strings like so:
__MSG_name__
but this doesn't work with HTML pages.
I can make a JavaScript function to fire onload that does the job like so:
document.title = chrome.i18n.getMessage("name");
document.querySelector("span.name").innerHTML = chrome.i18n.getMessage("name");
but this seems awfully ineffecient. Furthermore, I would like to be able to specify the page metadata; application-name and description, pulling the values from the localization files. What would be the best way of doing all this?
Thanks for your help.
Please refer to this documentation:
http://code.google.com/chrome/extensions/i18n.html
If you want to add localized content within HTML, you would need to do it via JavaScript as you mentioned before. That is the only way you can do it.
chrome.i18n.getMessage("name")
It isn't inefficient to do that, you can place your JavaScript at the end of the document (right before the end body tag) and it will fill up the text with respect to the locale.
Dunno if i understand exactly what you are trying to do but you could dynamically retrieve the LANG attribute (using .getAttribute("lang") or .lang) of the targeted tag and serve accordingly the proper values.