layered pdf page to multiple images - html

I would like to convert a single page pdf into multiple layered images.
For example the library pdf2htmlEX converts a pdf to html by making a single background png image and placing text on that image(the text is in unicode).
i would like to do something similar, Separate the background image and other portions but all in image format.
I have looked into poppler libraries and some other solutions, but couldnt find anything useful.
A url at which someone else wants the same thing done.
How do extract text layer and background layer from pdf?

Related

whats the best way to have text inside a svg code?

I have a 4 page leaflet (in perspective view) designed with illustrator and exported as SVG .I have some text on each page of the leaflet , its too heavy for web rendering ,(as the text is converted to paths). so I decided to keep the base leaflet as SVG and find a way to simulate perspective in the accompanying text and match it to the leaflets perspective, I also found the scripts to make the perspective happen but as soon as I place (inline)the perspective text in the SVG code it acts in a very different way I've made screenshots of the end result I want to achieve so you can see what I mean.
desired end result image
i use foreignObject tag to implement html inside svg .
code i did and whats happening

Why images at Github in markdown formatted file on are blurred?

I have uploaded images (jpg and png) to github and used them in markdown formatted file here (first 2 images):
https://github.com/vasili111/testRepo/blob/master/github_question.md
Third and fourth images are inserted with html tag.
In link above in browser images are blurred. But they are not blurred when I access images from browser directly as here:
https://raw.githubusercontent.com/vasili111/testRepo/master/images_for_github/3.png
https://raw.githubusercontent.com/vasili111/testRepo/master/images_for_github/3.jpg
Questions:
Why images are getting blurred?
How to use images in markdown formatted text without blur effect?
It's because they are rendered at a different size. The markdown-formatted ones are stretched to fit the screen; in your case, they are a bit too wide and have to be narrowed down by a few pixels, causing the blurring effect.
There's probably nothing you can do about it.

Display png images like charts in HTML / CSS

I have made a bunch of charts and tables which I have saved in png format for presentation as stimuli in a web-based experiment created with HTML / CSS / Javascript. How can I get them to look sharp when displayed?
Here's a sample of what they look like now when displayed in the experiment:
As you can see, the lines are jagged and sometimes even thin to vanishing, and the text has similar problems. I guess this is a consequence of the png images' "natural" sizes (about 3500x2500 pix) being larger than their display sizes (about 200px high), but I feel there should be some way to fix this at display time without manually resizing all the images.
Here's some history: these were all made in Excel, then copied to Powerpoint and thence saved as images. Originally I directly saved from Powerpoint, which defaulted to .jpg format and came out fuzzy. Then I tried saving to .emf and used IrfanView to resave as .png. The resulting pngs are extremely sharp when viewed in their natural (large) size through whatever image viewer, but when I embed them in html at a much smaller size, they look pretty bad as shown above.
Do you still have the excel file? If so you can:
Export your charts as pdf files in excel;
Then import the pdf's into a vector program such as Inkscape;
Save as svg and then reference the svg files like you would do with an image
tag(you can also embed directly)
When importing as a pdf they will be vector graphics so you can edit some points further if needed.
It's hard for a line on the screen to look good with a less that 1 pixel thickness.
So let's say you have elements 3 pixels thick on your image. After resizing to 250 X 250, they would be 0.3 pixels thick -> not good.
That what creates the undesirable effects you described at line edges and corners.
To address that problems I see three potential solutions:
Make an other copy of the images with lower resolutions from the original source (like screenshot of the Excel charts, or any other features that allows you to get a low resolution bitmap)
If you have the numerical data displayed on the charts and time to learn a cool technology, you can use a charting library. This way you would get the prettier rendering, because it would be vector drawings. Example: HighCharts
Last and far worse solution: work on the images with an image editor and the appropriate skills to increase the thickness size of all sharp elements, like lines, dots, arrows, etc...

How do I convert a website entirely made from frames into HTML?

I am looking to convert a clients website into HTML. I'm relatively new as my skills are more directed in the front end of websites (design) so I'm quite lost. The website is allegianceglobalinvestigations.com and if you scroll through it, each page has the same URL. How to I create a HTML file/template from this? I'm assuming that since there are 4 pages, I'll end up with 4 files? Do I need to use OCR for the text?
If you view the source it will show you the urls of the other frames. If you view just that url you can get the source for just that frame. You can use that source all together with some changes if you're trying to just "un-framify" the site. I think that was what you were asking.
There is very little text on there so the only OCR you will need is your eyes and a keyboard if you're trying to use real text on the site.
And yes, you will end up with 4 different files. One for each page.
Good luck with your project, the best way to learn is to dive right in!
This is a frame-based site with a top menu in one frame selecting between four pages in the other frame. The content of each subpage is encoded as a JPEG image in a table.
There are already files for each subpage: content.htm, sis.html, services.htm, and contact.htm. With this low amount of text, you may as well just type the text currently in the images into the body of these files instead of using OCR. Replace everything between <body> and </body> with the text, then use HTML to mark up to the content to your liking.
To eliminate the frames, paste the content of the body element from the menu.htm file into the start of the body element of the four subpages.

PDF to web page

I get a .pdf complete with images, fancy fonts, styles, gradients and what have you. Basically it's handed off to me with the message, "Make me a web page that looks exactly like this." I've tried a few pdf to html tools and they all look terrible. I figure I've only got 2 options and i hate them both.
convert the pdf to one big image and use an imagemap to add the links.
the screen copy tool that comes with acrobat reader to chop the file up into it's parts (buttons, logos, etc).
She uses Quarks to make this pdf. I've never used it, but I hear it is very popular. Are these really my only two options? Someone tell me I'm wrong, please.
Grab what text you can out of the PDF and clean it up. Pull the PDF into Photoshop and slice out the graphical elements you want to use. Rebuild the page using the images and put your text in HTML format.
Make a slice of the gradients and use them as background images with repeat.
Try to explain to your client why the fancy font is unsuitable for this medium.
Edit:
If it's just going to be a screen shot, you might as well just put the PDF up in the first place. At least people can zoom in.
Do not use one big image map. The more content you can convert from image to text, the better (more efficient) your HTML page will be.
Chop up the PDF into parts. Make the logos, etc. images, make text plain text, and make buttons button controls.
Exactly like what Diodeus said except-
-
Find the fancy font and check to see how much it will cost to license or buy it. Build two bills and send them to your client, one with the fancy font and one with a standard font. Then see if she wants the fancy font. It will show that you take your job serious and may get you less strict project conditions.
No they are not:
Adobes Online pdf to html service
or
pdftohtml