How to embed fonts into an EPS file --and what is the exact definition of "embed"? - embed

My company has created GraphPad Prism, a widely used program for scientists to analyze data and make technical graphs. Often scientists will export graphs from GraphPad Prism for submission to scientific journals. The format most journals want these days is EPS, and we export vector-based EPS files. But fonts are an issue.
We offer an option to "embed" fonts into the EPS file. What we mean by this is that text is converted to outlines or glyphs. These EPS files can be opened on other computers that lack the original fonts. But the journal production people can't edit the text, change font size, etc. when they work on these EPS files.
My programmers tell me that the term "embedding fonts" means exactly what we do -- convert to outlines/glyphs.
The people at a company that does page production for many scientific journals use a different definition of "embed". They want text to remain as text in the EPS file, but for the font definitions to be included in the EPS file. That way they don't need the original fonts, but can tweak spelling, font size, and even change fonts while preparing an EPS image for publication.
My programmers tell me that that second definition of "embed" is an Adobe-specific method not available to us.
So my question is this: Where can we find specifications or example code to let us embed fonts into an EPS file using the second definition (leave text as text and also include the TrueType font definitions)?

"My programmers tell me that the term "embedding fonts" means exactly what we do -- convert to outlines/glyphs."
Your programmers are, in my opinion, mistaken. In the print industry embedding fonts means embedding the font data as a font, not a series of vector linework. There are good reasons for maintaining fonts; file size, rendering performance, character hinting, drop-out correction etc.
"My programmers tell me that that second definition of "embed" is an Adobe-specific method not available to us. "
This is definitely not true and hasn't been since about 1990. The PostScript Language Reference Manual describes in some detail how to create fonts of many kinds, the 'black and white book' (I can't remember the name offhand) describes how to create quality type 1 fonts. The various tech notes from Adobe describe how to create fonts with type 2 (CFF) and CIDFonts with outlines of any of the preceding types.
You can also use 'type 42' fonts, which are essentially TrueType outlines. These are not quite the same as TrueType fonts, but they are very similar (the actual glyph descriptions are the same). It seems to me that this is what you want.
You can get an example of TrueType font inclusion by printing a document which uses TrueType fonts to a PostScript printer on FILE: under Windows, but you may find the code hard to follow.
Type 42 fonts are described on p346 of the 3rd edition PLRM, "Section 8.4.2 Type 42 fonts (TrueType)"
More detail is provided in Adobe Tech Note #5012 "The Type 42 Font Format Specification" This document (and many others) is available in PDF format from the Adobe web site.

Related

pdflib - Is there a way to tell if a font supports Chinese

When I used Arial Black to write text, I found it didn't work in Chinese, so I wondered if there was a way to determine whether a font would support Chinese or not
A general note about fonts:
not all font contains all glyphs. So it's expected that you can not use a single font for the whole Unicode range. You should first find the font that meets your visual requirements and contains the glyphs. Many font families have several specifications for the different Unicode ranges. You should first find the font that meets your visual requirements and contains the glyphs. Many font families have several variants for the different Unicode ranges. It is best to contact your font vendor to get the right font file. Or you can use free but very complete font families, such as Noto.
Of course you can check the glyph availability in a font with PDFlib. This is demonstrated in the PDFlib cookbook topic glyph availability:
gid = (int) p.info_font(font, "glyphid", "unicode=" + t.character);
Based on this, you can determine if a font contains the glyphs you need. Please see also the PDFlib 9.3.0 Tutorial, chapter 6.6.2 "Font-specific Encoding, Unicode, and Glyph Name Queries" for details.
Also related to your question, you might check the technique of fallback fonts for chaining multiple fonts to get a larger set of glyphs. Please see PDFlib 9.3.0 Tutorial, chapter 6.4.6 "Fallback Fonts", as well the included starter sample starter_fallback, which is also available in the PDFlib cookbook.
I have try 'int error_font = p.load_font("Arial Black","", "replacementchar=error");
as soon an exception occurs, you can not longer use the PDFlib Object. Please see PDFlib 9.3.0 Tutorial, chapter 3.1.1 "Exception Handling" for details on this important topic.
You find the PDFlib 9.3.0 Tutorial within all PDFlib Download package within the "doc" directory, as well on the PDFlib 9.3 Download page as separate link.

Is It Safe To Use Unicode Literals in HTML?

I am making an application, and I want to add a "HOME" button.
After much struggling with various icon libraries, I stumbled upon this site,
http://graphemica.com/%F0%9F%8F%A0, with this
🏠
A unicode symbol, which is more akin to a letter than an image.
I pasted it into my HTML, and it just workedTM.
All this seems a little too easy, though. Are unicode symbols widely supported? Is there some kind of problem with them that leads people to use icon libraries instead?
It depends on what do you mean for "safe".
User should have the fonts, so you must include the relative font, and in various formats: there is not yet a format recognized by most used web-browsers.
Additionally, font with multiple colours are not fully understood by various systems, so you should care about what do you expect from users (click, select, copy, etc.).
Additionally, every fonts has own design, so between different fonts (so browsers and operating system) things can look differently. We do not have yet a "Helvetica 'Home'", a "Times New Roman 'Home'".
All this points, could be solved by using a web font, with monochrome glyphs (but it could be huge, if it includes all Unicode code points (+ usual combinations).
It seems that various recent browser crashes if there are many different glyphs, but usually it should not be a problem.
I also recommend aria stuffs so that you page could be used also by e.g. readers (and braille screen).
Note: on the plus side, the few people that use text browser can better see the HOME (not the case in case of an image), if somebody still care about this use case.
Some things you want to make sure you’re doing:
Save your HTML file as UTF-8. In fact, save all text files as UTF-8 unless there’s some reason you can’t.
Put the line <meta charset="utf-8" /> near the top of your HTML file.
Make sure your server isn’t misconfigured to tell all browsers that webpages are in the wrong encoding.
If, somehow, it is and you can’t fix it, fall back on &entities;.
Specify a font stack for your emoji in CSS with a set of fonts that cover nearly every system, perhaps including Apple Color Emoji, Noto Color Emoji, Segoe UI Emoji and Twemoji.
If a free font such as Noto or Symbola contains the emoji you use, you can package it as a WOFF to be sure it will always display the way you want. (As of 2018, Tor browser does not show most emoji correctly by default, but mainstream browsers do.)
I think using unicode is a good practice for development. Beacause The unicodes are essentially part of your operating system so you don’t need any special library or plugin and you treat them like regular text.
The only problem is - code can be defficult to read or understand. I think it is not easy to understand that (&#12796 8;🏠) printing home icon.
Even the 8 bit PNGs are faster then the font icons.
Image icons can be lightweight but still slow down your site with another HTTP request and time for the image to load. With images you don’t have flexibility over the color and scaling. SVG vector image alternatives are still not faster than plain-text (Unicode characters). Unicode doesn’t require additional HTTP requests and can be made to scale nicely.
If you are developing a website using only simple shapes, you can use unicode UTF-8 symbols as replacement for font icons.
I think :
Almost every developer use libraries for icons because of readablility of code, Easy to use and get more options.
Safe or Not
I can not say whether it is safe or not.
Because Unicode contains such a large number of characters and incorporates the varied writing systems of the world, incorrect usage can expose programs or systems to possible security attacks. This is especially important as more and more products are internationalized. This document describes some of the security considerations that programmers, system analysts, standards developers, and users should take into account, and provides specific recommendations to reduce the risk of problems.
Read about UNICODE SECURITY CONSIDERATIONS
Here are few precautions to be taken while doing that, I did some research and found this to be more helpful for your question. Also I dont know how you can do but credits go to Mr.GOY
Displaying unicode symbols in HTML

Find out what font is used to replace missing unicode characters

I have a javascript application, that converts text in runes using the Unicode rune chart. The problem is, that some fonts do not contain the rune symbols.
Mozzila Firefox simply finds out the correct font and uses it for runes, instead of Goudy Medieval and Times New Roman. Google Chrome is not capable of doing that and displays black boxes instead of runes.
So my question is:
How can I find out which web-safe font supports these symbols?
Can I find which font does Firefox use at any point of the document?
Re. 1.: Use a character map program. (On Linux, use e.g. gucharmap, where you can search for the character, and by right-clicking on it (and holding), you can see the font used. You can also switch to other (non-default) fonts in the program and see if the character is present in that font too.)
Re. 2.:
Highlight the text whose font-family you want to determine.
Right-click and select Inspect Element.
In the Developer window that should open, on the right hand side, there should be a Rules column with a bunch of CSS rules. You'll usually have to scroll to the bottom there, and somewhere in there you should find the applicable font-family (inherited or custom-specified) list of which fonts to preferably use.
Compare this list (which may be a single item) with the list of fonts installed on your system. The first matching font between that list and what you have installed would be what Firefox is using.
Use a font-manager program (on Linux this might literally be "font-manager") to get a list of fonts installed on your system.
if rune is an open type font ligature, you can view the font's open type features a few ways....microsoft typography has a free tool u can download, install, then right click on an open type font, some new tabs will appear with the usual ones...one of them is properties. click on that guy, it'll have a list of all the features.
you can also use photoshop, i know if you select an open type font as the text in use, you can then view it's open type features. here's some images of exactly how to do it...sorry for the quality, these are from CS4: http://dev.bowdenweb.com/css/fonts/accessing-open-type-features-in-photoshop.html
so that's how you can tell what features they offer. i'm not sure if runic is a feature itself, or just a design term....that said, quick google search and "Junicode" is a medieval font with the "Junicode is an advanced Unicode font for medieval scholars, including the full range of characters for languages written in the Latin script" http://www.filewatcher.com/d/FreeBSD/8-stable/sparc64/junicode-0.7.6.tbz.1331504.html
but maybe your heart is set on your font....i can't find alot about the rune chart, sorry. you can search the entire open type font features list....i know microsoft typography has it listed, as does adobe...but neither are great for searches, and i'm also not a fan of their naming conventions, which confuse me even more.

'font-family: Symbol' and Windows-1252

I have a bunch of HTML documents that contain some simple text in Windows-1252 encoding, but throughout the text there are numerous appearances of span elements with font-family: Symbol.
For example:
<span style='font-family:Symbol'>Ă‘</span>
Which appears as the greek delta - Δ in the browser.
Google told me that using the Symbol font might show different results on different systems, as it's not actually a well defined font.
Is this really true? Is it "unsafe" to use the Symbol font?
If so, is there any way to reliably convert (on my own system) such symbols in the Symbol font to their Windows-1252 counterparts?
It's been always unsafe to rely on having certain font installed on all the computers/smartphones/gadgets that visit your site. There're some font embedding techniques that work reasonably well in some modern browsers but you'd need to repack the Symbol font and I doubt the copyright owner allows you to do it.
Of course, most characters in the Symbol font are not in the Windows-1252 encoding but that should not be an issue. You can use the following map to obtain the appropriate HTML entities. However, you'll have to write a script or program using a programming language (HTML is just a markup language).
When using font-family, if neither of the listed font faces are found on the client, that is without the webfont embeds, may result in changing to default font of client hence a different font replacement for what you'd show to your users.
You may want to use UTF-8 encoding and put the delta (Δ) sign in your HTML content, or use webfont embeds to provide an option, "use the font I want from this".
The problem is that the greek letter you see is just the appearance, the actual letter is something completely different.
I can think of two ways to convert it:
1. Write a script (in your language of choice) that converts each letter to it's Greek counterpart. (Ñ => Δ)
2. Take a screenshot of the document/page and use an OCR-program to convert it to Greek text.

Recognizing superscript characters using OCR

I've started a simple project in which it must get an image containing text with superscripts and then by using OCR (currently I'm using tesseract) it has to recognize the superscript characters + the normal ones.
For example, we have a chemical equation such as Cl², but when I use the tesseract to recognize it, it gives me Cl2 (all in one line).
So, what is the solution for this problem? Is there any other OCR API that has the ability to read superscripts?
Very good question that touches more advanced features of any OCR system.
First of all, to make sure you are NOT overlooking the functionality even though it may be there on an OCR system. Make sure to look at your result test not in plain TXT format, but in some kind of rich text capable viewer. TXT viewers, such as Notepad on Windows, often do not support superscript/subscript characters, so even if OCR were to give you correct characters, your viewer could have converted it to display it. If you are accessing text result programatically, that is less of an issue because you are supposed to get a proper subscript character value when accessing it directly. Just note that viewers must support it for you to actually see it. If you eliminated this possible post-processing conversion and made sure that no subscript is returned from OCR, then it probably does not support it.
Just like in this text box, in your original question you tried to give us a superscript character example, but this text box did not accept it even though you could copy/paste it from elsewhere.
Many OCR will see subscript as any other normal character, if they can see it at all. OCR of your use needs to have technical capability to actually produce superscripts/subscripts, and many of them do, but they tend to be commercial OCR systems not surprisingly.
I made a small testcase before answering this letter. I generated an image with a few superscript/subscript examples for my testing (of course EMC2 was the first example that came to mind :) .
You can find my test image here:
www.ocr-it.com/documents/superscript_subscript_test_page.tif
And processed this image through OCR-IT OCR Cloud 2.0 API using all default settings, but exporting to a rich text format, such as MS Word .DOC.
You can find my test image here:
www.ocr-it.com/documents/superscript_subscript_test_page_result.doc
Also note: When you are interested to extract superscript/subscript characters, pay separate attention to your image quality, more than you would with a typical text. Those characters are tiny and you need sufficient details and resolution to achieve descent OCR quality. Even scanned at 300 dpi images sometimes have issues with tiny characters due to too few pixels. If you are considering mobile and digital cameras, that becomes even more important.
DISCLOSURE: My specialty is implementing internal OCR solutions for companies of different sizes. My company is WiseTREND. Contact me directly if I can assist with anything further.