convert Miscellaneous Symbols in PDF file - html

i trying to create pdf file with dompdf
i have those Miscellaneous Symbols
specific ☐ or this ☐
they don't show up in generated PDF instead i got Question mark ?
replace the character
i tried of all those solutions
dompdf special characters
Special characters with dompdf and php

Really this is more of a general "how do I get dompdf to support X character" question. The issue relates to the character set encoding (which relates to the font used).
First, you need to specify a character set encoding that supports the character your specified. In pretty much all instances you should encode to UTF-8.
Second, you'll need to use a font that supports the particular glyphs you want and the font needs to be loaded into dompdf. For the particular character you've specific you can use the DejaVu fonts (e.g. DejaVu Sans) bundled with dompdf starting with v0.6.0.
Try the following:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<style>
* { font-family: DejaVu Sans, sans-serif; }
</style>
</head>
<body>
<p>☐</p>
<p>☐</p>
</body>
</html>
And see here for information on how to install fonts: https://stackoverflow.com/a/24517882/264628

Related

Html file with International Phonetic Alphabet characters and light or extra-light font

I am trying to display International Phonetic Alphabet characters in a UTF-8 html file with a font-weight that is either light or extra-light.
I have minimised my problem as follows.
I looked for a font that can display International Phonetic Characters and has a light or extra-light font style.
My first search was on google fonts, where I found the 'Assistant' font.
When I test this font on Google Fonts with characters of the International Phonetic Alphabet (IPA), it seems to work fine with any font weight.
I prepared the following html file that utilises the font 'Assistant' and displays some International Phonetic Alphabet characters:
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>hair</title>
<link href="https://fonts.googleapis.com/css?family=Assistant:300" rel="stylesheet">
<style>
.ipa {
font-family: 'Assistant';
font-weight: 300;
font-size: 34px;
}
</style>
</head>
<body>
<div class="ipa">hɛər</div>
</body>
</html>
When I view the above html file with Chrome, the "h" and "r" are as expected, but the "ɛ" "ə" seems to come from another font with a different weight.
I have tried to modify the header by substituting:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
with:
<meta charset="UTF-8">
<meta http-equiv="Content-Type" content="text/html">
but the result is the same.
My text editor (TextWrangler) has decoding for new files set to Unicode (UTF-8). I have also checked that the file is encoded in UTF-8 as follows:
$ file --mime hair.html
hair.html: text/html; charset=utf-8
Any ideas on how to display International Phonetic Alphabet characters with a light or extra-light font style?
Update 1
I have substituted
<link href="https://fonts.googleapis.com/css?family=Assistant:300" rel="stylesheet">
with:
<link href="https://fonts.googleapis.com/css?family=Assistant:300&subset=all" rel="stylesheet">
Safari 9.1 displays the text correctly:
However Chrome 52 and Firefox 48 still display the text incorectly:
At https://developers.google.com/fonts/docs/getting_started#specifying_script_subsets it is mentioned that:
Please note that if a client browser supports unicode-range
(http://caniuse.com/#feat=font-unicode-range) the subset parameter is
ignored; the browser will select from the subsets supported by the
font to get what it needs to render the text.
At http://caniuse.com/#feat=font-unicode-range it is reported that:
Chrome 52 and Firefox 48 support unicode-range and the subset parameter is ignored. (However the text is still incorrectly displayed in my tests).
Safari 9.1 supports partially and the subset parameter is not ignored (the text is correctly displayed in my tests).
Update 2
I have substituted
<link href="https://fonts.googleapis.com/css?family=Assistant:300" rel="stylesheet">
with
<link href="https://fonts.googleapis.com/css?family=Assistant:300&text=hɛər" rel="stylesheet">
The text is now displyed correctly for Chrome 52, Firefox 48 and Safari 9.1:
When I include more characters with the following line:
<link href="https://fonts.googleapis.com/css?family=Assistant:300&text=abcdefghijklmnopqrstuvwxyzɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏʐʑʒʓʔ" rel="stylesheet">
the text is still displayed correctly.
However when I try to include more of the ipa characters and extensions with the following line:
<link href="https://fonts.googleapis.com/css?family=Assistant:300&text=abcdefghijklmnopqrstuvwxyzɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯʰʱʲʳʴʵʶʷʸʹʺʻʼʽʾʿˀˁ˂˃˄˅ˆˇˈˉˊˋˌˍˎˏᴀᴁᴂᴃᴄᴅᴆᴇᴈᴉᴊᴋᴌᴍᴎᴏᴐᴑᴒᴓᴔᴕᴖᴗᴘᴙᴚᴛᴜᴝᴞᴟᴠᴡᴢᴣᴤᴥᴦᴧᴨᴩᴪᴫᴬᴭᴮᴯᴰᴱᴲᴳᴴᴵᴶᴷᴸᴹᴺᴻᴼᴽᴾᴿᵀᵁᵂᵃᵄᵅᵆᵇᵈᵉᵊᵋᵌᵍᵎᵏᵐᵑᵒᵓᵔᵕᵖᵗᵘᵙᵚᵛᵜᵝᵞᵟᵠᵡᵢᵣᵤᵥᵦᵧᵨᵩᵪᵫᵬᵭᵮᵯᵰᵱᵲᵳᵴᵵᵶᵷᵸᵹᵺᵻᵼᵽᵾᵿ" rel="stylesheet">
the text is not displayed correctly as previously.
Conclusion
This can be used as a workaround, however it has its limitations and it is not very satisfactory. Hopefully google fonts will provide a better solution.
By default, Google Fonts only returns the subset of the font for Latin characters (the behaviour may vary between browsers: some will actually send a request for all characters in use on the page). See https://developers.google.com/fonts/docs/getting_started#specifying_script_subsets for details.
You can have it send the whole whole by adding the subset=all parameter:
<link href="https://fonts.googleapis.com/css?family=Assistant:300&subset=all" rel="stylesheet">
However this is undocumented and even though it currently works, it may break in the future.
An alternative is to use the text parameter to provide a list of all the characters you need (correctly URL-encoded, of course). Might be quite verbose.
This has been discussed in this Google Fonts issue though it hasn't been resolved.
I have tried specifying the ipa subset, but no luck.

Special characters 'ű' and 'ő' display problems with embedded font family

I am currently building a multi lingual web site and using a custom font called 'Exo'. This font family contains special characters 'ű' and 'ő' from the Hungarian alphabet. I embedded the 'Exo' font to the website but unfortunately these characters don't display properly. The rest of the text is displayed with the embedded 'Exo' font but these characters are displayed with the default font family.
Here is the character encoding what I use:
<!DOCTYPE html>
<html lang="hu">
<head>
<meta charset=utf-8>
What could possibly be the problem?
These characters display properly on http://www.fontsquirrel.com/fonts/exo?q%5Bterm%5D=exo&q%5Bsearch_check%5D=Y where I have downloaded the font family from.
Any help is much appreciated!
Your font is loaded correctly since you see all Latin glyps. But the font you use does not contain the Hungarian subset. That means these special Hungarian glyphs just don't exist in the font. What happens is that the css font stack will display the fallback font for the missing glyphs.
Sort answer: What you need to do is make sure you use a version of the Exo font that contains all the subsets you need.
You can download a new (full glyph set) OTF from FontSquirrel and use the Webfont generator to create new webfonts like you did before. But now make sure you get the subset settings right...
FontSquirrel has a fine grained subset control. This means you only need to include the language support (subsets) you need. This way loading time will stay at a minimum.
Google commissioned the design and redesign of the Exo font family. The also have a excellent delivery network. So you might consider getting your font directly from the source: http://www.google.com/fonts#UsePlace:use/Collection:Exo+2
Again, you need to load the 'Latin Extended (latin-ext)' to get all needed glyphs. The gauge shows that loading this set of glyphs triples loading time.
Although there are other ways to specify a more restrictive subset, I would go for the Latin Extended because than you keep the advantage of the CDN. Exo might already be in your users browser cache.
I've used Google Fonts service and all seems to be okay.
https://jsfiddle.net/maxim_mazurok/wp1powya/
<!DOCTYPE html>
<html lang="hu">
<head>
<meta charset="UTF-8">
<link href='http://fonts.googleapis.com/css?family=Exo' rel='stylesheet' type='text/css'>
</head>
<body style="font-family: 'Exo', sans-serif; font-size:60px">
'ű' and 'ő'
</body>
</html>

How do I get this unicode character to appear on my website?

I would like this character to appear on my website:
http://www.fileformat.info/info/unicode/char/fdfd/index.htm
I put this tag on the top of my page
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
And I tried the code ﷽ and ﷽ but it just looks like a box on the site. How can I put this character on my website?
Thanks!
Edit: It shows up correctly on my phone, but not my Mac 10.6.8. How do I know which computers will support this character?
All the methods you have used are correct. Using the character ﷽ as such is correct too, provided that the HTML document is UTF-8 encoded and declared to be UTF-8 encoded. The ﷽ and ﷽ notations work independently of character encodings (that’s one of the main reasons for using them).
However, it fails if none of the fonts in the user’s system contains a glyph for the character. Browsers then typically display a small rectangle to indicate this.
The font information page for the character at Fileformat.info has incorrect or incomplete information. If you click on the Local Font List link there, you should see the real situation in your system. In my system, a fairly normal Windows 7 with a fairly large collection of free fonts added, none of the fonts listed on the file info page except GNU Unifont actually contains U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM. Perhaps the font support page was created in a system with versions of Arial, Times New Roman, etc., that differ from the normal fonts shipped with e.g. Windows 7.
What’s worse, the GNU Unifont glyph is hardly suitable for any real use. Being a coarse bitmap, it is totally inadequate for rendering a calligraphic character with a large number of details.
Microsoft Uighur has an appropriate glyph, but this font is not free, and there does not be any easily accessible information on terms for getting it legally for use as an embedded font (web font). There is also a font called Universalia, with information available in Russian only and of questionable legal status.
The following image shows examples of the character (in very large font size) in the fonts mentioned.
If you can see the character in iPhone, then obviously the iPhone has a font containing it, but most probably you cannot use that font as embedded.
Unfortunately, this appears to mean that you cannot use the character on a web page so that works for all users, or even the majority of users.
Update: The good news is that Google Fonts: Early Access contains a few fonts that contain the character and can be used as embedded fonts, either as hosted by Google or as hosted on your server. Beware that Early Access fonts are more or less experimental and that the shape of the character might not be suitable for your overall design and style.
In the following examples, I have included a short phrase in normal Arabic letters for comparison.
<style>
#import url(http://fonts.googleapis.com/earlyaccess/amiri.css);
#import url(http://fonts.googleapis.com/earlyaccess/notonaskharabic.css);
#import url(http://fonts.googleapis.com/earlyaccess/notonastaliqurdudraft.css);
</style>
<p><span style="font-family: Amiri">﷽ السلام عليكم</span> (Amiri)
<p><span style="font-family: Noto Naskh Arabic">﷽ السلام عليكم</span> (Noto Naskh Arabic)
<p><span style="font-family: Noto Nastaliq Urdu Draft">﷽ السلام عليكم</span> (Noto Nastaliq Urdu Draft)
Make sure that your code editor saves your file in utf-8 encoding.
Heya any chance your font does not support the character? I tested with the below works fine.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
<p style="font-family: arial, helvetica, sans-serif;">﷽</p>
</body>
</html>
EDIT: Loaded Arial, Helvetica stack and seems fine.
Also make sure your text editor is saving the file utf 8 encoded ( as mentioned above )

Using Unicode standard symbols in HTML/CSS documents. Do I use Javascript?

I'm an enormous newbie but I want to add a Unicode character to a document. I've tried reading all the Unicode character threads but I am finding it difficult to understand and I am getting a headache trying to soak in all the information.
<doctype html>
<html>
<head>
<meta name="description" content="description" />
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
<meta name="keywords" content="keywords" />
<title>title</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css" />
</head>
<body>
<div id="idtag">
📗
</div>
</body>
</html>
#idtag {
font-family: Tahoma, Helvetica, "Arial Unicode MS", sans-serif;
}
I am trying to add U+1F4D7 or anything like a book but when I view the page I just see a hollowed out square. I don't think I have the symbol on my PC. I have used Google Fonts before and they just require Javascript? Would I do the same? How would I do this?
The problem with &#x1F4D7 is that it's outside the classic BMP range -- not sure whether there are any fonts at all that have these characters, but they are definitely not common.
Just because a character is defined by Unicode doesn't mean that every font has to have it. Even characters in BMP (the codepoints between 0 and 65535) are not available in all fonts.
There is no way to display a character that's not available in the font that's being used. Unicode test pages that show you what a character "should" look like will send you images :-)
The "GREEN BOOK" character you are trying to use seems to be one of the emoji new in Unicode 6.
It is therefore not included in the font-families you are using for display, not even in the very large Arial Unicode MS.
On newer versions of OSX and iOS it is supported through the Apple Color Emoji font. Otherwise I don't know of any fonts that support this character. I guess there are some in Japan but almost certainly none on Google Webfonts.
Google Webfonts therefore won't help you in this case.
But if it's just about showing a book icon, I'd use an image. E.g. from the noun project.
If you need it as a font, you could use Fontello or Pictos Server to generate a font from a selected icon.
For more about using fonts for icons, there was a nice article on CSS-Tricks a few monts ago.
You document is correct in principle, except for the typo on the first line (should be <!doctype html>, with the exclamation mark), and the character is rendered OK on my Firefox. But this happens because Firefox is clever enough to scan the fonts installed on my computer to find a glyph for the character, and I happen to have installed the Symbola font that contains it. IE 9 is not that clever, but even it shows the character if I add Symbola to the font-family list.
So this is a font problem, and a big one. You cannot expect Symbola to be installed (probably less than 1 user out of 10,000 has installed it, regrettably), and it appears to be about the only published free font containing U+1F4D7. The page you refer to has a link to a font support page that says this. It might not be completely up-to-date, but it gives a good picture of font support.
You might consider using a web font for the purpose, but this sounds somewhat disproportionate for displaying just one character, and it seems that the Font Squirrel #font-face generator fails to work with Symbola, for some reason.
See also: Guide to using special characters in HTML.
P.S. The character would appear black and white, though you could use CSS to set a color on it. Cf. to the SO question Color in the Unicode standard?

How to make browsers load Japanese fonts for CJK text, instead of Chinese fonts

I have an XHTML1.1 document with a mix of English and Japanese text, with charset indicators lang="jp" and xml:lang="jp" in the opening tag for the <html> element. The actual content is encoded in UTF-8, and this is stated in the content-type as well:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="jp" lang="jp">
<head>
<title>Test page</title>
<meta http-equiv="Content-Type"
content="application/xhtml+xml; charset=utf-8"/>
</head>
<body><div>今</div><div>込</div></body></html>
The XML/HTML specs say that the "lang" attribute is inherited, so the content should end up being rendered with a font that supports Japanese, but instead I'm seeing it use fonts that are intended for Chinese. (Japanese "kanji" are actually subtly different in many cases from the equivalent Chinese "Hanzi", and wildly different for a few common characters.)
For instance, in the above code the top part of the first character should be ˄ with a - under it. If a Chinese font is used instead, this character will invariably instead look like a ˄ with ` underneath. Also, the second character should have a shape that looks like 7\, but when a Chinese font is used it will more often look like a lambda, λ. Neither of these are correct print/screen forms in Japanese.
The question: is there a way to force browsers to pick Japanese fonts for CJK text without writing a CSS rule that just contains a hundred and one font names in the hopes that at least one of them will match what the user has installed?
(Since minimal CJK fonts are along the lines of >4MB, with complete ones more around 15~20MB, relying on an #font-face declaration to ensure the right font gets loaded would be slow.)
I'd like a solution that works in all major browsers.
At least in the modern browsers (in 2022, over a decade after you asked), the lang attribute will work the way you wanted it to, but the lang code you need is ja, not jp as you used in the question. (jp is the ISO 3166 code that identifies Japan, the country, while ja is the BCP 47 code that identifies Japanese, the language.)
We can demonstrate the behaviour in a modern browser with this simple test document:
<!DOCTYPE html>
<html xml:lang="ja" lang="ja">
<title>Test page</title>
<div>今</div>
<div>込</div>
</html>
As shown in the screenshot below (taken in Chrome in Ubuntu), the Japanese forms (kanji) of 今 and 込 get used when we open the document above:
... and if we change ja to zh in our HTML document, the Chinese forms (Hanzi) get used instead:
It similarly works if we use the XHTML from the question (after changing jp to ja), or if we use Firefox, or if we use a Windows or macOS browser.
Note that depending upon the browser and the OS, the use of Japanese glyphs may be implemented by way of language-specific glyphs within a single font (as is the case with the Noto Sans CJK JP font used by default for CJK characters in Chrome on Ubuntu) or by selecting a different default font. For instance, Chrome for Windows will use Microsoft YaHei for lang="zh" text but Yu Gothic for lang="ja" text.
Well, if all else fails, I would explicitly specify common Japanese fonts in the CSS. Look up which fonts are available on which platforms, and create a font stack.
Basically, just select fonts the old fashioned way, and see if that fixes the problem.