fixed-width font (usually Courier) in htm - pre

i'm, new on learning html (for front-end)
i encountered with a subject that includes fixed-width font (usually Courier) in some web pages like w3schools.So after seraching alot,
i couldn't find a good answer for that.Can anyone explain it with an example?

A "fixed width font" is a style of glyphs (a glyph is the human visible representation of a character as displayed on the computer's screen or as printed on paper) such that every character has the same horizontal length.
Fixed width fonts are needed for "ASCII art" displays such as this:
I want to highlight this word. So I use "^" characters on
the next line like this: ^^^^
If the width was not fixed, the "^" characters might not be aligned correctly.

Related

Determining height and length of text in pixels given the font

This is mostly an HTML question, but I am interested in extracting information from the HTML using Python 3.
My question is:
Given a font-family, font-weight, and font-style, as well as the text itself and the text-size, how can I determine the height and width of the text?
Namely, I have the coordinates of the upper left corner of the text, and I would like to find its lower right corner. I am ready to manually input the sizes of single letters in my script, if that's what it takes (hopefully not!).
As a base example, I have a font
f { font-family:sans-serif; font-weight:normal; font-style:normal; }
and a tag
<span id="f" style="font-size:20px;vertical-align:baseline;color:rgba(0,0,0,1);">Hello, world!</span>
I would like to calculate the width and height of the text, in pixels.
I am aware of related questions on the site, but I couldn't find any that would answer my specific question. Feel free to link such a question (if answered), if it exists.
I have a workaround solution to this (not implemented yet). The parsing of the fonts is (often) done through referral to a .ttf (TrueFont) file. You can find such files in C:/Windows/Fonts (in Windows, not sure about other OS).
Using PIL/Pillow in Python, you can draw the bitmap of any string in a given font, see e.g. this answer. To be exact, you want to use PIL.ImageFont.ImageFont.getmask after initialising an instance of ImageFont with the appropriate .ttf file. Then you can just get the size of the mask and rescale to match the font size.

What character should I use to maintain height of an empty (zero width) string?

I have a string that can potentially be empty, and in that case, I want to substitute it with a special character to maintain the ordinary text height while having zero width. In TeX, this would be called \strut. What is the counterpart for that in HTML? I came up with two candidates: ⁠ and . Should I use one of these?
On modern browsers, any zero-width character will do the job, provided that the browser either knows that the character is zero-width or uses a font that contains an empty glyph for it. But some characters may have effects, depending on the context and on software used to process the HTML file.
U+2060 WORD JOINER has the effect of preventing line break.
U+FEFF ZERO WIDTH NO-BREAK SPACE has the same effect. It is formally deprecated for any use except as Byte Order Mark, but in reality it works more often than WORD JOINER (though there are exceptions).
U+200B ZERO WIDTH SPACE has the effect of allowing a line break even when it would otherwise not be permitted; it’s like SPACE, but with zero width.
Usually the worst-case scenario for characters like this is an old version of IE. Checking in IE 6 shows that U+FEFF and U+200B are OK, but U+2060 shows as a small rectangle (i.e., the browser tries to render the character but finds no glyph for it).
So I’d use  or ​ depending on whether I’d like to prevent or allow line break at that point. If it does not matter, ​ is more logical to use.
I would suggest  or if zero width is not essential or if it is essential you could try the Unicode character ⁠ which is a zero width non-breaking space.

display text as square symbols instead of letters

Is there a simple css way to display text with every letter replaced with a filled square?
My idea was to find a font-family that has squares for all letters, but I didn't find anything like that existing. Google is no friend as it gives hits of posted issues with boxes that appear when fonts fail in some way.
Letters should be displayed as squares, not replaced with squares. Also, I need to be able to control the square fill color with the usual html/css.
I'm fine to use font-face, but am trying to avoid the learning curve for creating my own font.
Update: here is an example:
div.innerHTML = "some arbitrary text".
Should be displayed like this:
"■■■■ ■■■■■■■■■ ■■■■".
#NoobEditor is right although. Many online font editors available (e.g.: http://fontark.net/farkwp/ ), you can create such font family in few minutes and can embed with your app.
Get a square font, define it in your we page style, asign it to an object, a div must work, put your text there. Voila.

Why is a trailing punctuation mark rendered at the start with direction:rtl?

This is more a sort of curiosity. While working on a multilingual web application I noticed that certain characters like punctuation marks (!?.;,) at the end of a block element are rendered as if they were placed at the beginning instead when the writing direction is right-to-left (as it is the case for certain Asian languages I do not speak).
In other words, The string
Hello, World!
is rendered as
!Hello, World
when placed in a div block with direction: rtl
This becomes even more evident if the text is split in two parts and given different colors: a contiguous chunk of text at the end is rendered in two separated regions:
http://jsfiddle.net/22Qk9/
What's the point of this behavior? I guess this must be a peculiarity of (all?) right-to-left languages which is automatically handled by the browser, so I don't need to care about it, or should I?
If you want to fix this behavior add the LRM character ‎ in the end. It's a non=printing character.
Source : http://dotancohen.com/howto/rtl_right_to_left.html
Example : http://jsfiddle.net/yobjj6ed/
The reason is that the exclamation mark “!” has the BiDi class O.N. ('Other Neutrals'), which means effectively that it adapts to the directionality of the surrounding text. In the example case, it is therefore placed to the left of the text before it. This is quite correct for languages written right to left: the terminating punctuation mark appears at the end, i.e. on the left.
Normally, you use the CSS code direction: rtl or, preferably, the HTML attribute dir=rtl for texts in a language that is written right to left, and only for them. For them, this behavior is a solution, not a problem.
If you instead use direction: rtl or dir=rtl just for special effects, like making table columns laid out right to left, then you need to consider the implications. For example, in the table case, you would need to set direction to ltr for each cell of the table (unless you want them to be rendered as primarily right to left text).
If you have, say, an English sentence quoted inside a block of Arabic text, then you need to set the directionality of an element containing the English text to ltr, e.g.
<blockquote dir=ltr>Hello, World!</blockquote>
A similar case (just with Arabic inside English text) is discussed as use case 6 in the W3C document What you need to know about the bidi algorithm and inline markup (which has a few oddities, though, like using cite markup for quoted text, against W3C recommendations).
The accepted answer https://stackoverflow.com/a/20799360/477420 works if you can control markup/CSS of the value, if you have no control over HTML following approach could work.
If you don't know if page will be rendered RTL or LTR but some text is definitely LTR (i.e. English-only) you can wrap the value with LRE/PDF marks to signify that is LTR region. Text will be rendered LTR irrespective of page's LTR or RTL direction.
This works when you have some code that tries to render text without ability to change markup of how exactly it will show up on the page. I.e. you rendering value for "song tile" or "company name" field in some nested child component (or server side) without ability to control surrounding HTML elements.
One drawback of this and similar approaches (like LRM proposal in this question) with adding marks to text is copy-paste of such value from the resulting HTML page will generally preserve the marks but they are not visible/zero width. While for most cases it is fine consider if that is a problem for you.
Approximate sample code (some companies have "Inc." at the end which will end up with dot at the beginning when rendered as-is on RTL page):
// comanyName = "Alphabet Inc." - really likes dot at the end including RTL
if(stringIsDefinitelyAscii(companyName))
{
companyName = "\u202A" + companyName + "\u202C"
}
return companyName;
Details on LRE/PDF symbols can be found in https://unicode.org/reports/tr9/#Explicit_Directional_Embeddings:
LRE U+202A LEFT-TO-RIGHT EMBEDDING
Treat the following text as embedded left-to-right.
PDF U+202C POP DIRECTIONAL FORMATTING End the scope of the last LRE, RLE, RLO, or LRO.
Some approaches to figure out if string has RTL characters can be found in How to detect whether a character belongs to a Right To Left language?, JavaScript: how to check if character is RTL?, How to detect if a string contains any Right-to-Left character?.

How does Zalgo text work?

I've seen weirdly formatted text called Zalgo like below written on various forums. It's kind of annoying to look at, but it really bothers me because it undermines my notion of what a character is supposed to be. My understanding is that a character is supposed to move horizontally across a line and stay within a certain "container". Obviously the Zalgo text is moving vertically and doesn't seem to be restricted to any space.
Is this a bug/flaw/exploit/hack in Unicode? Are these individual characters with weird properties? "What" is happening here?
H̡̫̤̤̣͉̤ͭ̓̓̇͗̎̀ơ̯̗̱̘̮͒̄̀̈ͤ̀͡w͓̲͙͖̥͉̹͋ͬ̊ͦ̂̀̚ ͎͉͖̌ͯͅͅd̳̘̿̃̔̏ͣ͂̉̕ŏ̖̙͋ͤ̊͗̓͟͜e͈͕̯̮̙̣͓͌ͭ̍̐̃͒s͙͔̺͇̗̱̿̊̇͞ ̸̤͓̞̱̫ͩͩ͑̋̀ͮͥͦ̊Z̆̊͊҉҉̠̱̦̩͕ą̟̹͈̺̹̋̅ͯĺ̡̘̹̻̩̩͋͘g̪͚͗ͬ͒o̢̖͇̬͍͇͓̔͋͊̓ ̢͈͙͂ͣ̏̿͐͂ͯ͠t̛͓̖̻̲ͤ̈ͣ͝e͋̄ͬ̽͜҉͚̭͇ͅx͎̬̠͇̌ͤ̓̂̓͐͐́͋͡ț̗̹̝̄̌̀ͧͩ̕͢ ̮̗̩̳̱̾w͎̭̤͍͇̰̄͗ͭ̃͗ͮ̐o̢̯̻̰̼͕̾ͣͬ̽̔̍͟ͅr̢̪͙͍̠̀ͅǩ̵̶̗̮̮ͪ́?̙͉̥̬͙̟̮͕ͤ̌͗ͩ̕͡
The text uses combining characters, also known as combining marks. See section 2.11 of Combining Characters in the Unicode Standard (PDF).
In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character
So you can easily construct a character sequence, consisting of a base character and “combining above” marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model. Such a sequence has no meaning of course, and even a monkey could produce it (e.g., given a keyboard with suitable driver).
And you can mix “combining above” and “combining below” marks.
The sample text in the question starts with:
LATIN CAPITAL LETTER H - H
COMBINING LATIN SMALL LETTER T - ͭ
COMBINING GREEK KORONIS - ̓
COMBINING COMMA ABOVE - ̓
COMBINING DOT ABOVE - ̇
Zalgo text works because of combining characters. These are special characters that allow to modify character that comes before.
OR
y + ̆ = y̆ which actually is
y + ̆ = y̆
Since you can stack them one atop the other you can produce the following:
y̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆
which actually is:
y̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆
The same goes for putting stuff underneath:
y̰̰̰̰̰̰̰̰̰̰̰̰̰̰̰̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆
that in fact is:
y̰̰̰̰̰̰̰̰̰̰̰̰̰̰̰̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆̆
In Unicode, the main block of combining diacritics for European languages and the International Phonetic Alphabet is U+0300–U+036F.
More about it here
To produce a list of combining diacritical marks you can use the following script (since links keep on dying)
for(var i=768; i<879; i++){console.log(new DOMParser().parseFromString("&#"+i+";", "text/html").documentElement.textContent +" "+"&#"+i+";");}
Also check em out
Mͣͭͣ̾ Vͣͥͭ͛ͤͮͥͨͥͧ̾