What is practical purpose for bidirectional override "bdo"? - html

Before coming here, I tried myself by googling. After I read these two links
http://www.w3schools.com/tags/tag_bdo.asp
http://www.w3schools.com/tags/tryit.asp?filename=tryhtml_bdo
I still don't understand clearly what is the practical purpose?
Thanks in advance for those who shed some light on this.

Pretty striaghtforward. If you're writing a web page using a default language, such as English, that is rendered left-to-right, and you want to include a island of text in another language, such as a quote in Hebrew, that is rendered right-to-left you can use this tag to override the base direction in which the text is written onto the page in case the bi-directional algorithm is getting it wrong. You need to make sure that the font you're using supports the appropriate character set too, of course.
http://www.w3.org/TR/html40/struct/dirlang.html

I tried the code bellow, and noticed that it is apparently obsolete for Hebrew, at least:
<!DOCTYPE html>
<html>
<body>
<p>If your browser supports bi-directional override (bdo), the next line will be written from right to left (rtl):</p>
<p>חדשות, ידיעות מהארץ והעולם - עיתון הארץ</p>
<bdo dir="rtl">חדשות, ידיעות מהארץ והעולם - עיתון הארץ</bdo>
</body>
</html>
Both seemed to output the same line, which confused me, but prompted a search that lead me to the following article:
The bidirectional ordering of text in AbiWord is done automatically,
closely following the Unicode Bidirectional Algorithm (UBA; see the
Unicode Consortium website). The Unicode character set assigns each
character certain directional properties which are then used by the
UBA to order text. Thus, Hebrew or Arabic characters will
automatically be treated as right-to-left, and English characters as
left-to-right. There are some characters that are directionally
ambiguous, and how they are treated by the UBA depends on what
characters are found in their vicinity (this includes all white space
and punctuation characters).
http://fantasai.tripod.com/qref/HTML4/structure/bdo.html
Hope it helps

Related

Why do some strings contain " " and some " ", when my input is the same(" ")?

My problem occurs when I try to use some data/strings in a p-element.
I start of with data like this:
data: function() {
return {
reportText: {
text1: "This is some subject text",
text2: "This is the conclusion",
}
}
}
I use this data as follows in my (vue-)html:
<p> {{ reportText.text1 }} </p>
<p> {{ reportText.text2 }} </p>
In my browser, when I inspect my elements I get to see the following results:
<p>This is some subject text</p>
<p>This is the conclusion</p>
As you can see, there is suddenly a difference, one p element uses and the other , even though I started of with both strings only using . I know and technically represent the same thingm, but the problem with the string is that it gets treated as a string with 1 large word instead of multiple separate words. This screws up my layout and I can't solve this by using certain css properties (word-wrap etc.)
Other things I have tried:
Tried sanitizing the strings by using .replace( , ), but that doesn't do anything. I assume this is because it basically is the same, so there is nothing to really replace. Same reason why I have to use blockcode on stackoverflow to make the destinction between and .
Logged the data from vue to see if there is any noticeable difference, but I can't see any. If I log the data/reportText I again only see string with 's
So I have the following questions:
Why does this happen? I can't seem to find any logical explanation why it sometimes uses 's and sometimes uses 's, it seems random, but I am sure I am missing something.
Any other things I could try to follow the path my string takes, so I can see where the transformation from to happens?
Per the comments, the solution devised ended up being a simple unicode character replacement targeting the \u00A0 unicode code point (i.e. replacing unicode non-breaking spaces with ordinary spaces):
str.replace(/[\\u00A0]/g, ' ')
Explanation:
JavaScript typically allows the use of unicode characters in two ways: you can input the rendered character directly, or you can use a unicode code point (i.e. in the case of JavaScript, a hexadecimal code prefixed with \u like \u00A0). It has no concept of an HTML entity (i.e. a character sequence between a & and ; like ).
The inspector tool for some browsers, however, utilizes the HTML concept of the HTML entity and will often display unicode characters using their corresponding HTML entities where applicable. If you check the same source code in Chrome's inspector vs. Firefox's inspector (as of writing this answer, anyway), you will see that Chrome uses HTML entities while Firefox uses the rendered character result. While it's a handy feature to be able to see non-printable unicode characters in the inspector, Chrome's use of HTML entities is only a convenience feature, not a reflection of the actual contents of your source code.
With that in mind, we can infer that your source code contains unicode characters in their fully rendered form. Regardless of the form of your unicode character, the fix is identical: you need to target these unicode space characters explicitly and replace them with ordinary spaces.

What are the exponent characters (in non-formatted text)? How can I create these exponent characters?

I´m searching for a list of exponents like ¹²³ and so on and the same with letters. Note these still remain superscripted even in plain text.
Does something like these exist? If not, how can I create those?
(I need them for a website-project)
Unicode versions of superscripted/subscripted characters exist for all ten digits but not for all letters. They remain superscripted/subscripted in a plain-text environment without the need of format tags such as <sup>/<sub>.
However (as of v14), not all letters have Unicode superscripts. Furthermore, they are scattered along different Unicode ranges, and are in fact used mainly for phonetic transcription. Additionally, they are used for compatibility purposes especially if the text does not support markup superscripts and subscripts.
Exponent characters:
These are mostly used for mathematical and referencing usage.
- ⁰ [U+2070]
- ¹ [U+00B9, Latin-1 Supplement]
- ² [U+00B2, Latin-1 Supplement]
- ³ [U+00B3, Latin-1 Supplement]
- ⁴ [U+2074]
- ⁵ [U+2075]
- ⁶ [U+2076]
- ⁷ [U+2077]
- ⁸ [U+2078]
- ⁹ [U+2079]
- ⁺ [U+207A]
- ⁻ [U+207B]
- ⁼ [U+207C]
- ⁽ [U+207D]
- ⁾ [U+207E]
- ⁿ [U+207F]
- ⁱ [U+2071]
The "linear", "squared", and "cubed" subscripts are the most familiar and are found in Latin-1 Supplement. All the others are found in Superscripts and Subscripts. Add 0x2070 to all the non-Latin-1 Supplement superscripts to obtain the code point value of these digits. See this Wikipedia article and the official Unicode codepage segment.
Interesting notes
There are also subtle differences between <sup> subscripts and Unicode subscripts; Unicode subscripts are entirely different codepoints altogether, and some fonts professionally design subscripted letters because <sup> subscripts may look thin.
Compare x² with x2, similarly x⁺ with x+ (the first involves Unicode, the second is markup)
The best solution is to use markup, such as <sup>.
You can't create the characters, but you can format then as super-scripts if you are generating HTML.
As to find which exist, you just have to use an unicode-character searching resource and look for "superscript" to have a listing -
This query, for example:
https://www.fileformat.info/info/unicode/char/search.htm?q=superscript&preview=entity
As you can see, all digits are available (more than once, even), but very few letters.
However, if you intend to generate HTML output, the <sup> tag will work for any text you want, and give the necessary semantic meaning to the text - you can read about it and try it online here: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sup

What is content in CSS before or after?

.icon-a:before { content: '\e803'; }
.icon-b:before { content: '\e96f'; }
Okay I know content can be used to render URL or quotes but what is happening in the above code?
I came across this code and it is confusing, I tried googling I can't find any.
Any help would be appreciated.
Thanks.
Quoting papiro as suggested here
Put simply, they're Unicode references. The "\e601", for example, is the hex code 0xe601. If you go here: http://unicodelookup.com/#0xe601/1 you'll see that the entry for that character is totally blank. It's in a part of the Unicode character set reserved for "private" use. Meaning icon libraries and the like can place whatever they want in those spots and not have to worry about overriding common characters like those of any of the alphabets of the world or a Chinese character, for instance.
In your case \e803 reffers to unicode character this
Hope this helps
It depends on font you are corrently using in parent element. This code is Unicode character code, which can display �. After \ code of character is entered.

HTML Special Characters for fraction with equal Numerator and Denominator

In my html page I have displayed fractions using html special character. My idea is to display 1/2, 2/2 and 3/3.
I have used &frac13; for 1/3 and &frac23; for 2/3 and the special charactera are displayed correctly. I took reference from this link HTML Special Characters
But when I tried using &frac33; for 3/3 it is not working. It is just displaying as it is, not converting to special character.
Could you someone please tell me what is the html special character for 3/3.
Thank You
<sup>3</sup>⁄<sub>3</sub>
Result: 3⁄3
Not all fractions have their own special character. For those fractions (like 3/3) which don't have slanted fraction characters, use the HTML entity ⁄:
<sup>3</sup>⁄<sub>3</sub> = 3⁄3
There is no named (or numeric) character reference for a character representing 3/3, since there simply is no such character.
In theory, the FRACTION SLASH U+2044 “⁄” character (representable as ⁄ in HTML, among other thing) can be used between digits to suggest that rendering routines present the combination as a typographic fraction. In practice, only some typesetting programs can do this, and web browsers come nowhere near.
Trying to play with HTML markup and/or CSS to construct something that looks like a typographic fraction (comparable to ½ in appearance) tend to produce messy results, including uneven line spacing.
The practical option is to use just common notations like 2/2. But if you want something like a typographic fraction, you could use MathML with MathJax. More exactly, you would use the mfrac element in MathML with the attribute bevelled="true". Sample code:
<!doctype html>
<title>Fractions with MathJax and MathML</title>
<script src=
"http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
Here we have the common fraction ½, then
a simulation with HTML and CSS:
<sup>1</sup>⁄<sub>2</sub>.
Note that this tends to create uneven line spacing.
There are some cures to that, but let us see how MathML works:
<math>
<mfrac bevelled="true">
<mn>1</mn>
<mn>2</mn>
</mfrac>
</math>.
Some text here to demonstrate that line spacing has not
been disturbed here.
Sample rendering:

Bi-directional Browser Title (Hebrew and English characters in Title)

I have a aspx page whose title has both Hebrew and English characters in it and the order of the words gets messed up.
Is there a way to style the title so the words don't get messed up, or is it an OS problem?
This is what the title should say:
This is what the title actually looks like:
#jleedev already has what seems to be the right answer. Here is a bit of background information on it:
Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts
There are some situations where you may not be able to use the markup described in the previous section. In HTML these include the title element and any attribute value.
In these situations you can use invisible Unicode characters that produce the same results.
To replicate the effect of the markup described in the example above related to nested base directions, we can use pairs of characters to surround the embedded text. The first character is one of U+202B RIGHT-TO-LEFT EMBEDDING (RLE) or U+202A LEFT-TO-RIGHT EMBEDDING (LRE). This corresponds to the markup <span dir="rtl"> or <span dir="ltr">, respectively. The second character is U+202C POP DIRECTIONAL FORMATTING (PDF). This corresponds to the in the markup. Below you can see how to apply this to the previous example.
<p>The title says "‫...‬" in Hebrew<p>
Try using the Unicode direction embedding codes:
The result looks like ‫הוספת קובץ JS עורך ישן‬ - Mozilla Firefox, and it's marked up as follows:
‫ (right-to-left embedding)
הוספת קובץ
JS
עורך ישן
‬ (pop directional formatting)
- Mozilla Firefox
(You ought to be able to write the title with <title dir="rtl">, but I couldn't get that to work.)
if you use utf-8 you shouldn't have a problem.
add the following to your <head> and make sure you are saving as utf-8
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />