Bidirectional text and numbers - html

I have a website that displays in two languages - english and farsi. The title of a list item can be in both languages mixed at the same time. All ok until here as far as you have text only it will render ok using direction:rtl in css.
But the catch is that I can also have a number inside or at the end of title (which in farsi is written and read same as in english - left to right). This ends up with a problem since no matter where I put that number it will mess up the words order in the title (the number is an ad ID at the end of the title).
To solve this issue I use &rlm and &lrm infront of the id - but the catch is that I have to switch this two according which language is choosen.
My correct html is as this (‏ is what fixes the id number issue in farsi):
<h3>
The name of my خدمات باشد is long
<span style="color:#999;">‏#89798798</span>
</h3>
JS FIDDLE: http://jsfiddle.net/WzF2D/
I tried setting direction:ltr on the span wrapping around ID but it still won't work. I also tried to use unicode-bidi:embed on h3 but also no go.
How can I solve this by using css only without having to rely on ‏?

I will assume that the desired rendering uses overall right-to-left writing, even though the text (at least in the example) is mainly English, with some words in Arabic letters inside the sentence. Moreover, I assume that expressions like “#89798798” are to be treated as separate fragments, so that when it appears after an English word, it is not considered as part of English text but set to the left of it, in RTL layout.
Under these (rather astonishing) premises, the CSS solution is to make such a fragment a bidirectionality isolate:
<span style="color:#999; unicode-bidi: embed">#89798798</span>

Related

Puncutation marks appear in the wrong side inside a right-to-left pharagraph tag in html

I am trying to mix between Hebrew and English within a single right-to-left paragraph tag, but puncuation marks are rendered on the opposite side.
For example, I wish to have the following rendered on the page:
But the double slash punctuation marks which should appear at the far left, are getting switched with the single puncuation mark at the far right, as you can see from running the following code snippet:
<p dir="rtl">אחת שתיים \\one\two\</p>
I had tried solving the problem using different methods (For example, using the unicode-bidi css property) but none of my attempts have worked.
Note: Changing the original text inside the paragraph tag to include special unicode characters (such as rlm control characters) or dividing the text within the tag into multiple tags is not an option in my case (I am trying to solve this problem without changing the html structure).
Preferably, I would want to solve this problem using only html or css, but also javascript might be an option, if one can't do it the other ways.

HTML emsp entity exact behavior

I am having trouble finding an exact reference of the behavior of the HTML emsp entity. I have looked at W3.org, MDN, W3schools, and here, but I have not yet found a description that describes its breaking or wrapping behavior in HTML that does not have any special styling applied.
The code below shows an experiment I resorted to, but I am still a bit confused about when it will and will not wrap. Is there a good reference?
<!DOCTYPE html>
<html>
<style>
body {
font-size: 20px; font-family: Courier, fixed;
}
</style>
<body>
<p>Following is some text with some embedded emsp entities.<br>Here is one mid-word: sam ple. <br>And here is one on each side of a dash: lock - step.<br>Then, how about one after a period?<br>Right after the next period is one and then a normal space.  How about the standard space and then the emsp?<br>That sequence follows this sample sentence.  (Note that since the regular space came first, this can cause this text after it to become indented, whereas the emsp-then-regular space occurrence just before will never do that, I think.) As long as I'm looking at them after the end of a sentence, we should try putting just one emsp after a sentence instead of the regular space caracter.<br>I thought that would stick the two sentences together, but it does not do so here. Indeed, this is consistent with its behavior mid-word. Okay, how about multiple occurrences of it?<br>There are 3 in the brackets here: [   ] and so on. I played with that last one a bit and I cannot get it to break after the 3 emsps. Here, they seem to keep their width (they are not combined into one) and they are not breakable, not even either before the first or after the last one. So, I seem to only be able to get the "[" and never the "]" as the first character on a new line.<br>Okay, more extremely, trying brackets around 5 emsp chars, a word, and 5 more emsp chars: [     word     ] .<br>There we seem able to break before "word", but still never before "]". What's going on there?
</p>
<p>From the examples above, I think the mystery around emsp characters is mostly resolved for me.</p>
<p>Consider the standard behavior for a regular space character. Here, first remember that multiple occurrences of regular whitespace characters are all compressed as if they were a single regular space character. Then, of course, the regular space character takes a certain width, and lines are never broken just before the regular space, only after it. And the space normally allocated for a regular space character does not need to be rendered at the right edge of a box.</p>
<p>Similarly, text can break after an emsp character, but will not do so before. It is wider than a regular space character, but mostly behaves like it. Where it differs is if you have multiple emsp entities right next to each other. In that case, no break will occur before, within, or after the group (unless there is whitespace before or after it, in which case the whitespace is the location of the break). But if a set of multiple emsp characters are placed directly between two non-white characters (as in the bracket example above) then they are not compressed and no breaking occurs. That's all I'm thinking of at the moment ...</p>
</body>
</html>
emsp is a white space having the same width as the letter "M"
I found a brief description at http://opencoder.net/emsp.html
Whether it is a breaking or a non-breaking space, I would say the easiest way to find out is to test it.

Bidirectional (BiDi) text inside HTML textarea not respecting LRM control character

I'm having a hard time with making BiDi strings work inside an HTML textarea as I'd expect.
This test string contains both Arabic and English, plus sequences of pseudo-tags (<1/>, <2/>), which are composed of neutral-direction characters (<, >, /, numbers) and should inherit their direction by the strong-direction character before them.
Given that these pseudo-tags are positioned after both RTL and LTR text, I need to force the direction of the text putting one LRM (U+200E, ‎) char before each pseudo-tags.
The result it's not what I expected:
Note that the textarea has the direction property set as follow: dir='rtl'
Tested with both Chrome and FF, none of them seems to work as expected. Am I missing something?
Results on Jsfiddle are even different: https://jsfiddle.net/o7d2ymdc/1/
Unfortunately, displaying these inside a textarea is going to be extremely difficult, if at all possible.
There are several issues that are at play here, among them is the fact that brackets and parentheses are mirrored in the Unicode Bidirectional algorithm: This <span dir="ltr"><</span> is rendered as '<', while this <span dir="rtl"><</span> is rendered as '>'. And all of this is added on top of the fact that we have different definitions of "end of string" in either of the RTL and LTR strings.
Your best bet could be using ContentEditable. You can display editable rich text - that is actually html nodes - and essentially isolate your RTL pieces from the HTML markup properly with spans, as if you would have statically displayed it. However, if this textbox allows for custom user-generated text, you may need to come up with a good algorithm that wraps the bidirectional text automatically as the user types, which can be a pretty big challenge.
If this helps, you're not the only one to deal with this. If you edit HTML blocks in Arabic Wikipedia, for example, you will see the exact same problem (which makes editing HTML and wikitext a fairly big challenge)
This problem is also one of the reasons why people prefer a WYSIWYG editor - that has proper contextual and conceptual separation between the markup/style and the text itself.

Two spaces after every full stop in paragraph using CSS?

How do I put two spaces after every full stop in a paragraph using CSS?
Ah, the old "two-spaces-after-a-period" meme rears its ugly head again.
Two spaces after a period is something that pertains to the typewriter world, or the monospaced font world. We moved beyond it long ago, starting with TeX or even before. The point is not to have one or two space characters after a period, but to have a pleasing amount of space there. Algorithms like TeX go to great length to do so. The algorithms in modern web browsers are still primitive by comparison, but are starting to do better. Consider the following:
You'll see that the space after the period is (slightly) greater than the inter-word space, as it should be.
What about the case of justification? You'd hope the browser would put the extra space between sentences, in preference to putting it between words. And that's what happens:
Anyway, so you want more fine-grained control, to realize your own typographical vision on your web pages. The following has four characters between the sentences:
You could also use spaces of different widths from Unicode to get just the amount of space you want (see Wikipedia article).
So is there any way to do this automatically? CSS has a word-spacing property, but no sentence-spacing property (actually, it's not that easy to figure out what a "sentence" is, even in English, and less so in other languages). Of course, putting more spaces in your HTML is not going to do a thing, since HTML treats any run of white space as a single space. So you're going to have to write some code, or find a plug-in, which traverses the text in your page and inserts markup. Or, add a plug-in or something to your CMS to spit out code which is marked up appropriately. Your alternatives for doing so are:
Add or a combination of different-width Unicode spaces.
As another poster suggested, use span tags with margin.
As a variant on the above, use a <span class="sentence"> element, with a CSS rules like .sentence::after { content: "\2002"; }, where 2002 is the "en-space". This results in:
However, the bottom line is that the web is not a typographical environment, notwithstanding the many worthy efforts to nudge it in that direction. Depending on your goals, you might consider creating your documents in a high-end document preparation environment, and publishing them as PDFs, for example.
The two spaces concept after a sentence is not "ugly" - in fact, it's just the opposite. Because of modern font kerning as well as the variety of fonts that Web browsers now support, it's sometimes very difficult to determine if a sentence has ended or if there is simply a word that is abbreviated that requires a period, not to mention a look of constant run-on. With 'fat' letters beginning a sentence, such as an upper-case "W", it can appear as though there is actually no space at all. Adding an additional space after a sentence provides readers with clear breaks. However, I get it that it would be quite difficult to create CSS that could "understand" what a sentence is so that it would automatically insert an additional space after each.
You could put your full stop in a span-tag and give it some CSS attributes, like "margin-right: 5px;", if it's only the appearance you are looking for.
Can only be done if you put your full stop to a tag, like <span>. For example :
www<span>.</span>google<span>.</span>com
Then the css is :
span:after{
content : " "; /*two spaces*/
}

How do I add a small number into my html?

I'm having a difficult time doing this probably because I'm not sure what to even call it. But a client wants me to put a small number one in some text at the end of a sentence - wich indicates that they should find the small 1 at the bottom of the page to read more details about that particular sentence. You see this type of thing a lot on container and supplement labels. Can this be accomplished with html? Please let me know if you need clarification.
Kind of like this except instead of an astrick, a small one:
*These statements have not been evaluated by the Food and Drug Administration. This product is not intended to diagnose, treat, cure or prevent any disease.
You must be looking for superscript.
<p>This is <sub>subscript</sub> text.</p>
<p>This is <sup>superscript</sup> text.</p>
In practice:
This is subscript text.
This is superscript text.
Use the character “¹” U+00B9 SUPERSCRIPT ONE. If you do not know how to enter it in your authoring environment (in Windows, you can usually enter it by Alt 0185), you can use the entity reference ¹.
Unlike <sup>1</sup> (which is what most people would use), this means using a character designed by a typographer to fit the font. The stroke width is correct, and so is the vertical placement. Using sup tends to mess up spacing between lines, among other things.
However, footnote references don’t work that well in HTML documents. Especially if you want to make the reference a link to the footnote itself, a tiny little superscript is very poor usability and accessibility. An expression like “[1]” is much better.
Well I think this problem has to do with citation-referencing a resource. You can do the following:
At the end of a sentence you can write the number of the reference(references are located at the bottom of a webpage most of the times) like this-actually is a HTML anchor tag:
The sentence goes here.<sub>1</sub>
Then at the bottom of the screen you can write the reference like this:
<a name="1">The reference details go here.</a>