A character is a space which doesn't allow for line breaking.
<p>lorem ipsum here are some words and so on</p>
| lorem ipsum |
| here are some words and so |
| on |
What's the opposite of that? That is, a character which is NOT rendered as a space, but CAN be used for line breaking.
<p>foo supercalifragilisticexpialidocious bar</p>
<!-- put char here ^ and here ^ -->
|foo supercalifragi |
|listicexpiali |
|docious bar |
or with wider size:
|foo supercalifragilisticexpiali |
|docious bar |
I'm aware of the soft-hyphen character, but for my purposes, I specifically do not want a hyphen added at the break.
You want the unicode character ZERO-WIDTH SPACE (\u200B).
You can get it in HTML with or .
Explicit breaks and non-breaks:
LB7 : Do not break before spaces or zero width space.
LB8 : Break before any character following a zero-width space, even if one or more spaces intervene.
http://unicode.org/reports/tr14/
There also is the little-known wbr tag, which lets the browser to decide whether to break the line or not.
There's a nice page over at quirksmode.org that answers this question quite nicely IMHO. http://www.quirksmode.org/oddsandends/wbr.html
In short: use <wbr /> or (or but you mentioned you don't want the dash).
use <wbr>.
You can use CSS3 property called word-wrap
p.test {word-wrap:break-word;}
Hope it helps!
theres a lot of discussion about this but it has become more or less standard to use
Related
Hello I am trying to compile an EPUB v2.0 with html code extracted from Indesign. I have noticed there are a lot of "special characters" either at the beginning of a paragraph or at the end. For example
<p class="text_indent0px font_size0_8em line_height1_325 margin_bottom1px margin_left0px margin_right0px sans_serif floatleft">E<span class="small_caps">VELYNE</span> </p>
What is this
and can I either get rid of it or replace it with a "nbsp;"?
	
Is the ascii code for tabs. So I guess the paragraphs were indented with tabs.
If you want to replace them with then use 4 of them
That would be a horizontal tab (i.e. the same as using the tab key).
If you want to replace it, I would suggest doing a find/replace using an ePub editor like Sigil (http://sigil-ebook.com/).
represents the horizontal tab
Similarly represent space.
To replace you have to use
In the HTML encoding &#{number}, {number} is the ascii code. Therefore, is a tab which typically condenses down to one space in HTML, unless you use CSS (or the <pre> tag) to treat it as pre formatted text.
Therefore, it's not safe to replace it with a non-breaking or a regular space unless you can guarantee that it's not being displayed as a tab anywhere.
div:first-child {
white-space: pre;
}
<div> Test</div>
<div> Test</div>
<pre> Test</pre>
See https://developer.mozilla.org/en-US/docs/Web/CSS/white-space and http://ascii.cl/
is the entity used to represent a non-breaking space
decimal char code of space what we enter using keyboard spacebar
decimal char code of horizontal tab
and both represent space but is non-breaking means multiple sequential occurrence will not be collapsed into one where as for the same case, ` will collapse to one space
= approx. 4 spaces and approx. 8 spaces
There are four types of character reference scheme used.
Using decimal character codes (regex-pattern: &#[0-9]+;),
Using hexadecimal character codes (regex-pattern: &#x[a-f0-9]+;),
Using named character codes (regex-pattern: &[a-z]+;),
Using the actual characters (regex-pattern: .).
Al these conversions are rendered same way. But, the coding style is different. For example, if you need to display a latin small letter E with diaeresis then you could use any of the below convention:
ë (decimal notation),
ë (hexadecimal notation),
ë (html notation),
ë (actual character),
Likewise, as you said, what should be used (a) (decimal notation) or (b) (html notation) or (c) (decimal notation).
So, from the above analogy, it can be said that the (a), (b) and (c) are three different kind of notation of three different characters.
And, this is for your information that, (a) is a Horizontal Tab, the (b) one is the non-breaking space which is actually in decimal notation and the (c) is the decimal notation for normal space character.
Now, technically space at the end of the paragraph, is nothing but meaningless. Better, you could discard those all. And if you still need to use space inside <pre> elements, not in <p> or <div>.
Hope this helps...
I've got a phone number in my markdown that tends to break over the end of the line. In HTML I could wrap it in a <nobr> tag to have it keep together properly. What is the correct way to do this in markdown?
You can use non-break hyphen character ( ‑ )
1‑111‑111‑1111
for
1-111-111-1111
Or you could need the phone number format with spaces in between, then use no-break space character ( )
1 111 111 1111
for
1 111 111 1111
Apparently I didn't realize you could just embed html in markdown.
<nobr>[1-111-111-1111](tel:11111111)</nobr>
works fine
The 'nobr' tag is non-standard HTML, and while it is supported by browsers for legacy purposes, the correct way is to handle it via CSS.
CSS equivalent:
.nobr { white-space:nowrap; }
In my html page I have displayed fractions using html special character. My idea is to display 1/2, 2/2 and 3/3.
I have used ⅓ for 1/3 and ⅔ for 2/3 and the special charactera are displayed correctly. I took reference from this link HTML Special Characters
But when I tried using &frac33; for 3/3 it is not working. It is just displaying as it is, not converting to special character.
Could you someone please tell me what is the html special character for 3/3.
Thank You
<sup>3</sup>⁄<sub>3</sub>
Result: 3⁄3
Not all fractions have their own special character. For those fractions (like 3/3) which don't have slanted fraction characters, use the HTML entity ⁄:
<sup>3</sup>⁄<sub>3</sub> = 3⁄3
There is no named (or numeric) character reference for a character representing 3/3, since there simply is no such character.
In theory, the FRACTION SLASH U+2044 “⁄” character (representable as ⁄ in HTML, among other thing) can be used between digits to suggest that rendering routines present the combination as a typographic fraction. In practice, only some typesetting programs can do this, and web browsers come nowhere near.
Trying to play with HTML markup and/or CSS to construct something that looks like a typographic fraction (comparable to ½ in appearance) tend to produce messy results, including uneven line spacing.
The practical option is to use just common notations like 2/2. But if you want something like a typographic fraction, you could use MathML with MathJax. More exactly, you would use the mfrac element in MathML with the attribute bevelled="true". Sample code:
<!doctype html>
<title>Fractions with MathJax and MathML</title>
<script src=
"http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
Here we have the common fraction ½, then
a simulation with HTML and CSS:
<sup>1</sup>⁄<sub>2</sub>.
Note that this tends to create uneven line spacing.
There are some cures to that, but let us see how MathML works:
<math>
<mfrac bevelled="true">
<mn>1</mn>
<mn>2</mn>
</mfrac>
</math>.
Some text here to demonstrate that line spacing has not
been disturbed here.
Sample rendering:
I have used space characters in html to give regular spaces in my text but interestingly some text is still does not have regular spaces. Please have a look,
<ul style="margin-left:40px;background-color:#6CF ;padding-left:20px;padding-right:10px;padding-top:10px;padding-bottom:10px; font-size:12px;" >
<li>CS-103 Programming Languages</li>
<li>EL-133 Electronics-I</li>
<li>MT-111 Calculus</li>
<li>CY-105 Applied Chemistry</li>
<li>PH-121 Applied Physics</li>
<li>HS-105 Pakistan Studies | HS-127 Pakistan Studies(for Foreigners)</li>
</ul>
Here is how it looks,
CS-103 Programming Languages
EL-133 Electronics-I
MT-111 Calculus
CY-105 Applied Chemistry
PH-121 Applied Physics
HS-105 Pakistan Studies | HS-127 Pakistan Studies(for Foreigners)
Please help out to make all list element look same. Thanks
The text does have regular spaces. The problem is that the font you use is not fixed width, and the length of the course type/number is throwing it off.
Use a table for stuff like that.
Depending on its semantic value, you could also use a definition list.
HTML:
<dl>
<dt>CS-103</dt>
<dd>Programming Languages</dd>
<dt>EL-133</dt>
<dd>Electronics-I</dd>
<dt>MT-111</dt>
<dd>Calculus</dd>
<dt>CY-105</dt>
<dd>Applied Chemistry</dd>
<dt>PH-121</dt>
<dd>Applied Physics</dd>
<dt>HS-105</dt>
<dd>Pakistan Studies | HS-127 Pakistan Studies (for Foreigners)</dd>
</dl>
CSS:
dl {
overflow: hidden;
}
dt {
float: left;
width: 80px
}
http://jsfiddle.net/SVdTt/
Brad's feedback about inconsistent spacing when using non-monotype fonts is correct (and there is no \t symbol to use for tabulation in html), however it may be more appropriate to use a definition list here with some styling applied.
Semantics fit perfectly (a term name dt followed by its description dd):
<dl>
<dt>CS-103</dt><dd>Programming Languages</dd>
<dt>EL-133</dt><dd>Electronics-I</dd>
...
</dl>
Fiddled
You will need to choose a monospaced font for them to look the same if I understand correctly.
This question already has answers here:
Encoding a tab in html [duplicate]
(4 answers)
Closed 8 years ago.
I have to render some text to a web page. The text is coming from sources outside my control and it is formatted using newlines and tab characters.
New lines (\n) can be replaced by br tags, but what about preserving tabs? A brief search reveals there is no way to directly render tab characters in HTML.
Why not just wrap the content in a <pre> tag? This will handle the \n as well as the \t characters.
An alternative to the non-breaking space would be the em space ( or ). It is usually rendered as a longer space, if that is an advantage.
A Quick & Dirty Way
For a quick fix, you can use the xmp tag to stop the browser from collapsing whitespace. The xmp tag contains text that should be rendered uninterpreted (and in a monospaced font).
The problem is that xmp tags have been deprecated since HTML3.2, and have been dropped from the HTML5 spec altogether. In practice, browsers still support xmp tags, so they can still be useful, but not in production.
The Proper Way
Tabs are for tabulating data. The proper way to tabulate data in HTML is to use the table tag. Every line in your original string translates to a row in the table, while each tab in the original string starts a new (left-aligned) cell in the table.
Imagine you had this (tab-aligned) string to begin with:
Spam 1.99
Cheese 2.99
Translated to HTML, that string would look like this:
<table>
<tr> <td> Spam </td> <td> 1.99 </td> </tr>
<tr> <td> Cheese </td> <td> 2.99 </td> </tr>
</table>
Note: If you wrapped the tab-aligned string in xmp tags and styled the HTML table to look like plain text, the rendered results would be the same.
replace \t with .
Each space you want will be a
As pointed out this isn't completely correct as it only pretends to be a tab as HTML doesn't actually output format a tab as you would expect.
If you're already replacing line breaks, why not do the same for tabs...?
str_replace("\t", ' ', $text);
, , or can be used.
W3 says little about this...
The character entities and denote an en space and an em space respectively, where an en space is half the point size and an em space is equal to the point size of the current font.
Read More at W3.org fro HTML3
Read More at W3.org for HTML4
Even more at Wikipedia (about spaces)