Is there a HTML character that, on all (major) browsers (plus IE8 sadly) displays nothing and doesn't add any extra space?
So, an alternative to but which doesn't add whitespace to the page, and which won't ever show up as an ugly "unrecognised character" marker or ?.
Why: in my case, I'm trying to work around a problem on an old, proprietary CMS that is removing empty but necessary HTML elements that are required because other parts of the system will fill them dynamically.
Imagine something like (simplified trivial example) <span class="placeholder" data-type="username"></span> which is populated with a user's username if a user is logged in - but this old-school CMS sees it as being empty and removes it.
There seem to be two options that mostly fit the bill. They seem to reliably not show anything when in a <span>, but they (particularly the second option) might have a minor effect on copy/paste and word breaking in some cases.
Zero-width space
aka which behaves the same as the (now in HTML5) <wbr> - used to make words break at certain points without changing the display of the words.
<h1>This text is full<span></span> of spans with char<span></span>acte<span></span>rs that affe<span></span>ct word brea<span></span>king but don't show up</h1>
<h1>Especially in das super<span></span>doupercrazy<span></span>long<span></span>worden.</h1>
Seems to work fine on modern browsers and IE7+ (not tested on IE6).
Soft hyphen
- like a zero-width space but (in theory) adds a hyphen when it breaks a word across a line.
<h1>This text is full<span></span> of spans with char<span></span>acte<span></span>rs that affe<span></span>ct word brea<span></span>king but don't show up</h1>
<h1>Especially in das super<span></span>doupercrazy<span></span>long<span></span>worden.</h1>
<h1>Example where das superdoupercrazylongword contains no spans.</h1>
Fine on modern browsers and IE7+ (not tested on IE6), though as some comments note there are issues with these turning into regular hyphens when copied and pasted, for example, here's how it pastes from Chrome to Notepad, on Windows 8.1:
Within a span, it seems to never add a hyphen (but still better to use zero-width spaces if possible).
Edit: I found an older SO answer discussing these as a solution to a different problem which suggests these are robust except for possible copy/paste quirks.
The only other issue with these I could find in research is that apparently some search engines may treat words containing these as being split (e.g. awesome might match searches for awe and some instead of awesome).
There are two characters that are graphic characters but defined to be zero width: U+200B ZERO WIDTH SPACE and U+FEFF ZERO WIDTH NO-BREAK SPACE. The former acts like a space character, so that it is a separator between words and allows line breaking in formatting, whereas the latter explicitly forbids line breaks. It depends on the purpose and context which one you should use. The can be represented in HTML as and .
There characters work well in most browsing situations. However, in IE 6, they tend to be rendered as small rectangles, since IE 6 does not know these characters and tries to render them as if they were graphic characters (which lack glyphs).
There are also control characters that are allowed in HTML, such as U+200E LEFT-TO-RIGHT MARK and U+200D ZERO WIDTH JOINER. They have no rendering as such, though they may affect rendering of graphic characters, e.g. by setting writing direction, affecting ligature behavior, etc. Due to the possibility of such effects, it might be risky to use them as “dummy” characters.
Chrome, IE, and Safari break lines at hyphens but Firefox doesn't.
Is there any way to make Firefox break lines at hyphens, like other browsers?
Insert the <wbr> tag after the hyphen. This tag is not present in any HTML specification (yet—it is in HTML5 drafts), but it has worked for a long time in browsers.
Firefox automatically treats a hyphen as allowing a line break after it when there are sufficiently many characters around the hyphen. But if you wish to allow line breaks more widely than that, use <wbr>, e.g. pre-<wbr>war.
Not easily. Try inserting a zero-width space () after each hyphen. For example:
a-really-long-hyphenated-phrase
This will make Firefox wrap as if there's a space, but it won't visually display that space.
It's easier to implement this if you have something processing your output server-side. Just run hyphens through a quick string replace.
I'm writting a simple paragraph in both English and Japanese, using only HTML and CSS. The English text breaks lines normally (when a word doesn't fit on a line anymore, it's pushed to the next one).
With Japanese though, not a whole word is pushed to the next line, but part of it only. I've tried setting word-wrap to break-word and normal, but nothing changes (with the Japanese text).
How to I make whole words in Japanese jump to the next line like it happens in English?
English separates words with spaces, Japanese doesn't.
Whether characters in Japanese form a word or not depends on context. In many cases, looking for certain grammatical (Kana) particles could be used to separate words - but this wouldn't even be close to being reliable.
Essentially, you'd need a Japanese dictionary / understanding of the language to identify where the words start and end - a browser won't know how to do this.
Alternatively, if you know the start and end of the words, you could perhaps wrap each one in a span - then use CSS to ensure each span wraps to a new line as a whole when it doesn't fit.
Japanese has specific rules that are followed when breaking text. They are called 禁則処理 (kinsoku shori). Here is a link explaining the rules. The rules are mostly concerned with special characters. Have a look at any popular Japanese webpage and you will see that multi-character (kana and kanji) words are often split. I often see です split between lines.
Update:
I stumbled across this tool recently. I haven't tried it out yet, but the theory is solid. If someone is looking to improve the line breaks with Japanese text this could be a good solution.
I'm not an expert with Japanese specifically so it's hard for me to tell if things are wrapping correctly, but I just had to solve this problem myself and both word-break: keep-all and white-space: nowrap seemed to solve the issue for me, so those might be worth trying out.
Until the browsers are smart enough to do on-the-fly semantic analysis of the language, there are only a couple of options :
1/ Understand enough of the language to be able to group semantic elements in their own, unbreakable DOM elements. Something like (without the line breaks) :
<span class="el">私は</span>
<span class="el">キッチンで</span>
<span class="el">パンを</span>
<span class="el">食べました。</span>
Then in CSS, use something like .el { display: inline-block; }. You probably want to do this only on headings and important text pieces only, since it could impact accessibility (ie. how screen readers interpret the text). The other inconvenients are that 1/ you need to understand the text to know where to add the blocks, and 2/ this obviously only works for static text (and even in that case, it's still a manual, painstaking process).
2/ Use a tool that does the grouping for you. It could be something on the client side, like TinySegmenter (whitch does segment a bit too much for my taste IMHO), or on the server-side, with things like Budou that use Google Cloud Natural Language API and ML to analyze your sentences. The downsides (at least for Budou) is that 1/ you need Python (I think that I saw a Node.js port somewhere), and 2/ It's not free.
Hope this helps!
try setting the css property
line-break:strict;
Check it out here.
How do you solve the problem with soft hyphens on your web pages? In a text there can be long words which you might want to line break with a hyphen. But you do not want the hyphen to show if the whole word is on the same line.
According to comments on this page <wbr> is a non standard "tag soup invented by Netscape". It seems like has its problems with standard compliance as well. There seems to be no way to get a working solution for all browsers.
Which is your way for handling soft hyphens and why did you choose it? Is there a preferred solution or best practice?
See related SO Discussion here.
Unfortunately, ­'s support is so inconsistent between browsers that it can't really be used.
QuirksMode is right -- there's no good way to use soft hyphens in HTML right now. See what you can do to go without them.
2013 edit: According to QuirksMode, now works/is supported on all major browsers.
Feb 2015 summary (partially updated Nov 2017)
They all perform pretty well, edges it as Google can still index of words containing it.
In browsers: and both display as expected in major browsers (even old IE!). <wbr> isn't supported in recent versions of IE (10 or 11) and doesn't work properly in Edge.
When copied and pasted from browsers: (tested 2015) as expected for and for Chrome and Firefox on Mac, on Windows (10), it keeps the characters and pastes hard hyphens into Notepad and invisible soft hyphens into applications that support them. IE (win7) always pastes with hyphens, even in IE10, and Safari (Mac) copies in a way which pastes as hyphens in some applications (e.g. MS Word), but not others
Find on page works for and on all browsers except IE which only matches exact copied-and-pasted matches (even up to IE11)
Search engines: Google matches words containing with words typed normally. As of 2017 it appears to no longer match words containing . Yandex appers to be the same. Bing and Baidu seem to not match either.
Test it
For up-to-date live testing, here are some examples of unique words with soft hyphens.
- confumbabblicationism - confumbabblicationism
..............................................................................................................confumbabblicationism
..................................................................................................................confumbabblicationism
<wbr> - donfounbabbl<wbr>ication<wbr>ism. This site removes <wbr/> from output. Here's a jsbin.com snippet for testing.
- eonfulbabblicationism - eonfulbabblicationism
.................................................................................................................eonfulbabblicationism
....................................................................................................................eonfulbabblicationism
Here they are with no shy hyphens (this is for copying and pasting into find-on-page testing; written in a way which won't break the search engine tests):
ZZZconfumbabblicationismZZZdonfounbabblicationismZZZeonfulbabblicationismZZZ
Display across browsers
Success: displaying as a normal word, except where it should break, when it breaks and hyphenates in the specified place.
Failure: displaying unusually, or failing to break in the intended place.
Chrome (40.0.2214.115, Mac): success, <wbr> success, success
Firefox (35.0.1, Mac): success, <wbr> success, success
Safari (6.1.2, Mac): success, <wbr> not tested yet, success
Edge (Windows 10): success, <wbr> fail (break but no hyphen), success
IE11 (Windows 10): success, <wbr> fail (no break), success
IE10 (Windows 10): success, <wbr> fail (no break), success
IE8 (Windows 7): erratic - sometimes, none of them work at all and they all just follow css word-wrap. Sometimes, they seem to all work. Not yet found any clear pattern as to why.
IE7 (Windows 7): success, <wbr> success, success
Copy-paste across browsers
Success: copying and pasting the whole word, unhyphenated. (tested on Mac pasting into browser search, MS Word 2011, and Sublime Text)
Failure: pasting with a hyphen, space, line break, or with junk characters.
Chrome (40.0.2214.115, Mac): success, <wbr> success, success
Firefox (35.0.1, Mac): success, <wbr> success, success
Safari (6.1.2, Mac): fail into MS Word (pastes all as hyphens), success in other applications <wbr> fail, fail into MS Word (pastes all as hyphens), success in other applications
IE10 (Win7): fail pastes all as hyphens, <wbr> fail, fail pastes all as hyphens
IE8 (Win7): fail pastes all as hyphens, <wbr> fail, fail pastes all as hyphens
IE7 (Win7): fail pastes all as hyphens, <wbr> fail, fail pastes all as hyphens
Search engine matching
Updated in November 2017. <wbr> not tested because StackOverflow's CMS stripped it out.
Success: searches on the whole, non-hyphenated word find this page.
Failure: search engines only find this page on searches for the broken segments of the words, or a word with hyphens.
Google: fails, succeeds
Bing: fails, fails
Baidu: fails, fails (can match fragments within longer strings but not the words on their own containing a or )
Yandex: fails, succeeds (though it's possible it's matching a string fragment like Baidu, not 100% sure)
Find on page across browsers
Success and failure as search engine matching.
Chrome (40.0.2214.115, Mac): success, <wbr> success, success
Firefox (35.0.1, Mac): success, <wbr> success, success
Safari (6.1.2, Mac): success, <wbr> success, success
IE10 (Win7): fail only matches when both contain shy hyphens, <wbr> success, fail only matches when both contain shy hyphens
IE8 (Win7): fail only matches when both contain shy hyphens, <wbr> success, fail only matches when both contain shy hyphens
IE7 (Win7): fail only matches when both contain shy hyphens, <wbr> success, fail only matches when both contain shy hyphens
There is an ongoing effort to standardize hyphenation in CSS3.
Some modern browsers, notably Safari and Firefox, already support this. Here is a good and up to date reference on browser support.
Once the CSS hyphenation gets implemented universally, that would be the best solution. In the meantime, I can recommend Hyphenator - a JS script that figures out how to hyphenate your text in the way most appropriate for a particular browser.
Hyphenator:
relies on Franklin M. Liangs hyphenation algorithm, commonly known from LaTeX and OpenOffice.
uses CSS3 hyphenation where it is available,
automatically inserts on most other browsers,
supports multiple languages,
is highly configurable,
gracefully falls back in case javascript is not enabled.
I've used it and it works great!
I use , inserted manually where necessary.
I always find it a pity that people don’t use techniques because there is some—maybe old or strange—browser around which doesn’t handle them the way they were specified. I found that is working properly in both recent Internet Explorer and Firefox browsers, that should be enough. You may include a browser check telling people to use something mature or continue at their own risk if they come around with some strange browser.
Syllabification isn’t that easy and I cannot recommend leaving it to some Javascript. It’s a language specific topic and may need to be carefully revised by the deskman if you don’t want it to turn your text irritating. Some languages, such as German, form compound words and are likely to lead to decomposition problems. E.g. Spargelder (germ. saved money, pl.) may, by syllabification rules, be wrapped in two places (Spar-gel-der). However, wrapping it in the second position, turns the first part to show up as Spargel- (germ. asparagus), activating a completely misleading concept in the head of the reader and therefore shoud be avoided.
And what about the string Wachstube? It could either mean ‘guardroom’ (Wach-stu-be) or ‘tube of wax’ (Wachs-tu-be). You may probably find other examples in other languages as well. You should aim to provide an environment in which the deskman can be supported in creating a well-syllabified text, proof-reading every critical word.
It is very important to notice that, as of HTML5, <wbr> and are not supposed to do the same thing!
Soft hyphens
is a soft hyphen, i.e., U+00AD: SOFT HYPHEN. For example,
innehållsförteckning
might be rendered as
innehållsförteckning
or as
innehålls-
förteckning
As of today, soft hyphens work in Firefox, Chrome, and Internet Explorer.
The wbr element
The wbr element is a word-break opportunity, which will not display a hyphen if a line break occurs. For example,
ABCDEFG<wbr/>abcdefg
might be rendered as
ABCDEFGabcdefg
or as
ABCDEFG
abcdefg
As of today, this element works in Firefox and Chrome.
The zero-width space entity can be used in place of <wbr> tag reliably on virtually every platform.
Also useful is the word joiner entity, that can be used to prohibit a break. (Insert between each character of a word, except where you want the break.)
With the two of these, you can do anything.
This is a crossbrowser solution that I was looking at a little while ago that runs on the client and using jQuery:
(function($) {
$.fn.breakWords = function() {
this.each(function() {
if(this.nodeType !== 1) { return; }
if(this.currentStyle && typeof this.currentStyle.wordBreak === 'string') {
//Lazy Function Definition Pattern, Peter's Blog
//From http://peter.michaux.ca/article/3556
this.runtimeStyle.wordBreak = 'break-all';
}
else if(document.createTreeWalker) {
//Faster Trim in Javascript, Flagrant Badassery
//http://blog.stevenlevithan.com/archives/faster-trim-javascript
var trim = function(str) {
str = str.replace(/^\s\s*/, '');
var ws = /\s/,
i = str.length;
while (ws.test(str.charAt(--i)));
return str.slice(0, i + 1);
};
//Lazy Function Definition Pattern, Peter's Blog
//From http://peter.michaux.ca/article/3556
//For Opera, Safari, and Firefox
var dWalker = document.createTreeWalker(this, NodeFilter.SHOW_TEXT, null, false);
var node,s,c = String.fromCharCode('8203');
while (dWalker.nextNode()) {
node = dWalker.currentNode;
//we need to trim String otherwise Firefox will display
//incorect text-indent with space characters
s = trim( node.nodeValue ).split('').join(c);
node.nodeValue = s;
}
}
});
return this;
};
})(jQuery);
I suggest using wbr, so the code can be written like this:
<p>这里有一段很长,很长的<wbr
></wbr>文字;这里有一段</p>
It won't lead space between charaters, while won't stop spaces created by line breaks.
I used soft hyphen unicode character successfully in few desktop and mobile browsers to solve the issue.
The unicode symbol is \u00AD and is pretty easy to insert into Python unicode string like s = u'Языки и методы програм\u00ADми\u00ADро\u00ADва\u00ADния'.
Other solution is to insert the unicode char itself, and the source string will look perfectly ordinary in editors like Sublime Text, Kate, Geany, etc (cursor will feel the invisible symbol though).
Hex editors of in-house tools can automate this task easily.
An easy kludge is to use rare and visible character, like ¦, which is easy to copy and paste, and replace it on soft hyphen using, e.g. frontend script in $(document).ready(...). Source code like s = u'Языки и методы про¦гра¦м¦ми¦ро¦ва¦ния'.replace('¦', u'\u00AD') is easier to read than s = u'Языки и методы про\u00ADг\u00ADра\u00ADм\u00ADми\u00ADро\u00ADва\u00ADния'.
Sometimes web browsers seems to be more forgiving if you use the Unicode string rather than the entity.
If you have bad luck and still has to use JSF 1, then the only solution is to use , does not work.
<wbr> and
Today you can use both.
<wbr> use to break and do not put more information.
Example, use to show links:
https://stackoverflow.com/questions/226464/soft-hyphen-in-html-wbr-vs-shy
when necessary, at this point the text will break and add a hyphen.
Example:
"É impossível para um homem aprender aquilo que ele acha que já sabe."
div{
max-width: 130px;
border-width: 2px;
border-style: dashed;
border-color: #f00;
padding: 10px;
}
<div>https://<wbr>stackoverflow.com<wbr>/questions/226464<wbr>/soft-hyphen-in-<wbr>html-wbr-vs-shy</div>
<div>É impossível para um homem aprender aquilo que ele acha que já sabe.</div>
Keep it simple. A soft hyphen is just a character. Like A or B or 🦊. You don't need a special character to include it, you can just type it (if your computer is set up for that), or copy/paste it from elsewhere.
Like here: Soft Hyphen on unicode explorer
Of course, the character you should copy is invisible, so that makes it a little difficult I guess :) But it still works. Right-click and copy.
You also don't need to render text containing a soft hyphen in any special way (like "DangerouslySetInnerHTML" with React). It is just a character; it works like it should in all relevant browsers worth any amount of salt.
As an example, in the next paragraph I will write the phrase "a really long word with soft hyphens instead of whitespace that should span at least two lines ow text no matter how big your screen is". Except I'll put in soft hyphens instead of whitespace. Here goes:
areallylongwordwithsofthyphensinsteadofwhitespacethatshouldspanatleasttwolinesoftextnomatterhowbigyourscreenis