Somehow, in my website, the last word appearing in a contenteditable is marked as misspelled, even when it is the correct spelling.
Here is an example of such an entry with the last word clearly marked as misspelled:
Just in case there is the HTML code surrounding that contenteditable <div> tag.
Note: as mentioned by Carl below, that screenshot shows the HTML as loaded and not the HTML as used once jQuery ran against it. I have a simplified sample below that includes that data. Also it shows a ! at the end of the string, which makes it work right... (Any other character after the word, and the spelling works fine.)
I tried to replicate that problem in jsfiddle without luck, so I create a simplified version of the page with just that one widget:
http://alexis.m2osw.com/contenteditable/ContentEditableProblemSimplified.html
IMPORTANT NOTE: On first load, it works properly because Firefox does not check the spelling of words at that time. You have to click in that field once and the effect happens.
This happens in FireFox v53.0, it has been like that for a while though.
This is driving me absolutely, !&&%&$ insane... it defies everything that I can think of.
THIS character right here... " "
In between these quotes... open google chrome and inspect. You will see its a ... normal right? Now right click and actually view the source of this stack overflow page. It's a regular space... (also, the character I copied was an actual space).
I could understand if it's some kind of rich text editor or something, but in the raw html source is a regular space, so what gives?
Here's just with hitting the space key (which works fine)... " ".
You can even copy it and paste it everywhere and wreak havoc and make chrome put everywhere. Even though whats copied in your clipboard is just a SPACE.
I have these stupid characters show up everywhere randomly in my website and I have no idea where they come from, or WHY is google converting a SPACE into a nbsp;
I have tried inspecting the actual character code and it's a regular space from all things I can find...
Every single method I try shows it as a NORMAL space... so what gives?
If i use ruby and do " ".ord I get 32. If i do it with the broken space I also get 32.
Please help me im losing my mind.
edit: you can prove this... view source on this page and you will see two empty " " like normal. Now look in console and only the one will be a , yet the raw source is identical.
Image for people not using chrome (this is looking at this very post via chrome dev tools):
Here's the HTML of the same text you see when you view source... no nbsp to be found.
When I view this page's source in Internet Explorer, or download it directly from the server and view it in a text editor, the first space character in question is formatted like this in the actual HTML:
THIS character right here... " "
Notice the entity. That is Unicode codepoint U+00A0 NO-BREAK SPACE. Chrome is just being nice and re-formatting it as when inspecting the HTML. But make no mistake, it is a real non-breaking space, not Unicode codepoint U+0020 SPACE like you are expecting. U+00A0 is visually displayed the same as U+0020, but they are semantically different characters.
The second space character in question is formatted like this in the actual HTML:
<p>Here's just with hitting the space key (which works fine)... <code>" "</code>.</p>
So it is Unicode codepoint U+0020 and not U+00A0. Viewing the raw hex data of this page confirms that:
It turns out the two seemingly identical whitespace characters are not the same character.
Behold:
var characters = ["a", "b", "c", "d", " "];
var typedSpace = " ";
var copiedSpace = " ";
alert("Typed: " + characters.indexOf(typedSpace)); // -1
alert("Copied: " + characters.indexOf(copiedSpace)); // 4
alert(typedSpace === copiedSpace); // false
JSFiddle
typedSpace.charCodeAt(0) returns 32, the classic space. Whereas copiedSpace.charCodeAt(0) returns 160, the   AKA character.
The difference between the two is that a whole bunch of repeated after one another will hold their ground and create additional space between them, whereas a whole bunch of repeated characters will squish together into one space.
For instance:
A B results in: A B
A B results in: A B
To convert the character with a character in a string, try this:
.replace(new RegExp(String.fromCharCode(160),"g")," ");
To the people in the future like myself that had to debug this from a high level all the way down to the character codes, I salute you.
Don't get yer knickers in a knot. It's one of those special html characters that we old-school love because we was tort rite.
For many of us, we were taught that a sentence started with a capital letter and ended with a full-stop. But the next sentence is separated from this by TWO spaces.
Good-ol'-HTML doesn't like space(s). If you enter a string of words with 5 spaces between them (using an unintelligent editor like MS Notepad, then html shows it with single spaces.
SO, to get it looking like we old-farts like, we end a sentence with '.&NbSp; Next' This puts two spaces after the full-stop, and looks like '. Next' rather than '. Next'.
Next point is that the real space (32) works as a linebreak, so that's good.
EXCEPT for we old-farts, who HATE to see our name split across a linebreak. That annoys us NO-END.
But, of course, that's where &NbSp; comes in handy again. If you enter 'John&NbSp;Brown', then the html thinks that's a single word, and it displays it just rite for we oldies.
How do these &NbSp; thingies get there? Well, good old Word (and I suspect many intelligent editors) see two spaces and output them as a non-breaking space followed by a normal space.
And when in Word, you can insert a non-breaking space between John and Brown by the key sequence alt-ctrl-space (sorry, you apple-users)
Lesson-over (with the exception that the term &NbSp; needs to be all lowercase - THIS viewer was even converting it)
It is a non breaking space. is the entity used to represent a non-breaking space. It is essentially a standard space, the primary difference being that a browser should not break (or wrap) a line of text at the point that this occupies.
Most likely the character is being inserted by your HTML Editor. Could you give a more specific example in context?
This is not actually an answer to the question but instead a tool that can be used to detect this special white space in the html of the pages of a website so we can proceed to locate and remove it.
The tool what basically does is:
Fetches the content of a URL
Looks for occurrences of chr(194).chr(160) in the HTML contents
Replaces and highlights the ocurrences with something more visible
This way you can actually know where the spaces are and edit your page properly to remove them.
The online version of the tool can be found here:
http://tools.heavydots.com/nbsp-space-char-detect/
A working example can be seen with the url of this question that contains one ocurrence:
http://tools.heavydots.com/nbsp-space-char-detect/?url=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F26962323%2Fwhat-is-this-insane-space-character-google-chrome&highlight=1&hstring=%7BNBSP%7D
There's a Github repo available if someone wants the code to run it locally:
https://github.com/HeavyDots/nbsp-space-char-detect
Hope someone finds it useful, for any feedback there's a comments section on the tool's page.
Updated 5th of January 2017
At our company blog we just wrote a funny post about this annoying white space. You're invited to drop by and read it! :-)
http://heavydots.com/blog/when-the-white-space-became-a-beast
As the previous answers have mentioned, it's a non-breaking space (nbsp). On Macs, this character gets inserted when you accidentally press Alt + Space (most of the time, this happens when entering code that requires Alt for special characters, e.g. [ on a German keyboard layout).
To remap this key combination to a plain ol' SPACE character, you can change your default keybinding as suggested on Apple SE
For whitespace, Press "Alt+0160" which is a character also.
For some reason on just one of my pages I get an empty space that I believe is generated by the bootstrap's library.
I am including the page into a section <section class="container col-md-8"> from another page, and the empty line shows up at the start of the section. On the F12 inspector it is seen as " " and when I select to edit it as HTML, a small red dot appears. What might be the reason for it ? And it is generated only on this page, every other included page in the section is fine.
Here are some screenshots:
Answer from #Holt was provided in a comment (posting as an answer so you can accept it to close the question!)
Your file is encoded in UTF-8 with BOM, change the encoding to UTF-8 without BOM
So it relates to the Byte Order Mark rendering in Chrome, and is a similar issue to:
Invisible character rendered between Twig includes
Chrome apparently shows those dots also in cases like zero-width-space characters, such as in the jsfiddle here
I'm afraid this is highly specific, so please bear with me and read carefully.
The problem:
Open a PDF file, select and copy some text that contains line breaks and paste it into a TinyMCE textarea in the Google Chrome browser. Then delete any line break and insert a space at the same point: the space that is added is non-breaking even though I used a regular "space bar" key stroke in TinyMCE.
How do I know there is a non-breaking space?
You can click the "show invisible characters" button on the first row of my TinyMCE implementation (see link below). Remember that with TinyMCE your must turn that option Off and On again every time you modify the text to see the changes.
The non-breaking spaces will appear in orange, normal spaces appear normally.
What I have found so far:
If I delete the character that comes after the line break and then type that character again, I can insert a normal space. The problem seems to be attached to that character.
If I delete the character occuring before the line break, the problem persists, i.e. when I delete the space and type a new space it is still a non breaking space.
Also when I save the text to the MySQL database, and read it again in TinyMCE, the problem still occurs, which reinforces my impression that the "hidden" character is attached to the letter following the line break (there is no saving on the test page of course).
Replicating it
You could of course try it yourself, but here is my testbed for you: http://www.roseback.com/test/tinymce4.html
I have tested it with many PDF files that we receive from graphic designers, from many products and eras. These PDFs are the files that are used for printing and there is no problem with those files for that use.
I uploaded a sample file here: http://www.roseback.com/test/languedoc.pdf. Test with the first paragraph starting with "Ce film exceptionnel".
However I have also tested random PDF files from the web and replicated the problem every time. So if you try with your own files and can't replicate, that might be interesting.
Environment:
Web page: the page is in HTML5, in UTF-8.
On the original page, the page is served via PHP and the textarea content comes from a MySQL 5.1 DB. The DB connection is set to UTF-8 in PHP, the content of the table and of the text field is in utf8_unicode_ci
On the test page there is no content and no saving, so no DB is involved.
Browser: Chrome. Does not happen in Firefox or Opera (not tested elsewhere)
TinyMCE: version 3 and version 4 (both standard version, not jQuery)
OS: on Windows 7 Pro 64 bit and also on Windows XP Pro 32 bit
I would appreciate any feedback, even simple confirmation / replication of the problem.
Hmm, i think what you observe has something to do with the fact that tinymce inserts non breaking spaces instead of spaces. Tinymce needs to so this in order to avoid that the browser shows more than one space concurrently entered as one single space (this is the default browser behaviour).
You can verify this by inserting more than one space and then have a look at the non-visible characters.
When I use Chrome, Firefox or Opera i have no problem with my website under Desktop computer, but when I use default Android browser (also on Google search preview), right menu does not show up. I checked on W3 validator website, but for index page, it says it cannot be checked:
http://volkangezer.scienceontheweb.net/index.php
For another page:
http://volkangezer.scienceontheweb.net/iletisim.php?dil=en
It shows some errors, but probably they are not the reason for this problem.
My first question is why my index.php page cannot be checked? The both pages have exactly the same encoding and include files.
Second question is, why right menu does not show up?
Thank you.
The validator tells you why it can't check it:
Sorry, I am unable to validate this document because on line 350 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
The error was: utf8 "\xC4" does not map to Unicode
In other words, either your file is screwed up or it is encoded using a character encoding that doesn't match the one it claims to use.
See Character encodings for beginners and the documents it links to for more information on the subject.