Force screen reader to read numbers as individual digits - html

I'm using voice over and I'm trying to get it to read numbers as individual digits, for example if I have inputted 2000, voice over will read out "two thousand". I want the desired behaviour to read out "two zero zero zero".
my current input element looks something like this
<input class="some-class" id="some-id" name="some-name"
type="text">
I have tried setting the type attribute to type="number", type="tel" and adding a style attribute equal to style="speak:spell-out", but non of them worked.
When I separate the numbers with whitespace like value="2 0 0 0" it works, but of course, you can't expect the user to do this.
I understand there may be a way to do this using javascript, but the solution can not contain javascript in the browser due to business requirements.
Any suggestions?

You shouldn't try to force a particular pronunciation or digit grouping.
Add spaces if grouping has a particular importance or meaning.
Take the base principle that numbers shouldn't be read differently by a screen reader than it is presented.
If digits must be separated in a particular way, add spaces, dots, dashes or another separation character.
Conversely, if there's no spaces, there's no special need to absolutely read a number digit by digit.
That's quite simple.
You shouldn't force the screen reader to read something in the way you view it yourself.
Other people may not have the same vision as you. Concerning numbers, some people will prefer to read digit by digit, but others will prefer having them grouped by two, three or four, for their ease of reading, writing and memorizing. Their screen reader is normally configured accordingly.
If a given grouping is important, then groups must be separated with spaces or other characters. If there's no separations, then it implicitly means that grouping has no particular importance.
Note that screen readers always give the possibility to read numbers digit by digit, if the user wish to do so. It is usually not the default.
Reading numbers digit by digit is usually done only for very big numbers (billions), or when mixing digits and letters.
Additionally consider that:
Different screen reader users have different preferences, and accessibility speaking, it's generally a bad idea to go against preferences or common defaults
There are several screen readers, and a lot of different voices in many languages; all potentially behave in slightly different way when reading numbers, and any small change in order to tweak it might create more problems than solve.
Screen reader users are used to pronunciation quirks, and they can fix them using personal dictionary
Screen readers are nowadays not that bad on deciding whether they need to read numbers at a hole, in groups, or digit by digit.
So, avoid deciding a particular grouping or pronunciation. It's a bad idea, and anyway technically perilous.

I understand there may be a way to do this using javascript, but the solution can not contain javascript in the browser due to business requirements.
You tried HTML and CSS and you can't use Javascript. Screenreaders use the Accessibility tree. They do not use CSS, there's no instruction to tell them to spell the text. They might choose to spell some abbreviations while reading some acronym. This is screenreader choice.
Screenreader users are used to ear numbers as strangely as they come and if they want them to be read char by char they have the appropriate shortcut to willingly spell them.

Related

How to ensure a Screen reader announces capital letters

I'm working on an email that is sent out to customers. The email has an access code that is a mix of Upper and Lower case letters. The problem we are facing is that even with the setting on for reading capital letters as Cap the screen reader does not announce the case of the letters and reads them in a continuous flow. For example, we want it to read the following:
Your one-time access code: U8dFyX
How do I make sure that it says that U, F, X are capital letters?
As a general rule you should not try to influence how a screen reader pronounces things. (#QuentinC has covered this in detail)
However in user testing we found that less experienced screen reader users struggled differentiating between letters by pitch (especially elderly screen reader users who also had a slight hearing impairment).
What we implemented was as follows:-
Display the activation code as normal.
After the activation code have a visually hidden section.
In this section we put 'for clarity your code contains upper and lower case letters as follows:'
We then added each letter to a span that explains what the letter is.
e.g.
<div class="visually-hidden">
<p>for clarity your code contains upper and lower case letters as follows:</p>
<span>Upper Case U</span><span>Number 8</span><span>Lower Case D</span> // etc.
</div>
It improved accuracy and recognition for everyone, even more experienced screen reader users benefited in a simulated noisy environment.
There is a problem with this that we eventually worked around.
The site was translated into multiple languages, so we had to translate 'Upper Case' and 'Lower Case' as part of i18n as it was causing confusion.
We eventually switched to a longer string of lower case characters only and made it case insensitive. (8 lower case characters is better than 6 alphanumeric characters anyway 208,827,064,576 combinations vs 68,719,476,736 combinations)
As email isn't secure it didn't really matter for anything other than reducing brute force attempts. If you are using it for security then stop!
A better solution
We replaced the above solution after around a year.
Instead we just sent a link within the email that automatically entered the code.
As you mentioned it is a one use code, just apply the same logic to a single use link. yoursite.com/code/aU4383H483kdj483Jfdfsk3UF
This way you can have a 256 bit code if you want.
It has the added benefit of not reading an access code out-loud in a public space.
As access codes do not offer any security benefit (unless you send them via a second medium such as SMS or WhatsApp message) you might as well send a single use link.
The final benefit of this is that your non screen reader users do not need to enter a code either so it makes it better for everyone!
You're better off not trying to control this yourself.
When reading generated passwords, activation codes or such, screen reader users are used to pay attention and carefully read character by character, so it shouldn't be a big problem.
IT's even less a problem for screen reader users than it is for partially sighted users, who may have difficulties distinguishing 0, O and o, or 1, I and l.
That's by the way why it is recommended to avoid including 0, O, o, 1, I and l in generated passwords and activation codes.
You'd better don't try to control how it is read yourself because different screen readers have different ways to read such strings and to announce capitals.
For example when reading character by character, in order to distinguish upper from lower case letters, the screen reader may:
Say the letter with a higher pitch (the most common)
Say "Capital"
Produce a sound before or after announcing the letter, for example a small beep
The way screen readers split into words when reading such strings as a whole word is heavily screen reader and voice specific.
Most split at capital boundary, some ignore case, some produce sounds, and it may be customized via screen reader or speech synthesizer options.
So, you're better off not trying to control yourself how such strings are or should be pronounced.
Simply let the users do as they are used to do.

Does 1 English letter = 1 Chinese character?

I am a UX designer and we are working on a product where there needs to be a text input field for the user to insert their note. There needs to be a word limit indication, whether they're typing in Traditional Chinese or English.
So my question is:
If the character limit is 15, am I correct to say:
I am in Sweden (11/15 characters)
我在瑞典 (4/15 characters)
I was told that 1 Chinese character counts as 2-byte code and 1 English letter counts as 1-byte. How does this affect the character limit? I want to make sure my design is clear as possible for the developers.
So it’s about display size, right? Counting words won’t be useful in that case because a word can be as long as you want.
Counting characters is marginally more useful, but also doesn’t guarantee that the message will fit in the end because different characters have different widths. Just as an example, these four strings all consist of five characters each:
“​​​​​”
“     ”
“WWWWW”
“﷽﷽﷽﷽﷽”
There really is no elegant way to solve this. You’d need to know the precise metrics of the font you’re using and then calculate the visual width of each input.
If you’re fine with a “close enough” solution, you can just use the <input> element’s maxlength attribute. HTML and JavaScript count UTF-16 code units, however, which means that characters in the so-called Basic Multilingual Plane count as 1 and everything else counts as 2.
The Basic Multilingual Plane contains 99% of all characters in common, present-day use, so the vast majority of users probably won’t notice anything wrong. You could do something fancier with JavaScript, but I reckon it’s not really necessary for this kind of task.
Just keep in mind that this approach still won’t guarantee that the user’s input will fit visually on the print-out unless you leave a lot of empty room just in case. Definitely play around with some narrow and wide characters to see how much space they really take up when printed.

Hyphenating arbitrary text automatically

What kinds of challenges are there facing automatic hyphenation? It seems that you could just draw word by word, breaking when the length of the line exceeds the length of the viewport (or whatever we're wrapping our text in), placing hyphens after as many characters as can fit (provided at least two characters fit and the word is at least four characters), skipping words that already contain a hyphen (there's no requirement that words have to be hyphenated).
But I note how Firefox and IE need a dictionary to be able to hyphenate with CSS's hyphens. This seems to imply that there are further issues regarding where we can place hyphens.
What kinds of issues are these? Do any exist in the English language or do they only exist in other languages?
You have these issues in all languages. You can only place a hyphen where meaningful tokens result from the split, as has already been pointed out. You don't want to, for example, split a word like "wr-ong".
This may or may not be a syllable, while in most languages (including English) it is. But the main point is that you cannot pin it down as easily just with some simple rules. You would need to consider a lot of phonology to get a highly accurate result, and these rules vary from language to language.
With this background, I can see why one would take a dictionary instead, and frankly, being a computational linguist myself, this is also what I would probably opt for.
If you DO want to go for an automatic solution, I would recommend doing some research in English phonology of syllables, or the so-called syllabification. You might want to start with this article on Wikipedia:
Wikipedia - Syllabification

Accessibility & HTML title tag separators - alternatives to the vertical line (pipe)

I am attempting to make a site a bit more screen reader friendly and in testing I noticed that a common pattern is quite annoying on a screen reader - the site is using a vertical line / pipe character as a separator in the <title> tag (e.g. <title>Page Name | Site Name</title>). When I use VoiceOver as a screen reader to do testing it is read as "Page Name Vertical Line Site Name" which sounds especially odd with the particular title of the site.
What are the best accessible alternatives to the pipe that also have no negative effect on SEO? I've tried a <title>Page Name - Site Name</title> and <title>Page Name · Site Name</title> and they work okay, but I afraid they might have gotchas (e.g. reading as 'dash' or 'ampersand m i d d o t semicolon') on some edge case or causing chaos with SEO. Is there an accepted best practice for this?
The pronunciation of punctuation or special characters varies by screen reader, so there is no optimal choice. While it is true that “vertical line” sounds odd, it’s an oddity that screen reader users are accustomed to, since the “|” is widely used—not that much in title elements, rather in link lists and other contexts. The use of an en dash “–” might help, as it is a normal punctuation character and might be just ignored or even handled in an advanced way (e.g., a pause followed by raised tone). On other other hand, a comma “,” or a colon “:” might do the same thing, or do better.
It is very unlikely that such choices have any impact on SEO, since search engines generally ignore punctuation and special characters. (They might notify some special characters in some contexts, e.g. distinguishing between C and C++.)
Depending on context and context language, you could also consider using purely verbal expressions, e.g. in English using “of” instead of a separator character. “New products of ACME Corporation” sounds better than “New products | ACME Corporation” (though the latter is in no way wrong). This may have a minor impact on SEO, since search engines may treat even small words like “of” as significant; but this would not matter much, due to the way people write things in search boxes.
Using either a hyphen, or comma will have no effect on the site's SEO ranking.
But shorter titles often get more clicks from search (my own impression, backed by some data below), so think if you really need the site's name on every page?
Also, keep your title around 55 characters or so for best results in Google, the character count is trickier than it used to be, see http://moz.com/blog/new-title-tag-guidelines-preview-tool for a detailed explanation of some recent changes by Google. The current length is actually determined by pixel counts, not characters.
See this experiment for PPC CTR based on title lenghths: http://danzarrella.com/ppc-ad-line-lengths-and-clickthrough-rates.html# (not SEO, but correlates with my own experience in organic results).

what are the disadvantages of having tons of entities?

I've been writing a source-to-display converter for a small project. Basically, it takes an input and transforms the input into an output that is displayable by the browser (think Wikipedia-like).
The idea is there, but it isn't like the MediaWiki style, nor is like the MarkDown style. It has a few innovations by itself. For example, when the user types in a chain of spaces, I would presume he wants the spaces preserved. Since html ignores spaces by default, I was thinking of converting these chain of spaces into respective s (for example 3 spaces in a row converted to 1 )
So what happens is that I can foresee a possibility of a ton of tags per post (and a single page may have multiple posts).
I've been hearing alot of anti-&nbsps in the web, but most of it boils down to readability headaches (in this case, the input is supplied by the user. if he decides to make his post unreadable he can do so with any of the other formatting actions supplied) or maintenance headaches (which in this case is not, since it's a converted output).
I'm wondering what are the disadvantages of having tons of tags on a webpage?
You are rendering every space as ?
Besides wasting so much bandwidth, this will not allow dynamic line breaking as "nbsp" means "*n*on *b*reaking *sp*ace". This will most probably cause much trouble.
If it's just being dumped to a client, it's just a matter of size, and if it's gzipped, it barely matters in terms of network traffic.
It'll slow down rendering, I'm sure, and take up DOM space, but whether or not that matters depends on stuff I don't know about your use case(s). You might be able to achieve the same result in other ways, too; not sure.
s aren't tags, but are character entities like ©, <, >, etc.
I'd say that the disadvantages would be readability. When I see a word, I expect the spacing to be constant (unless it is in a block of justified text).
Can you show me a case where you'd need s?
Have you considered trying to figure out what the user, by inserting those spaces, is really trying to achieve? Rather than the how (they want to insert the spaces), the what (if the spaces are at the beginning of a line, they want to indent the text in question).
An example of this is many programming sites convert 4 spaces at the start of a line to a pre+code block.
For your purposes, maybe it should be a <block> block.
The end goal being that of converting the spaces not to what the user (with their limited resources) intended to show up there but, rather, what they meant to convey with it.