Are there any placeholder numbers in a programming context? - language-agnostic

The terms foobar, foo, bar, baz and
qux are sometimes used as placeholder
names (also referred to as
metasyntactic variables) in computer
programming or computer-related
documentation.
...As stated here.
Are there any placeholders for numbers?
Update:
To be more clear, here are some examples where it would be useful to have placeholder numbers:
Credit card numbers
Licence plates
Phone numbers
Bar codes (the actual numbers)
Freestyle (any length and any numbers)

This Wikipedia article lists many magic numbers, in particular the famous DEADBEEF. However, it's often a bad practice to use magic numbers since they don't have a common meaning, so they aren't used as often as 'foo' or 'bar' (except, maybe, hex addressed like DEADBEEF). I tend to use all zeros or 1234567890 for things like credit card numbers / phone numbers in my tests. Occasionally I use 42 just for fun but only even then I make sure it's clear that it's a joke.

I don't know if there are any standard ones, but..
You could do 1337..
Or 42 (the meaning of life)

Related

Force screen reader to read numbers as individual digits

I'm using voice over and I'm trying to get it to read numbers as individual digits, for example if I have inputted 2000, voice over will read out "two thousand". I want the desired behaviour to read out "two zero zero zero".
my current input element looks something like this
<input class="some-class" id="some-id" name="some-name"
type="text">
I have tried setting the type attribute to type="number", type="tel" and adding a style attribute equal to style="speak:spell-out", but non of them worked.
When I separate the numbers with whitespace like value="2 0 0 0" it works, but of course, you can't expect the user to do this.
I understand there may be a way to do this using javascript, but the solution can not contain javascript in the browser due to business requirements.
Any suggestions?
You shouldn't try to force a particular pronunciation or digit grouping.
Add spaces if grouping has a particular importance or meaning.
Take the base principle that numbers shouldn't be read differently by a screen reader than it is presented.
If digits must be separated in a particular way, add spaces, dots, dashes or another separation character.
Conversely, if there's no spaces, there's no special need to absolutely read a number digit by digit.
That's quite simple.
You shouldn't force the screen reader to read something in the way you view it yourself.
Other people may not have the same vision as you. Concerning numbers, some people will prefer to read digit by digit, but others will prefer having them grouped by two, three or four, for their ease of reading, writing and memorizing. Their screen reader is normally configured accordingly.
If a given grouping is important, then groups must be separated with spaces or other characters. If there's no separations, then it implicitly means that grouping has no particular importance.
Note that screen readers always give the possibility to read numbers digit by digit, if the user wish to do so. It is usually not the default.
Reading numbers digit by digit is usually done only for very big numbers (billions), or when mixing digits and letters.
Additionally consider that:
Different screen reader users have different preferences, and accessibility speaking, it's generally a bad idea to go against preferences or common defaults
There are several screen readers, and a lot of different voices in many languages; all potentially behave in slightly different way when reading numbers, and any small change in order to tweak it might create more problems than solve.
Screen reader users are used to pronunciation quirks, and they can fix them using personal dictionary
Screen readers are nowadays not that bad on deciding whether they need to read numbers at a hole, in groups, or digit by digit.
So, avoid deciding a particular grouping or pronunciation. It's a bad idea, and anyway technically perilous.
I understand there may be a way to do this using javascript, but the solution can not contain javascript in the browser due to business requirements.
You tried HTML and CSS and you can't use Javascript. Screenreaders use the Accessibility tree. They do not use CSS, there's no instruction to tell them to spell the text. They might choose to spell some abbreviations while reading some acronym. This is screenreader choice.
Screenreader users are used to ear numbers as strangely as they come and if they want them to be read char by char they have the appropriate shortcut to willingly spell them.

Does 1 English letter = 1 Chinese character?

I am a UX designer and we are working on a product where there needs to be a text input field for the user to insert their note. There needs to be a word limit indication, whether they're typing in Traditional Chinese or English.
So my question is:
If the character limit is 15, am I correct to say:
I am in Sweden (11/15 characters)
我在瑞典 (4/15 characters)
I was told that 1 Chinese character counts as 2-byte code and 1 English letter counts as 1-byte. How does this affect the character limit? I want to make sure my design is clear as possible for the developers.
So it’s about display size, right? Counting words won’t be useful in that case because a word can be as long as you want.
Counting characters is marginally more useful, but also doesn’t guarantee that the message will fit in the end because different characters have different widths. Just as an example, these four strings all consist of five characters each:
“​​​​​”
“     ”
“WWWWW”
“﷽﷽﷽﷽﷽”
There really is no elegant way to solve this. You’d need to know the precise metrics of the font you’re using and then calculate the visual width of each input.
If you’re fine with a “close enough” solution, you can just use the <input> element’s maxlength attribute. HTML and JavaScript count UTF-16 code units, however, which means that characters in the so-called Basic Multilingual Plane count as 1 and everything else counts as 2.
The Basic Multilingual Plane contains 99% of all characters in common, present-day use, so the vast majority of users probably won’t notice anything wrong. You could do something fancier with JavaScript, but I reckon it’s not really necessary for this kind of task.
Just keep in mind that this approach still won’t guarantee that the user’s input will fit visually on the print-out unless you leave a lot of empty room just in case. Definitely play around with some narrow and wide characters to see how much space they really take up when printed.

Hyphenating arbitrary text automatically

What kinds of challenges are there facing automatic hyphenation? It seems that you could just draw word by word, breaking when the length of the line exceeds the length of the viewport (or whatever we're wrapping our text in), placing hyphens after as many characters as can fit (provided at least two characters fit and the word is at least four characters), skipping words that already contain a hyphen (there's no requirement that words have to be hyphenated).
But I note how Firefox and IE need a dictionary to be able to hyphenate with CSS's hyphens. This seems to imply that there are further issues regarding where we can place hyphens.
What kinds of issues are these? Do any exist in the English language or do they only exist in other languages?
You have these issues in all languages. You can only place a hyphen where meaningful tokens result from the split, as has already been pointed out. You don't want to, for example, split a word like "wr-ong".
This may or may not be a syllable, while in most languages (including English) it is. But the main point is that you cannot pin it down as easily just with some simple rules. You would need to consider a lot of phonology to get a highly accurate result, and these rules vary from language to language.
With this background, I can see why one would take a dictionary instead, and frankly, being a computational linguist myself, this is also what I would probably opt for.
If you DO want to go for an automatic solution, I would recommend doing some research in English phonology of syllables, or the so-called syllabification. You might want to start with this article on Wikipedia:
Wikipedia - Syllabification

Pitfalls when performing internationalization / localization with numbers?

When developing an application that will need to work with a variety of localizations, particularly with "right to left" text, is there a possibility of a case where numbers would need to be converted to "right to left" as well?
I'm no language scholar, but I know the RTL languages I am familiar with present their numbers in LTR.
For instance (using google translate):
I have 345 apples.
In Arabic:
لدي 345 التفاح.
So, I have two questions:
Is it possible to run into a language that uses RTL numbers?
How should internationalizing be handled in such cases?
or,
Is the "accepted norm" to just do numbers using Western Arabic characters, read from left to right?
In the big right-to-left scripts - Arabic, Hebrew and Thaana - numbers always run left to right. (When I say "Arabic", I refer to all the languages that are written in the Arabic script - Arabic, Farsi, Urdu, Pasto and many others.)
Hebrew and Thaana always use European digits, the same 0-9 set as English. There's nothing much to do there, because Unicode automatically takes care of ordering the numbers correctly. But see the comments about isolation below.
It's possible to use European digits in Arabic, too; for example, the Arabic Wikipedia uses them. However, very frequently Arabic texts use a different set of digits - https://en.wikipedia.org/wiki/Eastern_Arabic_numerals . It depends on your users' preferences. Notice also, that in the Persian language the digits are slightly different. From the point of view of right-to-left layout they behave pretty much the same way as European digits, although there are slight differences in the behavior of mathematical signs - for example, the minus can go on the other side. There are some subtleties here, but they are mostly edge cases.
In both Hebrew and Arabic you may run into a problem with bidi-isolation. For example, if you have a Hebrew paragraph in which you have an English word, and after the word you have numbers, the numbers will appear to the right of the word, although you may have wanted them to appear on the left. That's how the Unicode bidi algorithm works by default. To resolve such things you can use the Unicode control characters RLM and LRM. If you are using HTML5, you can also use the <bdi> tag for this, as well as the CSS rule "unicode-bidi: isolate". These CSS and HTML5 solutions are quite powerful and elegant, but aren't supported in all browsers yet.
I am aware of one script in which the digits run right-to-left: N'Ko, which is used for some languages of Africa. I actually saw websites written in it, but it is far less common than Hebrew and Arabic.
Finally, if you're using JavaScript, you can use the free jquery.i18n library for automatic number conversion. See https://github.com/wikimedia/jquery.i18n . (Disclaimer: I am one of this library's developers.)
Numbers will generally translate as you have them. Even in languages that read in different directions the Western Arabic numbers are typically recognized by the user.

Can an ID attribute start with colon?

Was viewing the source of Gmail for purely academic purposes and I came across this.
<input id=":3f4"
name="attach"
type="checkbox"
value="13777be311c96bab_13777be311c96bab_0.1_-1"
checked="">
Wonder of wonders, most elements have ids that starts with a :
I always thought the definition for ID attribute was this.
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
followed by any number of letters, digits ([0-9]), hyphens ("-"),
underscores ("_"), colons (":"), and periods (".").
Or am I missing anything new? I mean is that OK with HTML5?
Permissibility
It is allowed in the latest working draft: http://www.w3.org/TR/html-markup/datatypes.html#common.data.id
Any string, with the following restrictions:
- must be at least one character long
- must not contain any space characters
The spec also notes:
Previous versions of HTML placed greater restrictions on the content
of ID values (for example, they did not permit ID values to begin with
a number).
The definition you quoted appears in the HTML 4 spec.
There is a widely-visited SO thread which visits some of considerations regarding IDs (mainly from an HTML 4 perspective).
Rationale
After thinking about it more, I realized that there are two good questions here:
Why does the spec allow this?
IDs which can contain any character have the potential to break all sorts of things, such as CSS selectors (if proper escaping is not used), Sizzle (which jQuery uses) pattern matches, server IDs (such as ASP.Net web forms use) and IDs which are generated from model properties (such as one might do with a MVC pattern).
All those things aside, I believe a key goal of HTML 5 was to not create restrictions that weren't absolutely necessary (which was a shortcoming of XHTML). Just because a purpose hasn't been identified for something yet doesn't mean that it won't be in the future.
Despite the many things which won't work, certain things work just fine, for example document.getElementById(":foo")
http://jsfiddle.net/Xjast/
As with most things, it is up to the developer to be knowledgeable of the tools that he or she is using.
Why does Google do this?
Obviously this can't be answered conclusively unless you are part of the Gmail team. However, Google heavily minimizes and obfuscates their code; they also manage a huge amount of script, which suggests well-defined conventions.
Here's another thought. What if Google is leveraging the fact that CSS selectors require escaping of certain characters? This would go a long way towards reducing accidental restyling of content contained in an email message.