There is no code for the European Union in ISO 3166 or UN.M49. CLDR states that its own list does not contain a code for the European Union. I've seen the code "EU" used, but I can't find any official list that contains it. Is it in any official list of codes?
As it turns out, it is not in a list per se, but the code EU was officially "reserved" by the ISO 3166 Maintenance Agency to represent the European Union. This is discussed in an old version of the Maintenance Agency FAQ:
You can use EU for the name European Union. Please note that this is
not an official ISO 3166-1 country code. The European Union is not a
country but rather an organization. As such it is not eligible to be
formally included in ISO 3166-1. Recognizing, however, that many users
of ISO 3166-1 have a practical need to encode that name the ISO
3166/MA reserved the two-letter combination EU for the purpose of
identifying the European Union within the framework of ISO 3166-1.
This document is apparently no longer available, although parts of the statement are widely quoted on other Web sites (e.g. on Wikipedia).
Related
Can someone explain to me what a context free grammar is? After looking at the Wikipedia entry and then the Wikipedia entry on formal grammar, I am left utterly and totally befuddled. Would someone be so kind as to explain what these things are?
I am wondering this because I wish to investigate parsing, and also on the side, the limitation of a regex engine.
I'm not sure if these terms are directly programming related, or if they are related more to linguistics in general. If that is the case, I apologize, perhaps this could be moved if so?
A context free grammar is a grammar which satisfies certain properties. In computer science, grammars describe languages; specifically, they describe formal languages.
A formal language is just a set (mathematical term for a collection of objects) of strings (sequences of symbols... very similar to the programming usage of the word "string"). A simple example of a formal language is the set of all binary strings of length three, {000, 001, 010, 011, 100, 101, 110, 111}.
Grammars work by defining transformations you can make to construct a string in the language described by a grammar. Grammars will say how to transform a start symbol (usually S) into some string of symbols. A grammar for the language given before is:
S -> BBB
B -> 0
B -> 1
The way to interpret this is to say that S can be replaced by BBB, and B can be replaced by 0, and B can be replaced by 1. So to construct the string 010 we can do S -> BBB -> 0BB -> 01B -> 010.
A context-free grammar is simply a grammar where the thing that you're replacing (left of the arrow) is a single "non-terminal" symbol. A non-terminal symbol is any symbol you use in the grammar that can't appear in your final strings. In the grammar above, "S" and "B" are non-terminal symbols, and "0" and "1" are "terminal" symbols. Grammars like
S -> AB
AB -> 1
A -> AA
B -> 0
Are not context free since they contain rules like AB -> 1 that have more than one non-terminal symbol on the left.
Language Theory is related to Theory of Computation. Which is the more philosophical side of Computer Science, about deciding which programs are possible, or which will ever be possible to write, and what type of problems is it impossible to write an algorithm to solve.
A regular expression is a way of describing a regular language. A regular language is a language which can be decided by a deterministic finite automaton.
You should read the article on Finite State Machines: http://en.wikipedia.org/wiki/Finite_state_machine
And Regular languages:
http://en.wikipedia.org/wiki/Regular_language
All Regular Languages are Context Free Languages, but there are Context Free Languages that are not regular. A Context Free Language is the set of all strings accept by a Context Free Grammer or a Pushdown Automata which is a Finite State Machine with a single stack: http://en.wikipedia.org/wiki/Pushdown_automaton#PDA_and_Context-free_Languages
There are more complicated languages that require a Turing Machine (Any possible program you can write on your computer) to decide if a string is in the language or not.
Language theory is also very related to the P vs. NP problem, and some other interesting stuff.
My Introduction to Computer Science third year text book was pretty good at explaining this stuff: Introduction to the Theory of Computation. By Michael Sipser. But, it cost me like $160 to buy it new and it's not very big. Maybe you can find a used copy or find a copy at a library or something it might help you.
EDIT:
The limitations of Regular Expressions and higher language classes have been researched a ton over the past 50 years or so. You might be interested in the pumping lemma for regular languages. It is a means of proving that a certain language is not regular:
http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages
If a language isn't regular it may be Context Free, which means it could be described by a Context Free Grammer, or it may be even in a higher language class, you could prove it's not Context Free by the pumping lemma for Context Free languages which is similar to the one for regular expressions.
A language can even be undecidable, which means even a Turing machine (may program your computer can run) can't be programmed to decide if a string should be accepted as in the language or rejected.
I think the part you're most interested in is the Finite State Machines (Both Deterministic and Deterministic) to see what languages a Regular Expression can decide, and the pumping lemma to prove which languages are not regular.
Basically a language isn't regular if it needs some sort of memory or ability to count. The language of matching parenthesis is not regular for example because the machine needs to remember if it has opened a parenthesis to know if it has to close one.
The language of all strings using the letters a and b that contain at least three b's is a regular language: abababa
The language of all strings using the letters a and b that contain more b's than a's is not regular.
Also you should not that all finite language are regular, for example:
The language of all strings less than 50 characters long using the letters a and b that contain more b's than a's is regular, since it is finite we know it could be described as (b|abb|bab|bba|aabbb|ababb|...) ect until all the possible combinations are listed.
How do I create a HTML link to an emergency number like 911 or 112?
The RFC says
The
phone number can be represented in either global or local
notation. All phone numbers MUST use the global form unless they
cannot be represented as such.
[Emergency numbers ("911", "112")] cannot be represented in global form and
need to be represented as a local number with a context.
From the local-context section I don't find it easy to understand what a "local-context" is, let alone what the correct one for this case is. It lists domain prefixes like houston.example.com or a numeric prefix like +1, and in one paragraph it says
A context consisting of the initial digits of a global number does
not imply that adding these to the local number will generate a valid
E.164 number. It might do so by coincidence, but this cannot be
relied upon. (For example, "911" should be labeled with the context
"+1", but "+1-911" is not a valid E.164 number.)
But the phrasing of this paragraph is again very confusing.
Is
112
now the correct way of doing it, and the fact that it is not a valid E.164 number is irrelevant?
Or is the fact that it is not a valid E.164 number a problem?
In some other places I see people using
112
And again other people recommend
112
But when I tap that link on Android, the dialer opens with the number
112;746632668398+49
The cited Section 5.1.5 the RFC states
A context consisting of the initial digits of a global number does
not imply that adding these to the local number will generate a valid
E.164 number. It might do so by coincidence, but this cannot be
relied upon. (For example, "911" should be labeled with the context
"+1", but "+1-911" is not a valid E.164 number.)
I interpret this that emergency numbers should be prefixed by their country-secific prefixes, i.e.
in the US, 911 should be used as 911
in Germany 112 should be used as 112
The rest of the paragraph is about this syntax not being compliant to the E.164 recommendation. As far as I understand, E.164 is irrelevant in this context though.
I'm not asking about a particular language, but just in general. I know that, for example, #0x or simply 0x is put before the number, or an h is placed after the number, to refer to hexadecimal.
Is there a similar "standard" for binary?
Most of the popular languages don't have a way to enter binary literals. Common Lisp does it using #b prefix, and IIRC PL/I uses a b suffix. Those are the only ones I can think of off the top of my head that allow it.
I found a page at RosettaCode that describes how to enter integer literals in many different languages, including specifying radix.
b is used to represent binary numbers. Very few languages provides this support.
One best example is verilog / system verilog e.g. 4'b0101
How to find out which type of barcode is this in my sample ? I looked on wikipedia and there are quite many types of barcodes, most common should be Code 39 and Code 128.
Is there any lib for barcode OCR (python, java, C#, delphi) ?
On this barcode should be encoded time and date of expiration.
EDIT
I need to know how to read and decode above barcode. This barcodes were generated in legacy system and It would be nice if my app could OCR and understand them
On my barcode should be date 19.11.2010 15:43
According to this online bar code reader, it an EAN_13 code for a product with the number 5252235562500.
According to Wikipedia it's a product number for a discount coupon with manufacturer code 25223, family code 556 and coupon code 25.
If there is an expiration date encoded in the data, it's in some custom format encoded into the family code and coupon code. Otherwise you need a loopup table from the manufacturer to determine which coupon has which expiration date.
How about, http://code.google.com/p/zxing/.
There's an excellent barcode reading library named Zebra crossing (zxing) available in Java with ports/wrappers to C#, C++, Ruby, etc.
This particular one is indeed EAN-13 code, which encodes 13 decimal digits [0-9] (2..3 country digits + 9..10 product digits + 1 checksum digit).
The Wikipedia article referenced above seems to only refers to "coupon codes" only for UPC12 barcodes which are slightly different from EAN13 barcodes.
According to the offical GS1 site http://gepir.gs1.org/v31/xx/gtin.aspx?Lang=en-US this barcode is not defined as belonging to anyone (or country) so it is probably used internally by some organization for a custom application.
The GS1 site lot of information on barcode standards and formats.
I will be storing a year in a MySQL table: Is it better to store this as a smallint or varchar? I figure that since it's not a full date, that the date format shouldn't be an answer but I'll include that as well.
Smallint? varchar(4)? date? something else?
Examples:
2008
1992
2053
I would use the YEAR(4) column type... but only if the years expected are within the range 1901 and 2155... otherwise, see Gambrinus's answer.
I'd go for small-int - as far as I know - varchar would take more space as well as date. second option would be the date.
My own experience is with Oracle, which does not have a YEAR data type, but I have always tried to avoid using numeric data types for elements just because they are comprised only of digits. (So this includes phone numbers, social security numbers, zip codes as well, as additional examples).
My own rule of thumb is to consider what the data is used for. If you will perform mathematical operations on it then store it as a number. If you will perform string functions (eg. "Take the last four characters of the SSN" or "Display the phone number as (XXX) XXX-XXXX") then it's a string.
An additional clue is the requirement to store leading zeroes as part of the number.
Furthermore, and despite being commonly referred to as a phone "number", they frequently contain letters to indicate the presence of an extension number as a suffix. Similarly, a Standard Book Number potentially ended in an "X" as a "check digit", and an International Standard Serial Number can end with an "X" (despite the ISSN International Centre repeatedly referring to it as an 8-digit code http://www.issn.org/understanding-the-issn/what-is-an-issn/).
Formatting of phone numbers in an international context is tricky, or course, and conforming to E.164 requires that country calling codes are prefixed with a "+".