MySQL stripping off leading zero from integer column - mysql

I have a bigint field which when entering a number such as '05555555555' for example, the 0 is being stripped off and only inserting '5555555555'.
What data type should I use to prevent this?

You can't. Integer columns (bigint's) do not store leading zeros (ie. in a visual representation)
Rather than attempt to store a leading zero (by using a varchar field), have a view (or whatever) format the integer into a string in the format you require.
If you need to store something that is actually a string in the Domain model (e.g. a phone number), use a string rather than an integer type field.

BIGINT and other Integer columns do not store the visual representation of a number, only the number itself in binary form (BIGINT is 8 bytes). 5555555555 is stored as:
00000000 00000000 00000000 00000001 01001011 00100011 00001100 11100011
If the preceding zeros are significant to the integrity of your data, you should be using a VARCHAR or CHAR instead of an integer type. Numerical datatypes should only be used for numerical data. US ZIP Codes and phone numbers are NOT numerical data.

bigint stores the data as a number, and 05555555555 and 5555555555 are the same number. You'll need a string type to preserve the leading zero, e.g. varchar with a suitable maximum length.

You might look into altering the field to use UNSIGNED ZEROFILL. This should allow you to store the number with leading zeros.

Problem is if you got a big database with hundredthousands of rows, a bigint, is much faster then a VARHAR field. I got a similar issue with a product database full with European Article Numbers(EAN).Some of those codes start with a leading 0 When i change it to VARCHAR it takes 8 seconds to load certain pages that search for EAN codes, when I change it to BIGINT it turns into 2 seconds.
Big difference in speed indeed.

Related

Should Tax Identification Number be INT on MySQL?

I am making a database, and to identify a client, their Tax Identification Number will be the Primary Key.
**
Should the Tax Identification Number be VARCHAR, even though it's just numbers?**
I know things like phone numbers are more like addresses than numbers, and therefore should be VARCHAR.
I am not sure if the Tax Identification Number should be treated the same way, and I need to know these things before I work on the database since I am required to make a Entity Relationship Diagram of the database.
If your are sure it's purely made up of numbers , why not use INT, which takes less disk space and less resource to index ? It's not like phone numbers , which sometimes may contain punctuation marks like hyphen. e.g 0064-337881
By the way, if data masking is required. An implicit convert is automatically performed so the INT type is cast into string type. e.g.
select insert(tax_id, 2 , 3, '***') from testtable;
result:
1***56
7***12
9***07
In the USA, Tax Identification Numbers are 9 digits, and the first digit is 9. So if you store this as numbers only, it fits within a signed 32-bit integer, which supports values up to 231-1, or 2147483647.
But TIN values are commonly formatted with hyphens after the third and fifth digit, i.e. XXX-XX-XXXX. An Employer Identification Number (EIN) is commonly formatted as XX-XXXXXXX with a hyphen after the second digit.
I would therefore use a VARCHAR(11). This allows the common formatting.

smallest storage of integer array in mysql?

I have a table of user entries, and for every entry I have an array of (2-byte) integers to store (15-25, sporadically even more). The array elements will be written and read all at the same time, it is never needed to update or to access them individually. Their order matters. It makes sense to think of this as an array object.
I have many millions of these user entries and want to store this with the minimum possible amount of disk space. I'm however struggling with MySQL's lack of Array datatype.
I've been considering the following options.
Do it the MySQL way. Make a table my_data with columns user_id, data_id and data_int. To make this efficient, one needs an index on user_id, totalling well over 10 bytes per integer.
Store the array in text format. This takes ~6.5 bytes per integer.
making 35-40 columns ("enough") and having -32768 be 'empty' (since this value cannot occur in my data). This takes 3.5-4 bytes per integer, but is somewhat ugly (as I have to impose a strict limit on the number of elements in the array).
Is there a better way to do this in MySQL? I know MySQL has an efficient varchar type, so ideally I'd store my 2-byte integers as 2-byte chars in a varchar (or a similar approach with blob), but I'm not sure how to do that. Is this possible? How should this be done?
You could store them as separate SMALLINT NULL columns.
In MyISAM this this uses 2 bytes of data + 1 bit of null indicator for each value.
In InnoDB, the null indicators are encoded into the column's field start offset, so they don't take any extra space, and null values are not actually stored in the row data. If the rows are small enough that all the offsets are 1 byte, then this uses 3 bytes for every existing value (1 byte offset, 2 bytes data), and 1 byte for every nonexistent value.
Either of these would be better than using INT with a special value to indicate that it doesn't exist, since that would be 4 bytes of data for every value.
See NULL in MySQL (Performance & Storage)
The best answer was given in the comments, so I'll repost it here with some use-ready code, for further reference.
MySQL has a varbinary type that works really well for this: you can simply use PHP's pack/unpack functions to convert them to and from binary form, and store that binary form in the database using varbinary. Example code for the conversion is below.
function pack24bit($n) { //input: 24-bit integer, output: binary string of length 3 bytes
$b3 = $n%256;
$b2 = $n/256;
$b1 = $b2/256;
$b2 = $b2%256;
return pack('CCC',$b1,$b2,$b3);
}
function unpack24bit($packed) { //input: binary string of 3 bytes long, output: 24-bit int
$arr = unpack('C3b',$packed);
return 256*(256*$arr['b1']+$arr['b2'])+$arr['b3'];
}

What data type could I use for an ID number that has a length of 13 digits in SQL Server 2008?

Normally, the INTEGER data type would suffice, but being in South Africa the ID numbers have a length of 13 and the INTEGER data type only goes up to 10. I am not fond of using characters like VARCHAR since it would not restrict the input ID number to integer values only. I only solution I see (other to using VARCHAR) is to use DECIMAL. Only problems that I see are that I can't restrict the max size like in VARCHAR and the data input could have ',' and '.' Any comments?
Just use BIGINT, it ranges from -9223372036854775808 to 9223372036854775807 which should be enough for your application.
Assuming that you're referring to South African national ID numbers, which according to Wikipedia always have 13 digits, then I would go for CHAR(13) with a CHECK constraint (a CLR user-defined data type might also be an option).
The main reason is that the 'number' is not a number, it's an ID. You can't add, subtract, multiply etc. the values so there is no benefit in using a numeric data type. Furthermore, the ID is composed of components that have their own meaning, so being able to parse them out is presumably important (and easier when using character data types).
In fact, depending on how you use this data, you could also add columns that store the individual components of the ID (DOB, sequence, citizenship), either as computed columns or real columns. This could be convenient for querying and reporting (and indexing), especially if you converted the DOB to a date or datetime column.
I would indeed use VARCHAR with a CHECK that matches the format. You can even be more sophisticated if there is internal validation, e.g. a check digit. Now you are all set for other countries that have an alphabetic character, or if you need to handle a leading zero.
I wouldn't use an integer unless it makes sense to do some sort of arithmetic on the field, which is almost certainly not true here.
You could use money as well, although it appears you only get 4 digits after the decimal place. The money type is 8 bytes, giving you a range from -922,337,203,685,477.5808 to 922,337,203,685,477.5807.
declare #num as money
select #num = '1,300,000.45'
select #num
Results in:
1300000.45
The parsing of commas and periods might be dependent on your specific culture settings, although I don't know that for sure.

What is the best column type for a United States ZIP code?

I want to store Zip Code (within United States) in MySQL database. Saving space is a priority. which is better option using VARCHAR - limited to maximum length of 6 digit or using INT or using MEDIUM Int . The Zip code will not be used for any calculation. The Zip code will be used to insert (Once), Updated (if required) and Retrieved - Once (but it can be more than once) .
Which is better option to use here VARCHAR or INT or MEDIUM IN MYSQL ? Please suggest anything else ?
There are a few problems with storing a zip code as a numeric value.
Zip Codes have extensions, meaning they can be 12345-6789. You cannot store a dash in a numeric datatype.
There are many zip codes that start with a zero, if you store it as an int you are going to lose the leading zero.
You do not add/subtract, etc zip codes or use numeric functions with them.
I would store a zip code as a varchar(5) or varchar(10).
As a side note, I am not sure why you would select varchar(6), do you have a reason for selecting an unusual length when standard zip codes are 5 or 10 with the extension?
I usually use MEDIUMINT(5) ZEROFILL for 5 digit zip codes. This preserves any leading 0s and it only uses 3 bytes where a VARCHAR(5) would use 6. This assumes that you don't need the extended zip codes, which have a dash and 4 extra numbers. If you were to decide to use a textual type, I would use CHAR(5) instead of VARCHAR(5) since it is better if the data is always 5 characters long.
Zip codes are always 5 characters, hence you would need a CHAR datatype, rather than VARCHAR.
Your options are therefore
CHAR(5)
MEDIUMINT (5) UNSIGNED ZEROFILL
The first takes 5 bytes per zip code.
The second takes only 3 bytes per zip code. The ZEROFILL option is necessary for zip codes with leading zeros.
So, if space is your priority, use the MEDIUMINT.
I would suggest, use VARCHAR data type because in some countries zip codes are used as alphanumeric and in other places as an integer. So we cannot use an integer for global use. Also, zip code may start with zero like 001101 so in this case if we take data type integer then leading zero will be lost so we cannot pass actual zip code.
Used to live in the Netherlands and know that also characters are possible. So if you have user in different countries, I'd advise You to set it as a varchar(10).
2525 CA, Netherlands <- This is showing on the exact point, where I used to live. They have some kind of a coordinate system with their zip codes, which shows the exact position in combination with the letters at the end.

Allow number to start with ZERO when stored in mysql integer field

I need to store phone numbers starting with 0 but whenever i try to store this in MySql table the starting ZERO is removed because no number start with Zero actually.
How to solve this issue? Do I need to change the field type from Integer to another type?
change data type to unsigned-zerofill whatever you are using, float, int, decimal(6,2)... only edit the field to unsigned-zerofill
Phone numbers are not really numbers in the sense that they aren't ordinal. They're just characters - they fact that they are numbers is incidental.
Store them in a varchar and move on :D
Phone numbers can contain other symbols for readability too... a regexp for a phone number looks something like [0-9+-()*#]+. So you need to use a text field for phone numbers, plus some validation.
You can use data type as varchar to solve this.
Phone numbers aren't integers and you will only end up with problems trying to store them as integers, store them as strings instead.
Yes - numeric fields only store the numeric values, not the formatting of those (which paddin with leading zeroes is). You should either
change the field type from integer to varchar or char (if # of digits is ALWAYS the same).
Store the number as integer BUT prepend 0 in your presentation layer as needed.
You can also wrap the number you want with a lead zero with a function. I made this function to add lead zero if the "string" is smaller than 2 digits (it was used to add lead zeroes to hours and minutes)
function leadZero($num) {
if (strlen($num) < 2) {
return "0" . $num;
} else {
return $num;
}
}
If you have say, a number 2 that you want to output as 02, you'd do leadZero(2);
This will only add a zero IF the number is less than 2 digits long ! For instance leadZero(14); will return 14