What MySQL data type should be used for xsd:anyURI? - mysql

I have a parameter that i wish to store into my MySQL database.
Here is the description of the parameter :
Parameter Name : endUserId
Type : xsd:anyURI
Max Length : 256
Description : The format is 'tel:' followed by '+' and followed by the phone number
for e.g. tel: +22507588125 , The endUserId in the URL must be the same and URL-escaped, i.e. tel%3A%2B22507588125
so what data type is suitable for the parameter 'endUserId' described above ?
Thanks

A maximum length of 256 is larger than required. I have never seen a phone number with more than, say, 15 digits. Under reasonable constraints, you could store the phone number as a BIGINT. Just store the numeric part, and format the output (append "tel:+") when only you read it.
You may also store the international prefix apart, i.e. a phone number would be stored in two numeric columns (one for the international prefix, another for the actual phone number -- I am told no phone number has a 0 after its international prefix).
If the max size of 256 is not negociable, then your only choice is VARCHAR.

Related

Should Tax Identification Number be INT on MySQL?

I am making a database, and to identify a client, their Tax Identification Number will be the Primary Key.
**
Should the Tax Identification Number be VARCHAR, even though it's just numbers?**
I know things like phone numbers are more like addresses than numbers, and therefore should be VARCHAR.
I am not sure if the Tax Identification Number should be treated the same way, and I need to know these things before I work on the database since I am required to make a Entity Relationship Diagram of the database.
If your are sure it's purely made up of numbers , why not use INT, which takes less disk space and less resource to index ? It's not like phone numbers , which sometimes may contain punctuation marks like hyphen. e.g 0064-337881
By the way, if data masking is required. An implicit convert is automatically performed so the INT type is cast into string type. e.g.
select insert(tax_id, 2 , 3, '***') from testtable;
result:
1***56
7***12
9***07
In the USA, Tax Identification Numbers are 9 digits, and the first digit is 9. So if you store this as numbers only, it fits within a signed 32-bit integer, which supports values up to 231-1, or 2147483647.
But TIN values are commonly formatted with hyphens after the third and fifth digit, i.e. XXX-XX-XXXX. An Employer Identification Number (EIN) is commonly formatted as XX-XXXXXXX with a hyphen after the second digit.
I would therefore use a VARCHAR(11). This allows the common formatting.

Is Not BigInt Enough To House sha1?

I want to know if BigInt is enough in size.
I have created a registration.php where the user gets emailed an account activation link to click to verify his email so his account gets activated.
Account Activation Link is in this format:
[php]
$account_activation_link =
"http://www.".$site_domain."/".$social_network_name."/activate_account.php?primary_website_email=".$primary_website_email."&account_activation_code=".$account_activation_code."";
[/php]
Account Activation Code is in this format:
$account_activation_code = sha1( (string) mt_rand(5, 30)); //Type Casted the INT to STRING on the 1st parameter of sha1 as it needs to be a STRING.
Now, the following link got emailed:
http://www.myssite.com/folder/activate_account.php?primary_website_email=my.email#gmail.com&account_activation_code=22d200f8670dbdb3e253a90eee5098477c95c23d
Note the account activation code that got generated by sha1:
22d200f8670dbdb3e253a90eee5098477c95c23d
But in my mysql db, in the "account_activation_code" column, I only see:
"22". The rest of the activation code is missing. Why is that ?
The column is set to BigInt. Is not that enough to house the Sha1 generated code ?
What is your suggestion ?
Thank You
Hashing methods like SHA-1 produce binary values that are on the order of 160+ bits long depending on the variant used. The common SHA256 one is 256 bits long. No cryptographic hash will fit in a 64-bit BIGINT field because 64-bit hashes are uselessly small, you'll have nothing but collisions.
Normally people store hashes as their hex-encoded equivalents in a VARCHAR(255) column. These can be indexed and perform well enough in most situations, especially one where you do periodic lookups based on clicks. From a performance and storage perspective there's no problems here.
Short answer: BIGINT is way too small.
A hash is basically a stream of bits (160 bits in the case of SHA-1). While it's certainly possible to render those bits as a base 2 number and convert it to base 10, you need a really big storage to do so (as far as I know it's not common to see integer variables larger then 64 bits) and there aren't obvious advantages. BIGINT is a 64-bit type, thus cannot do the job.
Unless you have a good reason to store it as number, I'd simply go for either a binary column type or its plain-text hexadecimal representation in a good old VARCHAR (the latter tends to be more practical to handle).
You are trying to store a string in a BigInt. That is your issue. SHA hashes are a mix of alphanumeric characters not just numbers. Change the field to a VARCHAR and you'll be fine

What data type could I use for an ID number that has a length of 13 digits in SQL Server 2008?

Normally, the INTEGER data type would suffice, but being in South Africa the ID numbers have a length of 13 and the INTEGER data type only goes up to 10. I am not fond of using characters like VARCHAR since it would not restrict the input ID number to integer values only. I only solution I see (other to using VARCHAR) is to use DECIMAL. Only problems that I see are that I can't restrict the max size like in VARCHAR and the data input could have ',' and '.' Any comments?
Just use BIGINT, it ranges from -9223372036854775808 to 9223372036854775807 which should be enough for your application.
Assuming that you're referring to South African national ID numbers, which according to Wikipedia always have 13 digits, then I would go for CHAR(13) with a CHECK constraint (a CLR user-defined data type might also be an option).
The main reason is that the 'number' is not a number, it's an ID. You can't add, subtract, multiply etc. the values so there is no benefit in using a numeric data type. Furthermore, the ID is composed of components that have their own meaning, so being able to parse them out is presumably important (and easier when using character data types).
In fact, depending on how you use this data, you could also add columns that store the individual components of the ID (DOB, sequence, citizenship), either as computed columns or real columns. This could be convenient for querying and reporting (and indexing), especially if you converted the DOB to a date or datetime column.
I would indeed use VARCHAR with a CHECK that matches the format. You can even be more sophisticated if there is internal validation, e.g. a check digit. Now you are all set for other countries that have an alphabetic character, or if you need to handle a leading zero.
I wouldn't use an integer unless it makes sense to do some sort of arithmetic on the field, which is almost certainly not true here.
You could use money as well, although it appears you only get 4 digits after the decimal place. The money type is 8 bytes, giving you a range from -922,337,203,685,477.5808 to 922,337,203,685,477.5807.
declare #num as money
select #num = '1,300,000.45'
select #num
Results in:
1300000.45
The parsing of commas and periods might be dependent on your specific culture settings, although I don't know that for sure.

Best data type to store money values in MySQL

I want to store many records in a MySQL database. All of them contains money values. But I don't know how many digits will be inserted for each one.
Which data type do I have to use for this purpose?
VARCHAR or INT (or other numeric data types)?
Since money needs an exact representation don't use data types that are only approximate like float. You can use a fixed-point numeric data type for that like
decimal(15,2)
15 is the precision (total length of value including decimal places)
2 is the number of digits after decimal point
See MySQL Numeric Types:
These types are used when it is important to preserve exact precision, for example with monetary data.
You can use DECIMAL or NUMERIC both are same
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC. : MySQL
i.e. DECIMAL(10,2)
Good read
I prefer to use BIGINT, and store the values in by multiply with 100, so that it will become integer.
For e.g., to represent a currency value of 93.49, the value shall be stored as 9349, while displaying the value we can divide by 100 and display. This will occupy less storage space.
Caution:
Mostly we don't perform currency * currency multiplication, in case if we are doing it then divide the result with 100 and store, so that it returns to proper precision.
It depends on your need.
Using DECIMAL(10,2) usually is enough but if you need a little bit more precise values you can set DECIMAL(10,4).
If you work with big values replace 10 with 19.
If your application needs to handle money values up to a trillion then this should work: 13,2
If you need to comply with GAAP (Generally Accepted Accounting Principles) then use: 13,4
Usually you should sum your money values at 13,4 before rounding of the output to 13,2.
At the time this question was asked nobody thought about Bitcoin price. In the case of BTC, it is probably insufficient to use DECIMAL(15,2). If the Bitcoin will rise to $100,000 or more, we will need at least DECIMAL(18,9) to support cryptocurrencies in our apps.
DECIMAL(18,9) takes 12 bytes of space in MySQL (4 bytes per 9 digits).
We use double.
*gasp*
Why?
Because it can represent any 15 digit number with no constraints on where the decimal point is. All for a measly 8 bytes!
So it can represent:
0.123456789012345
123456789012345.0
...and anything in between.
This is useful because we're dealing with global currencies, and double can store the various numbers of decimal places we'll likely encounter.
A single double field can represent 999,999,999,999,999s in Japanese yens, 9,999,999,999,999.99s in US dollars and even 9,999,999.99999999s in bitcoins
If you try doing the same with decimal, you need decimal(30, 15) which costs 14 bytes.
Caveats
Of course, using double isn't without caveats.
However, it's not loss of accuracy as some tend to point out. Even though double itself may not be internally exact to the base 10 system, we can make it exact by rounding the value we pull from the database to its significant decimal places. If needed that is. (e.g. If it's going to be outputted, and base 10 representation is required.)
The caveats are, any time we perform arithmetic with it, we need to normalize the result (by rounding it to its significant decimal places) before:
Performing comparisons on it.
Writing it back to the database.
Another kind of caveat is, unlike decimal(m, d) where the database will prevent programs from inserting a number with more than m digits, no such validations exists with double. A program could insert a user inputted value of 20 digits and it'll end up being silently recorded as an inaccurate amount.
If GAAP Compliance is required or you need 4 decimal places:
DECIMAL(13, 4)
Which supports a max value of:
$999,999,999.9999
Otherwise, if 2 decimal places is enough:
DECIMAL(13,2)
src: https://rietta.com/blog/best-data-types-for-currencymoney-in/
Indeed this relies on the programmer's preferences. I personally use: numeric(15,4) to conform to the Generally Accepted Accounting Principles (GAAP).
Try using
Decimal(19,4)
this usually works with every other DB as well
Storing money as BIGINT multiplied by 100 or more with the reason to use less storage space makes no sense in all "normal" situations.
To stay aligned with GAAP it is sufficient to store currencies in DECIMAL(13,4)
MySQL manual reads that it needs 4 bytes per 9 digits to store DECIMAL.
https://dev.mysql.com/doc/refman/8.0/en/precision-math-decimal-characteristics.html
DECIMAL(13,4) represents 9 digits + 4 fraction digits (decimal places) => 4 + 2 bytes = 6 bytes
compare to 8 bytes required to store BIGINT.
There are 2 valid options:
use integer amount of currency minor units (e.g. cents)
represent amount as decimal value of the currency
In both cases you should use decimal data type to have enough significant digits. The difference can be in precision:
even for integer amount of minor units it's better to have extra precisions for accumulators (account for accumulating 10% fees from 1-cent operations)
different currencies have different number of decimals, cryptocurrencies have up to 18 decimals
The number of decimals can change over time due to inflation
Source and more caveats and facts.
Multiplies 10000 and stores as BIGINT, like "Currency" in Visual Basic and Office. See https://msdn.microsoft.com/en-us/library/office/gg264338.aspx

MySQL stripping off leading zero from integer column

I have a bigint field which when entering a number such as '05555555555' for example, the 0 is being stripped off and only inserting '5555555555'.
What data type should I use to prevent this?
You can't. Integer columns (bigint's) do not store leading zeros (ie. in a visual representation)
Rather than attempt to store a leading zero (by using a varchar field), have a view (or whatever) format the integer into a string in the format you require.
If you need to store something that is actually a string in the Domain model (e.g. a phone number), use a string rather than an integer type field.
BIGINT and other Integer columns do not store the visual representation of a number, only the number itself in binary form (BIGINT is 8 bytes). 5555555555 is stored as:
00000000 00000000 00000000 00000001 01001011 00100011 00001100 11100011
If the preceding zeros are significant to the integrity of your data, you should be using a VARCHAR or CHAR instead of an integer type. Numerical datatypes should only be used for numerical data. US ZIP Codes and phone numbers are NOT numerical data.
bigint stores the data as a number, and 05555555555 and 5555555555 are the same number. You'll need a string type to preserve the leading zero, e.g. varchar with a suitable maximum length.
You might look into altering the field to use UNSIGNED ZEROFILL. This should allow you to store the number with leading zeros.
Problem is if you got a big database with hundredthousands of rows, a bigint, is much faster then a VARHAR field. I got a similar issue with a product database full with European Article Numbers(EAN).Some of those codes start with a leading 0 When i change it to VARCHAR it takes 8 seconds to load certain pages that search for EAN codes, when I change it to BIGINT it turns into 2 seconds.
Big difference in speed indeed.