Should Tax Identification Number be INT on MySQL? - mysql

I am making a database, and to identify a client, their Tax Identification Number will be the Primary Key.
**
Should the Tax Identification Number be VARCHAR, even though it's just numbers?**
I know things like phone numbers are more like addresses than numbers, and therefore should be VARCHAR.
I am not sure if the Tax Identification Number should be treated the same way, and I need to know these things before I work on the database since I am required to make a Entity Relationship Diagram of the database.

If your are sure it's purely made up of numbers , why not use INT, which takes less disk space and less resource to index ? It's not like phone numbers , which sometimes may contain punctuation marks like hyphen. e.g 0064-337881
By the way, if data masking is required. An implicit convert is automatically performed so the INT type is cast into string type. e.g.
select insert(tax_id, 2 , 3, '***') from testtable;
result:
1***56
7***12
9***07

In the USA, Tax Identification Numbers are 9 digits, and the first digit is 9. So if you store this as numbers only, it fits within a signed 32-bit integer, which supports values up to 231-1, or 2147483647.
But TIN values are commonly formatted with hyphens after the third and fifth digit, i.e. XXX-XX-XXXX. An Employer Identification Number (EIN) is commonly formatted as XX-XXXXXXX with a hyphen after the second digit.
I would therefore use a VARCHAR(11). This allows the common formatting.

Related

What will be the constraints of the values that can be entered in the colum that was declared as FLOAT?

I am building a web app, and in some section in it a teacher inserts the expected results of a scientific experiment. These results must be very accurate, they might come like this 0.4933546522886728. And after searching for a while, FLOAT seems to be the right datatype to store these answers in the database. As known FLOAT columns in mysql can be declared like this FLOAT(n, d), where n is the total number of digits in the number and d is the number of digits after the decimal point. So, I do not know the number of digits the teacher will enter. So, what would happen if I declared it like this FLOAT. The thing that made me think of this is this quote from the mysql documentation.
For maximum portability, code requiring storage of approximate numeric data values should use FLOAT or DOUBLE PRECISION with no specification of precision or number of digits.
And what would be the maximum and minimum of the values to be entered in this FLOAT column.
I also thought of using VARCHAR and store the exact number that the teacher enters and then according to the nature of the number that in the database number that the student enters to be compared with the right answer will be manipulated to match the other number.
For example if the teacher enters 1.23451 and the student enters 1.4235123, my code will make it 1.42351.
The (n,d) on the end of FLOAT and DECIMAL does not make sense. All it does is cause an extra rounding.
FLOAT provides about 7 significant decimal digits of precision and a modestly big exponent range. 0.4933546522886728 will be stored as about 0.4933546xxxxx, with the extra digits being noise.
That number can be stored in a DOUBLE, with a rounding error after 53 bits (about 16 digits) of precision.
There are very few scientific measurements that need more digits than available in the precision of FLOAT.
You can INSERT ... VALUES ( 0.4933546522886728 ) and put that into a FLOAT. It will get rounded to 24 significant bits. Ditto for 4933546522886.728 . Or 0.0000000004933546522886728 . Or 4.933546522886728e20 or 4.933546522886728e-20 .
Take whatever numbers you are given and simply put them in the INSERT without worrying about precision or scaling.
VARCHAR is the wrong way to go for numbers and dates, unless you want to store the raw input before it has been converted into the internal format.

What is the best data type for ISBN10 and ISBN13 in a MySQL datase

For an application I'm currently building I need a database to store books. The schema of the books table should contain the following attributes:
id, isbn10, isbn13, title, summary
What data types should I use for ISBN10 and ISBN13? My first thoughts where a biginteger but I've read some unsubstantiated comments that say I should use a varchar.
You'll want a CHAR/VARCHAR (CHAR is probably the best choice, as you know the length - 10 and 13 characters). Numeric types like INTEGER will remove leading zeroes in ISBNs like 0-684-84328-5.
ISBN numbers should be stored as strings, varchar(17) for instance.
You need 17 characters for ISBN13, 13 numbers plus the hyphens, and 13 characters for ISBN10, 10 numbers plus hyphens.
ISBN10
ISBN10 numbers, though called "numbers", may contain the letter X. The last number in an ISBN number is a check digit that spans from 0-10, and 10 is represented as X. Plus, they might begin with a double 0, such as 0062472100, and as a numeric format, it might get the leading 00 removed once stored.
84-7844-453-X is a valid ISBN10 number, in which 84 means Spain, 7844 is the publisher's number, 453 is the book number and X (i.e 10) is the control digit. If we remove the hyphens we mix publisher with book id. Is it really important? Depending on the use you'll give to that number. Bibliographic researchers (I've found myself in that situation) might need it for many reasons that I won't go into here, since it has nothing to do with storing data. I would advise against removing hyphens, but the truth is everyone does it.
ISBN13
ISBN13 faces the same issues regarding meaning, in that, with the hyphens you get 4 blocks of meaningful data, without them, language, publisher and book id would become lost.
Nevertheless, the control digit will only be 0-9, there will never be a letter. But should you feel tempted to only store isbn13 numbers (since ISBN10 can automatically and without fail be upgraded to ISBN13), and use int for that matter, you could run into some issues in the future. All ISBN13 numbers begin with 978 or 979, but in the future some 078 might could be added.
A light explanation about ISBN13
A deeper explanation of ISBN
numbers

What data type could I use for an ID number that has a length of 13 digits in SQL Server 2008?

Normally, the INTEGER data type would suffice, but being in South Africa the ID numbers have a length of 13 and the INTEGER data type only goes up to 10. I am not fond of using characters like VARCHAR since it would not restrict the input ID number to integer values only. I only solution I see (other to using VARCHAR) is to use DECIMAL. Only problems that I see are that I can't restrict the max size like in VARCHAR and the data input could have ',' and '.' Any comments?
Just use BIGINT, it ranges from -9223372036854775808 to 9223372036854775807 which should be enough for your application.
Assuming that you're referring to South African national ID numbers, which according to Wikipedia always have 13 digits, then I would go for CHAR(13) with a CHECK constraint (a CLR user-defined data type might also be an option).
The main reason is that the 'number' is not a number, it's an ID. You can't add, subtract, multiply etc. the values so there is no benefit in using a numeric data type. Furthermore, the ID is composed of components that have their own meaning, so being able to parse them out is presumably important (and easier when using character data types).
In fact, depending on how you use this data, you could also add columns that store the individual components of the ID (DOB, sequence, citizenship), either as computed columns or real columns. This could be convenient for querying and reporting (and indexing), especially if you converted the DOB to a date or datetime column.
I would indeed use VARCHAR with a CHECK that matches the format. You can even be more sophisticated if there is internal validation, e.g. a check digit. Now you are all set for other countries that have an alphabetic character, or if you need to handle a leading zero.
I wouldn't use an integer unless it makes sense to do some sort of arithmetic on the field, which is almost certainly not true here.
You could use money as well, although it appears you only get 4 digits after the decimal place. The money type is 8 bytes, giving you a range from -922,337,203,685,477.5808 to 922,337,203,685,477.5807.
declare #num as money
select #num = '1,300,000.45'
select #num
Results in:
1300000.45
The parsing of commas and periods might be dependent on your specific culture settings, although I don't know that for sure.

Best data type to store money values in MySQL

I want to store many records in a MySQL database. All of them contains money values. But I don't know how many digits will be inserted for each one.
Which data type do I have to use for this purpose?
VARCHAR or INT (or other numeric data types)?
Since money needs an exact representation don't use data types that are only approximate like float. You can use a fixed-point numeric data type for that like
decimal(15,2)
15 is the precision (total length of value including decimal places)
2 is the number of digits after decimal point
See MySQL Numeric Types:
These types are used when it is important to preserve exact precision, for example with monetary data.
You can use DECIMAL or NUMERIC both are same
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC. : MySQL
i.e. DECIMAL(10,2)
Good read
I prefer to use BIGINT, and store the values in by multiply with 100, so that it will become integer.
For e.g., to represent a currency value of 93.49, the value shall be stored as 9349, while displaying the value we can divide by 100 and display. This will occupy less storage space.
Caution:
Mostly we don't perform currency * currency multiplication, in case if we are doing it then divide the result with 100 and store, so that it returns to proper precision.
It depends on your need.
Using DECIMAL(10,2) usually is enough but if you need a little bit more precise values you can set DECIMAL(10,4).
If you work with big values replace 10 with 19.
If your application needs to handle money values up to a trillion then this should work: 13,2
If you need to comply with GAAP (Generally Accepted Accounting Principles) then use: 13,4
Usually you should sum your money values at 13,4 before rounding of the output to 13,2.
At the time this question was asked nobody thought about Bitcoin price. In the case of BTC, it is probably insufficient to use DECIMAL(15,2). If the Bitcoin will rise to $100,000 or more, we will need at least DECIMAL(18,9) to support cryptocurrencies in our apps.
DECIMAL(18,9) takes 12 bytes of space in MySQL (4 bytes per 9 digits).
We use double.
*gasp*
Why?
Because it can represent any 15 digit number with no constraints on where the decimal point is. All for a measly 8 bytes!
So it can represent:
0.123456789012345
123456789012345.0
...and anything in between.
This is useful because we're dealing with global currencies, and double can store the various numbers of decimal places we'll likely encounter.
A single double field can represent 999,999,999,999,999s in Japanese yens, 9,999,999,999,999.99s in US dollars and even 9,999,999.99999999s in bitcoins
If you try doing the same with decimal, you need decimal(30, 15) which costs 14 bytes.
Caveats
Of course, using double isn't without caveats.
However, it's not loss of accuracy as some tend to point out. Even though double itself may not be internally exact to the base 10 system, we can make it exact by rounding the value we pull from the database to its significant decimal places. If needed that is. (e.g. If it's going to be outputted, and base 10 representation is required.)
The caveats are, any time we perform arithmetic with it, we need to normalize the result (by rounding it to its significant decimal places) before:
Performing comparisons on it.
Writing it back to the database.
Another kind of caveat is, unlike decimal(m, d) where the database will prevent programs from inserting a number with more than m digits, no such validations exists with double. A program could insert a user inputted value of 20 digits and it'll end up being silently recorded as an inaccurate amount.
If GAAP Compliance is required or you need 4 decimal places:
DECIMAL(13, 4)
Which supports a max value of:
$999,999,999.9999
Otherwise, if 2 decimal places is enough:
DECIMAL(13,2)
src: https://rietta.com/blog/best-data-types-for-currencymoney-in/
Indeed this relies on the programmer's preferences. I personally use: numeric(15,4) to conform to the Generally Accepted Accounting Principles (GAAP).
Try using
Decimal(19,4)
this usually works with every other DB as well
Storing money as BIGINT multiplied by 100 or more with the reason to use less storage space makes no sense in all "normal" situations.
To stay aligned with GAAP it is sufficient to store currencies in DECIMAL(13,4)
MySQL manual reads that it needs 4 bytes per 9 digits to store DECIMAL.
https://dev.mysql.com/doc/refman/8.0/en/precision-math-decimal-characteristics.html
DECIMAL(13,4) represents 9 digits + 4 fraction digits (decimal places) => 4 + 2 bytes = 6 bytes
compare to 8 bytes required to store BIGINT.
There are 2 valid options:
use integer amount of currency minor units (e.g. cents)
represent amount as decimal value of the currency
In both cases you should use decimal data type to have enough significant digits. The difference can be in precision:
even for integer amount of minor units it's better to have extra precisions for accumulators (account for accumulating 10% fees from 1-cent operations)
different currencies have different number of decimals, cryptocurrencies have up to 18 decimals
The number of decimals can change over time due to inflation
Source and more caveats and facts.
Multiplies 10000 and stores as BIGINT, like "Currency" in Visual Basic and Office. See https://msdn.microsoft.com/en-us/library/office/gg264338.aspx

MySQL Type for Storing a Year: Smallint or Varchar or Date?

I will be storing a year in a MySQL table: Is it better to store this as a smallint or varchar? I figure that since it's not a full date, that the date format shouldn't be an answer but I'll include that as well.
Smallint? varchar(4)? date? something else?
Examples:
2008
1992
2053
I would use the YEAR(4) column type... but only if the years expected are within the range 1901 and 2155... otherwise, see Gambrinus's answer.
I'd go for small-int - as far as I know - varchar would take more space as well as date. second option would be the date.
My own experience is with Oracle, which does not have a YEAR data type, but I have always tried to avoid using numeric data types for elements just because they are comprised only of digits. (So this includes phone numbers, social security numbers, zip codes as well, as additional examples).
My own rule of thumb is to consider what the data is used for. If you will perform mathematical operations on it then store it as a number. If you will perform string functions (eg. "Take the last four characters of the SSN" or "Display the phone number as (XXX) XXX-XXXX") then it's a string.
An additional clue is the requirement to store leading zeroes as part of the number.
Furthermore, and despite being commonly referred to as a phone "number", they frequently contain letters to indicate the presence of an extension number as a suffix. Similarly, a Standard Book Number potentially ended in an "X" as a "check digit", and an International Standard Serial Number can end with an "X" (despite the ISSN International Centre repeatedly referring to it as an 8-digit code http://www.issn.org/understanding-the-issn/what-is-an-issn/).
Formatting of phone numbers in an international context is tricky, or course, and conforming to E.164 requires that country calling codes are prefixed with a "+".