I'm a bit confused with the MySQL Documentation with regards to the storage requirements for various fields. I'm currently working with redesigning a database and I'm seeing TINYINT(4) as they data type. Previously I've never given any thought to this, but will this require one byte and just truncate the last digit off the number, or will it actually require 2 bytes and be converted to a SMALLINT internally?
EDIT - I know that the number represents the amount of digits that will be displayed, like TINYINT(2) will only show 2 digits or whatever, but what if that number is more than the data type can actually hold?
As you stated correctly the TINYINT type uses 1 byte of storage for 256 possible integer values (-128 through 127) or UNSIGNED 0-255. See that -128? This is (along with ZEROFILL) the reason for (4). But it won´t get converted to a SMALLINT automatically, so choose your data type accordingly.
See this link, this blog deals with the topic (as mentioned in answer here).
Related
I have created one table and I want to store numbers from 1 to 60 numbers only inside the field.
What should I put in the datatype of the table field? Should I use TINYINT (4) datatype?
"Best" data type is open to interpretation. Here are three options:
numeric(2, 0)
varchar(2)
tinyint(2)
These have different sizes, but that doesn't make them "best" -- except under certain circumstances where storage space is a primary concern. I am guessing that your "numbers" are not really numbers, but are codes of some sort that vary from 1 to 60.
If these are referencing a reference table, then tinyint makes sense as the key, because keys are often numbers. However, I often use int for such keys. The extra three bytes usually have little impact on performance.
If the code is zero-padded (so '01' rather than '1'), then char(2) is the appropriate type. It might take more space, but it accurately represents the value.
If these are indeed numbers -- like addition or multiplication is defined -- then tinyint is definitely the most appropriate type.
Yes TINYINT is the best option its from -128 to 127!
Documentation Link: https://dev.mysql.com/doc/refman/8.0/en/integer-types.html
I was using the MySQL INT(11) data type, but had to change to using CHAR(45) because I was dealing with large integers.
Now the CHAR is allowing empty strings to be submitted rather than return errors like it should. Is there another data type I can use?
You could use BIGINT or UNSIGNED BIGINT.
Anyways, you should do some validation at application level, so it will show the user some meaningful error message when the data he sent is not valid. And this is regardless the datatype used.
i would use BigInt, check this page: http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview.html
if INT doesn't have the range you need, you can 'upgrade' to BIGINT
Generally speaking, if the value is number, you should stick to the numeric data classes-- not doing so usually leads down the road of hard to find bugs and issues.
a half step would be specifing INT as UNSIGNED (basically doubles the range by not including the negative side of the number line in the data space)
See the documentation for more info on numeric types.
I seem to see a lot of people arbitrarily assigning large sizes to primary/foreign key fields in their MySQL schemas, such as INT(11) and even BIGINT(20) as WordPress uses.
Now correct me if I'm wrong, but even an INT(4) would support (unsigned) values up to over 4 billion. Change it to INT(5) and you allow for values up to a quadrillion, which is more than you would ever need, unless possibly you're storing geodata at NASA/Google, which I'm sure most of us aren't.
Is there a reason people use such large sizes for their primary keys? Seems like a waste to me...
The size is neither bits nor bytes.
It's just the display width, that is
used when the field has ZEROFILL
specified.
and
INT[(M)] [UNSIGNED] [ZEROFILL] A
normal-size integer. The signed range
is -2147483648 to 2147483647. The
unsigned range is 0 to 4294967295.
See this explanation.
I don't see any good reason to use a number larger than 32-bit integer for indexing data in normal business-sized databases. Most of them have maybe millions of records (or that order of magnitude).
Are there any performance difference between decimal(10,0) unsigned type and int(10) unsigned type?
It may depend on the version of MySQL you are using. See here.
Prior to MySQL 5.0.3, the DECIMAL type was stored as a string and would typically be slower.
However, since MySQL 5.0.3 the DECIMAL type is stored in a binary format so with the size of your DECIMAL above, there may not be much difference in performance.
The main performance issue would have been the amount of space taken up by the different types (with DECIMAL being slower). With MySQL 5.0.3+ this appears to be less of an issue, however if you will be performing numeric calculations on the values as part of the query, there may be some performance difference. This may be worth testing as there is no indication in the documentation that i can see.
Edit:
With regards to the int(10) unsigned, i took this at face value as just being a 4 byte int. However this has a maximum value of 4294967295 which strictly doesn't provide the same range of numbers as a DECIMAL(10,0) unsigned .
As #Unreason pointed out, you would need to use a bigint to cover the full range of 10 digit numbers, pushing the size up to 8 bytes.
A common mistake is that when specifying numeric columns types in MySQL, people often think the number in the brackets has an impact on the size of the number they can store. It doesn't. The number range is purely based on the column type and whether it is signed or unsigned. The number in the brackets is for display purposes in results and has no impact on the values stored in the column. It will also have no impact of the display of the results unless you specify the ZEROFILL option on the column as well.
According to the mysql data storage your decimal will require
DECIMAL(10,0): 4 bytes for 9 digits and 1 byte for the remaining 10th digit, so in total five bytes (assuming my reading of documentation is correct).
INT(10): will need BIGINT which is 8 bytes.
The differences is that the decimal is packed and some operations on such data type might be slower then on normal INT types which map directly to machine represented numbers.
Still I would do your own tests to confirm the above reasoning.
EDIT:
I noticed that I did not elaborate on the obvious point - assuming the above logic is sound the difference in size required is 60% more space needed for BIGINT variant.
However this does not directly translate to penalties due to the fact that data is normally not written byte by byte. In case of selects/updates of many rows you should see the performance loss/gain, but in case of selecting/updating a small number of rows the filesystem will fetch blocks from the disk(s) which will normally get/write multiple columns anyway.
The size (and speed) of indexes might be more directly impacted.
However, the question on how the packing influences various operations still remains open.
According to this similar question, yes, potentially there is a big performance hit because of difference in the way DECIMAL and INT are treated and threaded into the CPU when doing calculations.
See: Is there a performance hit using decimal data types (MySQL / Postgres)
I doubt such a difference can be performance related at all.
Most of performance issues tied to proper database design and indexing plan, and server/hardware tuning as a next level.
In which cases would you use which? Is there much of a difference? Which I typically used by persistence engines to store booleans?
A TINYINT is an 8-bit integer value, a BIT field can store between 1 bit, BIT(1), and 64 bits, BIT(64). For a boolean values, BIT(1) is pretty common.
From Overview of Numeric Types;
BIT[(M)]
A bit-field type. M indicates the
number of bits per value, from 1 to
64. The default is 1 if M is omitted.
This data type was added in MySQL
5.0.3 for MyISAM, and extended in 5.0.5 to MEMORY, InnoDB, BDB, and NDBCLUSTER. Before 5.0.3, BIT is a
synonym for TINYINT(1).
TINYINT[(M)] [UNSIGNED] [ZEROFILL]
A very small integer. The signed range
is -128 to 127. The unsigned range is
0 to 255.
Additionally consider this;
BOOL, BOOLEAN
These types are synonyms for
TINYINT(1). A value of zero is
considered false. Non-zero values are
considered true.
All these theoretical discussions are great, but in reality, at least if you're using MySQL and really for SQLServer as well, it's best to stick with non-binary data for your booleans for the simple reason that it's easier to work with when you're outputting the data, querying and so on. It is especially important if you're trying to achieve interoperability between MySQL and SQLServer (i.e. you sync data between the two), because the handling of BIT datatype is different in the two of them. SO in practice you will have a lot less hassles if you stick with a numeric datatype. I would recommend for MySQL to stick with BOOL or BOOLEAN which gets stored as TINYINT(1). Even the way MySQL Workbench and MySQL Administrator display the BIT datatype isn't nice (it's a little symbol for binary data). So be practical and save yourself the hassles (and unfortunately I'm speaking from experience).
BIT should only allow 0 and 1 (and NULL, if the field is not defined as NOT NULL). TINYINT(1) allows any value that can be stored in a single byte, -128..127 or 0..255 depending on whether or not it's unsigned (the 1 shows that you intend to only use a single digit, but it does not prevent you from storing a larger value).
For versions older than 5.0.3, BIT is interpreted as TINYINT(1), so there's no difference there.
BIT has a "this is a boolean" semantic, and some apps will consider TINYINT(1) the same way (due to the way MySQL used to treat it), so apps may format the column as a check box if they check the type and decide upon a format based on that.
Might be wrong but:
Tinyint is an integer between 0 and 255
bit is either 1 or 0
Therefore to me bit is the choice for booleans
From my experience I'm telling you that BIT has problems on linux OS types(Ubuntu for ex).
I developped my db on windows and after I deployed everything on linux, I had problems with queries that inserted or selected from tables that had BIT DATA TYPE.
Bit is not safe for now.
I changed to tinyint(1) and worked perfectly. I mean that you only need a value to diferentiate if it's 1 or 0 and tinyint(1) it's ok for that