bit(8) vs tinyint - mysql

In the eyes of the storage engine, is there any difference at all between bit(8) and tinyint?
or bit(16) vs smallint?
or bit(24) vs mediumint?
or bit(32) vs int?
What I want to know is that are they synonymous and the engine treats one like the other?

First off: I don't have any idea of the internals of how the individual engines would treat bit fields when trying to do queries. I would be curious if it would be faster to index or query those two column types.
From a raw storage standpoint, these are the storage requirements for numeric types:
TINYINT = 1 byte
SMALLINT = 2 bytes
MEDIUMINT = 3 bytes
INT, INTEGER = 4
bytes
BIGINT = 8 bytes
FLOAT(p) = 4
bytes if 0 <= p <= 24, 8 bytes if 25
<= p <= 53 FLOAT = 4 bytes
DOUBLE
[PRECISION], REAL = 8 bytes
DECIMAL(M,D), NUMERIC(M,D) = Varies;
see following discussion
BIT(M) =
approximately (M+7)/8 bytes
Unless you're using NDBCluster storage engine which requires 4 bytes per storage record. (unless you have multiple bit types which will compact into that 4 byte minimum)
Edit:
According to this page on numeric types tinyint(1) and bit were synonymous before version 5.0.3 for myISAM and 5.0.5 for MEMORY, InnoDB, BDB, and NDBCLUSTER. This would imply they are no longer.

Related

How to set the maximal size of BLOB column in MySQL?

I want to set the maximal size of a BLOB column to be up to 900KB.
Is there a way similar to the syntax of the other String Data types - for example
c CHAR(7),
vc VARCHAR(50),
pic BLOB (???)
to do this?
From the storage requirements section of the manual, the max size of the following fields:
TINYBLOB : L < 2^8 = 256 Bytes
BLOB : L < 2^16 = 65,536 Bytes
MEDIUMBLOB : L < 2^24 = 16,777,216 Bytes
LONGBLOB : L < 2^32 = 4,294,967,296 Bytes
In order to store 900KB, you would need to use a MEDIUMBLOB at the very minimum.
As far as I know, you can not specify your own size for the field.
With our IBM Database (AS400), it is possible to provide a size to a Blob field.
fieldName Blob(size in bytes)
The maximum Size is then visible like any other length
Screenshot from DBvisualizer:

MySQL TEXT memory allocation

Anybody knows how MySQL allocates disk space for fields like "TEXT" or "BLOB"
For example, what happens when I insert 10kb string into "TEXT" column? Is the entire 65kb data allocated or only 10kb?
This is explained in the documentation: http://dev.mysql.com/doc/refman/5.7/en/storage-requirements.html
BLOB, TEXT L + 2 bytes, where L < 2^16
MEDIUMBLOB, MEDIUMTEXT L + 3 bytes, where L < 2^24
LONGBLOB, LONGTEXT L + 4 bytes, where L < 2^32
Variable-length string types are stored using a length prefix plus
data. The length prefix requires from one to four bytes depending on
the data type, and the value of the prefix is L (the byte length of
the string). For example, storage for a MEDIUMTEXT value requires L
bytes to store the value plus three bytes to store the length of the
value.
So in short, the whole 65kb is not wasted.

Best datatype to store a long number made of 0 and 1

I want to know what's the best datatype to store these:
null
0
/* the length of other numbers is always 7 digits */
0000000
0000001
0000010
0000011
/* and so on */
1111111
I have tested, INT works as well. But there is a better datatype. Because all my numbers are made of 0 or 1 digits. Is there any better datatype?
What you are showing are binary numbers
0000000 = 0
0000001 = 2^0 = 1
0000010 = 2^1 = 2
0000011 = 2^0 + 2^1 = 3
So simply store these numbers in an integer data type (which is internally stored with bits as shown of course). You could use BIGINT for this, as recommended in the docs for bitwise operations (http://dev.mysql.com/doc/refman/5.7/en/bit-functions.html).
Here is how to set flag n:
UPDATE mytable
SET bitmask = POW(2, n-1)
WHERE id = 12345;
Here is how to add a flag:
UPDATE mytable
SET bitmask = bitmask | POW(2, n-1)
WHERE id = 12345;
Here is how to check a flag:
SELECT *
FROM mytable
WHERE bitmask & POW(2, n-1)
But as mentioned in the comments: In a relational database you usually use columns and tables to show attributes and relations rather than an encoded flag list.
As you've said in a comment, the values 01 and 1 should not be treated as equivalent (which rules out binary where they would be), so you could just store as a string.
It actually might be more efficient than storing as a byte + offset since that would take up 9 characters, whereas you need a maximum of 7 characters
Simply store as a varchar(7) or whatever the equivalent is in MySql. No need to be clever about it, especially since you are interested in extracting positional values.
Don't forget to bear in mind that this takes up a lot more storage than storing as a bit(7), since you are essentially storing 7 bytes (or whatever the storage unit is for each level of precision in a varchar), not 7 bits.
If that's not an issue then no need to over-engineer it.
You could convert the binary number to a string, with an additional byte to specify the number of leading zeros.
Example - the representation of 010:
The numeric value in hex is 0x02.
There is one leading zero, so the first byte is 0x01.
The result string is 0x01,0x02.
With the same method, 1010010 should be represented as 0x00,0x52.
Seems to me pretty efficient.
Not sure if it is the best datatype, but you may want to try BIT:
MySQL, PostgreSQL
There are also some useful bit functions in MySQL.

What is the Max limit of mysql text type

I am working on an app that allows people to upload data via pdf files. After reading the pdf with my app, i also like to store all characters in the pdf from the first page to the last page.
My fear is that a pdf file can be up to 80mb which can contain over 1 billion characters.
Can mysql handle such large amount of characters?
MySQL data storage requirements can be found here: MySQL5 storage requirements
There I find this table (L = length of string):
TINYBLOB, TINYTEXT L + 1 bytes, where L < 2^8 = 256b
BLOB, TEXT L + 2 bytes, where L < 2^16 = 65.536 = 65kb
MEDIUMBLOB, MEDIUMTEXT L + 3 bytes, where L < 2^24 = 16.777.216 = 16mb
LONGBLOB, LONGTEXT L + 4 bytes, where L < 2^32 = 4.294.967.296 = 4.3gb
So for 80Mb page, you need a LONGTEXT. For PDF I would advice a LONGBLOB type, since this is binary format.
For the record: Eggyal has a point that it is better NOT to store this PDF in the database, but on disk. So I would advice on no doing it via the database, if you really need to put it in MySQL use a LONGBLOB
Check this link: http://dev.mysql.com/doc/refman/5.7/en/storage-requirements.html
TINYTEXT 256 bytes
TEXT 65,535 bytes ~64kb
MEDIUMTEXT 16,777,215 bytes ~16MB
LONGTEXT 4,294,967,295 bytes ~4GB

MySQL bitwise AND 256-bit binary values

I'm intending on storing a 256-bit long binary value in a MySQL table column.
Which column type should I be using (blob?) such that I can run bitwise operations against it (example of an AND would be ideal).
I don't think you could find some way to perform bit-wise operation on 256-bit values at SQL level as the doc clearly state that:
MySQL uses BIGINT (64-bit) arithmetic for bit operations, so these operators have a maximum range of 64 bits.
http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#operator_bitwise-and
As for storing those values, TINYBLOB is possible, but my personal preference would go to simply BINARY(32) (a binary string of 32 bytes -- 256-bits).
While writing this, one trick came to my mind. If we are limited to 64-bit values (BIGINT UNSIGNED), why not store your 256-bit as 4 words of 64-bits. Not very elegant but that would work. Especially here since you only need bitwise operations:
ABCD32 & WXYZ32 == A8 & W8, B8 & X8, C8 & Y8, D8 & Z8
Very basically:
create table t (a bigint unsigned,
b bigint unsigned,
c bigint unsigned,
d bigint unsigned);
While inserting, 256-bit values has to be "split" on 4 words:
-- Here I use hexadecimal notation for conciseness. you may use b'010....000' if you want
insert into t values (0xFFFFFFFF,
0xFFFF0000,
0xFF00FF00,
0xF0F0F0F0);
You could easily query the 256-bit value:
mysql> select CONCAT(LPAD(HEX(a),8,'0'),
LPAD(HEX(b),8,'0'),
LPAD(HEX(c),8,'0'),
LPAD(HEX(d),8,'0')) from t;
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT(LPAD(HEX(a),8,'0'),
LPAD(HEX(b),8,'0'),
LPAD(HEX(c),8,'0'),
LPAD(HEX(d),8,'0')) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| FFFFFFFFFFFF0000FF00FF00F0F0F0F0 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
I used hexadecimal here again, but you could display as binary by replacing ̀HEX() by BIN()
And last but not least you could perform binary operation on them. Once again, you just have to "split" the operand. Assuming I want to apply the 256 bits mask 0xFFFFFFFFFFFFFFFF0000000000000000 to all values in the table:
update t set a = a & 0xFFFFFFFF,
b = b & 0xFFFFFFFF,
c = c & 0x00000000,
d = d & 0x00000000;
Looks like blob works with a query like this for the bitwise and:
select id,bin(label & b'01000000010000001000000000000000000') from projects;