Workaround to allow a TEXT column in mysql MEMORY/HEAP table - mysql

I want to use a temporary MEMORY table to store some intermediate data, but I need/want it to support TEXT columns. I had found a workaround involving casting the TEXT to a VARCHAR or something, but like an idiot I didn't write down the URL anywhere I can find now.
Does anyone know how to, for example, copy a table x into a memory table y where x may have TEXT columns? If anyone knows how to cast columns in a "CREATE TABLE y SELECT * FROM x" sorta format, that would definitely be helpful.
Alternatively, it would help if I could create a table that uses the MEMORY engine by default, and "upgrades" to a different engine (the default maybe) if it can't use the MEMORY table (because of text columns that are too big or whatever).

You can specify a SELECT statement after CREATE TEMPORARY TABLE:
CREATE TEMPORARY TABLE NewTempTable
SELECT
a
, convert(b, char(100)) as b
FROM OtherTable
Re comment: it appears that CHAR is limited to 512 bytes, and you can't cast to VARCHAR. If you use TEXT in a temporary table, the table is stored on disk rather than in memory.
What you can try is defining the table explicitly:
CREATE TEMPORARY TABLE NewTempTable (
a int
, b varchar(1024)
)
insert into NewTempTable
select a, b
from OtherTable

You can use varChar(5000). No need to cast. If you have example data, you can use it as a measure. There is 64Kb space.
The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.

Do you mean CAST(text_column AS CHAR)? Note that you shouldn't need it, MySQL will cast it automatically if the target column is VARCHAR(n).

Related

MySQL performance issue on ~3million rows containing MEDIUMTEXT?

I had a table with 3 columns and 3600K rows. Using MySQL as a key-value store.
The first column id was VARCHAR(8) and set to primary key.The 2nd and 3rd columns were MEDIUMTEXT. When calling SELECT * FROM table WHERE id=00000 MySQL took like 54 sec ~ 3 minutes.
For testing I created a table containing VARCHAR(8)-VARCHAR(5)-VARCHAR(5) where data casually generated from numpy.random.randint. SELECT takes 3 sec without primary key. Same random data with VARCHAR(8)-MEDIUMTEXT-MEDIUMTEXT, the time cost by SELECT was 15 sec without primary key.(note: in second test, 2nd and 3rd column actually contained very short text like '65535', but created as MEDIUMTEXT)
My question is: how can I achieve similar performance on my real data? (or, is it impossible?)
If you use
SELECT * FROM `table` WHERE id=00000
instead of
SELECT * FROM `table` WHERE id='00000'
you are looking for all strings that are equal to an integer 0, so MySQL will have to check all rows, because '0', '0000' and even ' 0' will all be casted to integer 0. So your primary key on id will not help and you will end up with a slow full table. Even if you don't store values that way, MySQL doesn't know that.
The best option is, as all comments and answers pointed out, to change the datatype to int:
alter table `table` modify id int;
This will only work if your ids casted as integer are unique (so you don't have e.g. '0' and '00' in your table).
If you have any foreign keys that references id, you have to drop them first and, before recreating them, change the datatype in the other columns too.
If you have a known format you are storing your values (e.g. no zeros, or filled with 0s up to the length of 8), the second best option is to use this exact format to do your query, and include the ' to not cast it to integer. If you e.g. always fill 0 to 8 digits, use
SELECT * FROM `table` WHERE id='00000000';
If you never add any zeros, still add the ':
SELECT * FROM `table` WHERE id='0';
With both options, MySQL can use your primary key and you will get your result in milliseconds.
If your id column contains only numbers so define it as int , because int will give you better performance ( it is more faster)
Make the column in your table (the one defined as key) integer and retry. Check first performance by running a test within your DB (workbench or simple command line). You should get a better result.
Then, and only if needed (I doubt it though), modify your python to convert from integer to string (and/or vise-versa) when referencing the key column.

mysql maximum row size

I don't know why I'm having this odd behaviour on mysql 5.6.24 , can you help me ? do you think it's a bug
mysql -D database --default_character_set utf8 -e "ALTER TABLE abc_folder ADD COLUMN lev10 varchar(5000);"
ERROR 1118 (42000) at line 1: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
instead
mysql -D database --default_character_set utf8 -e "ALTER TABLE abc_folder ADD COLUMN lev10 varchar(50000);"
In other words a bigger varchar() entry is accepted and correctly working.
Does anybody know what is going on ?
This error means that in your table abc_folder, one single line would be bigger than 65535 bytes without columns of type TEXT or BLOBS.
65535 is the highest number which can be represented by an unsigned 16-bit binary number. And as stated here :
Although InnoDB supports row sizes larger than 65,535 bytes internally, MySQL itself imposes a row-size limit of 65,535 for the combined size of all columns:
mysql> CREATE TABLE t (a VARCHAR(8000), b VARCHAR(10000),
-> c VARCHAR(10000), d VARCHAR(10000), e VARCHAR(10000),
-> f VARCHAR(10000), g VARCHAR(10000)) ENGINE=InnoDB;
ERROR 1118 (42000): Row size too large. The maximum row size for the
used table type, not counting BLOBs, is 65535. You have to change some
columns to TEXT or BLOBs
This means that internally, MySQL is probably using a 16 byte number to store the size of a row.
To correct your error, I would suggest changing VARCHAR(5000) into TEXT or to use multiple tables.
Using TEXT if you want to impose a limit of 5000 bytes on your column, you should use a truncate function when inserting into the table. You can use substr for that.
INSERT INTO abc_folder (lev10) VALUES (SUBSTR('Some Text',0, 5000))

Best solution for saving boolean values and saving cpu and memory on searches

What is the best solution for inserting boolean values on database if you want more query performance and minimum losing of memory on select statement.
For example:
I have a table with 36 fields that 30 of them has boolean values (zero or one) and i need to search records using the boolean fields that just have true values.
SELECT * FROM `myTable`
WHERE
`field_5th` = 1
AND `field_12th` = 1
AND `field_20` = 1
AND `field_8` = 1
Is there any solution?
If you want to store boolean values or flags there are basically three options:
Individual columns
This is reflected in your example above. The advantage is that you will be able to put indexes on the flags you intend to use most often for lookups. The disadvantage is that this will take up more space (since the minimum column size that can be allocated is 1 byte.)
However, if you're column names are really going to be field_20, field_21, etc. Then this is absolutely NOT the way to go. Numbered columns are a sign you should use either of the other two methods.
Bitmasks
As was suggested above you can store multiple values in a single integer column. A BIGINT column would give you up to 64 possible flags.
Values would be something like:
UPDATE table SET flags=b'100';
UPDATE table SET flags=b'10000';
Then the field would look something like: 10100
That would represent having two flag values set. To query for any particular flag value set, you would do
SELECT flags FROM table WHERE flags & b'100';
The advantage of this is that your flags are very compact space-wise. The disadvantage is that you can't place indexes on the field which would help improve the performance of searching for specific flags.
One-to-many relationship
This is where you create another table, and each row there would have the id of the row it's linked to, and the flag:
CREATE TABLE main (
main_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
);
CREATE TABLE flag (
main_id INT UNSIGNED NOT NULL,
name VARCHAR(16)
);
Then you would insert multiple rows into the flag table.
The advantage is that you can use indexes for lookups, and you can have any number of flags per row without changing your schema. This works best for sparse values, where most rows do not have a value set. If every row needs all flags defined, then this isn't very efficient.
For performance comparisson you can read a blog post I wrote on the topic:
Set Performance Compare
Also when you ask which is "Best" that's a very subjective question. Best at what? It all really depends on what your data looks like and what your requirements are and how you want to query it.
Keep in mind that if you want to do a query like:
SELECT * FROM table WHERE some_flag=true
Indexes will only help you if few rows have that value set. If most of the rows in the table have some_flag=true, then mysql will ignore indexes and do a full table scan instead.
How many rows of data are you querying over? You can store the boolean values in an integer value and use bit operations to test for them them. It's not indexable, but storage is very well packed. Using TINYINT fields with indexes would pick one index to use and scan from there.

How to store uuid as number?

Based on the answer of question, UUID performance in MySQL, the person who answers suggest to store UUID as a number and not as a string. I'm not so sure how it can be done. Anyone could suggest me something? How my ruby code deal with that?
If I understand correctly, you're using UUIDs in your primary column? People will say that a regular (integer) primary key will be faster , but there's another way using MySQL's dark side. In fact, MySQL is faster using binary than anything else when indexes are required.
Since UUID is 128 bits and is written as hexadecimal, it's very easy to speed up and store the UUID.
First, in your programming language remove the dashes
From 110E8400-E29B-11D4-A716-446655440000 to 110E8400E29B11D4A716446655440000.
Now it's 32 chars (like an MD5 hash, which this also works with).
Since a single BINARY in MySQL is 8 bits in size, BINARY(16) is the size of a UUID (8*16 = 128).
You can insert using:
INSERT INTO Table (FieldBin) VALUES (UNHEX("110E8400E29B11D4A716446655440000"))
and query using:
SELECT HEX(FieldBin) AS FieldBin FROM Table
Now in your programming language, re-insert the dashes at the positions 9, 14, 19 and 24 to match your original UUID. If the positions are always different you could store that info in a second field.
Full example :
CREATE TABLE `test_table` (
`field_binary` BINARY( 16 ) NULL ,
PRIMARY KEY ( `field_binary` )
) ENGINE = INNODB ;
INSERT INTO `test_table` (
`field_binary`
)
VALUES (
UNHEX( '110E8400E29B11D4A716446655440000' )
);
SELECT HEX(field_binary) AS field_binary FROM `test_table`
If you want to use this technique with any hex string, always do length / 2 for the field length. So for a sha512, the field would be BINARY (64) since a sha512 encoding is 128 characters long.
I don't think that its a good idea to use a binary.
Let's say that you want to query some value:
SELECT HEX(field_binary) AS field_binary FROM `test_table`
If we are returning several values then we are calling the HEX function several times.
However, the main problem is the next one:
SELECT * FROM `test_table`
where field_binary=UNHEX('110E8400E29B11D4A716446655440000')
And using a function inside the where, simply ignores the index.
Also
SELECT * FROM `test_table`
where field_binary=x'skdsdfk5rtirfdcv##*#(&##$9'
Could leads to many problems.

Equivalent of varchar(max) in MySQL?

What is the equivalent of varchar(max) in MySQL?
The max length of a varchar is subject to the max row size in MySQL, which is 64KB (not counting BLOBs):
VARCHAR(65535)
However, note that the limit is lower if you use a multi-byte character set:
VARCHAR(21844) CHARACTER SET utf8
Here are some examples:
The maximum row size is 65535, but a varchar also includes a byte or two to encode the length of a given string. So you actually can't declare a varchar of the maximum row size, even if it's the only column in the table.
mysql> CREATE TABLE foo ( v VARCHAR(65534) );
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
But if we try decreasing lengths, we find the greatest length that works:
mysql> CREATE TABLE foo ( v VARCHAR(65532) );
Query OK, 0 rows affected (0.01 sec)
Now if we try to use a multibyte charset at the table level, we find that it counts each character as multiple bytes. UTF8 strings don't necessarily use multiple bytes per string, but MySQL can't assume you'll restrict all your future inserts to single-byte characters.
mysql> CREATE TABLE foo ( v VARCHAR(65532) ) CHARSET=utf8;
ERROR 1074 (42000): Column length too big for column 'v' (max = 21845); use BLOB or TEXT instead
In spite of what the last error told us, InnoDB still doesn't like a length of 21845.
mysql> CREATE TABLE foo ( v VARCHAR(21845) ) CHARSET=utf8;
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
This makes perfect sense, if you calculate that 21845*3 = 65535, which wouldn't have worked anyway. Whereas 21844*3 = 65532, which does work.
mysql> CREATE TABLE foo ( v VARCHAR(21844) ) CHARSET=utf8;
Query OK, 0 rows affected (0.32 sec)
TLDR; MySql does not have an equivalent concept of varchar(max), this is a MS SQL Server feature.
What is VARCHAR(max)?
varchar(max) is a feature of Microsoft SQL Server.
The amount of data that a column could store in Microsoft SQL server versions prior to version 2005 was limited to 8KB. In order to store more than 8KB you would have to use TEXT, NTEXT, or BLOB columns types, these column types stored their data as a collection of 8K pages separate from the table data pages; they supported storing up to 2GB per row.
The big caveat to these column types was that they usually required special functions and statements to access and modify the data (e.g. READTEXT, WRITETEXT, and UPDATETEXT)
In SQL Server 2005, varchar(max) was introduced to unify the data and queries used to retrieve and modify data in large columns. The data for varchar(max) columns is stored inline with the table data pages.
As the data in the MAX column fills an 8KB data page an overflow page is allocated and the previous page points to it forming a linked list. Unlike TEXT, NTEXT, and BLOB the varchar(max) column type supports all the same query semantics as other column types.
So varchar(MAX) really means varchar(AS_MUCH_AS_I_WANT_TO_STUFF_IN_HERE_JUST_KEEP_GROWING) and not varchar(MAX_SIZE_OF_A_COLUMN).
MySql does not have an equivalent idiom.
In order to get the same amount of storage as a varchar(max) in MySql you would still need to resort to a BLOB column type. This article discusses a very effective method of storing large amounts of data in MySql efficiently.
The max length of a varchar is
65535
divided by the max byte length of a character in the character set the column is set to (e.g. utf8=3 bytes, ucs2=2, latin1=1).
minus 2 bytes to store the length
minus the length of all the other columns
minus 1 byte for every 8 columns that are nullable. If your column is null/not null this gets stored as one bit in a byte/bytes called the null mask, 1 bit per column that is nullable.
For Sql Server
alter table prg_ar_report_colors add Text_Color_Code VARCHAR(max);
For MySql
alter table prg_ar_report_colors add Text_Color_Code longtext;
For Oracle
alter table prg_ar_report_colors add Text_Color_Code CLOB;
Mysql Converting column from VARCHAR to TEXT when under limit size!!!
mysql> CREATE TABLE varchars1(ch3 varchar(6),ch1 varchar(3),ch varchar(4000000))
;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+-------+------+---------------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------------+
| Note | 1246 | Converting column 'ch' from VARCHAR to TEXT |
+-------+------+---------------------------------------------+
1 row in set (0.00 sec)
mysql>