I am facing a complet mystery.
I have create a table to store meteorolocal data. I have one value per hour, since 1979, for every 0.25 latitude and longitude.
This brings me to have billions of lines in the database.
Following multiples advices, I partionnated the table.
I choosed to partitionnate by years. This is how it looks like :
CREATE TABLE `MyTable` (
`latitude_100` SMALLINT NOT NULL, -- Smallint is 2 bytes, where float is 4. So we take latitude * 100
`longitude_100` SMALLINT NOT NULL, -- Same logic here
`time` DATETIME NOT NULL,
`final` TINYINT UNSIGNED NOT NULL,
`value` DOUBLE NOT NULL,
PRIMARY KEY (`latitude_100` ASC, `longitude_100` ASC, `time` ASC)
)
PARTITION BY HASH(YEAR(time)) PARTITIONS 45 ; -- This will work until 2023 included
In order to test, I injected in the table data only from 2015 to 2021.
The problem :
All SELECT from this table are extremly long.
Even worst, they are sometime stupidly long.
For example :
SELECT time, latitude_100, longitude_100, value
FROM MyTable
WHERE latitude_100 BETWEEN 500 AND 2000
AND longitude_100 BETWEEN 11600 AND 12800 AND
YEAR(time) = 1990 ;
Remember that there is NO data for 1990. By looking into the right partition, MySQL should see it immeditaly isn't it ?
MySQL explain me that it will look in all partition, which I do not understand why :
EXPLAIN SELECT time, latitude_100, longitude_100, value
FROM MyTable
WHERE latitude_100 BETWEEN 500 AND 2000
AND longitude_100 BETWEEN 11600 AND 12800 AND
YEAR(time) = 1990 ;
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, MyTable, p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31,p32,p33,p34,p35,p36,p37,p38,p39,p40,p41,p42,p43,p44, range, PRIMARY, PRIMARY, 4, , 118295536, 11.11, Using where
When I do
SELECT * FROM information_schema.partitions WHERE TABLE_SCHEMA='MySchema' AND TABLE_NAME = 'MyTable' AND PARTITION_NAME IS NOT NULL
I can see that only 6 partitions have data, all other are empty.
Last think I tried was to formulate the WHERE differently, to maybe take advantage of the index :
SELECT time, latitude_100, longitude_100, value
FROM MyTable
WHERE latitude_100 BETWEEN 500 AND 2000
AND longitude_100 BETWEEN 11600 AND 12800 AND
time BETWEEN "1990-01-01 00:00:00" AND "1990-12-31 23:00:00" AND
YEAR(time) = 1990 ;
But this does not accelerate the execution. Only the EXPLAIN is a bit different (but not in termes of partition reading) :
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, MyTable, p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31,p32,p33,p34,p35,p36,p37,p38,p39,p40,p41,p42,p43,p44, range, PRIMARY, PRIMARY, 9, , 118295536, 1.23, Using where
What do I do wrong ?
Why MySQL does not want to cooperate with partitionning ?
Thank you very much !
[Edit]
On technical side, the database is hosted on AWS RDS. It is powered by a "db.t4g.large" instance and user MySQL 8.0.27
Do not use PARTITION BY HASH! HASH will fail to do any pruning when using a date range (as you have!). Simply put, the Optimizer is not smart enough to see that your range fits in a single partition. Furthermore, HASH may unnecessarily be lumping two different years into the same partition. Instead, use PARTITION BY RANGE.
I know that RANGE(TO_DAYS(time)) works; perhaps RANGE(YEAR(time)) may work, depending on what version of MySQL you are using; check the specifics.
Hour: With some date arithmetic, you can shrink a 5-byte DATETIME down to a 3-byte MEDIUMINT. (A suitable change to PARTITION BY RANGE would be needed.)
Not enough: Since you are testing with only 7 years of data, my Partitioning suggestion will help only by a factor of 7.
DOUBLE? What are you measuring? DOUBLE takes 8 bytes and gives you about 16 significant digits. Even FLOAT (4 bytes, 7 digits) is likely to be overkill. For temperature (°C), consider DECIMAL(2) or TINYINT (-128..+127) or DECIMAL(4,2); they are 1,1,2 bytes, respectively. Extremes recorded: -89..+57. Note: °F would need one more byte in any INT or DECIMAL encoding. (I would guess that an instrument too close to a volcano or wildfire would fail to transmit data if the temp exceeded 99°C.)
Shrinking the DOUBLE would shrink the dataset size by about 1/3 -- worth the effort.
If you will end up with about 400GB rows, datatype size is very important.
So, let's dig deeper... Please provide
Amount of RAM
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
Any other SELECTs that you are likely to run, including WHERE clauses other than exactly one year.
How much disk space did your 7 years take? If using MyISAM, I would expect about 1.2TB; if using InnoDB, 3TB.
The lat/lng ranges in the sample Select were relatively small. Is this typical? If so, we may be able to take advantage of it.
ENGINE -- Since this is, I assume, mostly a readonly dataset, it may be a rare case where MyISAM is better. See estimates above; multiple by 6 to get estimates for the 43 years.
Usage -- What will you do with the results of a SELECT like the one you have? If that is the 'only' query, then there are more compact ways to store the data. But they will be more complex to Insert and Select. However, the speed improvement may be worth it. I need to see the various Selects before advising further.
Related
What is good approach to handle 3b rec table where concurrent read/write is very frequent within few days?
Linux server, running MySQL v8.0.15.
I have this table that will log device data history. The table need to retain its data for one year, possibly two years. The growth rate is very high: 8,175,000 rec/day (1mo=245m rec, 1y=2.98b rec). In the case of device number growing, the table is expected to be able to handle it.
The table read is frequent within last few days, more than a week then this frequency drop significantly.
There are multi concurrent connection to read and write on this table, and the target to r/w is quite close to each other, therefore deadlock / table lock happens but has been taken care of (retry, small transaction size).
I am using daily partitioning now, since reading is hardly spanning >1 partition. However there will be too many partition to retain 1 year data. Create or drop partition is on schedule with cron.
CREATE TABLE `table1` (
`group_id` tinyint(4) NOT NULL,
`DeviceId` varchar(10) COLLATE utf8mb4_unicode_ci NOT NULL,
`DataTime` datetime NOT NULL,
`first_log` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`first_res` tinyint(1) NOT NULL DEFAULT '0',
`last_log` datetime DEFAULT NULL,
`last_res` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`group_id`,`DeviceId`,`DataTime`),
KEY `group_id` (`group_id`,`DataTime`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (to_days(`DataTime`))
(
PARTITION p_20191124 VALUES LESS THAN (737753) ENGINE = InnoDB,
PARTITION p_20191125 VALUES LESS THAN (737754) ENGINE = InnoDB,
PARTITION p_20191126 VALUES LESS THAN (737755) ENGINE = InnoDB,
PARTITION p_20191127 VALUES LESS THAN (737756) ENGINE = InnoDB,
PARTITION p_future VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
Insert are performed in size ~1500/batch:
INSERT INTO table1(group_id, DeviceId, DataTime, first_result)
VALUES(%s, %s, FROM_UNIXTIME(%s), %s)
ON DUPLICATE KEY UPDATE last_log=NOW(), last_res=values(first_result);
Select are mostly to get count by DataTime or DeviceId, targeting specific partition.
SELECT DataTime, COUNT(*) ct FROM table1 partition(p_20191126)
WHERE group_id=1 GROUP BY DataTime HAVING ct<50;
SELECT DeviceId, COUNT(*) ct FROM table1 partition(p_20191126)
WHERE group_id=1 GROUP BY DeviceId HAVING ct<50;
So the question:
Accord to RickJames blog, it is not a good idea to have >50 partitions in a table, but if partition is put monthly, there are 245m rec in one partition. What is the best partition range in use here? Does RJ's blog still taken place with current mysql version?
Is it a good idea to leave the table not partitioned? (the index is running well atm)
note: I have read this stack question, having multiple table is a pain, therefore if it is not necessary i wish not to break the table. Also, sharding is currently not possible.
First of all, INSERTing 100 records/second is a potential bottleneck. I hope you are using SSDs. Let me see SHOW CREATE TABLE. Explain how the data is arriving (in bulk, one at a time, from multiple sources, etc) because we need to discuss batching the input rows, even if you have SSDs.
Retention for 1 or 2 years? Yes, PARTITIONing will help, but only with the deleting via DROP PARTITION. Use monthly partitions and use PARTITION BY RANGE(TO_DAYS(DataTime)). (See my blog which you have already found.)
What is the average length of DeviceID? Normally I would not even mention normalizing a VARCHAR(10), but with billions of rows, it is probably worth it.
The PRIMARY KEY you have implies that a device will not provide two values in less than one second?
What do "first" and "last" mean in the column names?
In older versions of MySQL, the number of partitions had impact on performance, hence the recommendation of 50. 8.0's Data Dictionary may have a favorable impact on that, but I have not experimented yet to see if the 50 should be raised.
The size of a partition has very little impact on anything.
In order to judge the indexes, let's see the queries.
Sharding is not possible? Do too many queries need to fetch multiple devices at the same time?
Do you have Summary tables? That is a major way for Data Warehousing to avoid performance problems. (See my blogs on that.) And, if you do some sort of "staging" of the input, the summary tables can be augmented before touching the Fact table. At that point, the Fact table is only an archive; no regular SELECTs need to touch it? (Again, let's see the main queries.)
One table per day (or whatever unit) is a big no-no.
Ingestion via IODKU
For the batch insert via IODKU, consider this:
collect the 1500 rows in a temp table, preferably with a single, 1500-row, INSERT.
massage that data if needed
do one IODKU..SELECT:
INSERT INTO table1(group_id, DeviceId, DataTime, first_result)
ON DUPLICATE KEY UPDATE
last_log=NOW(), last_res=values(first_result)
SELECT group_id, DeviceId, DataTime, first_result
FROM tmp_table;
If necessary, the SELECT can do some de-dupping, etc.
This approach is likely to be significantly faster than 1500 separate IODKUs.
DeviceID
If the DeviceID is alway 10 characters and limited to English letters and digits, then make it
CHAR(10) CHARACTER SET ascii
Then pick between COLLATION ascii_general_ci and COLLATION ascii_bin, depending on whether you allow case folding or not.
Just for your reference:
I have a large table right now over 30B rows, grows 11M rows daily.
The table is innodb table and is not partitioned.
Data over 7 years is archived to file and purged from the table.
So if your performance is acceptable, partition is not necessary.
From management perspective, it is easier to manage the table with partitions, you might partition the data by week. It will 52 - 104 partitions if you keep last or 2 years data online
Hi I currently have a query which is taking 11(sec) to run. I have a report which is displayed on a website which runs 4 different queries which are similar and all take 11(sec) each to run. I don't really want the customer having to wait a minute for all of these queries to run and display the data.
I am using 4 different AJAX requests to call an APIs to get the data I need and these all start at once but the queries are running one after another. If there was a way to get these queries to all run at once (parallel) so the total load time is only 11(sec) that would also fix my issue, I don't believe that is possible though.
Here is the query I am running:
SELECT device_uuid,
day_epoch,
is_repeat
FROM tracking_daily_stats_zone_unique_device_uuids_per_hour
WHERE day_epoch >= 1552435200
AND day_epoch < 1553040000
AND venue_id = 46
AND zone_id IN (102,105,108,110,111,113,116,117,118,121,287)
I can't think of anyway to speed this query up at all, below are pictures of the table indexes and the explain statement on this query.
I think the above query is using relevant indexes in the where conditions.
If there is anything you can think of to speed this query up please let me know, I have been working on it for 3 days and can't seem to figure out the problem. It would be great to get the query times down to 5(sec) maximum. If I am wrong about the AJAX issue please let me know as this would also fix my issue.
" EDIT "
I have came across something quite strange which might be causing the issue. When I change the day_epoch range to something smaller (5th - 9th) which returns 130,000 rows the query time is 0.7(sec) but then I add one more day onto that range (5th - 10th) and it returns over 150,000 rows the query time is 13(sec). I have ran loads of different ranges and have came to the conclusion if the amount of rows returned is over 150,000 that has a huge effect on the query times.
Table Definition -
CREATE TABLE `tracking_daily_stats_zone_unique_device_uuids_per_hour` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`day_epoch` int(10) NOT NULL,
`day_of_week` tinyint(1) NOT NULL COMMENT 'day of week, monday = 1',
`hour` int(2) NOT NULL,
`venue_id` int(5) NOT NULL,
`zone_id` int(5) NOT NULL,
`device_uuid` binary(16) NOT NULL COMMENT 'binary representation of the device_uuid, unique for a single day',
`device_vendor_id` int(5) unsigned NOT NULL DEFAULT '0' COMMENT 'id of the device vendor',
`first_seen` int(10) unsigned NOT NULL DEFAULT '0',
`last_seen` int(10) unsigned NOT NULL DEFAULT '0',
`is_repeat` tinyint(1) NOT NULL COMMENT 'is the device a repeat for this day?',
`prev_last_seen` int(10) NOT NULL DEFAULT '0' COMMENT 'previous last seen ts',
PRIMARY KEY (`id`,`venue_id`) USING BTREE,
KEY `venue_id` (`venue_id`),
KEY `zone_id` (`zone_id`),
KEY `day_of_week` (`day_of_week`),
KEY `day_epoch` (`day_epoch`),
KEY `hour` (`hour`),
KEY `device_uuid` (`device_uuid`),
KEY `is_repeat` (`is_repeat`),
KEY `device_vendor_id` (`device_vendor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=450967720 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (venue_id)
PARTITIONS 100 */
The straight forward solution is to add this query specific index to the table:
ALTER TABLE tracking_daily_stats_zone_unique_device_uuids_per_hour
ADD INDEX complex_idx (`venue_id`, `day_epoch`, `zone_id`)
WARNING This query change can take a while on DB.
And then force it when you call:
SELECT device_uuid,
day_epoch,
is_repeat
FROM tracking_daily_stats_zone_unique_device_uuids_per_hour
USE INDEX (complex_idx)
WHERE day_epoch >= 1552435200
AND day_epoch < 1553040000
AND venue_id = 46
AND zone_id IN (102,105,108,110,111,113,116,117,118,121,287)
It is definitely not universal but should work for this particular query.
UPDATE When you have partitioned table you can get profit by forcing particular PARTITION. In our case since that is venue_id just force it:
SELECT device_uuid,
day_epoch,
is_repeat
FROM tracking_daily_stats_zone_unique_device_uuids_per_hour
PARTITION (`p46`)
WHERE day_epoch >= 1552435200
AND day_epoch < 1553040000
AND zone_id IN (102,105,108,110,111,113,116,117,118,121,287)
Where p46 is concatenated string of p and venue_id = 46
And another trick if you go this way. You can remove AND venue_id = 46 from WHERE clause. Because there is no other data in that partition.
What happens if you change the order of conditions? Put venue_id = ? first. The order matters.
Now it first checks all rows for:
- day_epoch >= 1552435200
- then, the remaining set for day_epoch < 1553040000
- then, the remaining set for venue_id = 46
- then, the remaining set for zone_id IN (102,105,108,110,111,113,116,117,118,121,287)
When working with heavy queries, you should always try to make the first "selector" the most effective. You can do that by using a proper index for 1 (or combination) index and to make sure that first selector narrows down the most (at least for integers, in case of strings you need another tactic).
Sometimes, a query simply is slow. When you have a lot of data (and/or not enough resources) you just cant really do anything about that. Thats where you need another solution: Make a summary table. I doubt you show 150.000 rows x4 to your visitor. You can sum it, e.g., hourly or every few minutes and select from that way smaller table.
Offtopic: Putting an index on everything only slows you down when inserting/updating/deleting. Index the least amount of columns, just the once you actually filter on (e.g. use in a WHERE or GROUP BY).
450M rows is rather large. So, I will discuss a variety of issues that can help.
Shrink data A big table leads to more I/O, which is the main performance killer. ('Small' tables tend to stay cached, and not have an I/O burden.)
Any kind of INT, even INT(2) takes 4 bytes. An "hour" can easily fit in a 1-byte TINYINT. That saves over a 1GB in the data, plus a similar amount in INDEX(hour).
If hour and day_of_week can be derived, don't bother having them as separate columns. This will save more space.
Some reason to use a 4-byte day_epoch instead of a 3-byte DATE? Or perhaps you do need a 5-byte DATETIME or TIMESTAMP.
Optimal INDEX (take #1)
If it is always a single venue_id, then either this is a good first cut at the optimal index:
INDEX(venue_id, zone_id, day_epoch)
First is the constant, then the IN, then a range. The Optimizer does well with this in many cases. (It is unclear whether the number of items in an IN clause can lead to inefficiencies.)
Better Primary Key (better index)
With AUTO_INCREMENT, there is probably no good reason to include columns after the auto_inc column in the PK. That is, PRIMARY KEY(id, venue_id) is no better than PRIMARY KEY(id).
InnoDB orders the data's BTree according to the PRIMARY KEY. So, if you are fetching several rows and can arrange for them to be adjacent to each other based on the PK, you get extra performance. (cf "Clustered".) So:
PRIMARY KEY(venue_id, zone_id, day_epoch, -- this order, as discussed above;
id) -- to make sure that the entire PK is unique.
INDEX(id) -- to keep AUTO_INCREMENT happy
And, I agree with DROPping any indexes that are not in use, including the one I recommended above. It is rarely useful to index flags (is_repeat).
UUID
Indexing a UUID can be deadly for performance once the table is really big. This is because of the randomness of UUIDs/GUIDs, leading to ever-increasing I/O burden to insert new entries in the index.
Multi-dimensional
Assuming day_epoch is sometimes multiple days, you seem to have 2 or 3 "dimensions":
A date range
A list of zones
A venue.
INDEXes are 1-dimensional. Therein lies the problem. However, PARTITIONing can sometimes help. I discuss this briefly as "case 2" in http://mysql.rjweb.org/doc.php/partitionmaint .
There is no good way to get 3 dimensions, so let's focus on 2.
You should partition on something that is a "range", such as day_epoch or zone_id.
After that, you should decide what to put in the PRIMARY KEY so that you can further take advantage of "clustering".
Plan A: This assumes you are searching for only one venue_id at a time:
PARTITION BY RANGE(day_epoch) -- see note below
PRIMARY KEY(venue_id, zone_id, id)
Plan B: This assumes you sometimes srefineearch for venue_id IN (.., .., ...), hence it does not make a good first column for the PK:
Well, I don't have good advice here; so let's go with Plan A.
The RANGE expression must be numeric. Your day_epoch works fine as is. Changing to a DATE, would necessitate BY RANGE(TO_DAYS(...)), which works fine.
You should limit the number of partitions to 50. (The 81 mentioned above is not bad.) The problem is that "lots" of partitions introduces different inefficiencies; "too few" partitions leads to "why bother".
Note that almost always the optimal PK is different for a partitioned table than the equivalent non-partitioned table.
Note that I disagree with partitioning on venue_id since it is so easy to put that column at the start of the PK instead.
Analysis
Assuming you search for a single venue_id and use my suggested partitioning & PK, here's how the SELECT performs:
Filter on the date range. This is likely to limit the activity to a single partition.
Drill into the data's BTree for that one partition to find the one venue_id.
Hopscotch through the data from there, landing on the desired zone_ids.
For each, further filter based the date.
The application we are developing is writing around 4-5 millions rows of data every day. And, we need to save these data for the past 90 days.
The table user_data has the following structure (simplified):
id INT PRIMARY AUTOINCREMENT
dt TIMESTAMP CURRENT_TIMESTAMP
user_id varchar(20)
data varchar(20)
About the application:
Data that is older than 7 days old will not be written / updated.
Data is mostly accessed based on user_id (i.e. all queries will have WHERE user_id = XXX)
There are around 13000 users at the moment.
User can still access older data. But, in accessing the older data, we can restrict that he/she can only get the whole day data only and not a time range. (e.g. If a user attempts to get the data for 2016-10-01, he/she will get the data for the whole day and will not be able to get the data for 2016-10-01 13:00 - 2016-10-01 14:00).
At the moment, we are using MySQL InnoDB to store the latest data (i.e. 7 days and newer) and it is working fine and fits in the innodb_buffer_pool.
As for the older data, we created smaller tables in the form of user_data_YYYYMMDD. After a while, we figured that these tables cannot fit into the innodb_buffer_pool and it started to slow down.
We think that separating / sharding based on dates, sharding based on user_ids would be better (i.e. using smaller data sets based on user and dates such as user_data_[YYYYMMDD]_[USER_ID]). This will keep the table in much smaller numbers (only around 10K rows at most).
After researching around, we have found that there are a few options out there:
Using mysql tables to store per user per date (i.e. user_data_[YYYYMMDD]_[USER_ID]).
Using mongodb collection for each user_data_[YYYYMMDD]_[USER_ID]
Write the old data (json encoded) into [USER_ID]/[YYYYMMDD].txt
The biggest con I see in this is that we will have huge number of tables/collections/files when we do this (i.e. 13000 x 90 = 1.170.000). I wonder if we are approaching this the right way in terms of future scalability. Or, if there are other standardized solutions for this.
Scaling a database is an unique problem to the application. Most of the times someone else's approach cannot be used as almost all applications writes its data in its own way. So you have to figure out how you are going to manage your data.
Having said that, if your data continue to grow, best solution is the shadring where you can distribute the data across different servers. As long as bound to a single server like creating different tables you are getting hit by resource limits like memory, storage and processing power. Those cannot be increased unlimited manner.
How to distribute the data, that you have to figure out based on your business use cases. As you mentioned, if you are not getting more request on old data, the best way to distribute the data base on date. Like DB for 2016 data, DB for 2015 and so on. Later you may purge or shutdown the servers which you have more old data.
This is a big table, but not unmanageable.
If user_id + dt is UNIQUE, make it the PRIMARY KEY, and get rid if id, thereby saving space. (More in a minute...)
Normalize user_id to a SMALLINT UNSIGNED (2 bytes) or, to be safer MEDIUMINT UNSIGNED (3 bytes). This will save a significant amount of space.
Saving space is important for speed (I/O) for big tables.
PARTITION BY RANGE(TO_DAYS(dt))
with 92 partitions -- the 90 you need, plus 1 waiting to be DROPped and one being filled. See details here .
ENGINE=InnoDB
to get the PRIMARY KEY clustered.
PRIMARY KEY(user_id, dt)
If this is "unique", then it allows efficient access for any time range for a single user. Note: you can remove the "just a day" restriction. However, you must formulate the query without hiding dt in a function. I recommend:
WHERE user_id = ?
AND dt >= ?
AND dt < ? + INTERVAL 1 DAY
Furthermore,
PRIMARY KEY(user_id, dt, id),
INDEX(id)
Would also be efficient even if (user_id, dt) is not unique. The addition of id to the PK is to make it unique; the addition of INDEX(id) is to keep AUTO_INCREMENT happy. (No, UNIQUE(id) is not required.)
INT --> BIGINT UNSIGNED ??
INT (which is SIGNED) will top out at about 2 billion. That will happen in a very few years. Is that OK? If not, you may need BIGINT (8 bytes vs 4).
This partitioning design does not care about your 7-day rule. You may choose to keep the rule and enforce it in your app.
BY HASH
will not work as well.
SUBPARTITION
is generally useless.
Are there other queries? If so they must be taken into consideration at the same time.
Sharding by user_id would be useful if the traffic were too much for a single server. MySQL, itself, does not (yet) have a sharding solution.
Try TokuDB engine at https://www.percona.com/software/mysql-database/percona-tokudb
Archive data are great for TokuDB. You will need about six times less disk space to store AND memory to PROCESS your dataset compared to InnoDB or about 2-3 times less than archived myisam.
1 million+ tables sounds like a bad idea. Having sharding via dynamic table naming by the app code at runtime has also not been a favorable pattern for me. My first go-to for this type of problem would be partitioning. You probably don't want 400M+ rows in a single unpartitioned table. In MySQL 5.7 you can even subpartition (but that gets more complex). I would first range partition on your date field, with one partition per day. Index on the user_id. If you are on 5.7 and want to dabble with subpartitioning, I would suggest range partition by date, then hash subpartition by user_id. As a starting point, try 16 to 32 hash buckets. Still index the user_id field.
EDIT: Here's something to play with:
CREATE TABLE user_data (
id INT AUTO_INCREMENT
, dt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
, user_id VARCHAR(20)
, data varchar(20)
, PRIMARY KEY (id, user_id, dt)
, KEY (user_id, dt)
) PARTITION BY RANGE (UNIX_TIMESTAMP(dt))
SUBPARTITION BY KEY (user_id)
SUBPARTITIONS 16 (
PARTITION p1 VALUES LESS THAN (UNIX_TIMESTAMP('2016-10-25')),
PARTITION p2 VALUES LESS THAN (UNIX_TIMESTAMP('2016-10-26')),
PARTITION p3 VALUES LESS THAN (UNIX_TIMESTAMP('2016-10-27')),
PARTITION p4 VALUES LESS THAN (UNIX_TIMESTAMP('2016-10-28')),
PARTITION pMax VALUES LESS THAN MAXVALUE
);
-- View the metadata if you're interested
SELECT * FROM information_schema.partitions WHERE table_name='user_data';
i want to store changes that i do on my "entity" table. This should be like a log. Currently it is implemented with this table in MySQL:
CREATE TABLE `entitychange` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`entity_id` int(10) unsigned NOT NULL,
`entitytype` enum('STRING_1','STRING_2','SOMEBOOL','SOMEDOUBLE','SOMETIMESTAMP') NOT NULL DEFAULT 'STRING_1',
`when` TIMESTAMP NOT NULL,
`value` TEXT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
entity_id = the primary key of my entity table.
entitytype = the field that was changed in the entity table. sometimes only one field is changed, sometimes multiple. one change = one row.
value = the string representation of the "new value" of the field.
Example when changing Field entity.somedouble from 3 to 2, i run those queries:
UPDATE entity SET somedouble = 2 WHERE entity_id = 123;
INSERT INTO entitychange (entity_id,entitytype,value) VALUES (123,'SOMEDOUBLE',2);
I need to select the changes of a specific entity and entitytype of the last 15 days. For example: The last changes with SOMEDOUBLE for entity_id 123 within the last 15 days.
Now, there are two things that i dislike:
All Data is stored as TEXT - although most (less than 1%) isn't really text, in my case, most values are DOUBLE. Is this a big problem?
The Table is getting really, really slow when inserting, since the table already has 200 million rows. Currently my Server load is up to 10-15 because of this.
My Question: How do i address those two "bottlenecks"? I need to scale.
My approaches would be:
Store it like this: http://sqlfiddle.com/#!2/df9d0 (click on browse) - Store the changes in the entitychange table and then store the value according to its datatype in entitychange_[bool|timestamp|double|string]
Use partitioning by HASH(entity_id) - i thought of ~50 partitions.
Should I use another database system, maybe MongoDB?
If I were facing the problem you mentioned, I would design LOG table like bellow:
EntityName: (String) Entity that is being manipulated.(mandatory)
ObjectId: Entity that is being manipulated, primary key.
FieldName: (String) Entity field name.
OldValue: (String) Entity field old value.
NewValue: (String) Entity field new value.
UserCode: Application user unique identifier. (mandatory)
TransactionCode: Any operation changing the entities will need to have a unique transaction code (like GUID) (mandatory), In case of an update on an entity changing multiple fields,these column will be the key point to trace all changes in the update(transcation)
ChangeDate: Transaction date. (mandatory)
FieldType: enumeration or text showing the field type like TEXT or Double. (mandatory)
Having this approach Any entity (table) could be traced Reports will be readableOnly changes will be logged. Transaction code will be the key point to detect changes by a single action.
BTW
Store the changes in the entitychange table and then store the value
according to its datatype in entitychange_[bool|timestamp|double|string]
Won't be needed, in the single table you will have changes and data types
Use partitioning by HASH(entity_id)
I will prefer partitioning by ChangeDate or creating backup tables for changeDate that are old enough to be backed up and remover from the main LOG table
Should I use another database system, maybe MongoDB?
Any data base comes with its own prob and cons , you can use the design on any RDBMS.
A useful comparison of documant based data bases like MongoDB could be found here
hope be helpful.
Now I think I understand what you need, a versionable table with history of the records changed. This could be another way of achieving the same and you could easily make some quick tests in order to see if it gives you better performance than your current solution. Its the way Symfony PHP Framework does it in Doctrine with the Versionable plugin.
Have in mind that there is a primary key unique index of two keys, version and fk_entity.
Also take a look at the values saved. You will save a 0 value in the fields which didnt change and the changed value in those who changed.
CREATE TABLE `entity_versionable` (
`version` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`fk_entity` INT(10) UNSIGNED NOT NULL,
`str1` VARCHAR(255),
`str2` VARCHAR(255),
`bool1` BOOLEAN,
`double1` DOUBLE,
`date` TIMESTAMP NOT NULL,
PRIMARY KEY (`version`,`fk_entity`)
) ENGINE=INNODB DEFAULT CHARSET=latin1;
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "a1", "0", "0", "0", "2013-06-02 17:13:16");
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "a2", "0", "0", "0", "2013-06-11 17:13:12");
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "0", "b1", "0", "0", "2013-06-11 17:13:21");
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "0", "b2", "0", "0", "2013-06-11 17:13:42");
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "0", "0", "1", "0", "2013-06-16 17:19:31");
/*Another example*/
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
VALUES ("1", "a1", "b1", "0", "0", CURRENT_TIMESTAMP);
SELECT * FROM `entity_versionable` t WHERE
(
(t.`fk_entity`="1") AND
(t.`date` >= (CURDATE() - INTERVAL 15 DAY))
);
And probably another step to improve performance, it could be to save all history log records in separate tables, once per month or so. That way you wont have many records in each table, and searching by date will be really fast.
There two main challenges here:
How to store data efficiently, i.e. taking less space and being in an easy to use format
2-3. Managing a big table: archiving, ease for backup and restore
2-3. Performance optimisation: faster inserts and selects
Storing data efficiently
value filed. I would suggest to make it VARCHAR (N).
Reasons:
Using N<255 will save 1 byte per row just because of the data type.
Using other data types for this filed: fixed types use space whatever the value is, and normally it will be 8 bytes per row (datetime, long integer, char (8)) and other variable datatypes are too big for this field.
Also TEXT data type results in performance penalties: (from manaul on BLOB and Text data types)
Instances of TEXT columns in the result of a query that is processed using a temporary table causes the server to use a table on disk rather than in memory because the MEMORY storage engine does not support those data types. Use of disk incurs a performance penalty, so include BLOB or TEXT columns in the query result only if they are really needed. For example, avoid using SELECT *, which selects all columns.
Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened.
Basically TEXT is designed to store big strings and pieced of text, whereas VARCHAR() is designed relatively short strings.
id field. (updated, thanks to #steve) I agree that this field does not carry any useful information. Use 3 columns for your primary key: entity_id and entitype and when . TIMESTAMP will guarantee you pretty well that there will be no duplicates. Also same columns will be used for partitioning/sub-partitioning.
Table manageability
There are two main options: MERGE tables and Partitioning. MERGE storage engine is based on My_ISAM, which is being gradually phased out as far as I understand. Here is some reading on [MERGE Storage Engine].2
Main tool is Partitioning and it provides two main benefits:
1. Partition switching (which is often an instant operation on large chunk of data) and rolling window scenario: insert new data in one table and then instantly switch all of it into archive table.
2. Storing data in sorted order, that enables partition pruning - querying only those partitions, that contain needed data. MySQL allows sub-partitioning to group data further.
Partitioning by entity_id makes sense. If you need to query data for extended periods of time or you have other pattern in querying your table - use that column for sub-partitioing. There is no need for sub- partitioning on all columns of primary key, unless partitions will be switched at that level.
Number of partitions depends on how big you want db file for that partition to be. Number of sub-partitions depends on number of cores, so each core can search its own partition, N-1 sub-partitions should be ok, so 1 core can do overall coordination work.
Optimisation
Inserts:
Inserts are faster on table without indexes, so insert big chunk of data (do your updates), then create indexes (if possible).
Change Text for Varchar - it take some strain off db engine
Minimal logging and table locks may help, but not often possible to use
Selects:
Text to Varchar should definitely improve things.
Have a current table with recent data - last 15 days, then move to archive via partition switching. Here you have an option to partition table different to archive table (eg. by date first, then entity_id), and change partitioning manner by moving small (1 day) of data to temp table anв changing partitioning of it.
Also you can consider partitioning by date, you have many queries on date ranges. Put usage of your data and its parts first and then decide which schema will support it best.
And as for your 3rd question, I do not see how use of MongoDB will specifically benefit this situation.
This is called a temporal database, and researchers have been struggling with the best way to store and query temporal data for over 20 years.
Trying to store the EAV data as you are doing is inefficient, in that storing numeric data in a TEXT column uses a lot of space, and your table is getting longer and longer, as you have discovered.
Another option which is sometimes called Sixth Normal Form (although there are multiple unrelated definitions for 6NF), is to store an extra table to store revisions for each column you want to be tracked temporally. This is similar to the solution posed by #xtrm's answer, but it doesn't need to store redundant copies of columns that haven't changed. But it does lead to an explosion in the number of tables.
I've started to read about Anchor Modeling, which promises to handle temporal changes of both structure and content. But I don't understand it well enough to explain it yet. I'll just link to it and maybe it'll make sense to you.
Here are a couple of books that contain discussions of temporal databases:
Joe Celko's SQL for Smarties, 4th ed.
Temporal Data & the Relational Model, C.J. Date, Hugh Darwen, Nikos Lorentzos
Storing an integer in a TEXT column is a no-go! TEXT is the most expensive type.
I would go as far as creating one log table per field you want to monitor:
CREATE TABLE entitychange_somestring (
entity_id INT NOT NULL PRIMARY KEY,
ts TIMESTAMP NOT NULL,
newvalue VARCHAR(50) NOT NULL, -- same type as entity.somestring
KEY(entity_id, ts)
) ENGINE=MyISAM;
Partition them, indeed.
Notice I recommend using the MyISAM engine. You do not need transactions for this (these) unconstrained, insert-only table(s).
Why is INSERTing so slow, and what can you do to make it faster.
These are the things I would look at (and roughly in the order I would work through them):
Creating a new AUTO_INCREMENT-id and inserting it into the primary key requires a lock (there is a special AUTO-INC lock in InnoDB, which is held until the statement finishes, effectively acting as a table lock in your scenario). This is not usually a problem as this is a relatively fast operation, but on the other hand, with a (Unix) load value of 10 to 15, you are likely to have processes waiting for that lock to be freed. From the information you supply, I don't see any use in your surrogate key 'id'. See if dropping that column changes performance significantly. (BTW, there is no rule that a table needs a primary key. If you don't have one, that's fine)
InnoDB can be relatively expensive for INSERTs. This is a trade off made to allow additional functionality such as transactions and may or may not be affecting you. Since all your actions are atomic, I see no need for transactions. That said, give MyISAM a try. Note: MyISAM is usually a bad choice for huge tables because it only supports table locking and not record level locking, but it does support concurrent inserts, so it might be a choice here (especially if you do drop the primary key, see above)
You could play with database storage engine parameters. Both InnoDB and MyISAM have options you could change. Some of them have an impact on how TEXT data is actually stored, others have a broader function. One you should specifically look at is innodb_flush_log_at_trx_commit.
TEXT columns are relatively expensive if (and only if) they have non-NULL values. You are currently storing all values in that TEXT column. It is worth giving the following a try: add extra fields value_int and value_double to your table and store those values in the corresponding column. Yes, that will waste some extra space, but might be faster - but this will largely be dependant on the database storage engine and its settings. Please note that a lot of what people think about TEXT column performance is not true. (See my answer to a related question on VARCHAR vs TEXT)
You suggested spreading the information over more than one table. This is only a good idea if your tables are fully independant of one another. Otherwise you'll end up with more than one INSERT operation for any change, and you're more than likely to make things a lot worse. While normalizing data is usually good(tm), it is likely to hurt performance here.
What can you do to make SELECTs run fast
Proper keys. And proper keys. And just in case I forgot to mention: proper keys. You don't specify in detail what your selects look like, but I assume them to be similar to "SELECT * FROM entitychange WHERE entity_id=123 AND ts>...". A single compound index on entity_id and ts should be enough to make this operation fast. Since the index has to be updated with every INSERT, it may be worth trying the performance of both entity_id, ts and ts, entity_id: It might make a difference.
Partitioning. I wouldn't even bring this subject up, if you hadn't asked in your question. You don't say why you'd like to partition the table. Performance-wise it usually makes no difference, provided that you have proper keys. There are some specific setups that can boost performance, but you'll need the proper hardware setup to go along with this. If you do decide to partition your table, consider doing that by either the entity_id or the TIMESTAMP column. Using the timestamp, you could end up with archiving system with older data being put on an archive drive. Such a partitioning system would however require some maintenance (adding partitions over time).
It seems to me that you're not as concerned about query performance as about the raw insert speed, so I won't go into more detail on SELECT performance. If this does interest you, please provide more detail.
I would advise you to make a lot of in deep testing, but from my tests I am achiving very good results with both INSERT and SELECT with the table definition I posted before. I will detail my tests in depth so anyone could easily repeat and check if it gets better results. Backup your data before any test.
I must say that these are only tests, and may not reflect or improve your real case, but its a good way of learning and probably a way of finding usefull information and results.
The advises that we have seen here are really nice, and you will surely notice a great speed improvement by using a predefined type VARCHAR with size instead of TEXT. However you could gain speed, I would advise not to use MyISAM for data integrity reasons, stay with InnoDB.
TESTING:
1. Setup Table and INSERT 200 million of data:
CREATE TABLE `entity_versionable` (
`version` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`fk_entity` INT(10) UNSIGNED NOT NULL,
`str1` VARCHAR(255) DEFAULT NULL,
`str2` VARCHAR(255) DEFAULT NULL,
`bool1` TINYINT(1) DEFAULT NULL,
`double1` DOUBLE DEFAULT NULL,
`date` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`version`,`fk_entity`)
) ENGINE=INNODB AUTO_INCREMENT=230297534 DEFAULT CHARSET=latin1
In order to insert +200 million rows in about 35 mins in a table, please check my other question where peterm has answered one of the best ways to fill a table. It works perfectly.
Execute the following query 2 times in order to insert 200 million rows of no random data (change data each time to insert random data):
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE)
SELECT 1, 'a1', 238, 2, 524627, '2013-06-16 14:42:25'
FROM
(
SELECT a.N + b.N * 10 + c.N * 100 + d.N * 1000 + e.N * 10000 + f.N * 100000 + g.N * 1000000 + h.N * 10000000 + 1 N FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) e
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) f
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) g
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) h
) t;
*Since you already have the original table with 200 million rows of real random data, you wont probably need to fill it, just export your table data and schema and import it into a new Testing table with the same schema. That way you will make tests in a new table with your real data, and the improvements you get will also work for the original one.
2. ALTER the new Test table for performance (or use my example above in step 1 to get better results).
Once that we have our new Test table setup and filled with random data, we should check the above advises, and ALTER the table to speed it up:
Change TEXT to VARCHAR(255).
Select and make a good primary key unique index with two or three
columns. Test with version autoincrement and fk_entity in your first
test.
Partition your table if necessary, and check if it improves speed. I
would advise not to partition it in your first tests, in order to
check for real performance gain by changing data types and mysql
configuration. Check the following link for some partition and
improvement tips.
Optimize and repair your table. Index will be made again and will
speed searchs a lot:
OPTIMIZE TABLE test.entity_versionable;
REPAIR TABLE test.entity_versionable;
*Make a script to execute optimize and maintain your index up to date, launching it every night.
3. Improve your MySQL and hardware configuration by carefully reading the following threads. They are worth reading and Im sure you will get better results.
Easily improve your Database hard disk configuration spending a bit
of money: If possible use a SSD for your main MySQL database, and a
stand alone mechanical hard disk for backup purposes. Set MySQL logs
to be saved on another third hard disk to improve speed in your
INSERTs. (Remember to defragment mechanical hard disks after some
weeks).
Performance links: general&multiple-cores, configuration,
optimizing IO, Debiancores, best configuration,
config 48gb ram..
Profiling a SQL query: How to profile a query, Check for possible bottleneck in a query
MySQL is very memory intensive, use low latency CL7 DDR3 memory if
possible. A bit off topic, but if your system data is critical, you may look for ECC memory, however its expensive.
4. Finally, tests your INSERTs and SEARCHs in the test table. Im my tests with +200 million of random data with the above table schema, it spends 0,001seconds to INSERT a new row and about 2 minutes to search and SELECT 100 million rows. And however its only a test and seems to be good results :)
5. My System Configuration:
Database: MySQL 5.6.10 InnoDB database (test).
Processor: AMD Phenom II 1090T X6 core, 3910Mhz each core.
RAM: 16GB DDR3 1600Mhz CL8.
HD: Windows 7 64bits SP1 in SSD, mySQL installed in SSD, logs written in mechanical hard disk.
Probably we should get better results with one of the lastest Intel i5 or i7 easily overclocked to 4500Mhz+, since MySQL only uses one core for one SQL. The higher the core speed, the faster it will be executed.
6. Read more about MySQL:
O'Reilly High Performance MySQL
MySQL Optimizing SQL Statements
7. Using another database:
MongoDB or Redis will be perfect for this case and probably a lot faster than MySQL. Both are very easy to learn, and both has their advantages:
- MongoDB: MongoDB log file growth
Redis
I would definitively go for Redis. If you learn how to save the log in Redis, it will be the best way to manage the log with insanely high speed:
redis for logging
Have in mind the following advices if you use Redis:
Redis is compiled in C and its stored in memory, has some different
methods to automatically save the information into disk
(persistence), you wont probably have to worry about it. (in case of disaster
scenario you will end loosing about 1 second of logging).
Redis is used in a lot of sites which manages terabytes of data,
there are a lot of ways to handle that insane amount of information
and it means that its secure (used here in stackoverflow, blizzard, twitter, youporn..)
Since your log will be very big, it will need to fit in memory in
order to get speed without having to access the hard disk. You may
save different logs for different dates and set only some of them in
memory. In case of reaching memory limit, you wont have any errors and everything will still work perfectly, but check the Redis Faqs for more information.
Im totally sure that Redis will be a lot faster for this purpose than
MySQL. You will need to learn about how to play with lists and
sets to update data and query/search for data. If you may need really advanced query searches, you should go with MongoDB, but in this case of simple date searchs will be perfect for Redis.
Nice Redis article in Instagram Blog.
At work we have logtables on almost every table due to customer conditions (financial sector).
We have done it this way: Two tables ("normal" table, and log table) and then triggers on insert/update/delete of the normal table whichs stores a keyword (I,U,D) and the old record (on update, delete) or the new one (on insert) inside the logtable
We have both tables in the same database-schema
I have a warehouse table that looks like this:
CREATE TABLE Warehouse (
id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
eventId BIGINT(20) UNSIGNED NOT NULL,
groupId BIGINT(20) NOT NULL,
activityId BIGINT(20) UNSIGNED NOT NULL,
... many more ids,
"txtProperty1" VARCHAR(255),
"txtProperty2" VARCHAR(255),
"txtProperty3" VARCHAR(255),
"txtProperty4" VARCHAR(255),
"txtProperty5" VARCHAR(255),
... many more of these
PRIMARY KEY ("id")
KEY "WInvestmentDetail_idx01" ("groupId"),
... several more indices
) ENGINE=INNODB;
Now, the following query spends about 0.8s in query time and 0.2s in fetch time, for a total of about one second. The query returns ~67,000 rows.
SELECT eventId
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
Adding more ids to the select clause doesn't really change the performance at all.
SELECT eventId, groupId, activityId, insertDate
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
However, adding a "property" column does change it to 0.6s fetch time and 1.8s query time.
SELECT eventId, txtProperty1
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
Now to really blow your socks off. Instead of txtProperty1, using txtProperty2 changes the times to 0.8s fetch, 24s query!
SELECT eventId, txtProperty2
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
The two columns are pretty much identical in the type of data they hold: mostly non-null, and neither are indexed (not that that should make a difference anyways). To be sure the table itself is healthy I ran analyze/optimize against it.
This is really mystifying to me. I can see why adding columns to the select clause only can slightly increase fetch time, but it should not change query time, especially not significantly. I would appreciate any ideas as to what is causing this slowdown.
EDIT - More data points
SELECT * actually outperforms txtProperty2 - 0.8s query, 8.4s fetch. Too bad I can't use it because the fetch time is (expectedly) too long.
The MySQL documentation for the InnoDB engine suggests that if your varchar data doesn't fit on the page (i.e. the node of the b-tree structure), then the information will be referenced on overflow pages. So on your wide Warehouse table, it may be that txtProperty1 is on-page and txtProperty2 is off-page, thus requiring additional I/O to retrieve.
Not too sure as to why the SELECT * is better; it may be able to take advantage of reading data sequentially, rather than picking its way around the disk.
I'll admit that this is a bit of a guess, but I'll give it a shot.
You have id -- the first field -- as the primary key. I'm not 100% sure how MySQL does clustered indexes as far as lookups, but it is reasonable to suspect that, for any given ID, there is some "pointer" to the record with that ID.
It is relatively easy to find the beginnings of fields when all prior fields have fixed width. All your BIGINT(20) fields have a defined size that makes it easy for the db engine to find the field given a pointer to the start of the record; it's a simple calculation. Likewise, the start of the first VARCHAR(255) field is easy to find. After that, though, because the fields are VARCHAR fields, the db engine must take the data into account to find the start of the next field, which is much slower than simply calculating where that field should be. So, for any fields after txtProperty1, you will have this issue.
What would happen if you changed all the VARCHAR(255) fields to CHAR(255) fields? It is very possible that your query will be much faster, albeit at the cost of using the maximum storage for each CHAR(255) field regardless of the data it actually contains.
Fragmented tablespace? Try a null alter table:
ALTER TABLE tbl_name ENGINE=INNODB
Since I am a SQL Server user and not a MySQL guy, this is a long shot. In SQL Server the clustered index is the table. All the table data is stored in the clustered index. Additional indexes store redundant copies of the indexed data sorted in the appropriate sort order.
My reasoning is this. As you add more and more data to the query, the fetch time remains negligible. I presume this is because you are fetching all the data from the clustered index during the query phase and there is effectively nothing left to do during the fetch phase.
The reason the SELECT * works the way it does is because your table is so wide. As long as you are just requesting the key and one or two additional columns, it is best to just get everything during the query. Once you ask for everything, it becomes cheaper to segregate the fetching between the two phases. I am guessing that if you add columns to your query one at a time, you will discover the boundary where the query analyzer switches from doing all of the fetching in the query phase to doing most of the fetching in the fetching phase.
You should post the explain plans of the two queries so we can see what they are.
My guess is that the fast one is using a "Covering index", and the slow one isn't.
This means that the slow one must do 67,000 primary key lookups, which will be very inefficient if the table isn't all in memory (typically requiring 67k IO operations if the table is arbitrarily large and each row in its own page).
In MySQL, EXPLAIN will show "Using index" if a covering index is being used.
I Had a similar issue and creating additional right sized indexes helped significantly. What also helps is using partitioned database tables and tuning the databases ram.
i.e. add an index to the table for (eventId, txtProperty2)
Note: I noticed that you stated "Warehouse". Keep in mind that it is somewhat expected that if you have a huge database table you are working with additional delays are expected with each increased condition.