I have a table that uses myisam engine on my server. There are 10 update statements per second on average. I found that the mysql process disk write a lot higher than the theoretical value. After experimenting, I suspect that modifying any column of data would rewrite the entire row of data. The following is an experiment...
My table:
CREATE TABLE `test_update` (
`id` int(11) NOT NULL DEFAULT '0',
`str1` blob,
`str2` blob,
`str3` blob,
`update_time` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `update_time` (`update_time`)
) ENGINE=MyISAM;
I inserted 100000 rows data,each row has 30k string(10k per blob).After that I randomly update ‘update_time’ column 1 row/sec
while 1:
sql = "update test_update set update_time=%d where id=%d" %(now, randomid)
cur.execute(sql)
conn.commit()
slp_t = 1-(time.time()-end)
if slp_t>0:
time.sleep(slp_t)
end=time.time()
and iotop shows:
https://i.stack.imgur.com/sJa8y.png
It seems like modifying an int column would rewrite the entire row(even more). Is that true? If the answer is yes, why was it designed like this? what should i do to avoid this waste?
Related
I am trying to copy a table in mysql (version 5.7.38-1) using the following queries:
CREATE TABLE dest LIKE src;
INSERT INTO dest SELECT * FROM src;
Table dest is created and filled with records from Table src. So far, so good. You would expect the two tables to have roughly the same size. But Table dest has 646M, whereas Table src only has 134M. After the create-step, Table dest is 48K, more or less as expected.
Engine is InnoDB, default row format is dynamic and compression is on.
I have executed the following to see if it would help but to no avail:
ALTER TABLE dest ROW_FORMAT=COMPRESSED;
OPTIMIZE TABLE dest;
And this is SHOW CREATE TABLE src:
CREATE TABLE `src` (
`meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`post_id` bigint(20) unsigned NOT NULL DEFAULT '0',
`meta_key` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`meta_value` longtext COLLATE utf8mb4_unicode_ci,
PRIMARY KEY (`meta_id`),
KEY `post_id` (`post_id`),
KEY `meta_key` (`meta_key`(191))
) ENGINE=InnoDB AUTO_INCREMENT=6046271 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
I am aware that the mysql version is dated but changing that is outside my scope of control.
Two questions:
What is the reason for this unexpected behavior?
What is the solution to make Table dest smaller?
Thanks for your insights.
Various possibilities. The most likely is the indexes.
What engine? Was compression on? What row format?
Please provide SHOW CREATE TABLE src.
INSERT ... SELECT ... will feed the rows into the table one at a time (but a lot faster than one INSERT statement per row). If the Engine is InnoDB, then presumably the src rows are in PRIMARY KEY order. And the optimal order for inserting to dst is also that order. So I would expect the data's BTree to be effectively 'defragmented'.
Secondary indexes are another matter. They may or may not be efficiently ordered. And the "change buffer" may or may not compensate for the ordering. The resulting BTree for each secondary index may or may not be 'defragmented'.
What version of mysql/mariadb? I may have a tool to look deeper into the issue.
MySQL seems to be very slow for updates.
A simple update statement is taking more time than MS SQL for same update call.
Ex:
UPDATE ValuesTbl SET value1 = #value1,
value2 = #value2
WHERE co_id = #co_id
AND sel_date = #sel_date
I have changed some config settings as below
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=10G
innodb_log_file_size=2G
log-bin="foo-bin"
skip-log-bin
This is the create table query
CREATE TABLE `valuestbl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sel_date` datetime NOT NULL,
`co_id` int(11) NOT NULL,
`value1` decimal(10,2) NOT NULL,
`value2` decimal(10,2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21621889 DEFAULT CHARSET=latin1;
MySQL version: 8.0 on Windows
The update query takes longer time to update when compared to MS SQL, anything else I need to do to make it faster?
There are no indices, the ValuesTbl tables has a PK, not using for anything. the id column is a Primary key from another table, the sel_date is a date field and 2 decimal columns
If there are no indexes on ValuesTbl then the update has to scan the entire table which will be slow if the table is large. No amount of server tuning will fix this.
A simple update statement is taking more time than MS SQL for same update call.
The MS SQL server probably has an index on either co_id or sel_date. Or it has fewer rows in the table.
You need to add indexes, like the index of a book, so the database doesn't have to search the whole table. At minimum an index on co_id will vastly help performance. If there are many columns with different sel_date per ID, a compound index on (co_id, sel_date) would help further.
See Use The Index, Luke for an extensive tutorial on indexes.
I have a MyISAM table (on a Mariadb) with 7 millions rows in it.
CREATE TABLE `mytable` (
`id` bigint(100) unsigned NOT NULL AUTO_INCREMENT,
`x` int(5) unsigned NOT NULL DEFAULT '0',
`y` int(5) unsigned NOT NULL DEFAULT '0',
`value` int(5) unsigned NOT NULL DEFAULT '0'
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=10152508 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1
When i do
SELECT * FROM mytable WHERE id = 167880;
it takes around 0.272 sec
When i do
UPDATE mytable SET value = 1 WHERE id = 167880;
it takes randomly from 0.200 to 2.5 sec
I was thinking it's because my table have a lot of rows, but still, it shouldn't take that much time to update a row by it's primary key.
Since i did some researchs before posting, here are the checks i've already done :
No duplicate indexes
No others indexes than the primary key "id"
No triggers
Tried to switch to innoDB engine, it was worse (around 6 sec for an update)
Tried to switch to aria engine, it's even worse
Already did OPTIMIZE TABLE;
Config is the default config of last version of Mariadb (fresh install)
Made all theses check while the db was not used by anything else, so no heavy readings during the tests
I think that the problem is the data type you are using for id column.
Using INT rather then BIGINT can make a significant reduction in disk space.
Read this article instead.
http://ronaldbradford.com/blog/bigint-v-int-is-there-a-big-deal-2008-07-18/
Hope it helps
We have a data set that is fairly static in a MySQL database, but the read times are terrible (even with indexes on the columns being queried). The theory is that since rows are stored randomly (or sometimes in order of insertion), the disk head has to scan around to find different rows, even if it knows where they are due to the index, instead of just reading them sequentially.
Is it possible to change the order data is stored in on disk so that it can be read sequentially? Unfortunately, we can't add a ton more RAM at the moment to have all the queries cached. If it's possible to change the order, can we define an order within an order? As in, sort by a certain column, then sort by another column if the first column is equal.
Could this have something to do with the indices?
Additional details: non-relational single-table database with 16 million rows, 1 GB of data total, 512 mb RAM, MariaDB 5.5.30 on Ubuntu 12.04 with a standard hard drive. Also this is a virtualized machine using OpenVZ, 2 dedicated core E5-2620 2Ghz CPU
Create syntax:
CREATE TABLE `Events` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`provider` varchar(10) DEFAULT NULL,
`location` varchar(5) DEFAULT NULL,
`start_time` datetime DEFAULT NULL,
`end_time` datetime DEFAULT NULL,
`cost` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `provider` (`provider`),
KEY `location` (`location`),
KEY `start_time` (`start_time`),
KEY `end_time` (`end_time`),
KEY `cost` (`cost`)
) ENGINE=InnoDB AUTO_INCREMENT=16321002 DEFAULT CHARSET=utf8;
Select statement that takes a long time:
SELECT *
FROM `Events`
WHERE `Events`.start_time >= '2013-05-03 23:00:00' AND `Events`.start_time <= '2013-06-04 22:00:00' AND `FlightRoutes`.location = 'Chicago'
Explain select:
1 SIMPLE Events ref location,start_time location 18 const 3684 Using index condition; Using where
MySQL can only select one index upon which to filter (which makes sense, because having restricted the results using an index it cannot then determine how such restriction has affected other indices). Therefore, it tracks the cardinality of each index and chooses the one that is likely to be the most selective (i.e. has the highest cardinality): in this case, it has chosen the location index, but that will typically leave 3,684 records that must be fetched and then filtered Using where to find those that match the desired range of start_time.
You should try creating a composite index over (location, start_time):
ALTER TABLE Events ADD INDEX (location, start_time)
Basically I am monitoring slowest query on a website. It turns out they are something like:
INSERT INTO beststat (bestid,period,rawView) VALUES ( 'idX' , 2012 , 1 )
ON DUPLICATE KEY UPDATE rawView = rawView+1
Basically it's a logging table. If the row is already there it updates rawView with a +1
beststat is InnoDB so I have row-level locking and consindering I do a lot of inserts-updates it should be faster than MyISAM.
Anyway that query shouldn't take so long, maybe there is something else wrong. What it could be ?
Of course I have an Unique Index on bestid, period
Additional Info
This table (beststat) currently has ~1mil record and its size is: 68MB. I have 4GB RAM and innodb buffer pool size = 104,857,600. Mysql: 5.1.49-3
CREATE TABLE `beststat` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`bestid` int(11) unsigned NOT NULL,
`period` mediumint(8) unsigned NOT NULL,
`view` mediumint(8) unsigned NOT NULL DEFAULT '0',
`rawView` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `bestid` (`bestid`,`period`)
) ENGINE=InnoDB AUTO_INCREMENT=2020577 DEFAULT CHARSET=utf8
Notice to faster thing a litte bit i could do somethijng like:
UPDATE beststat SET rawView = rawView + 1 WHERE bestid = idX AND period = 2012;
if (mysql_affected_rows()==0)
INSERT INTO beststat (bestid,period,rawView) VALUES ('idX',2012,1)
So most of time i would run only the first query UPDATE. But I would like to understand why the first, more concise, query is slow.
I found this interesting article... still reading
dealing with big # of rows, i suggest to use load date infile to make query faster.
To further improve the query time, you can consider using memory table as well.