I'm trying to run a more or less simple query on a MySQL table with about 100000 entries (Size: 2319.58MB; avg: 24KB pre row).
This is the query:
SELECT
n0_.id AS id0,
n0_.sentTime AS sentTime1,
n0_.deliveredTime AS deliveredTime2,
n0_.queuedTime AS queuedTime3,
n0_.message AS message4,
n0_.failReason AS failReason5,
n0_.targetNumber AS targetNumber6,
n0_.sid AS sid7,
n0_.targetNumber AS targetNumber8,
n0_.sid AS sid9,
n0_.subject AS subject10,
n0_.targetAddress AS targetAddress11,
n0_.priority AS priority12,
n0_.targetUrl AS targetUrl13,
n0_.discr AS discr14,
n0_.notification_state_id AS notification_state_id15,
n0_.user_id AS user_id16
FROM
notifications n0_
INNER JOIN notification_states n1_ ON n0_.notification_state_id = n1_.id
WHERE
n0_.discr IN (
'twilioCallNotification', 'twilioSMSNotification',
'emailNotification', 'httpNotification'
)
ORDER BY
n0_.id desc
LIMIT
25 OFFSET 0
The query between 100 and 200sec.
This is the create statement of the table:
CREATE TABLE notifications (
id INT AUTO_INCREMENT NOT NULL,
notification_state_id INT DEFAULT NULL,
user_id INT DEFAULT NULL,
sentTime DATETIME DEFAULT NULL,
deliveredTime DATETIME DEFAULT NULL,
queuedTime DATETIME NOT NULL,
message LONGTEXT NOT NULL,
failReason LONGTEXT DEFAULT NULL,
discr VARCHAR(255) NOT NULL,
targetNumber VARCHAR(1024) DEFAULT NULL,
sid VARCHAR(34) DEFAULT NULL,
subject VARCHAR(1024) DEFAULT NULL,
targetAddress VARCHAR(1024) DEFAULT NULL,
priority INT DEFAULT NULL,
targetUrl VARCHAR(1024) DEFAULT NULL,
INDEX IDX_6000B0D350C80464 (notification_state_id),
INDEX IDX_6000B0D3A76ED395 (user_id),
PRIMARY KEY (id)
) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci ENGINE=InnoDB;
ALTER TABLE notifications ADD CONSTRAINT FK_6000B0D350C80464 FOREIGN KEY (notification_state_id) REFERENCES notification_states (id) ON DELETE CASCADE;
ALTER TABLE notifications ADD CONSTRAINT FK_6000B0D3A76ED395 FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE;
These queries are created by Doctrine2 (ORM). We are using MySQL 5.5.31 on a virtual machine (1GB ram, single core) running Debian 6.0.7. Why is the performance of the query so bad? Is it a database design fault or is the vbox just to small to handle this amount of data?
Add an index on discr column :
ALTER TABLE `notifications` ADD INDEX `idx_discr` (`discr` ASC) ;
You're filtering by notifications.discr but don't appear to have that indexed. Have you tried adding an index on it?
Related
I am using mysql with Django. I am trying to count the number of visitor_pages for a specific dealer in a certain amount of time.
I would share the raw sql query that I have obtained from django debug toolbar.
SELECT COUNT(*) AS `__count`
FROM `visitor_page`
INNER JOIN `dealer_visitors`
ON (`visitor_page`.`dealer_visitor_id` = `dealer_visitors`.`id`)
WHERE (`visitor_page`.`date_time` BETWEEN '2021-02-01 05:51:00'
AND '2021-03-21 05:50:00'
AND `dealer_visitors`.`dealer_id` = 15)
The issue is that I have more than 13 million records in the visitor_pages table and about 1.5 million records in the dealer_visitor table. I have already indexed date_time. I am thinking of using a materialized view but before attempting that, I would really appreciate suggestions on how I could improve this query.
visitor_pages schema:
CREATE TABLE `visitor_page` (
`id` int NOT NULL AUTO_INCREMENT,
`date_time` datetime(6) DEFAULT NULL,
`added_at` datetime(6) DEFAULT NULL,
`updated_at` datetime(6) DEFAULT NULL,
`page_id` int NOT NULL,
`dealer_visitor_id` int NOT NULL,
PRIMARY KEY (`id`),
KEY `visitor_page_page_id_246babdf_fk_web_page_id` (`page_id`),
KEY `visitor_page_dealer_visitor_id_e2dddea2_fk_dealer_visitors_id` (`dealer_visitor_id`),
KEY `visitor_page_date_time_06e9e9f5` (`date_time`),
CONSTRAINT `visitor_page_dealer_visitor_id_e2dddea2_fk_dealer_visitors_id` FOREIGN KEY (`dealer_visitor_id`) REFERENCES `dealer_visitors` (`id`),
CONSTRAINT `visitor_page_page_id_246babdf_fk_web_page_id` FOREIGN KEY (`page_id`) REFERENCES `web_page` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=13626649 DEFAULT CHARSET=latin1;
dealer_visitors schema:
CREATE TABLE `dealer_visitors` (
`id` int NOT NULL AUTO_INCREMENT,
`visit_date` datetime(6) DEFAULT NULL,
`added_at` datetime(6) DEFAULT NULL,
`updated_at` datetime(6) DEFAULT NULL,
`dealer_id` int NOT NULL,
`visitor_id` int NOT NULL,
`type` int DEFAULT NULL,
`notes` longtext,
`location` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `dealer_visitors_dealer_id_306e2202_fk_dealer_id` (`dealer_id`),
KEY `dealer_visitors_visitor_id_27ae498e_fk_visitor_id` (`visitor_id`),
KEY `dealer_visitors_type_af0f7d79` (`type`),
KEY `dealer_visitors_visit_date_f2b138c9` (`visit_date`),
CONSTRAINT `dealer_visitors_dealer_id_306e2202_fk_dealer_id` FOREIGN KEY (`dealer_id`) REFERENCES `dealer` (`id`),
CONSTRAINT `dealer_visitors_visitor_id_27ae498e_fk_visitor_id` FOREIGN KEY (`visitor_id`) REFERENCES `visitor` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1524478 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
EXPLAIN ANALYZE the query gives me the following:
EXPLAIN:
For this query:
SELECT COUNT(*) AS `__count`
FROM visitor_page vp JOIN
dealer_visitors dv
ON vp.dealer_visitor_id = dv.id
WHERE vp.date_time BETWEEN '2021-02-01 05:51:00' AND '2021-03-21 05:50:00' AND
dv.dealer_id = 15;
The best indexes are on dealer_visitors(dealer_id, date_time, id) and visitor_page(dealer_visitor_id).
An index only on date helps a bit. But you are retrieving a month's worth of data and that might be a lot of data to process. Having dealer_id as the first column in the index will restrict the data to only the rows for that dealer in that time frame.
Depending on the distribution of the data, the Optimizer might pick one of the tables to start with, or pick the other. So, let's provide optimal indexes for each case:
ON `visitor_page`.`dealer_visitor_id` = `dealer_visitors`.`id`
WHERE `visitor_page`.`date_time` BETWEEN ...
AND `dealer_visitors`.`dealer_id` = 15
Starting with visitor_page:
visitor_page: INDEX(date_time) -- (already exists)
dealer_visitors: (already has PRIMARY KEY(id))
Starting with dealer_visitors:
dealer_visitors: INDEX(dealer_id) -- (already exists)
visitor_page: INDEX(dealer_visitor_id, date_time) -- in this order
and drop dealer_visitors_visitor_id_27ae498e_fk_visitor_id as now being redundant.
The net is to add one index and drop one index.
Materialized view -- It is often best for Data Warehouse reports to build and incrementally maintain a "summary table" (a "materialized view"). The very odd date range (1 month + 20 days - 61 seconds) makes this clumsy to do. Typically it is handy to make the table based on whole days. If you can shift to daily (or hourly), then see http://mysql.rjweb.org/doc.php/summarytables
Something else to check: How much RAM do you have? What does SHOW VARIABLES LIKE 'innodb_buffer_pool_size'; say?
I see that the tables have different charset/collation. This is not a problem for the query in question, but if you have other queries that JOIN on VARCHARs, check that they use the same collation.
CREATE TABLE fa (
book varchar(100) DEFAULT NULL,
PRODUCTION varchar(1000) DEFAULT NULL,
VENDOR_LEVEL varchar(100) DEFAULT NULL,
BOOK_NO int(10) DEFAULT NULL,
UNSTABLE_TIME_PERIOD varchar(100) DEFAULT NULL,
`PERIOD_YEAR` int(10) DEFAULT NULL,
promo_3_visuals_manual_drag int(10) DEFAULT NULL,
BOOK_NO int(10) DEFAULT NULL,
PRODUCT_LEVEL_DIST varchar(100) DEFAULT NULL,
PRODUCT_LEVEL_ACV_TREND varchar(100) DEFAULT NULL,
KEY book (BOOK_NO),
KEY period (PERIOD_YEAR)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Index we added to column
Index : BOOK_NO and PERIODIC_YEAR has added
we cant add unique nor primary key to both column as it has plenty of duplicate values in it.
There are 46 millions rows.
We tried partitioning to period year and catno for sub partition, but doesn't worked as it is still takes long time
When i run the update query :
update fa set UNSTABLE_TIME_PERIOD = NULL where BOOK_NO = 0 and periodic_year = 201502;
It taking me more than 7 min , how can i OPTIMIZE the query?
Instead of creating 2 different keys, create single composite key for both the columns like:
KEY book_period (BOOK_NO, PERIOD_YEAR)
Also, first filter the records based on the column which will return the small set of records as compare to other.
If you think BOOK_NO will return less number of records as compare to PERIOD_YEAR, Use BOOK_NO first in where clause else use PERIOD_YEAR first and create the key accordingly.
As Álvaro González said, you should use some sort of key (eg. a Primary Key).
Adding a Primary Key:
CREATE TABLE fa (
<your_id>,
{...},
PRIMARY KEY(<your_id>),
{...}
)
or
CREATE TABLE fa (
<your_id> PRIMARY KEY,
{...}
)
It'd be a good idea to make your PRIMARY KEY AUTO_INCREMENT too for convenience, but this is not essenitial.
I have a large live database where around 1000 users are updating 2 or more updates every minute. at the same time there are 4 users are getting reports and adding new items. the main 2 tables contains around 2 Million and 4 Million rows till present.
Queries using these tables are taking too much time, even simple queries like:
"SELECT COUNT(*) FROM MyItemsTable" and "SELECT COUNT(*) FROM MyTransactionsTable"
are taking 10 seconds and 26 seconds
large reports now are taking 15mins !!! toooooo much time.
All the table that I'm using are innodb
is there any way to solve this problem before I read about reputation ??
Thank you in advance for any help
Edit
Here is the structure and indexes of MyItemsTable:
CREATE TABLE `pos_MyItemsTable` (
`itemid` bigint(15) NOT NULL,
`uploadid` bigint(15) NOT NULL,
`itemtypeid` bigint(15) NOT NULL,
`statusid` int(1) NOT NULL,
`uniqueid` varchar(10) DEFAULT NULL,
`referencenb` varchar(30) DEFAULT NULL,
`serialnb` varchar(25) DEFAULT NULL,
`code` varchar(50) DEFAULT NULL,
`user` varchar(16) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`pass` varchar(100) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`expirydate` date DEFAULT NULL,
`userid` bigint(15) DEFAULT NULL,
`insertdate` datetime DEFAULT NULL,
`updateuser` bigint(15) DEFAULT NULL,
`updatedate` datetime DEFAULT NULL,
`counternb` int(1) DEFAULT '0',
PRIMARY KEY (`itemid`),
UNIQUE KEY `referencenb_unique` (`referencenb`),
KEY `MyItemsTable_r04` (`itemtypeid`),
KEY `MyItemsTable_r05` (`uploadid`),
KEY `FK_MyItemsTable` (`statusid`),
KEY `ind_MyItemsTable_serialnb` (`serialnb`),
KEY `uniqueid_key` (`uniqueid`),
KEY `ind_MyItemsTable_insertdate` (`insertdate`),
KEY `ind_MyItemsTable_counternb` (`counternb`),
CONSTRAINT `FK_MyItemsTable` FOREIGN KEY (`statusid`) REFERENCES `MyItemsTable_statuses` (`statusid`),
CONSTRAINT `MyItemsTable_r04` FOREIGN KEY (`itemtypeid`) REFERENCES `itemstypes` (`itemtypeid`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `MyItemsTable_r05` FOREIGN KEY (`uploadid`) REFERENCES `uploads` (`uploadid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Just having few indexes does not mean your tables and queries are optimized.
Try to identify the querties that run the slowest and add specific indexes there.
Selecting * from a huge table .. where you have columns that contain text / images / files
will be aways slow. Try to limit the selection of such fat columns when you don't need them.
future readings:
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
and some more advanced configurations:
http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/
http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/
source
UPDATE:
try to use composite keys for some of the heaviest queries,
by placing the main fields that are compared in ONE index:
`MyItemsTable_r88` (`itemtypeid`,`statusid`, `serialnb`), ...
this will give you faster results for queries that complare only columns from the index :
SELECT * FROM my_table WHERE `itemtypeid` = 5 AND `statusid` = 0 AND `serialnb` > 500
and extreamlly fast if you search and select values from the index:
SELECT `serialnb` FROM my_table WHERE `statusid` = 0 `itemtypeid` IN(1,2,3);
This are really basic examples you will have to read a bit more and analyze the data for the best results.
I am facing some issue in query execution here is my case :
I have two tables log with 2 lakh records and logrecords with 6 lakh records
Where single log record in log table can have multiple log messages in logrecords table my database schema is as below
log Table
CREATE TABLE `log` (
`logid` varchar(50) NOT NULL DEFAULT '',
`creationtime` bigint(20) DEFAULT NULL,
`serviceInitiatorID` varchar(50) DEFAULT NULL,
PRIMARY KEY (`logid`),
KEY `idx_creationtime_wsc_log` (`creationtime`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
And logrecords Table
CREATE TABLE `logrecords` (
`logrecordid` varchar(50) NOT NULL DEFAULT '',
`timestamp` bigint(20) DEFAULT NULL,
`message` varchar(8000) DEFAULT NULL,
`loglevel` int(11) DEFAULT NULL,
`logid` varchar(50) DEFAULT NULL,
`indexcolumn` int(11) DEFAULT NULL,
PRIMARY KEY (`logrecordid`),
KEY `indx_logrecordid_message_logid` (`logrecordid`,`message`(767),`logid`),
KEY `logid` (`logid`),
KEY `indx_message` (`message`(767))
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Query created by hibernate is like
select this_.logid as logid4_1_, this_.loglevel as loglevel4_1_, this_.creationtime as creation3_4_1_,this_.serviceInitiatorID as service17_4_1_, this_.logtype as logtype4_1_,logrecord1_.logrecordid as logrecor1_3_0_, logrecord1_.timestamp as timestamp3_0_, logrecord1_.message as message3_0_, logrecord1_.loglevel as loglevel3_0_, logrecord1_.logid as logid3_0_, logrecord1_.indexcolumn as indexcol6_3_0_ from log this_ inner join wsc_logrecords logrecord1_ on this_.logid=logrecord1_.logid where (1=1) and (1=1) and logrecord1_.message like 'SecondMessage' order by this_.creationtime desc limit 25
Which taking around 7313ms to execute
Query Explain is like
But when I execute below query it is taking around 15 min to execute
select count(*) as y0_ from log this_ inner join logrecords logrecord1_ on this_.logid=logrecord1_.logid where (1=1) and (1=1) and lower(logrecord1_.message) like 'SecondMessage' order by this_.creationtime desc limit 25
For above query explain is like
and I am using MySQl database. I think there is some issue in indexing or some other which I am not able to identify
Any solution will be appreciated.
When you use lower(logrecord1_.message) like 'SecondMessage' instead of plain logrecord1_.message like 'SecondMessage' the DB engine will stop using the index on logrecord1_.message.
You can overcome this by creating a function based index that has lower(logrecord1_.message) in place of logrecord1_.message.
Why do I get an error of the form:
Error in query: Duplicate entry '10' for key 1
...when doing an INSERT statement like:
INSERT INTO wp_abk_period (pricing_id, apartment_id) VALUES (13, 27)
...with 13 and 27 being valid id-s for existing pricing and apartment rows, and the table is defined as:
CREATE TABLE `wp_abk_period` (
`id` int(11) NOT NULL auto_increment,
`apartment_id` int(11) NOT NULL,
`pricing_id` int(11) NOT NULL,
`type` enum('available','booked','unavailable') collate utf8_unicode_ci default NULL,
`starts` datetime default NULL,
`ends` datetime default NULL,
`recur_type` enum('daily','weekly','monthly','yearly') collate utf8_unicode_ci default NULL,
`recur_every` char(3) collate utf8_unicode_ci default NULL,
`timedate_significance` char(4) collate utf8_unicode_ci default NULL,
`check_in_times` varchar(255) collate utf8_unicode_ci default NULL,
`check_out_times` varchar(255) collate utf8_unicode_ci default NULL,
PRIMARY KEY (`id`),
KEY `fk_period_apartment1_idx` (`apartment_id`),
KEY `fk_period_pricing1_idx` (`pricing_id`),
CONSTRAINT `fk_period_apartment1` FOREIGN KEY (`apartment_id`) REFERENCES `wp_abk_apartment` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_period_pricing1` FOREIGN KEY (`pricing_id`) REFERENCES `wp_abk_pricing` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Isn't key 1 id in this case and having it on auto_increment sufficient for being able to not specify it?
Note: If I just provide an unused value for id, like INSERT INTO wp_abk_period (id, pricing_id, apartment_id) VALUES (3333333, 13, 27) it works fine, but then again, it is set as auto_increment so I shouldn't need to do this!
Note 2: OK, this is a complete "twilight zone" moment: so after running the query above with the huge number for id, things started working normally, no more duplicate entry errors. Can someone explain me WTF was MySQL doing to produce this weird behavior?
It could be that your AUTO_INCREMENT value for the table and the actual values in id column have got out of whack.
This might help:
Step 1 - Get Max id from table
select max(id) from wp_abk_period
Step 2 - Align the AUTO_INCREMENT counter on table
ALTER TABLE wp_abk_period AUTO_INCREMENT = <value from step 1 + 100>;
Step 3 - Retry the insert
As for why the AUTO_INCREMENT has got out of whack I don't know. Added auto_increment after data was in the table? Altered the auto_increment value after data was inserted into the table?
Hope it helps.
I had the same problem and here is my solution :
My ID column had a bad parameter. It was Tinyint, and MySql want to write a 128th line.
Sometimes, your problem you think the bigger you have is only a tiny parameter...
Late to the party, but I just ran into this tonight - duplicate key '472817' and the provided answers didn't help.
On a whim I ran:
repair table wp_abk_period
which output
Number of rows changed from 472816 to 472817
Seems like mysql had the row count wrong, and the issue went away.
My environment:
mysql Ver 14.14 Distrib 5.1.73, for Win64 (unknown)
Create table syntax:
CREATE TABLE `env_events` (
`tableId` int(11) NOT NULL AUTO_INCREMENT,
`deviceId` varchar(50) DEFAULT NULL,
`timestamp` int(11) DEFAULT NULL,
`temperature` float DEFAULT NULL,
`humidity` float DEFAULT NULL,
`pressure` float DEFAULT NULL,
`motion` int(11) DEFAULT NULL,
PRIMARY KEY (`tableId`)
) ENGINE=MyISAM AUTO_INCREMENT=528521 DEFAULT CHARSET=latin1
You can check the current value of the auto_increment with the following command:
show table status
Then check the max value of the id and see if it looks right. If not change the auto_increment value of your table.
When debugging this problem check the table name case sensitivity (especially if you run MySql not on Windows).
E.g. if one script uses upper case to 'CREATE TABLE my_table' and another script tries to 'INSERT INTO MY_TABLE'. These 2 tables might have different contents and different file system locations which might lead to the described problem.