I have a big base in MYSQL - 300 mb, where are 4 tables: the first one is about 200mb, the second is - 80.
There are 150 000 records in first table and 200 000 in second.
At the same time I use inner join there.
Select takes 3 seconds when I use optimization and indeces (before that it took about 20-30 seconds).
It is enough good result. But I need more, because page is loading for 7-8 seconds (3-4 for select, 1 for count, another small queries 1 sec, and 1-2 for page generation).
So, what I should do then? May be postgres takes less time than mysql? Or may be better to use memcaches, but in this case it can take lots of memory then (there are too many variants of sorting).
May be anybody has another idea? I would be glad to hear the new one:)
OK. I see we need queries:)
I renamed fields for table_1.
CREATE TABLE `table_1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`field` varchar(2048) DEFAULT NULL,
`field` varchar(2048) DEFAULT NULL,
`field` int(10) unsigned DEFAULT NULL,
`field` text,
`field` text,
`field` text,
`field` varchar(128) DEFAULT NULL,
`field` text,
`field` text,
`field` text,
`field` text,
`field` text,
`field` varchar(128) DEFAULT NULL,
`field` text,
`field` varchar(4000) DEFAULT NULL,
`field` varchar(4000) DEFAULT NULL,
`field` int(10) unsigned DEFAULT '1',
`field` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`field` text,
`new` tinyint(1) NOT NULL DEFAULT '0',
`applications` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `indexNA` (`new`,`applications`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=153235 DEFAULT CHARSET=utf8;
CREATE TABLE `table_2` (
`id_record` int(10) unsigned NOT NULL AUTO_INCREMENT,
`catalog_name` varchar(512) NOT NULL,
`catalog_url` varchar(4000) NOT NULL,
`parent_id` int(10) unsigned NOT NULL DEFAULT '0',
`checked` tinyint(1) NOT NULL DEFAULT '0',
`level` int(10) unsigned NOT NULL DEFAULT '0',
`work` int(10) unsigned NOT NULL DEFAULT '0',
`update` int(10) unsigned NOT NULL DEFAULT '1',
`type` int(10) unsigned NOT NULL DEFAULT '0',
`hierarchy` varchar(512) DEFAULT NULL,
`synt` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id_record`,`type`) USING BTREE,
KEY `rec` (`id_record`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=14504 DEFAULT CHARSET=utf8;
CREATE TABLE `table_3` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_table_1` int(10) unsigned NOT NULL,
`id_category` int(10) unsigned NOT NULL,
`work` int(10) unsigned NOT NULL DEFAULT '1',
`update` int(10) unsigned NOT NULL DEFAULT '1',
PRIMARY KEY (`id`),
KEY `site` (`id_table_1`,`id_category`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=203844 DEFAULT CHARSET=utf8;
There queries are:
1) get general count (takes less than 1 sec):
SELECT count(table_1.id) FROM table_1
INNER JOIN table_3 ON table_3.id_table_id = table_1.id
INNER JOIN table_2 ON table_2.id_record = table_3.id_category
WHERE ((table_2.type = 0)
AND (table_3.work = 1 AND table_2.work = 1)
AND (table_1.new = 1))AND 1 IN (table_1.applications)
2) get list for page with limit (it takes from 3 to 7 seconds, depends on count):
SELECT table_1.field, table_1.field, table_1.field, table_1.field, table_2.catalog_name FROM table_1
INNER JOIN table_3 ON table_3.id_table_id = table_1.id
INNER JOIN table_2 ON table_2.id_record = table_3.id_category
WHERE ((table_2.type = 0)
AND (table_3.work = 1 AND table_2.work = 1)
AND (table_1.new = 1))AND 1 IN (table_1.applications) LIMIT 10 OFFSET 10
Do Not Change DBMS
I would not suggest to change your DBMS, it may be very disruptive. If you have used MySQL specific queries that are not compatible with Postgres; you might need to redo whole indexing etc. Even then it may not guarantee a performance improvement.
Caching is a Good Option
Caching is really good idea. It takes load off your DBMS. It is best suited if you have heavy read, light write. This way objects would stay more time in Cache. MemcacheD is really good caching mechanism, and is really simple. Rapidly scaling sites (like Facebook and the likes) make heavy use of MemcacheD to alleviate the load from database.
How to Scale-up Really Big Time
Although, you do not have very heavy data.. so most likely caching would help you. But the next step ahead of caching is noSQL based solutions like Cassandra. We use cassandra in one of our application where we have heavy read and write (50:50) operation and database is really large and fast growing. Cassandra gives good performance. But, I guess in your case, Cassandra is an overkill.
But...
Before, you dive into any serious changes, I would suggest to really look into indexes. Try scaling vertically. Look into slow queries. (Search for slow query logging directive). Hopefully, MySQL will be faster after optimizing these thing and you would not need additional tools.
You should look into indexing specific to the most frequent/time consuming queries you use. Check this post on indexing for mysql.
Aside from all the other suggestions others have offered, I've slightly altered and not positive of the performance impact under MySQL. However, I've added STRAIGHT_JOIN so the optimizer doesn't try to think which order or table to join FOR you.
Next, I moved the "AND" conditions into the respective JOIN clauses for tables 2 & 3.
Finally, the join from table 1 to 3 had (in your post)
table_3.id_table_id = table_1.id
instead of
table_3.id_table_1 = table_1.id
Additionally, I can't tell performance, but maybe having a stand-alone index on just the "new" column for exact match first without regards to the "applications" column. I don't know if the compound index is causing an issue since you are using an "IN" for the applications and not truly an indexable search basis.
Here's the modified results
SELECT STRAIGHT_JOIN
count(table_1.id)
FROM
table_1
JOIN table_3
ON table_1.id = table_3.id_table_1
AND table_3.work = 1
JOIN table_2
ON table_3.id_category = table_2.id_record
AND table_2.type = 0
AND table_2.work = 1
WHERE
table_1.new = 1
AND 1 IN table_1.applications
SELECT STRAIGHT_JOIN
table_1.field,
table_1.field,
table_1.field,
table_1.field,
table_2.catalog_name
FROM
table_1
JOIN table_3
ON table_1.id = table_3.id_table_1
AND table_3.work = 1
JOIN table_2
ON table_3.id_category = table_2.id_record
AND table_2.type = 0
AND table_2.work = 1
WHERE
table_1.new = 1
AND 1 IN table_1.applications
LIMIT 10 OFFSET 10
You should also optimize your query.
Without a look into the statements this question can only be answered using theoretical approaches. Just a few ideas to take into consideration...
The SELECT-Statement...
First of all, make sure that your query is as "good" as it can be. Are there any indeces you might have missed? Are those indeces the same field types and so on? Can you perhaps narrow the query down so the database has less to work on?
The Query cache...
If your query is repeated pretty often, it might help to use the Query cache or - in case you're already using it - give it more RAM.
The Hardware...
Of course different RDBMS are slower or faster than others, depending on their strenght or weaknesses, but if your query is optimized into oblivion, you only can get it faster while scaling up the database server (better cpu, better i/o and so on, depending on where the bottleneck is).
Other Factors...
If this all is maxed out, maybe try speeding up the other components (1-2 secs for page generation looks pretty slow to me).
To all those factors mentioned there is a huge amount of ideas and posts in stackoverflow.com.
That is actually not such a big database, certainly not too much for your database system. As comparison, the database that we are using is currently around 40 GB. It's an MS SQL Server, though, so it's not directly comparable, but there is no dramatic difference between the database systems.
My guess is that you haven't been completely successful in using indexes to speed up the query. You should look at the execution plan for the query and see if you can spot what part of the execution that is taking most of the time.
Related
I have 2 MySQL UPDATE Query problem on my website.
Problem 1
I run a content site that updates page views for posts when users read.
Each time I send push notifications, my server times out; when I comment on the Update query that increments the page views, everything returns to normal.
This I think maybe as a result of hundreds of UPDATE query trying to update the views on the same row.
**The query that updated the tablename**
update table set views='$newview' where id=1
Query Explain
id: 1
select_type: SIMPLE
table: new_jobs
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 1
Extra: Using where
**tablename create table**
CREATE TABLE `tablename` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`company_id` int(11) DEFAULT NULL,
`job_title` varchar(255) DEFAULT NULL,
`slug` varchar(255) DEFAULT NULL,
`advert_date` date DEFAULT NULL,
`expiry_date` date DEFAULT NULL,
`no_deadline` int(1) DEFAULT 0,
`source` varchar(20) DEFAULT NULL,
`featured` int(1) DEFAULT 0,
`views` int(11) DEFAULT 1,
`email_status` int(1) DEFAULT 0,
`draft` int(1) DEFAULT 0,
`created_by` int(11) DEFAULT NULL,
`show_company_name` int(1) DEFAULT 1,
`display_application_method` int(1) DEFAULT 0,
`status` int(1) DEFAULT 1,
`upload_date` datetime DEFAULT NULL,
`country` int(1) DEFAULT 1,
`followers_email_status` int(1) DEFAULT 0,
`og_img` varchar(255) DEFAULT NULL,
`old_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `new_jobs_company_id_index` (`company_id`),
KEY `new_jobs_views_index` (`views`),
KEY `new_jobs_draft_index` (`draft`),
KEY `new_jobs_country_index` (`country`)
) ENGINE=InnoDB AUTO_INCREMENT=151359 DEFAULT CHARSET=utf8
What is the best way of handling this?
[Scenario 2 removed on request]
Scenario 1. I would expect the update of a 'view' count (or 'click' or 'like' or whatever) to be more like
UPDATE t SET views = views + 1 WHERE id = 123;
I assume you have an index (probably the PRIMARY KEY) on id?
Since there are other things going on with that table, it may be wise to split off the rapidly incrementing counter into a separate table. This would avoid interfering with other queries. You can get other data, plus the counter, by using JOIN .. USING(id).
Scenario 2 does not make sense. It seems to keep the latest date for each email, but what does country mean? Since it seems like more than just a counter, you might want a separate table to log those 3 columns.
Please provide SHOW CREATE TABLE.
There are many things that novices perceive as a "crash". Please describe further -- out of connections, out of disk space, sluggishness, the client gave error message, other operations taking too long, etc. Each has a different remedy.
Query
Are you are currently logically doing
BEGIN;
$ct = SELECT views ... FOR UPDATE;
...
UPDATE ... SET views = $ct+1 WHERE ...;
COMMIT;
If so, that is much less efficient than
(with autocommit = ON)
UPDATE ... SET views = views+1 ...;
Note that the first version hangs onto the row longer. If you fail to use FOR UPDATE, you will drop some counts.
Splitting into a separate table sort of forces you to run the UPDATE as its own transaction.
Other
innodb_flush_log_at_trx_commit:
Default is 1, which is secure, but leads to at least one IOPs for each transaction.
2 leads to a flush once a second. During intense times, this is much more efficient. But a crash could lose up to one second's worth of updates. the inaccuracy of "view count" due to a rare crash is, in my opinion, acceptable.
KEY(views) needs to be updated every time views is changed. But, thanks to the "change buffer", this is unlikely to involve any extra I/O, at least now while you are doing the UPDATE.
INT(1) takes 4 bytes; the (1) has no meaning. Suggest changing to TINYINT (1 byte), thereby saving about 27 bytes per row. (7 columns plus 2 indexes)
country INT(1) -- Is it a flag? What is the meaning? Is it normalized to another table? Using 4 bytes for an id and an extra table when standard abbreviations ('US', 'UK', 'RU', 'IN', etc) would take 2 bytes? Suggest country CHAR(2) CHARACTER SET ascii COLLATE ascii_general_ci.
Indexing flags rarely benefits. Let's see the queries where you think such indexes might be used. And the EXPLAIN SELECT ... for them.
Original question was based on where best to set tx isolation to READ UNCOMMITTED but after some advise it would seem that my initial thoughts on that as a possible solution was incorrect.
DDL
CREATE TABLE `tblgpslog` (
`GPSLogID` BIGINT(20) NOT NULL AUTO_INCREMENT,
`DTSaved` DATETIME NULL DEFAULT NULL,
`PrimaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`SecondaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`ThirdAssetID` BIGINT(20) NULL DEFAULT NULL,
`JourneyType` CHAR(1) NOT NULL DEFAULT 'B',
`DateStamp` DATETIME NULL DEFAULT NULL,
`Status` VARCHAR(50) NULL DEFAULT NULL,
`Location` VARCHAR(255) NULL DEFAULT '',
`Latitude` DECIMAL(11,8) NULL DEFAULT NULL,
`Longitude` DECIMAL(11,8) NULL DEFAULT NULL,
`GPSFix` CHAR(2) NULL DEFAULT NULL,
`Speed` BIGINT(20) NULL DEFAULT NULL,
`Heading` INT(11) NULL DEFAULT NULL,
`LifeOdometer` BIGINT(20) NULL DEFAULT NULL,
`Extra` VARCHAR(20) NULL DEFAULT NULL,
`BatteryLevel` VARCHAR(5) NULL DEFAULT '--',
`Ignition` TINYINT(4) NOT NULL DEFAULT '1',
`Radius` INT(11) NOT NULL DEFAULT '0',
`GSMLatitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
`GSMLongitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
INDEX `SecondaryUnitID` (`SecondaryAssetID`),
INDEX `ThirdUnitID` (`ThirdAssetID`),
INDEX `DateStamp` (`DateStamp`),
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`, `Status`),
INDEX `Location` (`Location`),
INDEX `DTSaved` (`DTSaved`),
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=153076364
;
The original query is as follows
SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, L.Extra
FROM tblGPSLog L
WHERE PrimaryAssetID = 183 AND L.GPSLogID > 147694199
ORDER BY DateStamp ASC
LIMIT 100;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","L","index_merge","PRIMARY,GPSLogID,PrimaryUnitIDDateStamp,PrimaryAssetID","PrimaryAssetID,PRIMARY","9,8",\N,"96","Using intersect(PrimaryAssetID,PRIMARY); Using where; Using filesort"
This gave issues a few months ago and after a bit of investigation I changed the query to below, but that is now acting very similar.
EXPLAIN SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, tmpA.Extra,
PrimaryAssetID FROM (SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer,
BatteryLevel, Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","PRIMARY","<derived2>","ALL",\N,\N,\N,\N,"5380842","Using where; Using filesort"
"2","DERIVED","L","range","PRIMARY,GPSLogID","PRIMARY","8",\N,"8579290","Using where"
Thanks for any advise.
Jim
I believe setting tx isolation to READ UNCOMMITTED, will stop the SELECT from locking the table.
Why would you believe that READ UNCOMMITTED will accomplish that?
SELECT is already non-locking by default in all isolation levels except for SERIALIZABLE.
That is, SELECT is always non-locking unless you use FOR UPDATE or FOR SHARE / LOCK IN SHARE MODE. When using SERIALIZABLE isolation level, SELECT is implicitly converted to a locking SELECT FOR SHARE. See https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html
I strongly recommend to never use READ UNCOMMITTED. This is not a good idea, because your transaction can read uncommitted work by other transactions, which means you can read inconsistent data (partially completed transactions), and phantom data (changes from transactions that are eventually rolled back). There is no advantage to doing this, and a potential for queries returning wrong results.
What makes you think locking is the cause of your performance problem? Have you observed an increase in lock time in the slow query log?
It's more common for performance problems to be caused by poor query optimization or not enough system resource.
If your database has become slower after 8+ years, I would guess that the database has grown until the active data set no longer fits in RAM.
Re your comment:
Is there a tool or way to investigate this further? I know the query that causing the issue, just can't determine why
There are many tools and ways to investigate. There are books on this subject like High Performance MySQL, and whole companies devoted to creating performance monitoring tools, like Percona and VividCortex.
I can't guess at a suggestion without knowing more specific details. If you want more help, can you please edit your original question above and add:
The SQL query that is having trouble.
The output of EXPLAIN <query> for the query that's having trouble.
The output of SHOW CREATE TABLE <tablename> for each table referenced by the query. You can run this statement in the MySQL client.
That's for starters.
Your statements
its rare that an SELECT would hit the table while INSERT is happening and even if it does, it wouldn't cause any great issues.
DELETE statements are scheduled once a week only at off peak hours,
equate to "Changing the isolation mode won't help much."
I recommend setting long_query_time=1 and turning on the slowlog. Later, look through the slowlog with pt-query-digest to find the few "worst" queries. Then let's discuss improving them.
More
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`,
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
The first of those takes care of the second, so the second is unnecessary.
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
A PK is a UNIQUE key, so chuck the second of those. That extra unique index slows down inserts and wastes disk space.
In this, I see no reason to have a query and subquery:
SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, tmpA.Extra, PrimaryAssetID
FROM
( SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199
) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
A pair of DECIMAL(11,8) adds up to 12 bytes, and is overkill for lat&lng. See this for smaller alternatives.
The table has been growing in size, correct? And, after it got so big, performance took a nose dive? Shrinking datatypes to shrink the table is one approach, albeit a temporary fix.
Using intersect(PrimaryAssetID,PRIMARY) -- Almost always, it is better to build a composite index than to use "Index merge intersect".
Although
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
should have been equivalent to
INDEX `PrimaryAssetID` (`PrimaryAssetID`, GPSLogID)
something is preventing it. Suggest you add this 2-column composite index. Perhaps a large percentage of rows have PrimaryAssetID = 183?? If convenient, please do SELECT COUNT(*) FROM tblgpslog WHERE PrimaryAssetID = 183
Will you be purging 'old' data from this log? If so, the optimal way involves PARTITIONing; see this.
can you please advise why such a query would take so long (literally 20-30 minutes)?
I seem to have proper indexes set up, don't I?
UPDATE `temp_val_import_435` t1,
`attr_upc` t2 SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`
CREATE TABLE `attr_upc` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`upc` varchar(255) NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `upc` (`upc`),
KEY `last_update` (`last_update`)
) ENGINE=InnoDB AUTO_INCREMENT=102739 DEFAULT CHARSET=utf8
CREATE TABLE `temp_val_import_435` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`attr_id` int(11) DEFAULT NULL,
`translation_id` int(11) DEFAULT NULL,
`source_value` varchar(255) NOT NULL,
`value` varchar(255) DEFAULT NULL,
`count` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `core_value_id` (`core_value_id`),
KEY `translation_id` (`translation_id`),
KEY `source_value` (`source_value`),
KEY `value` (`value`),
KEY `count` (`count`)
) ENGINE=InnoDB AUTO_INCREMENT=32768 DEFAULT CHARSET=utf8
Ed Cottrell's solution worked for me. Using = instead of LIKE sped up a smaller test query on 1000 rows by a lot.
I measured 2 ways: 1 in phpMyAdmin, the other looking at the time for DOM load (which of course involves other processes).
DOM load went from 44 seconds to 1 second, a 98% increase.
But the difference in query execution time was much more dramatic, going from 43.4 seconds to 0.0052 seconds, a decrease of 99.988%. Pretty good. I will report back on results from huge datasets.
Use = instead of LIKE. = should be much faster than LIKE -- LIKE is only for matching patterns, as in '%something%', which matches anything with "something" anywhere in the text.
If you have this query:
SELECT * FROM myTable where myColumn LIKE 'blah'
MySQL can optimize this by pretending you typed myColumn = 'blah', because it sees that the pattern is fixed and has no wildcards. But what if you have this data in your upc column:
blah
foo
bar
%foo%
%bar
etc.
MySQL can't optimize your query in advance, because it's possible that the text it is trying to match is a pattern, like %foo%. So, it has to perform a full text search for LIKE matches on every single value of temp_val_import_435.value against every single value of attr_upc.upc. With a simple = and the indexes you have defined, this is unnecessary, and the query should be dramatically faster.
In essence you are joining on a LIKE which is going to be problematic (would need EXPLAIN to see is MySQL if utilizing indexes at all). Try this:
UPDATE `temp_val_import_435` t1
INNER JOIN `attr_upc` t2
ON t1.`value` LIKE t2.`upc`
SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`
I've been working on a small Perl program that works with a table of articles, displaying them to the user if they have not been already read. It has been working nicely and it has been quite speedy, overall. However, this afternoon, the performance has degraded from fast enough that I wasn't worried about optimizing the query to a glacial 3-4 seconds per query. To select articles, I present this query:
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
WHERE ciid NOT
IN (
SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)
AND (
cid =117
OR cid =308
OR cid =310
)
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
The list of possible cid's varies and could be quite a bit more. In any case, I noted that about 2-3 seconds of the total time to make the query is devoted to "ORDER BY." If I remove that, it only takes about a half second to give me the query back. If I drop the subquery, the performance goes back to normal... but the subquery didn't seem to be problematic until just this afternoon, after working fine for a week or so.
Any ideas what could be slowing it down so much? What might I do to try to get the performance back up to snuff? The table being queried has 45,000 rows. The subquery's table has fewer than 3,000 rows at present.
Update: Incidentally, if anyone has suggestions on how to do multiple queries or some other technique that would be more efficient to accomplish what I am trying to do, I am all ears. I'm really puzzled how to solve the problem at this point. Can I somehow apply the order by before the join to make it apply to the real table and not the derived table? Would that be more efficient?
Here is the latest version of the query, derived from suggestions from #Gordon, below
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
LEFT JOIN (
SELECT ciid, dateRead
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)alreadyRead ON channelitem.ciid = alreadyRead.ciid
WHERE (
alreadyRead.ciid IS NULL
)
AND `cid`
IN ( 6648, 329, 323, 6654, 6647 )
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
Also, I should mention what my db structure looks like with regards to these two tables -- maybe someone can spot something odd about the structure:
CREATE TABLE IF NOT EXISTS `channelitem` (
`newsversion` int(11) NOT NULL DEFAULT '0',
`cid` int(11) NOT NULL DEFAULT '0',
`ciid` int(11) NOT NULL AUTO_INCREMENT,
`description` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`url` varchar(222) DEFAULT NULL,
`creationdate` datetime DEFAULT NULL,
`urgent` varchar(10) DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`lastchanged` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`author` varchar(255) NOT NULL,
PRIMARY KEY (`ciid`),
KEY `newsversion` (`newsversion`),
KEY `cid` (`cid`),
KEY `creationdate` (`creationdate`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1638554365 ;
CREATE TABLE IF NOT EXISTS `uninet_channelitem_read` (
`ciid` int(11) NOT NULL,
`uid` int(11) NOT NULL,
`dateRead` datetime NOT NULL,
PRIMARY KEY (`ciid`,`uid`),
KEY `ciid` (`ciid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
It never hurts to try the left outer join version of such a query:
SELECT ci.ciid, ci.cid, ci.name, ci.description, ci.url, ci.creationdate, ci.author
FROM `channelitem` ci left outer join
(SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
) cr
on ci.ciid = cr.ciid
where cr.ciid is null and
ci.cid in (117, 308, 310)
ORDER BY ci.`creationdate` DESC
LIMIT 0 , 100
This query will be faster with an index on uninet_channelitem_read(ciid) and probably on channelitem(cid, ciid, createddate).
The problem could be that you need to create an index on the channelitem table for the column creationdate. Indexes help a database to run queries faster. Here is a link about MySQL Indexing
Basically I am monitoring slowest query on a website. It turns out they are something like:
INSERT INTO beststat (bestid,period,rawView) VALUES ( 'idX' , 2012 , 1 )
ON DUPLICATE KEY UPDATE rawView = rawView+1
Basically it's a logging table. If the row is already there it updates rawView with a +1
beststat is InnoDB so I have row-level locking and consindering I do a lot of inserts-updates it should be faster than MyISAM.
Anyway that query shouldn't take so long, maybe there is something else wrong. What it could be ?
Of course I have an Unique Index on bestid, period
Additional Info
This table (beststat) currently has ~1mil record and its size is: 68MB. I have 4GB RAM and innodb buffer pool size = 104,857,600. Mysql: 5.1.49-3
CREATE TABLE `beststat` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`bestid` int(11) unsigned NOT NULL,
`period` mediumint(8) unsigned NOT NULL,
`view` mediumint(8) unsigned NOT NULL DEFAULT '0',
`rawView` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `bestid` (`bestid`,`period`)
) ENGINE=InnoDB AUTO_INCREMENT=2020577 DEFAULT CHARSET=utf8
Notice to faster thing a litte bit i could do somethijng like:
UPDATE beststat SET rawView = rawView + 1 WHERE bestid = idX AND period = 2012;
if (mysql_affected_rows()==0)
INSERT INTO beststat (bestid,period,rawView) VALUES ('idX',2012,1)
So most of time i would run only the first query UPDATE. But I would like to understand why the first, more concise, query is slow.
I found this interesting article... still reading
dealing with big # of rows, i suggest to use load date infile to make query faster.
To further improve the query time, you can consider using memory table as well.