How to make this SQL query faster? - mysql

I have the following query:
SELECT DISTINCT `movies_manager_movie`.`id`,
`movies_manager_movie`.`title`,
`movies_manager_movie`.`original_title`,
`movies_manager_movie`.`synopsis`,
`movies_manager_movie`.`keywords`,
`movies_manager_movie`.`release_date`,
`movies_manager_movie`.`rating`,
`movies_manager_movie`.`poster_web_url`,
`movies_manager_movie`.`has_poster`,
`movies_manager_movie`.`number`,
`movies_manager_movie`.`has_sources`,
`movies_manager_movie`.`season_id`,
`movies_manager_movie`.`created`,
`movies_manager_movie`.`updated`,
`movies_manager_moviecache`.`activity_name`
FROM `movies_manager_movie`
LEFT OUTER JOIN `movies_manager_moviecache` ON (`movies_manager_movie`.`id` = `movies_manager_moviecache`.`movie_id`)
WHERE (`movies_manager_movie`.`has_sources` = 1
AND (`movies_manager_moviecache`.`team_member_id` IN (
SELECT U0.`id` FROM `movies_manager_movieteammember` U0
INNER JOIN `movies_manager_movieteammemberactivity` U1 ON (U0.`id` = U1.`team_member_id`)
WHERE U1.`movie_id` = 3588 )
AND `movies_manager_movie`.`number` IS NULL
)
AND NOT (`movies_manager_movie`.`id` = 3588 ))
ORDER BY `movies_manager_moviecache`.`activity_name` DESC LIMIT 3;
This query can take up to 3 seconds and I'm very surprise since I got indexes everywhere and no more than 35 rows in each of my MyIsam tables, using the latest MySQL version.
I cached everything I could but I have at least to run this one 20000 times every day, which is approximately 16 h of waiting for loading. And I'm pretty sure none of my user (nor Google Bot) appreciate a 4 secondes waiting time for each page loading.
What could I do to make it faster ?
I thought about duplicating field from movie to moviecache since the all purpose of movie cache is to denormalize to complex join already.
I tried inlining the subquery to a list of ID but it surprisingly doubled the time of the query.
Tables:
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(120) | NO | UNI | NULL | |
| original_title | varchar(120) | YES | | NULL | |
| synopsis | longtext | YES | | NULL | |
| keywords | varchar(120) | YES | | NULL | |
| release_date | date | YES | | NULL | |
| rating | int(11) | NO | | NULL | |
| poster_web_url | varchar(255) | YES | | NULL | |
| has_poster | tinyint(1) | NO | | NULL | |
| number | int(11) | YES | | NULL | |
| season_id | int(11) | YES | MUL | NULL | |
| created | datetime | NO | | NULL | |
| updated | datetime | NO | | NULL | |
| has_sources | tinyint(1) | NO | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(120) | NO | UNI | NULL | |
| biography | longtext | YES | | NULL | |
| birth_date | date | YES | | NULL | |
| picture_web_url | varchar(255) | YES | | NULL | |
| allocine_link | varchar(255) | YES | | NULL | |
| created | datetime | NO | | NULL | |
| updated | datetime | NO | | NULL | |
| has_picture | tinyint(1) | NO | | NULL | |
| biography_linkyfied | longtext | YES | | NULL | |
+---------------------+--------------+------+-----+---------+----------------+
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| movie_id | int(11) | NO | MUL | NULL | |
| tag_slug | varchar(100) | YES | MUL | NULL | |
| team_member_id | int(11) | YES | MUL | NULL | |
| cast_rank | int(11) | YES | | NULL | |
| activity_name | varchar(30) | YES | MUL | NULL | |
+----------------+--------------+------+-----+---------+----------------+
Mysql tells me it's a slow query:
# Query_time: 3 Lock_time: 0 Rows_sent: 9 Rows_examined: 454128

Move movies_manager_movieteammemberactivity and movies_manager_movieteammember to your main join statement (so that you're doing a left outer between movies_manager_movie and the inner join product of the other 3 tables). This should speed up your query considerably.

Try this:
SELECT `movies_manager_movie`.`id`,
`movies_manager_movie`.`title`,
`movies_manager_movie`.`original_title`,
`movies_manager_movie`.`synopsis`,
`movies_manager_movie`.`keywords`,
`movies_manager_movie`.`release_date`,
`movies_manager_movie`.`rating`,
`movies_manager_movie`.`poster_web_url`,
`movies_manager_movie`.`has_poster`,
`movies_manager_movie`.`number`,
`movies_manager_movie`.`has_sources`,
`movies_manager_movie`.`season_id`,
`movies_manager_movie`.`created`,
`movies_manager_movie`.`updated`,
(
SELECT `movies_manager_moviecache`.`activity_name`
FROM `movies_manager_moviecache`
WHERE (`movies_manager_movie`.`id` = `movies_manager_moviecache`.`movie_id`
AND (`movies_manager_moviecache`.`team_member_id` IN (
SELECT U0.`id` FROM `movies_manager_movieteammember` U0
INNER JOIN `movies_manager_movieteammemberactivity` U1 ON (U0.`id` = U1.`team_member_id`)
WHERE U1.`movie_id` = 3588 )
AND `movies_manager_movie`.`number` IS NULL
) ) LIMIT 1) AS `activity_name`
FROM `movies_manager_movie`
WHERE (`movies_manager_movie`.`has_sources` = 1
AND NOT (`movies_manager_movie`.`id` = 3588 ))
ORDER BY `activity_name` DESC
LIMIT 3;
Let me know how that performs

Related

MySQL Matching where clause with optional NULL

I have 2 tables - patients, and issuers.
I with to extract entire patients table along with issuer_name from patients and issuers table. Optionally there might be no issuer of patient identifier.
If i do:
select * from patients, issuer_name where patients.issuer_of_patient_identifier=issuer.issuer_id doesn't return anything in case for the corresponding patient table row issuer_of_patient_identifier is NULL.
How do i accomplish this?
mysql> describe patients;
+------------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------------+--------------+------+-----+-------------------+-----------------------------+
| patient_id | int(11) | NO | PRI | NULL | auto_increment |
| patient_identifier | varchar(64) | YES | MUL | NULL | |
| issuer_of_patient_identifier | int(11) | YES | MUL | NULL | |
| medical_record_locator | varchar(64) | YES | | NULL | |
| patient_name | varchar(128) | NO | MUL | NULL | |
| birth_date | datetime | YES | | NULL | |
| deceased_date | datetime | YES | | NULL | |
| gender | varchar(16) | YES | | NULL | |
| ethnicity | varchar(45) | YES | | NULL | |
| date_created | datetime | NO | | CURRENT_TIMESTAMP | |
| last_update_date | datetime | YES | | NULL | on update CURRENT_TIMESTAMP |
| last_updated_by | varchar(128) | NO | | NULL | |
+------------------------------+--------------+------+-----+-------------------+-----------------------------+
12 rows in set (0.00 sec)
mysql> describe issuers;
+------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+-------------------+-----------------------------+
| issuer_id | int(11) | NO | PRI | NULL | auto_increment |
| issuer_name | varchar(64) | NO | MUL | NULL | |
| issuer_uid | varchar(128) | YES | | NULL | |
| issuer_uid_type | varchar(64) | YES | | NULL | |
| date_created | datetime | NO | | CURRENT_TIMESTAMP | |
| last_update_date | datetime | YES | | NULL | on update CURRENT_TIMESTAMP |
| last_updated_by | varchar(128) | NO | | NULL | |
+------------------+--------------+------+-----+-------------------+-----------------------------+
The query is
select * from patients
LEFT JOIN issuer_name ON
patients.issuer_of_patient_identifier = issuer.issuer_id
Without joins:
select * from patients, issuer_name
where patients.issuer_of_patient_identifier = issuer.issuer_id
OR patients.issuer_of_patient_identifier IS NULL
Never use commas in the FROM clause. Always use proper, explicit JOIN syntax. Avoid tutorials that use commas. Challenge instructors that show examples with them.
You want a LEFT JOIN:
select *
from patients p left join
issuer_name i
on p.issuer_of_patient_identifier = i.issuer_id;

MySQL select last row of each one user from a list of users in a table

Im trying to select from a Moodle table the last row of each user in a list.
my query is
SELECT *
FROM mdl_logstore_standard_log
WHERE eventname='\\core\\event\\user_enrolment_created'
AND courseid=34
AND relateduserid IN(120,128)
GROUP BY relateduserid;`
and the table that i use is :
MariaDB [****_*****]> describe mdl_logstore_standard_log;
+-------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+----------------+
| id | bigint(10) | NO | PRI | NULL | auto_increment |
| eventname | varchar(255) | NO | | | |
| component | varchar(100) | NO | | | |
| action | varchar(100) | NO | | | |
| target | varchar(100) | NO | | | |
| objecttable | varchar(50) | YES | | NULL | |
| objectid | bigint(10) | YES | | NULL | |
| crud | varchar(1) | NO | | | |
| edulevel | tinyint(1) | NO | | NULL | |
| contextid | bigint(10) | NO | MUL | NULL | |
| contextlevel | bigint(10) | NO | | NULL | |
| contextinstanceid | bigint(10) | NO | | NULL | |
| userid | bigint(10) | NO | MUL | NULL | |
| courseid | bigint(10) | YES | MUL | NULL | |
| relateduserid | bigint(10) | YES | | NULL | |
| anonymous | tinyint(1) | NO | | 0 | |
| other | longtext | YES | | NULL | |
| timecreated | bigint(10) | NO | MUL | NULL | |
| origin | varchar(10) | YES | | NULL | |
| ip | varchar(45) | YES | | NULL | |
| realuserid | bigint(10) | YES | | NULL | |
+-------------------+--------------+------+-----+---------+----------------+
My Problem with this query is that it give me the first row for every userid in list, and i want the last. i tried order by id desc but nothing changed.
You can try this:
SELECT
L.*
FROM mdl_logstore_standard_log L
INNER JOIN
(
SELECT
relateduserid,
MAX(id) AS max_id
FROM mdl_logstore_standard_log
WHERE eventname='\\core\\event\\user_enrolment_created'
AND courseid=34
AND relateduserid IN(120,128)
GROUP BY relateduserid
)AS t
ON L.id = t.max_id
First getting the maximum auto increment id for those relateduserids then making an inner join between mdl_logstore_standard_log and t table would return your expected result.
Try this but i didnt test it
select * from mdl_logstore_standard_log where eventname='\\core\\event\\user_enrolment_created' and courseid=34 and relateduserid IN(120,128) GROUP BY relateduserid ORDER BY id DESC LIMIT 1;

MySQL Unknown column in where clause?

I have two databases.
One is called INFO with three tables (Stories, Comments, Replies)
Stories has the following fields
+--------------+----------------+------+-----+---------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------------+------+-----+---------------------+-----------------------------+
| storyID | int(11) | NO | PRI | NULL | |
| originalURL | varchar(500) | YES | | NULL | |
| originalDate | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| numDiggs | int(11) | YES | | NULL | |
| numComments | int(11) | YES | | NULL | |
| diggURL | varchar(500) | YES | | NULL | |
| rating | varchar(50) | YES | | NULL | |
| title | varchar(200) | YES | | NULL | |
| summary | varchar(10000) | YES | | NULL | |
| uploaderID | varchar(50) | YES | | NULL | |
| imageURL | varchar(500) | YES | | NULL | |
| category1 | varchar(50) | YES | | NULL | |
| category2 | varchar(50) | YES | | NULL | |
| uploadDate | timestamp | NO | | 0000-00-00 00:00:00 | |
| num | int(11) | YES | | NULL | |
+--------------+----------------+------+-----+---------------------+-----------------------------+
Another database is called Data with one table (User). Fields shown below:
+-------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-------------+------+-----+---------+-------+
| userID | varchar(50) | NO | PRI | NULL | |
| numStories | int(11) | YES | | NULL | |
| numComments | int(11) | YES | | NULL | |
| numReplies | int(11) | YES | | NULL | |
| numStoryDiggs | int(11) | YES | | NULL | |
| numCommentReplies | int(11) | YES | | NULL | |
| numReplyDiggs | int(11) | YES | | NULL | |
| numStoryComments | int(11) | YES | | NULL | |
| numStoryReplies | int(11) | YES | | NULL | |
+-------------------+-------------+------+-----+---------+-------+
User.userID is full of thousands of unique names. All other fields are currently NULL. The names in User.userID correspond to the names in Stories.uploaderID.
I need to, for each userID in User, count the number of stories uploaded from (i.e. num) Stories for the corresponding name and insert this value into User.numStories.
The query which I have come up with (which produces an error) is:
INSERT INTO DATA.User(numStories)
SELECT count(num)
FROM INFO.Stories
WHERE INFO.Stories.uploaderID=DATA.User.userID;
The error I get when running this query is
Unknown column 'DATA.User.userID' in 'where clause'
Sorry if this is badly explained. I will try and re-explain if need be.
You aren't creating new entries in the User table, you're updating existing ones. Hence, insert isn't the right syntax here, but rather update:
UPDATE DATA.User u
JOIN (SELECT uploaderID, SUM(num) AS sumNum
FROM INFO.Stories
GROUP BY uploadedID) i ON i.uploaderID = u.userID
SET numStories = sumNum
EDIT:
Some clarification, as requested in the comments.
The inner query sums the num in Stories per uploaderId. The updates statement updates the numStories in User the the calculated sum of the inner query of the matching id.

MySQL : why is left join slower then inner join? Optimization help required

I have a MySQL query which joins between the two table. I need to map call id from first table with second table. Second table may not have the call id, hence I need to left join the tables. Below is the query, it takes around 125 seconds to finish.
select uniqueid, TRANTAB.DISP, TRANTAB.DIAL FROM
closer_log LEFT JOIN
(select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log group by call_uniqueId) TRANTAB
on closer_log.uniqueid=TRANTAB.call_uniqueId;
Here is the explain output of the query with left join.
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
| 1 | PRIMARY | closer_log | index | NULL | uniqueid | 43 | NULL | 37409 | Using index |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 32535 | |
| 2 | DERIVED | agent_transition_log | index | NULL | index_agent_transition_log | 43 | NULL | 159406 | |
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
If I do the internal join, then execution time is around 2 seconds.
select uniqueid, TRANTAB.DISP, TRANTAB.DIAL FROM
closer_log JOIN
(select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log group by call_uniqueId) TRANTAB
on closer_log.uniqueid=TRANTAB.call_uniqueId;
Explain output of query with internal join.
+----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 32535 | |
| 1 | PRIMARY | closer_log | ref | uniqueid,index_closer_log | index_closer_log | 43 | TRANTAB.call_uniqueId | 1 | Using where; Using index |
| 2 | DERIVED | agent_transition_log | index | NULL | index_agent_transition_log | 43 | NULL | 159406 | |
+----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
My question is, why is internal join so much faster then left join. Does my query has any logical fault which is causing the slow execution? What are my optimization options. The call ids in both the tables are indexed.
Edit 1) Added table descriptions
mysql> desc agent_transition_log;
+--------------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+----------------------+------+-----+---------+-------+
| user_log_id | int(9) unsigned | NO | MUL | NULL | |
| event_time | datetime | YES | | NULL | |
| dispoStatus | varchar(6) | YES | | NULL | |
| call_uniqueId | varchar(40) | YES | MUL | NULL | |
| xfer_call_uid | varchar(40) | YES | | NULL | |
| pause_duration | smallint(5) unsigned | YES | | 0 | |
| wait_duration | smallint(5) unsigned | YES | | 0 | |
| dialing_duration | smallint(5) unsigned | YES | | 0 | |
| ring_wait_duration | smallint(5) unsigned | YES | | 0 | |
| talk_duration | smallint(5) unsigned | YES | | 0 | |
| dispo_duration | smallint(5) unsigned | YES | | 0 | |
| park_duration | smallint(5) unsigned | YES | | 0 | |
| rec_duration | smallint(5) unsigned | YES | | 0 | |
| xfer_wait_duration | smallint(5) unsigned | YES | | 0 | |
| logged_in_duration | smallint(5) unsigned | YES | | 0 | |
| sub_status | varchar(6) | YES | | NULL | |
+--------------------+----------------------+------+-----+---------+-------+
16 rows in set (0.00 sec)
mysql> desc closer_log;
+----------------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+----------------------+------+-----+---------+----------------+
| closecallid | int(9) unsigned | NO | PRI | NULL | auto_increment |
| lead_id | int(9) unsigned | NO | MUL | NULL | |
| list_id | bigint(14) unsigned | YES | | NULL | |
| campaign_id | varchar(20) | YES | MUL | NULL | |
| call_date | datetime | YES | MUL | NULL | |
| start_epoch | int(10) unsigned | YES | | NULL | |
| end_epoch | int(10) unsigned | YES | | NULL | |
| length_in_sec | int(10) | YES | | NULL | |
| status | varchar(6) | YES | | NULL | |
| phone_code | varchar(10) | YES | | NULL | |
| phone_number | varchar(18) | YES | MUL | NULL | |
| user | varchar(20) | YES | | NULL | |
| comments | varchar(255) | YES | | NULL | |
| processed | enum('Y','N') | YES | | NULL | |
| queue_seconds | decimal(7,2) | YES | | 0.00 | |
| user_group | varchar(20) | YES | | NULL | |
| xfercallid | int(9) unsigned | YES | | NULL | |
| uniqueid | varchar(40) | YES | MUL | NULL | |
| callerid | varchar(40) | YES | | NULL | |
| agent_only | varchar(20) | YES | | | |
| queue_position | smallint(4) unsigned | YES | | 1 | |
| root_uid | varchar(40) | YES | | NULL | |
| parent_uid | varchar(40) | YES | | NULL | |
| extension | varchar(100) | YES | | NULL | |
| alt_dial | varchar(6) | YES | | NULL | |
| talk_duration | smallint(5) unsigned | YES | | 0 | |
| did_pattern | varchar(50) | YES | | NULL | |
+----------------+----------------------+------+-----+---------+----------------+
Left join looks for the fields from left + unmatched entries from right, so it has to check every joined field in the right table which might be NULL (if you don't have an index on the fields for that JOIN, it means the query will check the whole right table every time). Inner join looks only for direct matches, so it might not have to go over the whole table to perform a join (Especially if you join on indexed fields).
By the way, if you only want to display the entries mentioned in agent_transition_log, you don't need join at all:
select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log group by call_uniqueId;
will do the job.
OR if you do want to add the missing entries:
SELECT call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log group by call_uniqueId
UNION
SELECT uniqueid as call_uniqueid, NULL as DISP, NULL as DIAL from closer_log
WHERE uniqueid not in (SELECT call_uniqueid FROM agent_transition_log);

MySQL - Grouping and counting

My database structure contains:
actions
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| placement_id | int(11) | YES | MUL | NULL | |
| lead_id | int(11) | NO | MUL | NULL | |
+--------------+--------------+------+-----+---------+----------------+
placements
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| publisher_id | int(11) | YES | MUL | NULL | |
| name | varchar(255) | NO | | NULL | |
| status | tinyint(1) | NO | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
leads
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| status | varchar(255) | YES | | NULL | |
+---------+--------------+------+-----+---------+----------------+
I would like to retrieve a number of statuses per each placement (groupped):
| placement_id | placement_name | count
+--------------+----------------+-------
| 123 | PlacementOne | 12
| 567 | PlacementTwo | 15
I tried countless times and every query I wrote seems dumb, so I won't even post them here. I'm hopeless. An well commented query (so I can learn) would be much appreciated.
select a.placement_id, p.name as placement_name, count(l.status)
from actions a
inner join placements p on p.id = a.placement_id
inner join leads l on l.id = a.lead_id
group by a.placement_id, p.name