optmize a select query (if necessary creating index) - mysql

I have 2 tables:
chats (id, ..., chat_status_id) // (about 28k records)
chat_messages(id, chat_id, send_date, ...) // (about 1 million records)
I need to get chats of certain status with latest message.
This is the select I am using, but it's pretty slow (it works in the end):
SELECT c.*,
p1.*
FROM chats c
JOIN chat_messages p1
ON ( c.id = p1.chat_id )
LEFT OUTER JOIN chat_messages p2
ON ( c.id = p2.chat_id
AND ( p1.send_date < p2.send_date
OR ( p1.send_date = p2.send_date
AND p1.id < p2.id ) ) )
WHERE p2.id IS NULL
AND c.chat_status_id = 1
ORDER BY p1.send_date DESC
I do not know howto optimize it.

I would start with a few index updates. First, your WHERE clause is based on the CHATS table status, that should be first in your index order, but can have the ID as well AFTER to support the joins. Next, your messages table. Similarly, those are JOINed based on the CHAT ID, not its own unique table ID as the primary consideration, but does use the ID as the carry-along for conditional testing of p1.id < p2.id. Suggestions as follows.
Table Index
Chats ( Chat_Status_Id, id )
Chat_Messages ( Chat_id, id, send_date )

Give this a try:
SELECT c.*, p1.*
FROM chats c
JOIN chat_messages p1 ON ( c.id = p1.chat_id )
WHERE NOT EXISTS
(
SELECT 1
FROM chat_messages p2
WHERE c.id = p2.chat_id
AND ( p1.send_date < p2.send_date
OR ( p1.send_date = p2.send_date
AND p1.id < p2.id ) )
)
WHERE c.chat_status_id = 1
ORDER BY p1.send_date DESC
With
chats: INDEX(chat_status_id, id)
chat_messages: INDEX(chat_id, send_date, id)

Related

Limit results to 1 row for each group in subquery

as a result of the execution of this query, situations are possible when two rows have the same minimum price, but i still need to select one. I understand perfectly well that the standard limit cannot be dispensed with here. I do not have enough knowledge to understand from which side to approach the solution of this issue. Thank you in advance for your attension.
UPDATE offers t1
SET t1.deleted_at = NOW()
WHERE t1.id
NOT IN
(
SELECT f.id
FROM (
SELECT name, MIN(net_price) as minprice
FROM offers
WHERE
supplier_id = (SELECT id FROM suppliers WHERE name = 'somename')
group BY name
)
AS x inner join (SELECT * FROM offers) AS f ON f.name = x.name AND f.net_price = x.minprice
)
AND
t1.supplier_id = (SELECT id FROM suppliers WHERE name = 'somename');
I don't know somehowe, but this works for me:
UPDATE offers t1
SET t1.deleted_at = NOW()
WHERE t1.id
NOT IN
(
SELECT f.id
FROM (
SELECT id, name, MIN(net_price) as minprice
FROM offers
WHERE
supplier_id = (SELECT id FROM suppliers WHERE name = 'somename')
group BY name
)
AS x inner join (SELECT * FROM offers) AS f ON f.id = x.id
)
AND
t1.supplier_id = (SELECT id FROM suppliers WHERE name = 'somename');
Just add id to subquery select and put it on inner join f.id = x.id

SQL query to check if value doesn't exist in another table

I have a SQL query which does most of what I need it to do but I'm running into a problem.
There are 3 tables in total. entries, entry_meta and votes.
I need to get an entire row from entries when competition_id = 420 in the entry_meta table and the ID either doesn't exist in votes or it does exist but the user_id column value isn't 1.
Here's the query I'm using:
SELECT entries.* FROM entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
WHERE 1=1
AND ( ( entry_meta.meta_key = 'competition_id' AND CAST(entry_meta.meta_value AS CHAR) = '420') )
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
LIMIT 0, 25;
The votes table has 4 columns. vote_id, entry_id, user_id, value.
One option I was thinking of was to SELECT entry_id FROM votes WHERE user_id = 1 and include it in an AND clause in my query. Is this acceptable/efficient?
E.g.
AND entries.ID NOT IN (SELECT entry_id FROM votes WHERE user_id = 1)
A left join with an appropriate where clause might be useful:
SELECT
entries.*
FROM
entries
INNER JOIN entry_meta ON (entries.ID = entry_meta.entry_id)
LEFT JOIN votes ON entries.ID = votes.entry_id
WHERE 1=1
AND (
entry_meta.meta_key = 'competition_id'
AND CAST(entry_meta.meta_value AS CHAR) = '420')
AND votes.entry_id IS NULL -- This will remove any entry with votes
)
GROUP BY entries.ID
ORDER BY entries.submission_date DESC
Here's an implementation of Andrew's suggestion to use exists / not exists.
select
e.*
from
entries e
join entry_meta em on e.ID = em.entry_id
where
em.meta_key = 'competition_id'
and cast(em.meta_value as char) = '420'
and (
not exists (
select 1
from votes v
where
v.entry_id = e.ID
)
or exists (
select 1
from votes v
where
v.entry_id = e.ID
and v.user_id != 1
)
)
group by e.ID
order by e.submission_date desc
limit 0, 25;
Note: it's generally not a good idea to put a function inside a where clause (due to performance reasons), but since you're also joining on IDs you should be OK.
Also, The left join suggestion by Barranka may cause the query to return more rows than your are expecting (assuming that there is a 1:many relationship between entries and votes).

My Odd SubSelect, Need a LEFT JOIN Improvement

Here is a sample SQL dump: https://gist.github.com/JREAM/99287d033320b2978728
I have a SELECT that grabs a bundle of users.
I then do a foreach loop to attach all the associated tree_processes to that user.
So I end up doing X Queries: users * tree.
Wouldn't it be much more efficient to fetch the two together?
I've thought about doing a LEFT JOIN Subselect, but I'm having a hard time getting it correct.
Below I've done a query to select the correct data in the SELECT, however I would have to do this for all 15 rows and it seems like a TERRIBLE waste of memory.
This is my dirty Ateempt:
-
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
(
SELECT tp.id FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS newest_tree_id,
#
# Don't want to have to do this below for every row
(
SELECT t.type FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
WHERE subscribers_id = s.id
ORDER BY tp.id DESC
LIMIT 1
) AS tree_type
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
This is my LEFT JOIN attempt, but I am having trouble getting the SELECT values
SELECT
s.id,
s.firstname,
s.lastname,
s.email,
freshness.id,
# freshness.subscribers_id < -- Cant get multiples out of the LEFT join
FROM subscribers AS s
INNER JOIN scenario_subscriptions AS ss ON (
ss.subscribers_id = s.id
)
LEFT JOIN ( SELECT tp.id, tp.subscribers_id AS tp FROM tree_processes AS tp
JOIN tree AS t ON (
t.id = tp.tree_id
)
ORDER BY tp.id DESC
LIMIT 1 ) AS freshness
ON (
s.id = subscribers_id
)
WHERE ss.scenarios_id = 1
AND ss.completed != 1
AND ss.purchased_exit != 1
AND deleted != 1
GROUP BY s.id
LIMIT 0, 100
In the LEFT JOIN you are using 'freshness' as the table alias. This in you select you need to additionally state what column(s) you want from it. Since there is only one column (id) you need to add:
freshness.id
to the select clause.
Your ON clause of the left join looks pretty dodgy too. Maybe freshness.id = ss.subscribers_id?
Cheers -

MySQL show the row with the latest Date for each different value in other column?

I'm working with a mysql query that is supposed to select all messages addressed or sent by the user. I need to group all messages with same UID so that I show a single thread for each differente user (this means it should eliminate all messages except the last with same UID). My problem is that I started using GROUP BY to do it but sometimes the row that remains is actually the older message instead of the latest.
This is what I was trying:
SELECT `UID`, `Name`, `Text`, `A`.`Date`
FROM `Users`
INNER JOIN (
(
SELECT *, To_UID AS UID FROM `Messages` WHERE `From_UID` = '$userID' AND `To_UID` != '$userID'
)
UNION ALL
(
SELECT *, From_UID AS UID FROM `Messages` WHERE `To_UID` = '$userID' AND `From_UID` != '$userID'
)
) AS A
ON A.UID = Users.ID
GROUP BY UID // This doesn't work
How can I show only the row with the most resent date per UID?
use DISTINCT and only use ORDER BY date
GROUP BY actually sometimes displays a random row, which isn't always commonly discussed.
you can try some thing like this:
select UID, Name, Text, c.date
from User
inner join (
select if(b.From_UID = '$userID', b.To_UID, b.From_UID) as UID,
*
from Messages as b
inner join(
select if(c.From_UID = '$userID', c.To_UID, c.From_UID) as UID,
max(c.date) as date
from Messages as c
where c.From_UID = '$userID' or c.To_UID = '$userID'
group by UID
) as d on d.date = b.date and d.UID = b.UID
) as e on e.UID = Users.id
)
or create a temp table / stored procedure to make life easier
Temp table
create temp table t
select if(From_UID = '$userID', To_UID, From_UID) as UID, * from Messages
select UID, Name, Text, date
from User
inner join (
select *
from t as t1
inner join(
select
t2.UID,
max(t2.date) as date
from t as t2
group by t2.UID
) as t3 on t3.date = t1.date and t3.UID = t1.UID
) as e on e.UID = Users.id

MySql performance problems

This query I have takes a whooping 45 seconds to execute. I have indexes on all fields that are being search.
SELECT SQL_CALC_FOUND_ROWS g.app_group_id, g.id as g_id, p.`id` as form_id,
a.`user` as activity_user,a.`activity` as app_act ,a.*
FROM grouped g
INNER JOIN
(SELECT max(id) as id, app_group_id FROM grouped GROUP BY app_group_id) g1
ON g1.app_group_id = g.app_group_id AND g.id = g1.id
INNER JOIN form p
on p.id = g.id
INNER JOIN
(SELECT a.id, a.date_time, a.user, a.activity FROM log a) a
ON g.id = a.id
WHERE p.agname like '%blahblah%' and p.`save4later` != 'y'
and a.activity = 'APP Submitted' or a.activity = 'InstaQUOTE'
ORDER BY app_group_id DESC limit 0, 100
In my explain it shows im using Using temporary; Using filesort
Indexes are:
activity table: PRIMARY activity_id INDEX date_time INDEX id INDEX activity INDEX user
form table: PRIMARY id INDEX id_md5 INDEX dateadd INDEX dateu INDEX agent_or_underwriter INDEX
grouped table: UNIQUE id INDEX app_group_id INDEX agent_or_underwriter save4later
Any advice is much appreciated
Thank you very much
To start with, try this one:
SELECT
g.app_group_id,
g.id AS g_id,
p.id AS form_id,
a.user AS activity_user,
a.activity AS app_act,
a.id,
a.date_time
FROM grouped g
INNER JOIN
(SELECT MAX(id) AS id, app_group_id FROM grouped GROUP BY app_group_id) g1
ON g1.app_group_id = g.app_group_id AND g.id = g1.id
INNER JOIN form p
ON p.id = g.id
INNER JOIN log a
ON g.id = a.id
WHERE p.agname LIKE '%blahblah%'
AND p.save4later != 'y'
AND a.activity IN('APP Submitted', 'InstaQUOTE')
LIMIT 0, 100
I removed an unnecessary subquery. Also removed ORDER BY. I guess you could do without sorting, and that must speed the query up a lot.
I also removed SQL_CALC_FOUND_ROWS, because, as I mentioned earlier, it should be faster to issue a separate COUNT(*) query.
Your where clause has an "Or" on the "a.Activity". Without ( ) around both activity, it is going through EVERYTHING, all P, G, G1 aliases. I am guessing that might be your bigger issue.
Additionally, I would ensure you have an index on your "Form" table with the ( Save4Later ) column indexed
I would update the query like this:
SELECT STRAIGHT_JOIN SQL_CALC_FOUND_ROWS
g.app_group_id,
g.id as g_id,
QualifyPages.Form_id,
a.`user` as activity_user,
a.`activity` as app_act,
a.*
FROM
( select p.ID as Form_ID
from FORM p
WHERE p.`save4later` != 'y'
AND p.agname like '%blahblah%' ) QualifyPages
JOIN Grouped g
on QualifyPages.Form_ID = g.ID
INNER JOIN
( SELECT app_group_id,
max(id) as MaxIDPerGroup
FROM
grouped
GROUP BY
app_group_id ) g1
ON g.app_group_id = g1.app_group_id
AND g.id = g1.MaxIDPerGroup
INNER JOIN
( SELECT a.id, a.date_time, a.user, a.activity
FROM log a
WHERE a.activity = 'APP Submitted'
or a.activity = 'InstaQUOTE' ) a
ON g.id = a.id
ORDER BY
g.app_group_id DESC
limit
0, 100