How can I count total selected row before apply LIMIT? - mysql

I have an almost complex query like this:
SELECT qa.id,
qa.subject,
qa.category cat,
qa.keywords tags,
qa.body_html,
qa.amount,
qa.visibility,
qa.date_time,
COALESCE(u.reputation, 'N') reputation,
COALESCE(Concat(u.user_fname, ' ', u.user_lname), 'unknown') NAME,
COALESCE(u.avatar, 'anonymous.png') avatar,
(
SELECT COALESCE(Sum(vv.value),0)
FROM votes vv
WHERE qa.id = vv.post_id
AND 15 = vv.table_code) AS total_votes,
(
SELECT COALESCE(Sum(vt.total_viewed),0)
FROM viewed_total vt
WHERE qa.id = vt.post_id
AND 15 = vt.table_code limit 1) AS total_viewed
FROM qanda qa
LEFT JOIN users u
ON qa.author_id = u.id
AND qa.visibility = 1
WHERE qa.type = 0 $query_where
ORDER BY $query_order
LIMIT :j, 11;
Noted that $query_where variable contains some other conditions which will be created dynamically. Anyway, as you see, maximum it returns 10 posts.
Currently, to count total matched rows, I use another query like this:
SELECT COUNT(amount) paid_qs,
COUNT(*) all_qs
FROM qanda qa
WHERE type = 0 $query_where
I guess there is some waste processing. I mean two separated queries (with complex conditions on the where clause) will be too much.
Is there any approach to use one query instead of them?

You can query the found rows after the query with the FOUND_ROWS() function.
Reference: MySQL Reference Manual
You have to include the SELECT SQL_CALC_FOUND_ROWS ... clause into your query.

Related

Passing argument in LEFT JOIN

I am currently trying to get data from 2 tables with a LEFT JOIN having an unknow value.
I tried using LEFT JOIN but it didn't work.
Here is my code example :
SELECT
cc.shid,
cc.user,
ts.type,
sum(cc.qty1) + sum(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM
content_c cc
LEFT JOIN
(SELECT
s.shid,
s.type
FROM
tab_s s
LIMIT 1
) as ts ON ts.shid = cc.shid
WHERE
cc.time_i like '2019-01%'
GROUP BY
cc.user,
ts.type
With that query it will never work : ts will contain the first occurence of tab_s regardless of cc.shid. I wonder if there is a way to make this :
LEFT JOIN
(SELECT
s.shid,
s.type
FROM
tab_s s
WHERE
s.shid = cc.shid
LIMIT 1
) as ts ON ts.shid = cc.shid
Any idea ? Is there a pointer notion in SQL or something like ? Like I can use &cc.shid, or #cc.shid ?
Note that doing the following :
LEFT JOIN tab_s ts ON ts.shid = cc.shid
Will make my request to take more than 1 minute to display results. And I cannot set an index in tab_s.shid aswell as cc.shid as its have multiple occurences.
Please keep in mind that content_c can have multiple occurence of cc.shid, that why I need to take only the first result (LIMIT 1). It's important.
Use a correlated subquery:
SELECT cc.shid, cc.user, cc.type,
SUM(cc.qty1) + SUM(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM (SELECT cc.*,
(SELECT s.type
FROM tab_s s
WHERE ts.shid = cc.shid
LIMIT 1
) as type
FROM content_c cc
) cc
WHERE cc.time_i >= '2019-01-01' AND
cc.time_i < '2019-02-01'
GROUP BY cc.shid, cc.user, cc.type;
Notes:
The use of LIMIT with no ORDER BY is suspicious. Why would there be duplicates in the underlying table?
Your date comparisons are bad. Use date/time functions when working with date/time values. Don't use string functions.
The GROUP BY should include all non-aggregated columns in the SELECT.
As discussed in the question comments, Can you please try this script and see if it meets your requirements? This will return a row per ID in "content_c" table with the GROUP BY impact.
SELECT
cc.shid,
cc.user,
ts.type,
sum(cc.qty1) + sum(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM content_c cc
LEFT JOIN
(
SELECT DISTINCT s.shid, s.type FROM tab_s s
) AS ts ON ts.shid = cc.shid
WHERE cc.time_i like '2019-01%'
GROUP BY cc.shid,cc.user,ts.type

MySQL select best (and oldest) perform per athlete, categories

I am trying to build the SQL query from following table (example):
Example of table with name "performances"
This is table with athletic performances. I want to select the best perform from this table per discipline and set of one or more categories. Each athlete should be only once in result though his best perform value is twice or more in performance table.
Here is expected result from table "performances"
Actually I have this SQL query, but from subquery join all rows with best value for athlete_id and best:
SELECT
p.athlete_id, p.value
FROM
(SELECT athlete_id, MAX(value) AS best FROM performances
WHERE discipline_id = 32 AND category_id IN (1,3,5,7,9)
GROUP BY athlete_id) f
INNER JOIN performances p
ON p.athlete_id = f.athlete_id AND p.conversion = f.best
ORDER BY p.value DESC, p.created
Please, how can I join only one row for each athlete, which has a oldest created attributte?
To get the single row for each athlete per discipline based on greatest value value you can do a self left join, To handle the tie case or if single athlete has more than 1 rows having same maximum value you can use case statement to pick the row with oldest date
select a.*
from performances a
left join performances b
on a.discipline_id = b.discipline_id
and a.athlete_id = b.athlete_id
and case when a.value = b.value
then a.created > b.created
else a.value < b.value
end
where b.discipline_id is null
DEMO
Further you can add filter in your where clause
and a.discipline_id = 32
and a.category_id IN (1,3,5,7,9)
DEMO
You don't have to use joins, you can do it with a window function:
SELECT
p.athlete_id,
p.value
FROM
(
SELECT
athlete_id,
value,
ROW_NUMBER() over (partition by athlete_id order by value desc, created) rowid
FROM
performances
WHERE
discipline_id = 32 AND
category_id IN (1,3,5,7,9)
) p
where
p.rowid = 1
Thank you a lot, Guys. After your answers I finally found the solution.
SELECT r.* FROM
(SELECT p.athlete_id, p.conversion, MIN(p.created) AS created FROM
(SELECT athlete_id, MAX(conversion) AS best
FROM performances
WHERE discipline_id = 32 AND category_id IN (1,3,5,7,9)
GROUP BY athlete_id) f
INNER JOIN performances p ON p.athlete_id = f.athlete_id AND p.conversion = f.best
GROUP BY p.athlete_id) w INNER JOIN performances r
ON w.athlete_id = r.athlete_id AND w.conversion = r.conversion
AND ((w.created = r.created) OR (w.created IS NULL AND r.created IS NULL))
ORDER BY r.conversion DESC, r.created

Optimization of relatively basic JOIN and GROUP BY query

I have a relatively basic query that fetches the most recent messages per conversation:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
The message table contains more than 911000 records, the conversation table contains around 680000. The execution time for this query, varies between 4 and 10 seconds, depending on the load on the server. Which is far too long.
Below is a screenshot of the EXPLAIN result:
The cause is apparently the MAX and/or the GROUP BY, because the following similar query only takes 10ms:
SELECT COUNT(*)
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE (`message`.`status`=0)
AND (`message`.`user_id` <> 1)
AND ((`conversation`.`sender_user_id` = 1 OR `conversation`.`receiver_user_id` = 1))
The corresponding EXPLAIN result:
I have tried adding different indices to both tables without any improvement, for example: conv_msg_idx(add_time, conversation_id) on message which seems to be used according to the first EXPLAIN result, however the query still takes around 10 seconds to execute.
Any help improving the indices or query to get the execution time down would be greatly appreciated.
EDIT:
I have changed the query to use an INNER JOIN:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
But the execution time is still ~ 6 seconds.
You should create Multiple-Column Index on the columns which are in your WHERE clause and which you want to SELECT (except conversation_id). (reference)
conversation_id should be an index in both table.
Try to avoid 'Or' in Sql query this will make the fetching slow. Instead use union or any other methods.
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE (conversation.sender_user_id = 1 AND conversation.status != -1)) GROUP BY conversation_id
union
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE ((conversation.receiver_user_id = 1 AND conversation.status != -2) ) GROUP BY conversation_id ORDER BY max_add_time DESC LIMIT 12
Instead of depending on a single table message, have two tables: One for message, as you have, plus another thread that keeps the status of the thread of messages.
Yes, that requires a little more work when adding a new message -- update a column or two in thread.
But it eliminates the GROUP BY and MAX that are causing grief in this query.
While doing this split, see if some other columns would be better off in the new table.
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
You can try with INNER JOIN, if your logic not get affect using it.
you can modify this query by avoiding max() use
select * from(
select row_number() over(partition by conversation_id order by add_time desc)p1
)t1 where t1.p1=1

sql query very slow when another table gets fuller

I have the following query, but after some time when users start putting in more and more items in the "ci_falsepositives" table, it gets really slow.
The ci_falsepositives table contains a reference field from ci_address_book and another reference field from ci_matched_sanctions.
How can I create a new query but still being able to sort on each field.
For example I can still sort on "hits" or "matches"
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches,
(SELECT COUNT(*)
FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference
AND n.sanction_key IN
(SELECT sanction_key FROM ci_matched_sanctions)
) AS falsepositives
FROM ci_address_book c
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) S
ORDER BY folder asc, wholename ASC
LIMIT 0,15
The problem has to be the SELECT COUNT(*) FROM ci_falsepositives sub-query. That sub-query can be written using an inner join between ci_falsepositives and ci_matched_sanctions, but the optimizer might do that for you anyway. What I think you need to do, though, is make that sub-query into a separate query in the FROM clause of the 'next query out' (that is, SELECT c.*, ...). Probably, that query is being evaluated multiple times - and that's what's hurting you when people add records to ci_falsepositives. You should study the query plan carefully.
Maybe this query will be better:
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches, f.falsepositives
FROM ci_address_book AS c
JOIN (SELECT n.addressbook_id, COUNT(*) AS falsepositives
FROM ci_falsepositives AS n
JOIN ci_matched_sanctions AS m
ON n.sanction_key = m.sanction_key
GROUP BY n.addressbook_id
) AS f
ON c.reference = f.addressbook_id
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) AS s
ORDER BY folder asc, wholename ASC
LIMIT 0, 15

Query output differs from the expected output

Below query is doing what I need:
SELECT assign.from_uid, assign.aid, assign.message, curriculum.asset,
curriculum.title, curriculum.description
FROM assignment assign
INNER JOIN curriculum_topics_assets curriculum
ON assign.nid = curriculum.asset
WHERE assign.to_uid = 13 AND assign.status = 1
GROUP BY assign.from_uid, assign.to_uid, assign.nid
ORDER BY assign.created DESC
Now I need to get the total count of rows of the result. For example if it is displaying 5 rows the o/p should be like My expected o/p. The query I tried is given below.
SELECT count(description) FROM assignment assign
INNER JOIN curriculum_topics_assets curriculum ON assign.nid = curriculum.asset
WHERE assign.to_uid = 13 AND assign.status = 1
GROUP BY assign.from_uid, assign.to_uid, assign.nid
ORDER BY assign.created DESC
My expected o/p:
count(*)
---------
5
My current o/p:
count(*)
---------
6
2
5
6
6
The easiest solution would be to
place your initial GROUP BY query in a subselect
select the amount of rows retrieved from this subselect
SQL Statement
SELECT COUNT(*)
FROM (
SELECT assign.from_uid
FROM assignment assign
INNER JOIN curriculum_topics_assets curriculum ON assign.nid = curriculum.asset
WHERE assign.to_uid = 13
AND assign.status = 1
GROUP BY
assign.from_uid
, assign.to_uid
, assign.nid
) q
Edit - why doesn't the original query return the results required
It did already prepared what was needed to get the correct result
Your query without grouping returns a resultset of 25 records (6+2+5+6+6)
From these 25 records, you have 5 unique combinations of from_uid, to_uid, nid
Now you don't want to count how many records each combination has (as you did in your example) but how many unique (distinct anyone?) combinations there are.
One solution to this is the subselect I presented but following equivalent statement using a DISTINCT clause might be more comprehensive.
SELECT COUNT(*)
FROM (
SELECT DISTINCT assign.from_uid
, assign.to_uid
, assign.nid
FROM assignment assign
INNER JOIN curriculum_topics_assets curriculum ON assign.nid = curriculum.asset
WHERE assign.to_uid = 13
AND assign.status = 1
) q
Note that my personal preference goes to the GROUP BY solution.
To get the number of rows for a query do:
SELECT COUNT(*) as RowCount FROM (--insert other query here--) s
In you example:
SELECT COUNT(*) as RowCount FROM (SELECT a.from_uid
FROM assignment a
INNER JOIN curriculum_topics_assets c ON a.nid = c.asset
WHERE a.to_uid = 13
AND a.status = 1
GROUP BY a.from_uid, a.to_uid, a.nid
) s
Note that I the dropped the stuff that has no effect on the number of rows to make the query run slightly faster.
You should use COUNT(*) instead of count(description). Look at: http://www.mysqlperformanceblog.com/2007/04/10/count-vs-countcol/