tuning a SQL query for better performance - mysql

I have this SQL statement that works but takes a while to execute
I have an activity table and I need to find the last activity and the associated user for each id.
SELECT id, date_time, user
FROM activity_log a
WHERE a.date_time = (SELECT MAX(a1.date_time)
FROM activity_log a1
WHERE a.id = a1.id
GROUP BY id)
ORDER BY `id` desc limit 0, 100
I have a non unique index on date_time field and id field.
How can we get a shorter execution time on this query?

What you currently have is a correlated subquery, which requires a computation on each of the rows you return from your outer select.
Instead, return the entire dataset of id and max(date_time) as a subquery and join to that. That requires only 1 trip to the activity_log table to find each max(date_time) and will significantly improve your runtimes.
SELECT a.id, a.date_time, a.user
FROM activity_log a
INNER JOIN (
SELECT id, MAX(date_time) as date_time
FROM activity_log
GROUP BY id) a1
ON a.id = a1.id and a.date_time = a1.date_time
ORDER BY `id` desc limit 0, 100

What happends if you try this:
SELECT id, date_time, user
FROM activity_log a
WHERE EXISTS
(SELECT 1 FROM (SELECT ID,MAX(a1.date_time) maxdate
FROM activity_log a1
GROUP BY ID) a1 WHERE A1.ID=A.ID AND A1.MAXDATE=a.date_time)
ORDER BY `id` desc limit 0, 100

Related

MySQL Delete all records except latest N for each user

I want to keep lastest N records of each user_id and delete others.
Structure table "tab":
id (auto increment)
user_id
information
If possible, I would like to not delete if a user's number of records is less than N.
Thank you in advance.
You can use correlated subquery as follows:
Delete from your_table t
Where N <= (select count(1)
from Your_table tt
where tt.id < t.id)
You can use join in a delete:
delete t
from t join
(select t.*, row_number() over (order by id desc) as seqnum
from t
) tt
on tt.user_id = t.user_id
where seqnum > N;
This reversely enumerates the rows for a given user_id and then deletes those whose enumeration is too large.
I should add that this requires MySQL 8+.
EDIT:
In older versions of MySQL, you can use:
delete t
from t join
(select u.user_id,
(select t2.id
from t t2
where t2.user_id = t.user_id
order by t2.id desc
limit 1 offset N
) as nth_id
from (select distinct user_id from t) u
) tt
on tt.user_id = t.user_id
where t.id <= nth_id;
Test the subquery before you run the delete. It should be returning the n+1th id for each user.

error in Max count(*) in SQL with same data

I wrote a SQL query to get users with the largest number of purchases.
SELECT name, count(*) as C
FROM sells
GROUP BY user_id
ORDER BY C
LIMIT 1
But, If i have two users with same number of purchase this query can not detect. what's the solution?
Try subquery:
SELECT name, count(*) as C
FROM sells
GROUP BY user_id
HAVING C >= ALL
(SELECT count(*)
FROM sells
GROUP BY user_id)
This will work in any sql version, without using LIMIT in a subquery
Write a subquery that gets the maximum count. Then use HAVING to select all the rows with that count.
SELECT name, COUNT(*) AS c
FROM sells
GROUP BY user_id
HAVING c = (SELECT COUNT(*) c
FROM sells
GROUP BY user_id
ORDER BY c DESC
LIMIT 1)
or this can be done as a join between subqueries:
SELECT t1.*
FROM (SELECT name, COUNT(*) AS c
FROM sells
GROUP BY user_id) AS t1
JOIN (SELECT COUNT(*) AS c
FROM sells
GROUP BY user_id
ORDER BY c DESC
LIMIT 1) AS t2
ON t1.c = t2.c
SELECT name, COUNT(*)
FROM sells
GROUP BY user_id
HAVING COUNT(*) = ( SELECT MAX(C) FROM ( SELECT COUNT(*) AS C FROM sells GROUP BY user_id ) )
You are using LIMIT 1 in the query. It restricts the number of records in the output to one. If you wish to see all the records from the output remove this LIMIT.
If you only need to see one row per every same count, you can modify this query as:
SELECT GROUP_CONCAT(name), count(*) as C
FROM sells
GROUP BY user_id
ORDER BY C
LIMIT 1
This will concatenate both the names having similar counts.

mysql query that has join and counts

I need help getting the top 5 results and their counts from columns from two different tables in a mysql database joined together.
table1 cols
-------
id, country, timestamp
table2 cols
--------
id, table1_id, reason
The results id like to get are the top 5 countries and their number of times found between two timestamps, and the top 5 reasons and their counts for all the rows used to generate the first count. There is a one to many relationship between table1 and table2. This is stumping me and I appreciate any insight you could give me.
It's not entirely clear what resultset you want to return.
This may be of some help to you:
SELECT t.country
, COUNT(DISTINCT t.id) AS count_table1_rows
, COUNT(r.id) AS count_table2_rows
, COUNT(*) AS count_total_rows
FROM table1 t
LEFT
JOIN table2 r
ON r.table1_id = t.id
WHERE t.timestamp >= NOW() - INTERVAL 7 DAY
AND t.timestamp < NOW()
GROUP BY t.country
ORDER BY COUNT(DISTINCT t.id) DESC
LIMIT 5
That will return a maximum of 5 rows, one row per country, with counts of rows in table1, counts of rows found in table2, and a count of the total rows returned.
The LEFT keyword specifies an "outer" join operation, such that rows from table1 are returned even if there are no matching rows found in table2.
To get the count for each "reason", associated with each country, you could do something like this:
SELECT t.country
, COUNT(DISTINCT t.id) AS count_table1_rows
FROM table1 t
LEFT
JOIN ( SELECT s.country
, r.reason
, COUNT(*) AS cnt_r
FROM table1 s
JOIN table2 r
ON s.table1_id = t.id
WHERE s.timestamp >= NOW() - INTERVAL 7 DAY
AND s.timestamp < NOW()
GROUP
BY s.country
, r.reason
) u
ON u.country = t.country
WHERE t.timestamp >= NOW() - INTERVAL 7 DAY
AND t.timestamp < NOW()
GROUP
BY t.country
, u.reason
ORDER
BY COUNT(DISTINCT t.id) DESC
, t.country DESC
, u.cnt_r DESC
, u.reason DESC
This query doesn't "limit" the rows being returned. It would be possible to modify the query to have only a subset of the rows returned, but that can get complex. And before we muck the complexity of adding "top 5 within top 5" type limits, we want to ensure that the rows returned by a query are a superset of the rows we actually want.
Is this what you want?
select t2.reason, count(*)
from (select t1.country, count(*)
from table1 t1
where timestamp between #STARTTIME and #ENDTIME
group by country
order by count(*) desc
limit 5
) c5 join
table1 t1
on c5.country = t1.country and
t1.timestamp between #STARTTIME and #ENDTIME join
table2 t2
on t2.table1_id = t1.id
group by t2.reason;
The c5 subquery gets the five countries. The other two bring back the data for the final aggregation.

Subqueries involving multiple columns

I want a query to return values not present in another table. I currently run two queries and do the intersection in code. I am stuck with the syntax for multiple columns and presence of statements after where
First query:
SELECT sid, cid
FROM Table2
where used = 0
group by sid, cid
Main query:
SELECT sid, cid, count(1) as cnt
FROM Table1
WHERE ##not any pair of (sid, cid) returned from first query##
GROUP BY sid, cid
HAVING cnt < 20
LIMIT 50
What is a complete main query?
Try:
SELECT t1.sid, t1.cid, count(1) as cnt
FROM Table1 t1
LEFT JOIN Table2 t2
ON t1.sid = t2.sid AND t1.cid = t2.cid AND t2.used = 0
WHERE t2.sid IS NULL AND t2.cid IS NULL
GROUP BY sid, cid
HAVING cnt < 20
LIMIT 50

SQL select the last 3 dates from a table

I have a table with lots of fields in mysql
I need a query to return (in the same raw!) the top last 3 dates (dates can have large gaps between them)
ie:
2012/01/20
2012/01/18
2012/01/12
2012/01/10
2012/01/04
etc...
Any help will be appreciated
I must get them in the same row!
This is the query I am trying to use with no success:
SELECT a.id, a.thedate, b.id AS id1, b.thedate AS thedate1, c.id AS id2, c.thedate as thedate2
FROM mytable AS a INNER JOIN mytable AS b ON a.id = b.id INNER JOIN mytable AS c ON b.id=c.id
WHERE c.thedate = SELECT MAX(thedate)
EDIT :
SELECT group_concat(date) FROM (SELECT date FROM my_table ORDER BY date DESC LIMIT 3) AS temp
Corrected-
SELECT group_concat(date) FROM ( select date from table_name order by date desc limit 3) as a
SELECT GROUP_CONCAT(a.date )
FROM (
SELECT date
FROM my_table
ORDER BY date DESC
LIMIT 3
) AS a