I have a MySQL (v5.7.26) query that runs forever. Here is the query:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY
)
AND r.type = 'WORKOUT'
GROUP BY ur.user_id;
I have played a bit with it, by trying to understand where is the problem. For the testing purposes, I tried breaking in two. So:
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY;
That returns (very quickly) list of user user_id's.
When I plug the returned result in to the first part of the query, like this:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (1,1,1,4,4,5,6,7,7,7);
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
It runs very fast. My assumption is the IN (Subquery) is the bottleneck.
I was thinking to extract the subquery and get the user_ids, and then used it as a variable, but I am not sure is it the good approach, and additionally I am having issues with it. this is my attempt:
-- first statement
SET #v1 = (SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY)
-- second statement
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS prefixes
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND ur.user_id IN (#v1);
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
Problem here is that the first statement returns an error:
Subquery returns more than 1 row.
Expected result are user_id's, that can be duplicates. And I need those duplicated for the count.
How can I fix this?
Try EXISTS instead of IN
...
AND EXISTS (SELECT *
FROM user_resource ur2
WHERE ur2.user_id = ur.user_id
AND ur2.action_date >= now() - INTERVAL 2 DAY)
...
and indices on user_resource (user_id, action_date), user_resource (status, action_date, user_id) and/or user_resource (type).
You could try:
-- first statement
SET #v1 = (SELECT GROUP_CONCAT(user_id)
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY)
-- second statement
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS prefixes
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
WHERE ur.status = 1 AND NOT ur.action_date IS NULL AND FIND_IN_SET(ur.user_id,#v1)
AND r.type = 'WORKOUT'
GROUP BY ur.user_id
Additional join will be faster then sub-query:
SELECT
ur.user_id AS user_id,
sum(r.duration) AS total_time,
count(user_id) AS number_of_workouts
FROM user_resource ur
INNER JOIN resource r ON r.id = ur.resource_id
INNER JOIN (
SELECT user_id
FROM user_resource ur2
WHERE ur2.action_date >= now() - INTERVAL 2 DAY
) t ON t.user_id = ur.user_id
WHERE
ur.status = 1
AND NOT ur.action_date IS NULL
AND r.type = 'WORKOUT'
GROUP BY ur.user_id;
Related
I have this SQL query that returns me the whole set:
SELECT s.id, s.status, s.created_at
FROM scores s
WHERE s.account_id IN (
SELECT a.id
FROM accounts a
WHERE a.status = 'ACTIVE' AND a.created_at >= DATE(NOW()) - 3 AND a.tool = 'GLR'
) ORDER BY s.created_at ASC;
...but now I want a sub-set of that: all the records in account that doesn't have any record in score.
I've tried these, but no luck so far O:)
--- Using LEFT JOIN
SELECT s.account_id, s.status, s.created_at
FROM scores s
LEFT JOIN accounts a ON s.account_id = a.id
AND a.status = 'ACTIVE'
AND a.created_at >= DATE(NOW()) - 3
AND a.tool = 'GLR'
WHERE s.account_id IS NULL ORDER BY s.created_at ASC;
--- Using EXISTS / NOT EXISTS
SELECT s.account_id, s.status, s.created_at
FROM scores s
WHERE NOT EXISTS (
SELECT 1 FROM accounts a
WHERE s.account_id = a.id
AND a.status = 'ACTIVE'
AND a.created_at >= DATE(NOW()) - 3
AND a.tool = 'GLR'
) ORDER BY s.created_at ASC;
SELECT a.* FROM accounts a LEFT JOIN scores s ON s.account_id = a.id
WHERE a.status = 'ACTIVE' AND a.created_at >= DATE(NOW()) - 3 AND a.tool = 'GLR'
AND s.account_id IS NULL
i need to count rows gouped by Hours and add this to select subquery, but i got error on this line AND DATE(created_at) = T.day_start AND user_id = T.user_id
Here is my query:
SELECT
COUNT(*)
FROM
(
SELECT
HOUR (call_start_at) AS hours,
count(*) AS calls
FROM
calls
WHERE
1
AND user_id = 8
AND call_start_at >= '2016-01-06 00:00:00'
AND call_start_at <= '2016-01-06 23:59:59'
GROUP BY
HOUR (call_start_at)
) AS T1
And i try this add to select subquery, but wrong on marked line with T.day_start and T.user_id when i changing.
Here is my test:
SELECT
T2.name,
T2.calls,
ROUND(calling_time * 100 / working_time, 2) AS percent,
T2.calling_time,
T2.working_time
FROM
(SELECT
T.name,
(SELECT COUNT(*) FROM calls AS C WHERE DATE(C.created_at) = T.day_start AND C.user_id = T.user_id) AS calls,
(SELECT
COUNT(*)
FROM
(SELECT
HOUR(call_start_at) as hours,
count(*) as calls
FROM
calls
WHERE 1
AND DATE(created_at) = T.day_start AND user_id = T.user_id // marked line
GROUP BY
HOUR(call_start_at)) as T3
) as row_count,
(SELECT SUM(call_length) FROM calls AS C WHERE DATE(C.created_at) = T.day_start AND C.user_id = T.user_id) AS calling_time,
SUM(T.working_time) AS working_time
FROM
(SELECT
U.username AS name,
U.id AS user_id,
DATE(UW.start) as day_start,
UW.length AS working_time
FROM
users AS U
LEFT JOIN users_worktime AS UW ON UW.user_id = U.id
WHERE 1
AND U.type = 'agent'
AND UW.start >= '2016-01-06 00:00:00'
AND UW.start <= '2016-01-06 23:59:59'
) AS T
GROUP BY
T.name, T.user_id, T.day_start
) AS T2
You can more simply write the query as:
SELECT COUNT(DISTINCT HOUR(call_start_at)) as num
FROM calls
WHERE 1 AND
user_id = 8 AND
call_start_at >= '2016-01-06 00:00:00' AND
call_start_at <= '2016-01-06 23:59:59'
This will allow you to use the correlation clause.
Note: This ignores NULL values. I assume that is not a problem (it is easily fixed if it is).
I have below query:
SELECT u.*
(SELECT sum(trs.amount)
FROM transactions trs
WHERE u.id = trs.user AND trs.type = 'Recycle' AND
trs.TIME >= UNIX_TIMESTAMP(CURDATE())
) as amt
FROM (SELECT DISTINCT user_by
FROM xeon_users_rented
) AS xur JOIN
users u
ON xur.user_by = u.username
LIMIT 50
Which selects some data from my database. The above query works fine. However, I would like to also select count(*) from xeon_users_rented where user_by = u.username This is what I have attempted:
SELECT u.*
(SELECT sum(trs.amount)
FROM transactions trs
WHERE u.id = trs.user AND trs.type = 'Recycle' AND
trs.TIME >= UNIX_TIMESTAMP(CURDATE())
) as amt,
(SELECT DISTINCT count(*)
FROM xeon_users_rented
WHERE xur.user_by = u.username
) AS ttl
FROM (SELECT DISTINCT user_by
FROM xeon_users_rented
) AS xur JOIN
users u
ON xur.user_by = u.username
LIMIT 50
However, that gives me the total number of rows in xeon_users_rented as ttl - not the total distinct rows where username = user_by
I think you can do what you want just by tinkering with your subquery a little. That is, change the select distinct to a group by:
SELECT u.*, xur.cnt,
(SELECT sum(trs.amount)
FROM transactions trs
WHERE u.id = trs.user AND trs.type = 'Recycle' AND
trs.TIME >= UNIX_TIMESTAMP(CURDATE())
) as amt
FROM (SELECT user_by, COUNT(*) as cnt
FROM xeon_users_rented
GROUP BY user_by
) xur JOIN
users u
ON xur.user_by = u.username
LIMIT 50;
Some notes:
SELECT DISTINCT is not really necessary, because you can do the same logic using GROUP BY. So, it is more important to understand GROUP BY.
You are using LIMIT with no ORDER BY. That means that you can get a different set of rows each time you run the query. Bad practice.
I have the following SQL query :
SELECT users.username, users.id, count(tahminler.tahmin)as tahmins_no, m.winnings
FROM users
LEFT JOIN tahminler ON users.id = tahminler.user_id
LEFT JOIN matches_of_comments ON tahminler.match_id = matches_of_comments.match_id
LEFT JOIN (SELECT user_id ,count(result) as winnings from tahminler WHERE result = 1 group by user_id) as m ON m.user_id = users.id
WHERE (MONTH( STR_TO_DATE( matches_of_comments.match_date, '%d.%m.%Y' ) ) = 01 AND YEAR( STR_TO_DATE( matches_of_comments.match_date, '%d.%m.%Y' ) ) = 2014 AND flag=1)
GROUP BY users.id
having count(tahminler.tahmin) > 0
The Where clause did not apply on the the sub-query (m) . I do not want to add the same clause inside the sub-query it will make the query complicated and not optimized . Is there a way to apply this condition on the sub-query also without repeat it inside the sub-query
There is no need to join the same table two times in your query. You can get the result from below query.
Try this:
SELECT u.username, u.id, COUNT(t.tahmin) AS tahmins_no,
SUM(t.result = 1) AS winnings
FROM users u
LEFT JOIN tahminler t ON u.id = t.user_id
LEFT JOIN matches_of_comments mc ON t.match_id = mc.match_id
WHERE MONTH(STR_TO_DATE(mc.match_date, '%d.%m.%Y')) = 1 AND
YEAR(STR_TO_DATE(mc.match_date, '%d.%m.%Y')) = 2014 AND flag=1
GROUP BY u.id
HAVING tahmins_no > 0
SELECT *
FROM users
WHERE id
IN ( 2024 )
AND id NOT IN (
SELECT user_id
FROM `used`
WHERE DATE_SUB( DATE_ADD( CURDATE( ) , INTERVAL 7 DAY ) , INTERVAL 14 DAY ) <= created)
AND id NOT IN (
SELECT user_id
FROM coupon_used
WHERE code = 'XXXXX')
AND id IN (
SELECT user_id
FROM accounts)
I have id 2024 in users table, but this id 2024 is there in used tables. So when I run this query, it shows me 2024 id also, which should be filtered out. I run the query where I selected specific users, and then I want these user to be filter out that they should not be in used table. But above query is not giving me the desire result. Desire Result is that I want to Select Users by following conditions: Take Specific Users, and check that they are not in used table and not in coupon_used table but they should be in accounts table.
I would use left joins for the exclusion conditions and a regular join for the inclusions:
SELECT users.*
FROM users
INNER JOIN accounts ON accounts.user_id = users.id
LEFT JOIN used ON used.user_id = users.id AND DATE_SUB(CURDATE(), INTERVAL 7 DAY) <= used.created)
LEFT JOIN coupon_used ON coupon_used.user_id = users.id AND coupon_used.code = 'XXXX'
WHERE id IN (2024) AND used.user_id IS NULL AND coupon_used.user_id IS NULL
I've edited the date manipulation as well; +7 -14 would be -7 :)
I would recommend using a JOIN on accounts and LEFT OUTER JOINs on the other two tables. A JOIN on accounts means it must be in the accounts table. LEFT OUTER JOINS on the coupon_used and used means it will return a record no matter if they're in that table or not. Filtering down to c.user_id IS NULL means that there is NOT a record in that table.
SELECT users.*
FROM users
JOIN accounts ON users.id = accounts.user_id
LEFT OUTER JOIN coupon_used c ON users.id = c.user_id AND c.code = 'XXXXX'
LEFT OUTER JOIN `used` u ON users.id = u.user_id AND DATE_SUB( DATE_ADD( CURDATE( ) , INTERVAL 7 DAY ) , INTERVAL 14 DAY ) <= u.created
WHERE id IN ( 2024 )
AND c.user_id IS NULL
AND u.user_id IS NULL
Firstly, try something like this using joins. Which should be easier to read and (depending on the version of MySQL) faster
SELECT DISTINCT users.*
FROM users
INNER JOIN accounts ON users.id = accounts.user_id
LEFT OUTER JOIN coupon_used ON users.id = coupon_used.user_id AND coupon_used.code = 'XXXXX'
LEFT OUTER JOIN `used` ON users.id = `used`.user_id AND DATE_SUB( DATE_ADD( CURDATE( ) , INTERVAL 7 DAY ) , INTERVAL 14 DAY ) <= `used`.created
WHERE id IN ( 2024 )
AND coupon_used.user_id IS NULL
AND `used`.user_id IS NULL
EDIT - Simplifying the date check:-
SELECT DISTINCT users.*
FROM users
INNER JOIN accounts ON users.id = accounts.user_id
LEFT OUTER JOIN coupon_used ON users.id = coupon_used.user_id AND coupon_used.code = 'XXXXX'
LEFT OUTER JOIN `used` ON users.id = `used`.user_id AND DATE_SUB( CURDATE( ) , INTERVAL 7 DAY ) <= `used`.created
WHERE id IN ( 2024 )
AND coupon_used.user_id IS NULL
AND `used`.user_id IS NULL