I upgraded our mysql db from 5.6 to 5.7 and am in the process of fixing some queries which are throwing some errors. One of the queries I am working involves a GROUP BY with a COALESCE.
Here is the query (abstracted) that works:
SELECT
MAX(a.id),
a.entered,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered
ORDER BY
teh_count DESC
LIMIT
6;
But I need to COALESCE a.entered with c.override, so I tried the following:
SELECT
MAX(a.id),
a.entered,
COALESCE(c.override, a.entered) AS appearance,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered
ORDER BY
teh_count DESC
LIMIT
6;
But MySQL 5.7 now throws the following error: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'st_core.tuc.code_appearance_override' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I assume I can can change the sql_mode, but I'd prefer not too. What the error is telling me makes sense, in that the COALESCE column is not aggregated, so as a test I wrapped it with MAX and it works, however it seems kind of hacky to me.
Is there a more elegant solution?
You should also include a.entered in your group by clause and that's what the error saying. Though not sure why you are grouping by an different column a.code_entered?
Your query should look like
SELECT
MAX(a.id),
a.entered,
COALESCE(c.override, a.entered) AS appearance,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered,
COALESCE(c.override, a.entered)
ORDER BY
teh_count DESC
LIMIT
6;
I think you intend something like this:
SELECT MAX(a.id),
COALESCE(c.override, a.entered) AS appearance,
count(*) AS the_count
FROM a INNER JOIN
b
ON b.id = a.link_to_b_id INNER JOIN
c
ON c.link_link_to_b_id = b.id
WHERE b.revision_id > 0 AND
c.terminated_at = '0000-00-00 00:00:00' AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY appearance
ORDER BY the_count DESC
LIMIT 6;
This removes a.entered from the SELECT list so there is only one column for grouping. That column can be referenced by table alias in the GROUP BY.
Related
Please see attached Tables image.
The question I have:
Find the top 5 occupations that borrowed the most in 2016
The code I have:
select c.occupation, count(*) no_mostborrow
from client c
Inner Join client c on c.clientID = b.clientID
where b.borrowDate >= '2016-01-01' and b.borrowDate < '2017-01-01'
group by c.clientoccupation, c.clientid
order by count(*) asc
limit 5
I feel like I am missing something here but I am not sure what. I am sure I am completely off. Thank you so much for your time.
To answer your question, you only want occupation in the group by. And the join needs to be correct:
select c.occupation, count(*) as no_mostborrow
from client c join
borrower b
on c.clientid = b.clientid
where b.borrowDate >= '2016-01-01' and b.borrowDate < '2017-01-01'
group by c.clientoccupation
order by count(*) asc
limit 5
I am trying to count views in time interval. The request below works only without the AND clause
If I take out the AND s.timestamp < 2019-01-31 it works just fine.
SELECT s.category_id, c.name, count(s.category_id) AS ViewCount
FROM stat_product_category s
JOIN category c ON s.category_id = c.id
WHERE s.timestamp > 2019-01-01 AND s.timestamp < 2019-01-31
GROUP BY s.category_id
ORDER By ViewCount Desc
LIMIT 10
you need single quote on date value
SELECT s.category_id, c.name, count(s.category_id) AS ViewCount
FROM stat_product_category s
JOIN category c ON s.category_id = c.id
WHERE s.timestamp > '2019-01-01' AND s.timestamp <'2019-01-31'
GROUP BY s.category_id ,c.name
ORDER By ViewCount Desc
LIMIT 10
Those aren't timestamps - without quotes, you just have a couple of ints you're subtracting. 2019-01-01 is evaluated as 2019-1-1, or 2017. 2019-01-31 is evaluated as 2019-1-31, or 1987. There is no number that's greater than 2017 but smaller than 1987, so you get no results. Surrounding these values with quotes will make them a string literal, and allow the database to perform an implicit conversion to a date:
SELECT s.category_id, c.name, count(s.category_id) AS ViewCount
FROM stat_product_category s
JOIN category c ON s.category_id = c.id
WHERE s.timestamp > '2019-01-01' AND s.timestamp < '2019-01-31'
-- Here ------------^----------^-------------------^----------^
GROUP BY s.category_id
ORDER By ViewCount Desc
LIMIT 10
that script is looking good you just need to pop qoutes around your dates so that SQL knows where the values start & stops.
SELECT s.category_id, c.name, count(s.category_id) AS ViewCount
FROM stat_product_category s
JOIN category c ON s.category_id = c.id
WHERE s.timestamp > '2019-01-01' AND s.timestamp < '2019-01-31'
GROUP BY s.category_id
ORDER By ViewCount Desc
LIMIT 10
(tested & working!)
Good luck with the project!
The system that I'm currently working on is a legacy system. The result of the first query have no issue just that the query will go through a loop to retrieve another value using the result from the first query. I try to change the query from "order by id" to "order by date" since I'm having some issue for certain account if the table is order by id. I also tried to change the query because it's currently very slow. I did combine both query together but it takes a long time to execute.
How do I join 2 query together without affecting the performance?
/* This query as I mention has no issue (first query)*/
SELECT
DATE_FORMAT('2016-12-12 00:00:00', '%Y-%m-%d') AS date,
C.id AS account_id,
C.account_no,
C.account_name,
B.amount AS last_topup,
DATE_FORMAT('2016-12-12 23:59:59', '%Y-%m-%d') AS topup_date,
NULL AS balance
FROM (SELECT
account_id,
MAX(date) date
FROM table1
GROUP BY account_id) A
INNER JOIN table1 B
USING (account_id, date)
RIGHT JOIN table2 C ON B.account_id = C.id
ORDER BY C.account_no;
/* Loop query (second query)*/
SELECT
`t`.`balance_after` AS `balance`
FROM table3 `t`
WHERE `t`.`account_id` = '<id from the loop>' AND `t`.`date` <= '2017-07-26
23:59:59'
ORDER BY `t`.`date` DESC;
/* The query that I combined both (takes a long time to execute*/
SELECT
DATE_FORMAT('2016-12-12 00:00:00', '%Y-%m-%d') AS date,
C.id AS account_id,
C.account_no,
C.account_name,
B.amount AS last_topup,
DATE_FORMAT('2016-12-12 23:59:59', '%Y-%m-%d') AS topup_date,
D.balance_after AS balance
FROM (SELECT
account_id,
MAX(date) date
FROM table1
GROUP BY account_id) A
INNER JOIN table1 B
USING (account_id, date)
RIGHT JOIN table2 C ON B.account_id = C.id
RIGHT JOIN table3 D ON B.account_id = D.account_id
WHERE D.date <= "2017-07-30 23:59:59"
ORDER BY C.account_no;
I have a calendar and user_result table and I need to join these two queries.
calendar query
SELECT `week`, `date`, `time`, COUNT(*) as count
FROM `calendar`
WHERE `week` = 1
GROUP BY `date`
ORDER BY `date` DESC
and the result is
{"week":"1","date":"2014-08-21","time":"15:30:00","count":"4"}, {"week":"1","date":"2014-08-20","time":"17:30:00","count":"12"}
user_result query
SELECT `date`, SUM(`point`) as score
FROM `user_result`
WHERE `user_id` = 1
AND `date` = '2014-08-20'
and the result is just score 3
My goal is to always show calendar even if the user isn't present in the user_result table, but if he is, SUM his points for that day where calendar.date = user_result.date. Result should be:
{"week":"1","date":"2014-08-21","time":"15:30:00","count":"4","score":"3"}, {"week":"1","date":"2014-08-20","time":"17:30:00","count":"12","score":"0"}
I have tried this query below, but the result is just one row and unexpected count
SELECT c.`week`, c.`date`, c.`time`, COUNT(*) as count, SUM(p.`point`) as score
FROM `calendar` c
INNER JOIN `user_result` p ON c.`date` = p.`date`
WHERE c.`week` = 1
AND p.`user_id` = 1
GROUP BY c.`date`
ORDER BY c.`date` DESC
{"week":"1","date":"2014-08-20","time":"17:30:00","count":"4","score":"9"}
SQL Fiddle
ow sorry, i was edited, and i was try at your sqlfiddle, if you want to show all date from calendar you can use LEFT JOIN, but if you want to show just the same date between calendar and result you can use INNER JOIN, note: in this case INNER JOIN just show 1 result, and LEFT JOIN show 2 results
SELECT c.`week`, p.user_id, c.`date`, c.`time`, COUNT(*) as count, p.score
FROM `calendar` c
LEFT JOIN
(
SELECT `date`, SUM(`point`) score, user_id
FROM `result`
group by `date`
) p ON c.`date` = p.`date`
WHERE c.`week` = 1
GROUP BY c.`date`
ORDER BY c.`date` DESC
I put a pre-aggreate query / group by date as a select for the one person you were interested in... then did a left-join to it. Also, your column names of week, date and time (IMO) are poor choice column names as they can appear to be too close to reserved keywords in MySQL. They are not, but could be confusing..
SELECT
c.week,
c.date,
c.time,
coalesce( OnePerson.PointEntries, 0 ) as count,
coalesce( OnePerson.totPoints, 0 ) as score
FROM
calendar c
LEFT JOIN ( select
r.week,
r.date,
COUNT(*) as PointEntries,
SUM( r.point ) as totPoints
from
result r
where
r.week = 1
AND r.user_id = 1
group by
r.week,
r.date ) OnePerson
ON c.week = OnePerson.week
AND c.date = OnePerson.date
WHERE
c.week = 1
GROUP BY
c.date
ORDER BY
c.date DESC
Posted code to SQLFiddle
I have many tables that log the users action on some forum, each log event has it's date.
I need a query that gives me all the users that wasn't active in during the last year.
I have the following query (working query):
SELECT *
FROM (questions AS q
INNER JOIN Answers AS a
INNER JOIN bestAnswerByPoll AS p
INNER JOIN answerThumbRank AS t
INNER JOIN notes AS n
INNER JOIN interestingQuestion AS i ON q.user_id = a.user_id
AND a.user_id = p.user_id
AND p.user_id = t.user_id
AND t.user_id = n.user_id
AND n.user_id = i.user_id)
WHERE DATEDIFF(CURDATE(),q.date)>365
AND DATEDIFF(CURDATE(),a.date)>365
AND DATEDIFF(CURDATE(),p.date)>365
AND DATEDIFF(CURDATE(),t.date)>365
AND DATEDIFF(CURDATE(),n.date)>365
AND DATEDIFF(CURDATE(),i.date)>365
what i'm doing in that query - joining all the tables according to the userId, and then checking each
date column individually to see if it's been more then a year
I was wondering if there is a way to make it simpler, something like finding the max between all dates (the latest date) and compering just this one to the current date
If you want to get best performance, you cannot use greatest(). Instead do something like this:
SELECT *
FROM questions q
JOIN Answers a ON q.user_id = a.user_id
JOIN bestAnswerByPoll p ON a.user_id = p.user_id
JOIN answerThumbRank t ON p.user_id = t.user_id
JOIN notes n ON t.user_id = n.user_id
JOIN interestingQuestion i ON n.user_id = i.user_id
WHERE q.date > curdate() - interval 1 year
AND a.date > curdate() - interval 1 year
AND p.date > curdate() - interval 1 year
AND t.date > curdate() - interval 1 year
AND n.date > curdate() - interval 1 year
AND i.date > curdate() - interval 1 year
You want to avoid datediff() such that MySQL can do index lookup on date column comparisons. Now, to make sure that index lookup works, you should create compound (multi-column) index on (user_id, date) for each one of your tables.
In this compound index, first part (user_id) will be user for faster joins, and second part (date) will be used for faster date comparisons. If you replace * in your SELECT * with only columns mentioned above (like user_id only), you might be able to get index-only scans, which will be super-fast.
UPDATE Unfortunately, MySQL does not support WITH clause for common table expressions like PostgreSQL and some other databases. But, you can still factor out common expression as follows:
SELECT *
FROM questions q
JOIN Answers a ON q.user_id = a.user_id
JOIN bestAnswerByPoll p ON a.user_id = p.user_id
JOIN answerThumbRank t ON p.user_id = t.user_id
JOIN notes n ON t.user_id = n.user_id
JOIN interestingQuestion i ON n.user_id = i.user_id,
(SELECT curdate() - interval 1 year AS year_ago) x
WHERE q.date > x.year_ago
AND a.date > x.year_ago
AND p.date > x.year_ago
AND t.date > x.year_ago
AND n.date > x.year_ago
AND i.date > x.year_ago
In MySQL, you can use the greatest() function:
WHERE DATEDIFF(CURDATE(), greatest(q.date, a.date, p.date, t.date, n.date, i.date)) > 365
This will help with readability. It would not affect performance.