insert a where condition in an aggregated sql statement - mysql

I've been trying to put a "where" clause in my aggregated sql statement,
what I want is to get the sum of qty and only display those who have a 30 or less (including zero) quantities sold.
I tried putting it in the "on" part but to no avail
SELECT p.id, p.names,p.image, coalesce(sum(o.qty), 0) as sum_product_qty
from products p left join orders_item o on p.id = o.product_id
and date(o.datesales) <= curdate() and date(o.datesales) >= curdate() - interval 6 day and
sum_product_qty <= 30 group by p.id, p.names order by sum_product_qty desc
this is my sql statement
SELECT p.id, p.names,p.image, coalesce(sum(o.qty), 0) as sum_product_qty
from products p left join orders_item o on p.id = o.product_id
and date(o.datesales) <= curdate() and date(o.datesales) >= curdate() - interval 6 day
group by p.id, p.names order by sum_product_qty desc

Use Having to filter your aggregated groups.
SELECT p.id,
p.names,
p.image,
COALESCE(SUM(o.qty), 0) AS sum_product_qty
FROM products p
LEFT JOIN orders_item o
ON p.id = o.product_id
AND date(o.datesales) <= curdate()
AND date(o.datesales) >= curdate() - interval 6 DAY
GROUP BY p.id,
p.names,
p.image
HAVING COALESCE(SUM(o.qty), 0) <= 30
ORDER BY COALESCE(SUM(o.qty), 0) DESC

Try using a having clause instead.
SELECT p.id, p.names,p.image, coalesce(sum(o.qty), 0) as sum_product_qty
from products p left join orders_item o on p.id = o.product_id
and date(o.datesales) <= curdate() and date(o.datesales) >= curdate() - interval 6 day
group by p.id, p.names order by sum_product_qty desc HAVING sum_product_qty <= 30;
A HAVING clause in SQL specifies that an SQL SELECT statement should only return rows where aggregate values meet the specified conditions. It was added to the SQL language because the WHERE keyword could not be used with aggregate functions.
The aggregated SQL statement groups the data, but the WHERE clause operates on the data row-wise, hence throwing an error.
The HAVING clause filters the data on the group row but not on the individual row.

Related

SQL Condition in ORDER BY clause

I have this query here:
SELECT posts.id, username, cover, audio, title, postDate, commentsDisabled,
MAX(postClicks.clickDate) as clickDate,
COUNT(*) as ClickCount
FROM postClicks INNER JOIN
posts
ON posts.id = postClicks.postid INNER JOIN
users
ON users.id = posts.user
WHERE posts.private = 0
GROUP BY postClicks.postid
ORDER BY ClickCount
LIMIT 5
This query gets me the top 5 results ORDER BY Count which is ClickCount. Each postClicks in my database has a clickDate what I am trying to do now is with the 5 results I get back, put them in order by ClickCount within the past 24 hours, I still need 5 results, but they need to be in order of ClickCount with 24 hour period.
I use to have this in the where clause:
postClicks.clickDate > DATE_SUB(CURDATE(), INTERVAL 1 DAY)
But after the 24 hour period I would not get 5 results, I need to get 5 results.
My question is, can I put a condition or case in my order by clause?
You cannot put a condition in the ORDER BY in this query, because that would affect the LIMIT. Instead, you can use a subquery:
SELECT pc5.*
FROM (SELECT posts.id, username, cover, audio, title, postDate, commentsDisabled,
MAX(postClicks.clickDate) as clickDate,
COUNT(*) as ClickCount,
SUM(postClicks.clickDate > DATE_SUB(CURDATE(), INTERVAL 1 DAY)) as clicks24hours
FROM postClicks INNER JOIN
posts
ON posts.id = postClicks.postid INNER JOIN
users
ON users.id = posts.user
WHERE posts.private = 0
GROUP BY postClicks.postid
ORDER BY ClickCount
LIMIT 5
) pc5
ORDER BY clicks24hours DESC;

MySQL 5.7 | GROUP BY | Non Aggregated Column Error

I upgraded our mysql db from 5.6 to 5.7 and am in the process of fixing some queries which are throwing some errors. One of the queries I am working involves a GROUP BY with a COALESCE.
Here is the query (abstracted) that works:
SELECT
MAX(a.id),
a.entered,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered
ORDER BY
teh_count DESC
LIMIT
6;
But I need to COALESCE a.entered with c.override, so I tried the following:
SELECT
MAX(a.id),
a.entered,
COALESCE(c.override, a.entered) AS appearance,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered
ORDER BY
teh_count DESC
LIMIT
6;
But MySQL 5.7 now throws the following error: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'st_core.tuc.code_appearance_override' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I assume I can can change the sql_mode, but I'd prefer not too. What the error is telling me makes sense, in that the COALESCE column is not aggregated, so as a test I wrapped it with MAX and it works, however it seems kind of hacky to me.
Is there a more elegant solution?
You should also include a.entered in your group by clause and that's what the error saying. Though not sure why you are grouping by an different column a.code_entered?
Your query should look like
SELECT
MAX(a.id),
a.entered,
COALESCE(c.override, a.entered) AS appearance,
count(*) AS teh_count
FROM
a
INNER JOIN
b ON b.id = a.link_to_b_id
INNER JOIN
c ON c.link_link_to_b_id = b.id
WHERE
b.revision_id > 0
AND
c.terminated_at = '0000-00-00 00:00:00'
AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY
a.entered,
COALESCE(c.override, a.entered)
ORDER BY
teh_count DESC
LIMIT
6;
I think you intend something like this:
SELECT MAX(a.id),
COALESCE(c.override, a.entered) AS appearance,
count(*) AS the_count
FROM a INNER JOIN
b
ON b.id = a.link_to_b_id INNER JOIN
c
ON c.link_link_to_b_id = b.id
WHERE b.revision_id > 0 AND
c.terminated_at = '0000-00-00 00:00:00' AND
a.created_at > date_sub(NOW(), INTERVAL 8 HOUR)
GROUP BY appearance
ORDER BY the_count DESC
LIMIT 6;
This removes a.entered from the SELECT list so there is only one column for grouping. That column can be referenced by table alias in the GROUP BY.

Getting order totals and sorting them using 1 query

I have the following but I am not getting the correct totals because GROUP BY does not add the group user id's total revenue.
SELECT
users.id,
Count(orders.id) AS total_orders,
SUM(item_orders.item_price * item_orders.quantity) AS total_rev
FROM
orders
LEFT JOIN
users ON orders.user_id = users.id
LEFT JOIN
item_orders ON item_orders.order_id = orders.id
WHERE
orders.date >= '2015-10-06 00:00:00' AND orders.date <= '2016-03-23 23:59:59'
GROUP BY
users.id -- ignores duplicate ids, doesn't sum total_rev for those
ORDER BY
total_rev DESC
LIMIT 0,25
I would like to use WITH ROLLUP to solve this but when I use WITH ROLLUP I can not use ORDER BY and I can't order with GROUP BY with an alias which contains my total per user id.
SELECT
users.id,
Count(orders.id) AS total_orders,
SUM(item_orders.item_price * item_orders.quantity) AS total_rev
FROM
orders
LEFT JOIN
users ON orders.user_id = users.id
LEFT JOIN
item_orders ON item_orders.order_id = orders.id
WHERE
orders.date >= '2015-10-06 00:00:00' AND orders.date <= '2016-03-23 23:59:59'
GROUP BY
users.id, total_rev DESC WITH ROLLUP -- does not work because total_rev is an alias!
LIMIT 0,25
Any suggestions on how I can get this to work? Does it make sense I do this in the DB? I am trying to save resources by letting the DB presort so I can do the paging in my application.
EDIT: As xjstratedgebx mentioned in the comments below the first query works fine and there is no reason for using WITH ROLLUP. The first query can also be shortened like so:
SELECT
orders.user_id,
Count(orders.id) AS total_orders,
SUM(item_orders.item_price * item_orders.quantity) AS total_rev
FROM
orders
LEFT JOIN
item_orders ON item_orders.order_id = orders.id
WHERE
orders.date >= '2015-10-06 00:00:00' AND orders.date <= '2016-03-23 23:59:59'
GROUP BY
orders.user_id
ORDER BY
total_rev DESC
LIMIT 0,25

finding closest date from multiple tables mysql

I have many tables that log the users action on some forum, each log event has it's date.
I need a query that gives me all the users that wasn't active in during the last year.
I have the following query (working query):
SELECT *
FROM (questions AS q
INNER JOIN Answers AS a
INNER JOIN bestAnswerByPoll AS p
INNER JOIN answerThumbRank AS t
INNER JOIN notes AS n
INNER JOIN interestingQuestion AS i ON q.user_id = a.user_id
AND a.user_id = p.user_id
AND p.user_id = t.user_id
AND t.user_id = n.user_id
AND n.user_id = i.user_id)
WHERE DATEDIFF(CURDATE(),q.date)>365
AND DATEDIFF(CURDATE(),a.date)>365
AND DATEDIFF(CURDATE(),p.date)>365
AND DATEDIFF(CURDATE(),t.date)>365
AND DATEDIFF(CURDATE(),n.date)>365
AND DATEDIFF(CURDATE(),i.date)>365
what i'm doing in that query - joining all the tables according to the userId, and then checking each
date column individually to see if it's been more then a year
I was wondering if there is a way to make it simpler, something like finding the max between all dates (the latest date) and compering just this one to the current date
If you want to get best performance, you cannot use greatest(). Instead do something like this:
SELECT *
FROM questions q
JOIN Answers a ON q.user_id = a.user_id
JOIN bestAnswerByPoll p ON a.user_id = p.user_id
JOIN answerThumbRank t ON p.user_id = t.user_id
JOIN notes n ON t.user_id = n.user_id
JOIN interestingQuestion i ON n.user_id = i.user_id
WHERE q.date > curdate() - interval 1 year
AND a.date > curdate() - interval 1 year
AND p.date > curdate() - interval 1 year
AND t.date > curdate() - interval 1 year
AND n.date > curdate() - interval 1 year
AND i.date > curdate() - interval 1 year
You want to avoid datediff() such that MySQL can do index lookup on date column comparisons. Now, to make sure that index lookup works, you should create compound (multi-column) index on (user_id, date) for each one of your tables.
In this compound index, first part (user_id) will be user for faster joins, and second part (date) will be used for faster date comparisons. If you replace * in your SELECT * with only columns mentioned above (like user_id only), you might be able to get index-only scans, which will be super-fast.
UPDATE Unfortunately, MySQL does not support WITH clause for common table expressions like PostgreSQL and some other databases. But, you can still factor out common expression as follows:
SELECT *
FROM questions q
JOIN Answers a ON q.user_id = a.user_id
JOIN bestAnswerByPoll p ON a.user_id = p.user_id
JOIN answerThumbRank t ON p.user_id = t.user_id
JOIN notes n ON t.user_id = n.user_id
JOIN interestingQuestion i ON n.user_id = i.user_id,
(SELECT curdate() - interval 1 year AS year_ago) x
WHERE q.date > x.year_ago
AND a.date > x.year_ago
AND p.date > x.year_ago
AND t.date > x.year_ago
AND n.date > x.year_ago
AND i.date > x.year_ago
In MySQL, you can use the greatest() function:
WHERE DATEDIFF(CURDATE(), greatest(q.date, a.date, p.date, t.date, n.date, i.date)) > 365
This will help with readability. It would not affect performance.

MySQL Join with conditions on join on the fly

I need to know, how do i do this on the fly,
for example i have customers who are in the each different duedate statuses, i want to select MAX (most recent due date) ON LEFT JOIN currently when its join two tables it selects the oldest duedate which is not what i want..
SELECT c.customerid, i.datedue
FROM customers c
LEFT JOIN invoice i
ON i.customerid = c.customerid
WHERE i.datedue <= UNIX_TIMESTAMP()
AND c.status!='d'
GROUP BY i.customerid
ORDER BY i.datedue DESC
LIMIT 0, 1000
You need to use the max() function:
SELECT c.customerid, MAX(i.datedue)
FROM customers c LEFT JOIN invoice i ON i.customerid = c.customerid
WHERE i.datedue <= UNIX_TIMESTAMP() and c.status!='d'
GROUP BY i.customerid
ORDER BY i.datedue DESC
LIMIT 0,1000
This will give you the maximum datedue for each customer.