Yes, this is an assignment. So the task was to output two columns of 'first name' and 'last name' with conditions:
-A u (B ∩ -C ∩ -(A ∩ -( B u D)))
A: All consumers that didn't shop on Monday and Friday
(time_by_day.the_day)
B: All consumers who bought 'Non-Consumable'
(product_class.product_family)
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
D: Female consumers from Canada (consumer.gender, consumer.country)
This is what I got so far
SELECT
c.fname,
c.lname
FROM
customer AS c
INNER JOIN sales_fact_1997 AS s ON c.customer_id = s.customer_id
INNER JOIN time_by_day AS t ON s.time_id = t.time_id
INNER JOIN product AS p ON s.product_id = p.product_id
INNER JOIN product_class AS pc ON p.product_class_id = pc.product_class_id
Where
NOT t.the_day in ('Monday', 'Friday') OR
(
pc.product_family = 'Non-Consumable' AND
NOT SUM(s.unit_sales) > 10 AND
NOT (
t.the_day in ('Monday', 'Friday') AND
NOT (
pc.product_family = 'Non-Consumable' OR
(c.country = 'Canada' AND c.gender = 'F')
)
)
)
GROUP BY concat(c.customer_id, s.time_id)
That ended up with an error
#1111 - Invalid use of group function
But I don't know which part of the code is wrong. I'm pretty sure that it's probably the WHERE part. But I don't know what I did wrong.
Condition C is where I'm really struggling. I manage just fine making a query of C
SELECT
t.time_id,
c.customer_id,
c.fullname,
round(SUM(s.unit_sales),0) as tot
FROM
customer as c
INNER JOIN sales_fact_1997 as s ON c.customer_id = s.customer_id
INNER JOIN time_by_day as t on s.time_id=t.time_id
GROUP BY concat(c.customer_id, s.time_id)
ORDER BY c.customer_id, t.time_id
But trying to incorporate it into the main code is hard for me.
Reading online I assume that I should probably use HAVING instead of WHERE.
I would really appreciate it if someone can point me in the right direction.
This is the database that I used.
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
You should use COUNT not SUM.
SELECT time_id,
count(*)
FROM sales_fact_1997
GROUP BY time_id
HAVING COUNT(*)>=10 ;
count(*) is not needed, I let just to show the results
Can you try if it helps:
SELECT c.lname,
c.fname
FROM customer c
INNER JOIN
(
SELECT time_id,customer_id,product_id
FROM sales_fact_1997
GROUP BY time_id,customer_id,product_id
HAVING COUNT(*)>=10
) as s on c.customer_id=s.customer_id
INNER JOIN
(
SELECT time_id,the_day
FROM time_by_day
WHERE the_day
NOT IN ('Monday','Friday')
) as t on s.time_id=t.time_id
INNER JOIN
(
SELECT product_family,product_id
FROM product_class
INNER JOIN product
on product_class.product_class_id=product.product_class_id
WHERE product_family='Non-Consumable'
) pc on s.product_id=pc.product_id
where c.country='Canada' and c.gender ='F' ;
I have a training_stats table (current due training) and I also have a completed_training table.
What I want to do is query due training with the last completed date from the completed table.
I've nearly got what I want, I get the due training, but they are duplicated with each completed record(as there are many completed records to each current due), and I only want single rows and the latest completed date.
I've been trying to use MAX, and when I run the MAX query independently, I get the last record. But when the MAX query is in the join, it is returning all completed rows.
This is the query that I am using:
SELECT s.course_stat_id
,o.org_name
,u.id
,u.first_name
,u.last_name
,a.area_id
,a.area_name
,tc.course_id
,tc.course_name
,s.assigned_on
,s.due
,s.pass_mark
,s.completed_on
,completed.complete_training_id
,completed.complete_date
FROM training_stats s
JOIN organisations o ON o.org_id = s.org_id
LEFT JOIN (
SELECT complete_training_id
,user_id
,area_id
,course_id
,max(completed_on) AS complete_date
FROM completed_training
GROUP BY complete_training_id
) completed ON completed.user_id = s.user_id
AND completed.area_id = s.area_id
AND completed.course_id = s.course_id
LEFT JOIN users u ON u.id = s.user_id
LEFT JOIN areas a ON a.area_id = s.area_id
LEFT JOIN training_courses tc ON tc.course_id = s.course_id
WHERE u.active = 1
AND o.active = 1
AND s.assigned = 1
Can you see what I am doing wrong?
Not exactly positive of your expected results, but the failure is PROBABLY for your group by and JOIN. Your group by is ONLY on the training ID, but you are also pulling user, area and course as well as max date completed for said respective training ID, user, area, course. You group by and join should match the unique characteristics.
Without seeing data, the query as I interpret it is that the "complete_training_id" is an auto-increment column for that table. Having said that, there would only ever be one record for that ID.
Having said that, the completed training table can have for a single user, area and course, multiple training days of which you want the most recent. For example someone attending college and needs to take many computer classes and they are refreshers from prior so assume all are same course ID. A person could take in 2012, 2014, 2016. You would want the instance of the user/area/course showing the 2016 dated training. So lets look at that first.
select
ct.user_id,
ct.area_id,
ct.course_id,
max(ct.completed_on) AS complete_date
FROM
completed_training ct
GROUP BY
ct.user_id,
ct.area_id,
ct.course_id
Now, for each user, area and course of study, I have one record with the most recent completion date. NOW lets pull the rest of the details, but since you need the completed training ID too, I applied the MAX() of that in the query below. The ID should by default be increasing every time a new record is added, so one completed a year ago would have a lower value than the ID completed today. So you get both the completed ID and its corresponding date for a given user, area, course.
SELECT
s.course_stat_id,
o.org_name,
u.id,
u.first_name,
u.last_name,
a.area_id,
a.area_name,
tc.course_id,
tc.course_name,
s.assigned_on,
s.due,
s.pass_mark,
s.completed_on,
ct.complete_training_id,
ct.complete_date
FROM
training_stats s
JOIN organisations o
ON s.org_id = o.org_id
AND o.active = 1
LEFT JOIN
( select
ct.user_id,
ct.area_id,
ct.course_id,
max(ct.complete_training_id ) as complete_training_id,
max(ct.completed_on) AS complete_date
FROM
completed_training ct
GROUP BY
ct.user_id,
ct.area_id,
ct.course_id ) ct
on s.user_id = ct.user_id
AND s.area_id = ct.area_id
AND s.course_id = ct.course_id
JOIN users u
ON s.user_id = u.id
AND u.active = 1
LEFT JOIN areas a
ON s.area_id = a.area_id
LEFT JOIN training_courses tc
ON s.course_id = tc.course_id
WHERE
s.assigned = 1
I'm not 100% sure of that. First, run this query. It should list all completed training, with a rnk from 1 (lastest), to n (oldest).
SELECT complete_training_id
,user_id
,area_id
,course_id
,completed_on AS complete_date
,#curRank := case when complete_training_id <> #cur_complete_training_id then 0 else #curRank + 1 end rnk
FROM completed_training, (select #curRank := 0, #cur_complete_training_id := 0)
ORDER BY complete_training_id, completed_on DESC
If true, the answer is :
SELECT s.course_stat_id
,o.org_name
,u.id
,u.first_name
,u.last_name
,a.area_id
,a.area_name
,tc.course_id
,tc.course_name
,s.assigned_on
,s.due
,s.pass_mark
,s.completed_on
,completed.complete_training_id
,completed.complete_date
FROM training_stats s
JOIN organisations o ON o.org_id = s.org_id
LEFT JOIN (
SELECT complete_training_id
,user_id
,area_id
,course_id
,completed_on AS complete_date
,#curRank := case when complete_training_id <> #cur_complete_training_id then 0 else #curRank + 1 end rnk
FROM completed_training, (select #curRank := 0, #cur_complete_training_id := 0)
ORDER BY complete_training_id, completed_on DESC
) completed ON completed.user_id = s.user_id and completed.rnk = 1
AND completed.area_id = s.area_id
AND completed.course_id = s.course_id
LEFT JOIN users u ON u.id = s.user_id
LEFT JOIN areas a ON a.area_id = s.area_id
LEFT JOIN training_courses tc ON tc.course_id = s.course_id
WHERE u.active = 1
AND o.active = 1
AND s.assigned = 1
the following is the situation. I need to connect an order-table with a message-table. But i'm only interested in the first message(lowest message-id). The connection between the tables is the orderid.
$result = $this->db->executeS('
SELECT o.*, c.iso_code AS currency, s.name AS shippingMethod, m.message AS note
FROM '._DB_PREFIX_.'orders o
LEFT JOIN '._DB_PREFIX_.'currency c ON c.id_currency = o.id_currency
LEFT JOIN '._DB_PREFIX_.'message m ON m.id_order = o.id_order
LEFT JOIN '._DB_PREFIX_.'carrier s ON s.id_carrier = o.id_carrier
LEFT JOIN jtl_connector_link l ON o.id_order = l.endpointId AND l.type = 4
WHERE l.hostId IS NULL AND o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
GROUP BY o.id_order
HAVING MIN(m.id_message)
LIMIT '.$limit
);
This query works so far. But now orders without a message are missing.
Thank you for your help!
Markus
You want to select several orders and per order the first message. This is generally difficult in MySQL for the lack of window functions (e.g. ROW_NUMBER OVER). But as it's just one column from the message table you are interested in, you can use a subquery in the SELECT clause.
SELECT
o.*,
c.iso_code AS currency,
s.name AS shippingMethod,
(
SELECT m.message
FROM message m
WHERE m.id_order = o.id_order
ORDER BY m.id_message
LIMIT 1
) AS note
FROM orders o
JOIN currency c ON c.id_currency = o.id_currency
JOIN carrier s ON s.id_carrier = o.id_carrier
WHERE o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
AND NOT EXISTS
(
SELECT *
FROM jtl_connector_link l
WHERE l.endpointId = o.id_order
AND l.type = 4
);
i am retrieving the result from above 4 table using following query
SELECT
(SELECT SUM(CASE when c.Training_Id=1 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es
ON c.Course_Id = es.Course_Id
) STEM,
(SELECT SUM(CASE when c.Training_Id=2 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
) MA,
c.* FROM campus c;
The problem with this query is, two(2) students are in STEM and one(1) Student in MA against Campus_Id 3, but its repeating records against all campuses. i want if campus has no students than there should be '0' Zero.
You need to filter your subselects by Campus_Id. But first you have to use distinct table aliases. Change your last line to ca.* FROM campus ca. Then you can use a where clause in your subselects (WHERE c.Campus_Id = ca.Campus_Id).
SELECT
(SELECT SUM(CASE when c.Training_Id=1 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es
ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id -- line added
) STEM,
(SELECT SUM(CASE when c.Training_Id=2 then 1 else 0 end)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id -- line added
) MA,
ca.* FROM campus ca; -- line changed
This should solve your problem.
To improve the performance you can also filter your subselects by Training_Id. In the first subselect you only need the rows with Training_Id=1. So you can change your where clause to:
WHERE c.Campus_Id = ca.Campus_Id
AND c.Training_Id = 1
Doing that you can also use COUNT instead of SUM. So your subselect would look like:
SELECT COUNT(1)
FROM courses c
INNER JOIN enrolled_students es ON c.Course_Id = es.Course_Id
WHERE c.Campus_Id = ca.Campus_Id AND c.Training_Id = 1
To prevent code duplication (your subselects are almost equal) you can join all needed tables and group by Campus_Id:
select
COUNT(co.Training_Id=1 OR NULL) STEM,
COUNT(co.Training_Id=2 OR NULL) MA,
ca.Campus_Id
from campus ca
left join courses co on co.Campus_Id = ca.Campus_Id
left join enrolled_students es on es.Course_Id = co.Course_Id
where co.Training_Id in (1, 2)
group by ca.Campus_Id
I have four tables person,loan,ca,payments
I would like to get the sum of all payments amounts and cash advance amounts which has the same ID as the loan joined with a person from a specific date.
Here is my code, but the sum is calculated incorrectly:
SELECT pd.*,
l.total_loan_amount,
sum(c.ca_total_amount) AS ctot,
sum(p.payment_amount)
FROM personal_data pd
LEFT JOIN loans l
ON pd.id_personal_data = l.id_personal_data
LEFT JOIN ca c
ON l.id_loan = c.id_loan
LEFT JOIN payments p
ON l.id_loan = p.id_loan
WHERE l.loan_date = curDate()
AND (
c.ca_date = curDate()
OR c.ca_date IS NULL
)
AND (
p.payment_date = curDate()
OR p.payment_date IS NULL
)
GROUP BY pd.id_personal_data
Doing that may sometimes retrieve invalid results because id may or may not sometimes be present on other table.
Try using a subquery for each column you want to retrieve.
SELECT pd.*,
l.total_loan_amount,
c.totalCA,
p.totalPayment
FROM personal_data pd
LEFT JOIN loans l
ON pd.id_personal_data = l.id_personal_data
LEFT JOIN
(
SELECT id_loan, SUM(ca_total_amount) totalCA
FROM ca
-- WHERE DATE(ca_date) = DATE(CURDATE()) OR
-- ca_date IS NULL
GROUP BY id_loan
) c ON l.id_loan = c.id_loan
LEFT JOIN
(
SELECT id_loan, SUM(payment_amount) totalPayment
FROM payments
-- WHERE DATE(payment_date) = DATE(CURDATE()) OR
-- payment_date IS NULL
GROUP BY id_loan
) p ON l.id_loan = p.id_loan
WHERE DATE(l.loan_date) = DATE(curDate())
I think dates on every payment and cash advance are irrelevant because you are looking for its totals based on the date of loan