This question already has answers here:
sql sum data from multiple tables
(6 answers)
Closed 2 years ago.
I have 3 tables (attendance, allowances and deductions) with some records in attendance.Wage, allowances.Amount, deductions.Amount columns. And I want to "SUM" values in these columns with selected date.
Summed up values must be
attendance.Wage:100
allowances.Amount:150
deductions.Amount:120
but with my query values are seeing very different.
SELECT Name, SUM(attendance.Wage), SUM(allowances.Amount), SUM(deductions.Amount) FROM employees
INNER JOIN attendance USING (EmployeeID)
INNER JOIN allowances USING (EmployeeID)
INNER JOIN deductions USING (EmployeeID)
WHERE MONTH(attendance.Date) = 6 AND YEAR(attendance.Date) = 2020
AND
MONTH(allowances.Date) = 6 AND YEAR(allowances.Date) = 2020
AND
MONTH(deductions.Date) = 6 AND YEAR(deductions.Date) = 2020
GROUP BY employees.EmployeeID;
Output of the query:
attendance.Wage:400
allowances.Amount:900
deductions.Amount:720
Why the values are multiplying or increasing? How can I fix that?
Because you are getting multiple rows from each table and the join is multiplying them.
Without additional information, I would recommend correlated subqueries:
SELECT e.Name,
(SELECT SUM(a.Wage)
FROM attendance a
WHERE a.EmployeeID = e.EmployeeID AND
a.date >= '2020-06-01' AND a.date < '2020-07-01'
),
(SELECT SUM(a.Amount)
FROM allowances a
WHERE a.EmployeeID = e.EmployeeID AND
a.date >= '2020-06-01' AND a.date < '2020-07-01'
),
(SELECT SUM(d.Amount)
FROM deduction d
WHERE d.EmployeeID = e.EmployeeID AND
d.date >= '2020-06-01' AND d.date < '2020-07-01'
)
FROM employees e;
With an index on (EmployeeId, date, amount/wage) in each of the three tables, this should also have better performance than alternatives using explicit aggregations and joins.
You would need to push the aggregation down in subqueries, otherwise the sums count each value multiple times.
SELECT e.Name, ad.total_attendance, aw.total_allowances, dd.total_deductions
FROM employees e
INNER JOIN (
SELECT EmployeeID, SUM(wage) total_attendance
FROM attendance
WHERE date >= '2020-06-01' and date < '2020-07-01'
GROUP BY EmployeeID
) ad USING (EmployeeID)
INNER JOIN (
SELECT EmployeeID, SUM(amount) total_allowances
FROM allowances
WHERE date >= '2020-06-01' and date < '2020-07-01'
GROUP BY EmployeeID
) aw USING (EmployeeID)
INNER JOIN (
SELECT EmployeeID, SUM(amount) total_deductions
FROM deductions
WHERE date >= '2020-06-01' and date < '2020-07-01'
GROUP BY EmployeeID
) dd USING (EmployeeID)
Related
I have the following scenario. i have three tables (users, sales, sales_details) Users to Sales is a 1 to 1 relationship and sales to sales_details is 1 to many.
I am running a query where I get all the sales for each user by joining all 3 tables without any issue.
Query looks something like this
SELECT s.month as month,u.name as name, s.year as year, s.date as date,sum(sd.qty) as qty,sum(sd.qty*sd.value) as value,s.id as id,sum(sd.stock) as stock,s.currency as currency,s.user as user
FROM sales as s
left join sales_details as sd on s.id = sd.Sales
inner join users as u on s.user = u.Id
group by s.Id
What I want to do now is add an extra field in my query which will be a subquery.
SELECT SUM(total) AS total_yearly
FROM (
SELECT sum(qty) as total
FROM sales
left join sales_details on sales.Id = sales_details.Sales
WHERE ((month <= MONTH(NOW()) and year = YEAR(NOW()))
or (month >= MONTH(Date_add(Now(),interval - 12 month)) and year = YEAR(Date_add(Now(),interval - 12 month))))
and User = **ID OF USER** ) as sub
This query on its own gives me the sales for the user for the past 12 months while the original query does it per month. I know that the result will be the same for each user but i need it for other calculations.
My problem is how I will join the 2 queries so that the subquery will read the user id from the original one.
Thanks in advance!
Group the second query by user, and then join it with the original query.
SELECT s.month as month,u.name as name, s.year as year, s.date as date,
sum(sd.qty) as qty,sum(sd.qty*sd.value) as value,s.id as id,
sum(sd.stock) as stock,s.currency as currency,s.user as user,
us.total
FROM sales as s
left join sales_details as sd on s.id = sd.Sales
inner join users as u on s.user = u.Id
inner join (
SELECT User, sum(qty) as total
FROM sales
left join sales_details on sales.Id = sales_details.Sales
WHERE ((month <= MONTH(NOW()) and year = YEAR(NOW()))
or (month >= MONTH(Date_add(Now(),interval - 12 month)) and year = YEAR(Date_add(Now(),interval - 12 month)))))
GROUP BY User) AS us ON s.user = us.user
group by s.Id
The system that I'm currently working on is a legacy system. The result of the first query have no issue just that the query will go through a loop to retrieve another value using the result from the first query. I try to change the query from "order by id" to "order by date" since I'm having some issue for certain account if the table is order by id. I also tried to change the query because it's currently very slow. I did combine both query together but it takes a long time to execute.
How do I join 2 query together without affecting the performance?
/* This query as I mention has no issue (first query)*/
SELECT
DATE_FORMAT('2016-12-12 00:00:00', '%Y-%m-%d') AS date,
C.id AS account_id,
C.account_no,
C.account_name,
B.amount AS last_topup,
DATE_FORMAT('2016-12-12 23:59:59', '%Y-%m-%d') AS topup_date,
NULL AS balance
FROM (SELECT
account_id,
MAX(date) date
FROM table1
GROUP BY account_id) A
INNER JOIN table1 B
USING (account_id, date)
RIGHT JOIN table2 C ON B.account_id = C.id
ORDER BY C.account_no;
/* Loop query (second query)*/
SELECT
`t`.`balance_after` AS `balance`
FROM table3 `t`
WHERE `t`.`account_id` = '<id from the loop>' AND `t`.`date` <= '2017-07-26
23:59:59'
ORDER BY `t`.`date` DESC;
/* The query that I combined both (takes a long time to execute*/
SELECT
DATE_FORMAT('2016-12-12 00:00:00', '%Y-%m-%d') AS date,
C.id AS account_id,
C.account_no,
C.account_name,
B.amount AS last_topup,
DATE_FORMAT('2016-12-12 23:59:59', '%Y-%m-%d') AS topup_date,
D.balance_after AS balance
FROM (SELECT
account_id,
MAX(date) date
FROM table1
GROUP BY account_id) A
INNER JOIN table1 B
USING (account_id, date)
RIGHT JOIN table2 C ON B.account_id = C.id
RIGHT JOIN table3 D ON B.account_id = D.account_id
WHERE D.date <= "2017-07-30 23:59:59"
ORDER BY C.account_no;
I have this query
SELECT E.employee_id FROM employee E
INNER JOIN timesheet T ON T.`employee_id` = E.`employee_id` AND DATE_FORMAT(T.`date_created`, '%Y-%m') != DATE_FORMAT(NOW(), '%Y-%m')
WHERE T.`employee_id` NOT IN (1,2)
actually I am trying to get only employees whose ids are not listed in 1,2 and if employee_id is present for 1 or 2 then the month and year should not be the current month and year for that particular employee
SQL Fiddle
I am not sure if am doing it the right way.
It sounds as though you need an or relationship between your conditions, not an and relationship. Also, it's preferable to leave the join condition for joining and have all your "business" logic in the where clause - it makes your query easier to read:
SELECT E.employee_id
FROM employee E
INNER JOIN timesheet T ON T.`employee_id` = E.`employee_id`
WHERE T.`employee_id` NOT IN (1,2) OR -- Note the "OR" relationship
DATE_FORMAT(T.`date_created`, '%Y-%m') !=
DATE_FORMAT(NOW(), '%Y-%m')
Try this one
SELECT E.employee_id FROM employee E
INNER JOIN timesheet T ON T.`employee_id` = E.`employee_id`
WHERE T.`employee_id` NOT IN (1,2) OR (DATE_FORMAT(T.`date_created`, '%Y-%m') != DATE_FORMAT(NOW(), '%Y-%m'))
SELECT DISTINCT(id_no), lastname,
(SELECT COUNT(purchasedate) num_of_purch
FROM sales JOIN Artist ON
sales.id = Artist.id_no
WHERE DATE_SUB(CURDATE(),INTERVAL 1
YEAR) <= purchasedate
) AS num_of_purch
FROM Artist
This query returns the all Artist's ID_no, and their last name and the total number of purchases, altho i want to specify which purchases were to which artist. Help in solving this would be greatly apprciated.
EDIT - DISTINCT(id_no) is redundant as it is a primary key.
This shows the number of sales for each artist_id:
SELECT artist.id_no, count(sales.id) as num_of_purch
FROM artist left join sales on sales.id = artist.id_no
WHERE DATE_SUB(CURDATE(), INTERVAL 1 YEAR) <= purchasedate
GROUP BY artist.id
To return also the last names, and all of the details:
SELECT art_tot.id_no, art_tot.lastname, art_tot.num_of_purch, sales.*
FROM (SELECT artist.id_no, artist.lastname, count(sales.id) as num_of_purch
FROM artist left join sales on sales.id = artist.id_no
WHERE DATE_SUB(CURDATE(), INTERVAL 1 YEAR) <= purchasedate
GOUP BY artist.id, artist.lastname) art_tot
left join sales on art_tot.id_no = sales.id
This should give you artist and number of purchases per artist
select a.id_no, a.lastname, count(s.purchasedate) num_of_purch
from artists a
join sales s on a.id_no = s.id
where date_sub(curdate(), interval 1 year) <= s.purchasedate
group by a.id_no, a.lastname
You should use a GROUP BY to get the count per artist.
And you should use an outer join to include artists who have no sales within the last year.
SELECT a.id_no, a.lastname, COUNT(s.purchasedate) AS num_of_purch
FROM Artist a
LEFT OUTER JOIN sales s ON s.id = a.id_no
AND s.purchasedate => CURDATE() - INTERVAL 1 YEAR
GROUP BY a.id_no;
PS: Using DISTINCT(id_no) is meaningless not only because id_no is already a unique key, but because DISTINCT always applies to all columns in the select list, even if you add parentheses to make it look like a function that applies only to one column.
I am having to set up a query that retrieves the last comment made on a customer, if no one has commented on them for more than 4 weeks. I can make it work using the query below, but for some reason the comment column won't display the latest record. Instead it displays the oldest, however the date shows the newest. It may just be because I'm a noob at SQL, but what exactly am I doing wrong here?
SELECT DISTINCT
customerid, id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments
WHERE customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
GROUP BY customerid
ORDER BY maxdate
The first "WHERE" clause is just ensuring that it shows only customers from a specific area, and that they are "past due enabled". The second makes sure that the customer has not been commented on within the last 27 days. It's grouped by customerid, because that is the number that is associated with each individual customer. When I get the results, everything is right except for the comment column...any ideas?
Join much better to nested query so you use the join instead of nested query
Join increase your speed
this query resolve your problem.
SELECT DISTINCT
customerid,id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments inner join customers on comments.customerid = customers.id
WHERE comments.pastdue='1' AND comments.hubarea='1' AND DATEDIFF(NOW(), comments.date) <= 27
GROUP BY customerid
ORDER BY maxdate
I think this might probably do what you are trying to achieve. If you can execute it and maybe report back if it does or not, i can probably tweak it if needed. Logically, it ' should' work - IF i have understood ur problem correctly :)
SELECT X.customerid, X.maxdate, co.id, c.customername, co.user, co.comment
FROM
(SELECT customerid, MAX(date) AS 'maxdate'
FROM comments cm
INNER JOIN customers cu ON cu.id = cm.customerid
WHERE cu.pastdue='1'
AND cu.hubarea='1'
AND DATEDIFF(NOW(), cm.date) <= 27)
GROUP BY customerid) X
INNER JOIN comments co ON X.customerid = co.customerid and X.maxdate = co.date
INNER JOIN customer c ON X.customerid = c.id
ORDER BY X.maxdate
You need to have subquery for each case.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN
(
SELECT DISTINCT ID
FROM customers
WHERE pastdue = 1 AND hubarea = 1
) c ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL
The first join gets the latest record for each customer.
The second join shows only customers from a specific area, and that they are "past due enabled".
The third join, which uses LEFT JOIN, select all customers that has not been commented on within the last 27 days. In this case,only records without on the list are selected because of the condition d.customerID IS NULL.
But tomake your query shorter, if the customers table has already unique records for customer, then you don't need to have subquery on it.Directly join the table and put the condition on the WHERE clause.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN customers c
ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL AND
c.pastdue = 1 AND
c.hubarea = 1
Two of your table columns are not contained in either an aggregate function or the GROUP BY clause. for example suppose that you have two data rows with the same customer id and same date, but with different comment data. how SQL should aggregate these two rows? :( it will generate an error...
try this
select customerid, id, customername, user,date, comment from(
select customerid, id, customername, user,date, comment,
#rank := IF(#current_customer = id, #rank+ 1, 1),
#current_customer := id
from comments
where customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
order by customerid, maxdate desc
) where rank <= 1