I have this table:
SALESMAN | INVOICE | VALUE
1 | 7470 | 10
1 | 7471 | 20
1 | 7472 | 30
2 | 7473 | 40
2 | 7474 | 50
I want a query in order to get this result:
SALESMAN | INVOICE | VALUE | TOTAL_VALUE | TOTAL_ITEMS
1 | 7470 | 10 | 40 | 3
1 | 7471 | 20 | 40 | 3
1 | 7472 | 10 | 40 | 3
2 | 7473 | 40 | 90 | 2
2 | 7474 | 50 | 90 | 2
TOTAL_VALUE is the sum of all VALUE for the same SALESMAN.
TOTAL_ITEMS is the amount of rows with the same SALESMAN.
Is possible to achieve this in MySql?
Use GROUP BY
SELECT s.SALESMAN, s.INVOICE, s.VALUE,
xx.TOTAL_VALUE, xx.TOTAL_ITEMS
FROM sales s JOIN
(SELECT SALESMAN,
SUM(VALUE) AS TOTAL_VALUE,
COUNT(value) AS TOTAL_ITEMS
FROM sales
GROUP BY SALESMAN) xx ON S.SALESMAN = xx.SALESMAN;
this query:
SELECT SALESMAN, INVOICE , VALUE,
(select sum(VALUE) FROM your_table t1 where t1.SALESMAN = your_table.SALESMAN)
AS TOTAL_VALUE,
(select count(VALUE) FROM your_table t2 where t2.SALESMAN = your_table.SALESMAN)
AS TOTAL_ITEMS
from your_table
Related
I have a table called transactions which contains sellers and their transactions: sale, waste, and whenever they receive products. The structure is essentially as follows:
seller_id transaction_date quantity reason product unit_price
--------- ---------------- -------- ------ ------- ----------
1 2018-01-01 10 import 1 100.0
1 2018-01-01 -5 sale 1 100.0
1 2018-01-01 -1 waste 1 100.0
2 2018-01-01 -3 sale 4 95.5
I need a daily summary of each seller, including the value of their sales, waste and starting inventory. The problem is, the starting inventory is a cumulative sum of quantities up until the given day (the imports at the given day is also included). I have the following query:
SELECT
t.seller_id,
t.transaction_date,
t.SUM(quantity * unit_price) as amount,
t.reason as reason,
(
SELECT SUM(unit_price * quantity) FROM transactions
WHERE seller_id = t.seller_id
AND (transaction_date <= t.transaction_date)
AND (
transaction_date < t.transaction_date
OR reason = 'import'
)
) as opening_balance
FROM transactions t
GROUP BY
t.transaction_date,
t.seller_id
t.reason
The query works and I get the desired results. However, even after creating indices for both the outer and the subquery, it takes way too much time (about 30 seconds), because the opening_balance query is a dependant subquery which is calculated for each row over and over again.
How can i optimize, or rewrite this query?
Edit: the subquery had a small bug with a missing WHERE condition, i fixed it, but the essence of the question is the same. I created a fiddle with example data to play around:
https://www.db-fiddle.com/f/ma7MhufseHxEXLfxhCtGbZ/2
Following approach utilizing User-defined variables can be more performant than using the Correlated Subquery. In your case, a temp variable was used to account for the calculation logic, which also get outputted. You can ignore that.
You can try the following query (can add more explanation if needed):
Query
SELECT dt.reason,
dt.amount,
#bal := CASE
WHEN dt.reason = 'import'
AND #sid <> dt.seller_id THEN dt.amount
WHEN dt.reason = 'import' THEN #bal + #temp + dt.amount
WHEN #sid = 0
OR #sid = dt.seller_id THEN #bal
ELSE 0
end AS opening_balance,
#temp := CASE
WHEN dt.reason <> 'import'
AND #sid = dt.seller_id
AND #td = dt.transaction_date THEN #temp + dt.amount
ELSE 0
end AS temp,
#sid := dt.seller_id AS seller_id,
#td := dt.transaction_date AS transaction_date
FROM (SELECT seller_id,
transaction_date,
reason,
Sum(quantity * unit_price) AS amount
FROM transactions
WHERE seller_id IS NOT NULL
GROUP BY seller_id,
transaction_date,
reason
ORDER BY seller_id,
transaction_date,
Field(reason, 'import', 'sale', 'waste')) AS dt
CROSS JOIN (SELECT #sid := 0,
#td := '',
#bal := 0,
#temp := 0) AS user_vars;
Result (note that I have ordered by seller_id first and then transaction_date)
| reason | amount | opening_balance | temp | seller_id | transaction_date |
| ------ | ------ | --------------- | ----- | --------- | ---------------- |
| import | 1250 | 1250 | 0 | 1 | 2018-12-01 |
| sale | -850 | 1250 | -850 | 1 | 2018-12-01 |
| waste | -100 | 1250 | -950 | 1 | 2018-12-01 |
| import | 950 | 1250 | 0 | 1 | 2018-12-02 |
| sale | -650 | 1250 | -650 | 1 | 2018-12-02 |
| waste | -450 | 1250 | -1100 | 1 | 2018-12-02 |
| import | 2000 | 2000 | 0 | 2 | 2018-12-01 |
| sale | -1200 | 2000 | -1200 | 2 | 2018-12-01 |
| waste | -250 | 2000 | -1450 | 2 | 2018-12-01 |
| import | 750 | 1300 | 0 | 2 | 2018-12-02 |
| sale | -600 | 1300 | -600 | 2 | 2018-12-02 |
| waste | -450 | 1300 | -1050 | 2 | 2018-12-02 |
View on DB Fiddle
do thing something like this ?
SELECT s.* ,#balance:=#balance+(s.quantity*s.unit_price) AS opening_balance FROM (
SELECT t.* FROM transactions t
ORDER BY t.seller_id,t.transaction_date,t.reason
) s
CROSS JOIN ( SELECT #balance:=0) AS INIT
GROUP BY s.transaction_date, s.seller_id, s.reason;
SAMPLE
MariaDB [test]> select * from transactions;
+----+-----------+------------------+----------+------------+--------+
| id | seller_id | transaction_date | quantity | unit_price | reason |
+----+-----------+------------------+----------+------------+--------+
| 1 | 1 | 2018-01-01 | 10 | 100 | import |
| 2 | 1 | 2018-01-01 | -5 | 100 | sale |
| 3 | 1 | 2018-01-01 | -1 | 100 | waste |
| 4 | 2 | 2018-01-01 | -3 | 99.5 | sale |
+----+-----------+------------------+----------+------------+--------+
4 rows in set (0.000 sec)
MariaDB [test]> SELECT s.* ,#balance:=#balance+(s.quantity*s.unit_price) AS opening_balance FROM (
-> SELECT t.* FROM transactions t
-> ORDER BY t.seller_id,t.transaction_date,t.reason
-> ) s
-> CROSS JOIN ( SELECT #balance:=0) AS INIT
-> GROUP BY s.transaction_date, s.seller_id, s.reason;
+----+-----------+------------------+----------+------------+--------+-----------------+
| id | seller_id | transaction_date | quantity | unit_price | reason | opening_balance |
+----+-----------+------------------+----------+------------+--------+-----------------+
| 1 | 1 | 2018-01-01 | 10 | 100 | import | 1000 |
| 2 | 1 | 2018-01-01 | -5 | 100 | sale | 500 |
| 3 | 1 | 2018-01-01 | -1 | 100 | waste | 400 |
| 4 | 2 | 2018-01-01 | -3 | 99.5 | sale | 101.5 |
+----+-----------+------------------+----------+------------+--------+-----------------+
4 rows in set (0.001 sec)
MariaDB [test]>
SELECT
t.seller_id,
t.transaction_date,
SUM(quantity) as amount,
t.reason as reason,
quantityImport
FROM transaction t
inner join
(
select sum(ifnull(quantityImport,0)) quantityImport,p.transaction_date,p.seller_id from
( /* subquery get all the date and seller distinct row */
select transaction_date ,seller_id ,reason
from transaction
group by seller_id, transaction_date
)
as p
left join
( /* subquery get all the date and seller and the import quantity */
select sum(quantity) quantityImport,transaction_date ,seller_id
from transaction
where reason='Import'
group by seller_id, transaction_date
) as n
on
p.seller_id=n.seller_id
and
p.transaction_date>=n.transaction_date
group by
p.seller_id,p.transaction_date
) as q
where
t.seller_id=q.seller_id
and
t.transaction_date=q.transaction_date
GROUP BY
t.transaction_date,
t.seller_id,
t.reason;
My data looks like this:
Table Name = sales_orders
Customer_id| Order_id| Item_Id
-------------------------------
1 | 1 | 10
1 | 1 | 24
1 | 1 | 37
1 | 2 | 11
1 | 2 | 15
1 | 3 | 28
2 | 4 | 37
4 | 6 | 10
2 | 7 | 10
However, I need it to look like this:
Customer_id| Order_id| Item_Id |Order_rank
------------------------------------------
1 | 1 | 10 | 1
1 | 1 | 24 | 1
1 | 1 | 37 | 1
1 | 2 | 11 | 2
1 | 2 | 15 | 2
1 | 3 | 28 | 3
2 | 4 | 37 | 1
4 | 6 | 10 | 1
2 | 7 | 10 | 2
Customer_Id is a unique person
Order_id is a unique order
item_id is the product code
To further explain, the first three lines are from Customer #1's first order (order_id = 1) where this person ordered 3 different items (10,24, and 37). They then purchased another order (order_id =2) with two other products. Person with customer_id =2 has 2 unique orders (4 and 6), while customer with ID '4' has one unique order (order_id =6)
Essentially, what I need to do is rank these orders by customer_id and order Id, so that I can say "Order_id = 7 is the second order for customer_id = 2, because Order_rank = 2"
The challenge here is that I can't use session variables (e.g. #grp := customer_id ) in the MySQL query
For example, a query such as this is NOT allowed:
SELECT
customer_id,
order_id,
#ss := CASE WHEN #grp = customer_id THEN #ss + 1 ELSE 1 END AS
order_rank,
#grp := customer_id
FROM
(
SELECT
customer_id,
order_id
FROM sales_orders
GROUP BY customer_id, order_id
ORDER BY customer_id, order_id ASC
) AS t_1
CROSS JOIN (SELECT #ss := 0, #grp = NULL)ss
ORDER BY customer_id asc
Thanks for the help!
In a Correlated Subquery, we can Count(..) the unique and previous order_id values for a specific row's customer_id and order_id to determine the rank.
We need to count unique values because you have multiple rows per order (due to multiple items).
Query
SELECT
t1.Customer_id,
t1.Order_id,
t1.Item_Id,
(SELECT COUNT(DISTINCT t2.Order_id)
FROM sales_orders t2
WHERE t2.Customer_id = t1.Customer_id AND
t2.Order_id <= t1.Order_id
) AS Order_rank
FROM sales_orders AS t1;
Result
| Customer_id | Order_id | Item_Id | Order_rank |
| ----------- | -------- | ------- | ---------- |
| 1 | 1 | 10 | 1 |
| 1 | 1 | 24 | 1 |
| 1 | 1 | 37 | 1 |
| 1 | 2 | 11 | 2 |
| 1 | 2 | 15 | 2 |
| 1 | 3 | 28 | 3 |
| 2 | 4 | 37 | 1 |
| 4 | 6 | 10 | 1 |
| 2 | 7 | 10 | 2 |
View on DB Fiddle
You can use a correlated subquery:
select so.*,
(select count(*)
from sales_orders so2
where so2.Customer_id = so.Customer_id and
so2.order_id <= so.order_id
) as rank_order
from sales_orders so;
Or in MySQL 8+:
select so.*,
dense_rank() over (partition by Customer_Id order by Order_Id) as rank_order
from sales_orders so;
How can we SUM amount for each activity only on same date and output a row for each date? This query is not working.
SELECT SUM(amount), type, date FROM table GROUP BY DISTINCT date;
Table
+----+------------+-----------+---------+
| id | date | activity | amount |
+----+------------+-----------+---------+
| 1 | 2017-12-21 | Shopping | 200 |
| 2 | 2017-12-21 | Gift | 240 |
| 3 | 2017-12-23 | Give Away | 40 |
| 4 | 2017-12-24 | Shopping | 150 |
| 5 | 2017-12-25 | Give Away | 120 |
| 6 | 2017-12-25 | Shopping | 50 |
| 7 | 2017-12-25 | Shopping | 500 |
+----+------------+-----------+---------+
Required Result
+------------+-----------+------+-----------+
| date | Shopping | Gift | Give Away |
+------------+-----------+------+-----------+
| 2017-12-21 | 200 | 240 | |
| 2017-12-23 | | | 40 |
| 2017-12-24 | 150 | | |
| 2017-12-25 | 550 | | 120 |
+------------+-----------+------+-----------+
Use:
select `date`,
sum(if (activity='Shopping', amount, null)) as 'Shopping',
sum(if (activity='Gift', amount, null)) as 'Gift',
sum(if (activity='Give Away', amount, null)) as 'Give Away'
from table
group by `date`
You can try this. It returns exact result that you want
SELECT t.date,
SUM(t.shopping_amount) AS shopping,
SUM(t.gift_amount) AS gift,
SUM(t.give_away_amount) AS give_away
FROM
(
SELECT p.`date`, p.`activity`, p.`amount` AS shopping_amount,
0 AS gift_amount, 0 AS give_away_amount
FROM products p
WHERE p.`activity` = 'Shopping'
UNION
SELECT p.`date`, p.`activity`, 0 AS shopping_amount,
p.amount AS gift_amount, 0 AS give_away_amount
FROM products p
WHERE p.`activity` = 'Gift'
UNION
SELECT p.`date`, p.`activity`, 0 AS shopping_amount,
0 AS gift_amount, p.amount AS give_away_amount
FROM products p
WHERE p.`activity` = 'Give Away'
) t
GROUP BY t.date
Hmmm, you can't pivot your results into column headers unless you know all possible values as demonstrated by slaasko but you can get the results using sql into a form which can be pivoted using your display tool ( e.g. slice of BI tool).
SELECT SUM(amount), activity, date FROM table GROUP BY date, activity;
Problem:
The Employee table holds the salary information in a year.
Write a SQL to get the cumulative sum of an employee's salary over a period of 3 months but exclude the most recent month.
The result should be displayed by 'Id' ascending, and then by 'Month' descending.
Employee table:
| Id | Month | Salary |
|----|-------|--------|
| 1 | 1 | 20 |
| 2 | 1 | 20 |
| 1 | 2 | 30 |
| 2 | 2 | 30 |
| 3 | 2 | 40 |
| 1 | 3 | 40 |
| 3 | 3 | 60 |
| 1 | 4 | 60 |
| 3 | 4 | 70 |
My Code:
SELECT t1.Id, t1.Month,
(SELECT SUM(Salary)
FROM Employee AS t2
WHERE t1.Id = t2.Id
AND t1.Month >= t2.Month) AS Salary
FROM Employee t1
WHERE Month <> (SELECT
MAX(Month)
FROM Employee
GROUP BY t1.Id)
ORDER BY Id, Month DESC;
My Output:
| Id | Month | Salary |
|----|-------|--------|
| 1 | 3 | 90 |
| 1 | 2 | 50 |
| 1 | 1 | 20 |
| 2 | 2 | 50 |
| 2 | 1 | 20 |
| 3 | 3 | 100 |
| 3 | 2 | 40 |
Expected:
| Id | Month | Salary |
|----|-------|--------|
| 1 | 3 | 90 |
| 1 | 2 | 50 |
| 1 | 1 | 20 |
| 2 | 1 | 20 |
| 3 | 3 | 100 |
| 3 | 2 | 40 |
I used MAX() and GROUP BY() functions to exclude the most recent month of each group, but it doesn't work for Id=2.
Is there any advice on how to get rid of the following row?
| 2 | 2 | 50 |
Thanks in advance.
To only get the cumulative sum for the last 3 months, excluding the most recent month per id, you can use
SELECT t1.Id, t1.Month, SUM(t2.Salary)
FROM Employee t1
JOIN Employee t2 ON t1.Id = t2.Id AND t1.Month - t2.Month <= 2 AND t1.Month - t2.Month >= 0
JOIN (SELECT id, MAX(month) as max_mth from Employee GROUP BY id) tmax on tmax.id=t1.id AND tmax.max_mth<>t1.month
GROUP BY t1.Id, t1.Month
ORDER BY t1.Id, t1.Month DESC;
Try this:
SELECT t1.id, t1.month,
(SELECT SUM(salary)
FROM employee t2
WHERE t1.id = t2.id
AND t1.month >= t2.month
AND t1.month - t2.month < 3) AS salary
FROM (
SELECT * FROM employee p
WHERE month <> (select MAX(month)
FROM employee c where c.id = p.id)) t1
ORDER BY id, month desc;
Output is:
+------+-------+--------+
| id | month | salary |
+------+-------+--------+
| 1 | 3 | 90 |
| 1 | 2 | 50 |
| 1 | 1 | 20 |
| 2 | 1 | 20 |
| 3 | 3 | 100 |
| 3 | 2 | 40 |
+------+-------+--------+
The problem you were having was that you were deleting only the last month present across all employees. What I believe you wanted was to delete the last month present for each employee even if that last month was several months back. This solution creates a derived table where the last month is missing for each employee and uses that in place of your t1 employee table.
I think this answer is closest to what you were trying to do in your original query:
SELECT t1.id, t1.month,
(SELECT SUM(salary)
FROM employee t2
WHERE t1.id = t2.id
AND t1.month >= t2.month
AND t1.month - t2.month < 3) AS salary
FROM employee t1
WHERE month <> (SELECT MAX(month)
FROM employee t3
WHERE t3.id = t1.id)
ORDER by id, month desc;
On second look you were actually pretty close. I believe the problem was that the "GROUP BY t1.Id" line doesn't actually group anything because t1.Id is constant for any given subquery as "t1" is defined in the outtermost select statement. Replace it with a where clause and limit the total to 3 months in the SUM() query, and you're there.
Try this query:
SELECT e.Id, e.Month, SUM( e2.Salary ) AS 'Salary'
FROM
Employee AS e
INNER JOIN Employee AS e2
ON e2.Id = e.Id
AND e2.Month <= e.Month
WHERE
e.Month <> ( SELECT MAX( [Month] ) FROM Employee WHERE Id = e.Id )
GROUP BY
e.Id, e.Month
ORDER BY
e.Id, e.Month DESC
Output is:
+----+-------+--------+
| Id | Month | Salary |
+----+-------+--------+
| 1 | 3 | 90 |
| 1 | 2 | 50 |
| 1 | 1 | 20 |
| 2 | 1 | 20 |
| 3 | 3 | 100 |
| 3 | 2 | 40 |
+----+-------+--------+
I am working with a dataset with a similar format to the following:
Table: Account
*-----------*----------*-------------*
| id | amount | date |
*-----------*----------*-------------*
| 1 | 100 | 01/01/2016 |
| 2 | 100 | 01/02/2016 |
| 3 | 100 | 01/03/2016 |
| 4 | 200 | 01/04/2016 |
| 5 | 200 | 01/05/2016 |
| 6 | 200 | 01/06/2016 |
| 7 | 300 | 01/07/2016 |
| 8 | 300 | 01/08/2016 |
| 9 | 300 | 01/09/2016 |
| 10 | 400 | 01/10/2016 |
*-----------*----------*-------------*
I need a query to return that returns the most recent record for every distinct value in the table. So, the above table would return
*-----------*----------*-------------*
| id | amount | date |
*-----------*----------*-------------*
| 3 | 100 | 01/03/2016 |
| 6 | 200 | 01/06/2016 |
| 9 | 300 | 01/09/2016 |
| 10 | 400 | 01/10/2016 |
*-----------*----------*-------------*
I am still new to subqueries but I tried the following
SELECT a.id, a.amount, a.date FROM account a WHERE a.date IN (SELECT MAX(date) FROM account)
However this only return the latest date. How can I get the latest date for every distinct value in the amount column.
If you only need amount:
SELECT amount, MAX(date) from myTable group by amount
If you need more data:
SELECT * from myTable where (amount, date) IN (
SELECT amount, MAX(date) as date from table group by amount
)
Or maybe this will run faster:
SELECT * from myTable A WHERE NOT EXISTS (
SELECT 1
FROM myTable B
WHERE A.date < B.date
AND A.amount = B.amount
)