I am clueless how can I write a (MySQL) query for this. I am sure it is super simple for an experienced person.
I have a table which summarizes sold items per day, like:
date
item
quantity
2020-01-15
apple
3
2020-01-15
pear
2
2020-01-15
potato
1
2020-01-14
orange
3
2020-01-14
apple
2
2020-01-14
potato
2
2020-01-13
lemon
5
2020-01-13
kiwi
2
2020-01-13
apple
1
I would like to query the N top sellers for every day, grouped by the date DESC, sorted by date and then quantity DESC, for N = 2 the result would look like:
date
item
quantity
2020-01-15
apple
3
2020-01-15
pear
2
2020-01-14
orange
3
2020-01-14
apple
2
2020-01-13
lemon
5
2020-01-13
kiwi
2
Please tell me how can I limit the returned item count per date.
First of all, it is not a good idea to use DATE as the name of a column.
You can use #rank := IF(#current = date, #rank + 1, 1) to number your rows by DATE. This statement checks each time that if the date has changed, it starts counting from zero.
Select date, item, quantity
from
(
SELECT item, date, sum(quantity) as quantity,
#rank := IF(#current = date, #rank + 1, 1) as ranking,
#current := date
FROM yourtable
GROUP BY item, date
order by date, sum(quantity) desc
) t
where t.ranking < 3
You can do this if you are using MySQL 8.0++
SELECT * FROM
(SELECT DATE, ITEM, QUANTITY, ROW_NUMBER() OVER (PARTITION BY DATE ORDER BY QUANTITY DESC) as order_rank FROM TABLE_NAME) as R
WHERE order_rank < 2
I think you can use:
select t.*
from t
where (quantity, item) >= (select t2.quantity, t2.item
from t t2
where t2.date = t.date
order by t2.quantity desc, t2.item
limit 1 offset 1
);
The only caveat is that you need to have at least "n" items available on the day (although that condition can be added as well).
Related
I would like to ask how to use the FIFO method with MYSQL8 to generate the expected result as follow.
The date of the stock in the result table must be the stock in date. Thanks to everyone if you can help.
Orginal_Table
Stock Name
In/out
Quantity
Date
Apple
in
10
1/1/2021
Banana
in
5
1/2/2021
Banana
out
3
1/5/2021
Banana
in
4
1/6/2021
Cherry
in
3
1/6/2021
Cherry
in
4
1/7/2021
Cherry
out
5
1/8/2021
Expected_result
Stock Name
balance
stock_in_date
Apple
10
1/1/2021
Banana
2
1/2/2021
Banana
4
1/6/2021
Cherry
2
1/7/2021
WITH
cte1 AS (
SELECT name, SUM(quantity * (operation = 'out')) went_out
FROM test
GROUP BY name
),
cte2 AS (
SELECT *, SUM(quantity * (operation = 'in')) OVER (PARTITION BY name ORDER BY operation_date) amount
FROM test
)
SELECT name,
operation_date,
CASE WHEN amount - quantity < went_out
THEN amount - went_out
ELSE quantity
END result
FROM cte1
JOIN cte2 USING (name)
WHERE operation = 'in'
AND went_out < amount
ORDER BY 1,2;
fiddle with query building steps.
I asked this question yesterday, but I didn't make it clear enough as it seems, so I'm gonna add some information to make everything clear.
Consider the following 2 tables:
0_12_table
ID userID text timestamp
1 1 bla 2020-08-07 10:30:00
2 1 blub 2020-08-06 11:30:00
3 1 abc 2020-08-05 09:20:00
4 1 def 2020-08-04 06:13:00
5 2 ghi 2020-08-02 08:05:00
6 2 abc 2020-08-05 10:20:00
7 3 def 2020-08-04 07:13:00
8 4 ghi 2020-08-02 09:05:00
9 5 jkl 2020-08-07 06:30:00
10 5 mno 2020-08-08 08:32:00
12_24_table:
ID userID text timestamp
1 1 bla 2020-08-07 19:30:00
2 1 blub 2020-08-06 21:30:00
3 1 abc 2020-08-05 19:20:00
4 2 def 2020-08-04 16:13:00
5 2 ghi 2020-08-02 18:05:00
6 2 abc 2020-08-05 20:20:00
7 3 def 2020-08-04 17:13:00
8 4 ghi 2020-08-02 19:05:00
9 5 jkl 2020-08-07 20:13:00
Basically, users can (and are animated to do so) to add one entry in the databse between 00:00 and 12:00 and one between 12:01 and 23:59.
Now I'd like to reward them for adding consecutive entries. Whenever they miss their timeframe, that "counter" is reset to 0 though...
In the above given data, the user with the userID 1 would have a streak of 3 days right now (in my time, its 9 AM right now), whenever its after 12 AM though, and he didn't make another entry, the counter would be set to 0 and the streak is over, because he missed adding an entry for the morning.
The users with the userID's 2,3 and 4 would have no streak at all. The streak is always cancelled, when there is one morning entry or evening entry missing.
The user with the userID 5 would have a streak of 1, which would increased to 2, whenever he made his entry for the timeframe of 12:01 to 23:59.
I hope you understand the logic. The important part is, that it does NOT matter, if he had a streak of 10 2 days ago. Whenever there is an entry missing, the streak is reset to 0. So when there is no entry until 12 AM on one day for the morning table or when there is no entry for the evening until 23:59, then the streak is gone. It always uses today as reference, so its really "consecutive entries until today".
The answer that seems to be as close as I got so far is the following:
select min(dte), max(dte), count(*)
from (select dte, (#rn := #rn + 1) as seqnum
from (select dte
from ((select date(timestamp) as dte, 1 as morning, 0 as evening
from morning
) union all
(select date(timestamp) as dte, 0 as morning, 1 as evening
from evening
)
) me
group by dte
having sum(morning) > 0 and sum(evening) > 0
order by dte
) d cross join
(select #rn := 0) params
) me
group by dte - interval seqnum day
order by count(*) desc
limit 1;
However, I didn't introduce the userID there so far and the biggest problem: It just takes the last streak, no matter if there is a gap until today.. But, as mentioned, it always takes today as reference.
I hope someone can help me here.
Last important information: I'm using MariaDB 10.1.45, so "WITH" or "ROWNUM()" is not available, updating is not possible right now.
Thanks in advance!
This would really be simpler in a more recent version that uses window functions. But you can adapt the variables to get all streaks for users:
select userid, count(*) as length
from (select dte, (#rn := #rn + 1) as seqnum
from (select dte
from ((select userid, date(timestamp) as dte, 1 as morning, 0 as evening
from morning
) union all
(select userid, date(timestamp) as dte, 0 as morning, 1 as evening
from evening
)
) me
group by userid, dte
having sum(morning) > 0 and sum(evening) > 0
order by userid, dte
) d cross join
(select #rn := 0) params
) me
group by userid, dte - interval seqnum day
order by count(*) desc;
It turns out that a "global" sequence works as well as local sequences for this problem, so the variable use is still simple. The changes are to the group by and order by clauses.
You can then use this as a subquery to get the maximum:
select userid, max(seq)
from (select userid, count(*) as seq
from (select dte, (#rn := #rn + 1) as seqnum
from (select dte
from ((select userid, date(timestamp) as dte, 1 as morning, 0 as evening
from morning
) union all
(select userid, date(timestamp) as dte, 0 as morning, 1 as evening
from evening
)
) me
group by userid, dte
having sum(morning) > 0 and sum(evening) > 0
order by userid, dte
) d cross join
(select #rn := 0) params
) me
group by userid, dte - interval seqnum day
) u
group by userid;
Note: Users with no streaks would be filtered out. You can put them back in using a left join in the outer query. However, you would really want a table of all users for this, rather than your two separate tables, so I haven't bothered.
I have this table below and want to get the min value of quantity, max value of quantity, first value of quantity and last value of quantity. The new table should be grouped by date with a 1 day interval.
id item quantity date
1 xLvCm 2 2020-01-10 19:15:03
1 UBizL 4 2020-01-10 20:16:41
1 xLvCm 1 2020-01-10 21:21:12
1 xLvCm 3 2020-01-11 11:14:00
1 UBizL 1 2020-01-11 15:01:10
1 moJEe 4 2020-01-12 00:15:50
1 moJEe 1 2020-01-12 02:11:23
1 UBizL 1 2020-01-12 04:16:17
1 KiZoX 3 2020-01-13 10:10:02
1 KiZoX 2 2020-01-13 19:05:40
1 KiZoX 1 2020-01-13 20:14:33
This is the expected table result
min(quantity) max(quantity) first(quantity) last(quantity) date
1 4 2 1 2020-01-10 19:15:03
1 3 3 1 2020-01-11 11:14:00
1 4 4 1 2020-01-12 00:15:50
1 4 3 1 2020-01-13 10:10:02
The SQL query I have tried is
SELECT MIN(quantity), MAX(quantity), FIRST(quantity), LAST(quantity) FROM tablename GROUP BY date
I can't figure out how to include the first and last values of quantity and group by day (like 10, 11, 12, 13) instead of date like (2020-01-10 19:15:03)
It is important to state the database tool you are using because of the different functionality available in each of them. But if you were using Snowflake this is something I would try:
select distinct day(date) as day_of_month,
min(quantity) over (partition by day(date) order by date range between unbounded preceding and UNBOUNDED FOLLOWING) min_quantity,
max(quantity) over (partition by day(date) order by date range between unbounded preceding and UNBOUNDED FOLLOWING) max_quantity ,
last_value(QUANTITY) over (partition by day(date) order by date range BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) as last_quantity,
first_value(QUANTITY) over (partition by day(date) order by date range BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) as first_quantity
from demo_db.staging.test
It is important to note that this is a costly query. If your table is huge this might take too long.
A common approach to this problem is to use window functions and aggregation. Here is one method:
SELECT date(date), MIN(quantity), MAX(quantity),
MAX(CASE WHEN seqnum_a = 1 THEN quantity END) as first_quantity,
MAX(CASE WHEN seqnum_d = 1 THEN quantity END) as last_quantity
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY date(date) ORDER BY date) as seqnum_a,
ROW_NUMBER() OVER (PARTITION BY date(date) ORDER BY date des) as seqnum_d
FROM tablename t
) t
GROUP BY date(date);
Try this:
select A.minquantity,A.maxquantity,B.firstquantity,C.lastquantity,A.date from (
(select min(quantity) as minquantity,max(quantity) as maxquantity,Date(date) as date
from Test group by Date(date))A
join
(select Date(date) as date,quantity as firstquantity from
Test where date in (select min(date) from Test group by Date(date)))B
on A.date=B.date
join
(select Date(date)as date,quantity as lastquantity from Test
where date in (select max(date) from Test group by Date(date)))C
on A.date=C.date
);
Output:
1 4 2 1 2020-01-10
1 3 3 1 2020-01-11
1 4 4 1 2020-01-12
1 3 3 1 2020-01-13
I have 2 tables in MySQL(InnoDB). The first is an employee table. The other table is the expense table. For simplicity, the employee table contains just id and first_name. The expense table contains id, employee_id(foreign key), amount_spent, budget, and created_time. What I would like is a query that returns the percentage of their budget spent for the most recent X number of expense they've registered.
So given the employee table:
| id | first_name
-------------------
1 alice
2 bob
3 mike
4 sally
and the expense table:
| id | employee_id | amount_spent | budget | created_time
----------------------------------------------------------
1 1 10 100 10/18
2 1 50 100 10/19
3 1 0 40 10/20
4 2 5 20 10/22
5 2 10 70 10/23
6 2 75 100 10/24
7 3 50 50 10/25
The query for the last 3 trips would return
|employee_id| first_name | percentage_spent |
--------------------------------------------
1 alice .2500 <----------(60/240)
2 bob .4736 <----------(90/190)
3 mike 1.000 <----------(50/50)
The query for the last 2 trips would return
|employee_id| first_name | percentage_spent |
--------------------------------------------
1 alice .3571 <----------(50/140)
2 bob .5000 <----------(85/170)
3 mike 1.000 <----------(50/50)
It would be nice if the query, as noted above, did not return any employees who have not registered any expenses (sally). Thanks in advance!!
I'll advise you to convert datatype of created_time as DATETIME in order to get accurate results.
As of now, I've assumed that most recent id indicates most recent spents as it's what sample data suggests.
Following query should work (didn't tested though):
select t2.employee_id,t1.first_name,
sum(t2.amount_spent)/sum(t2.budget) as percentage_spent
from employee t1
inner join
(select temp.* from
(select e.*,#num := if(#type = employee_id, #num + 1, 1) as row_number,
#type := employee_id as dummy
from expense e
order by employee_id,id desc) temp where temp.row_number <= 3 //write value of **n** here.
) t2
on t1.id = t2.employee_id
group by t2.employee_id
;
Click here for DEMO
Feel free to ask doubt(s), if you've any.
Hope it helps!
If you are using mysql 8.0.2 and higher you might use window function for it.
SELECT employee_id, first_name, sliding_sum_spent/sliding_sum_budget
FROM
(
SELECT employee_id, first_name,
SUM(amount_spent) OVER (PARTITION BY employee_id
ORDER BY created_time
RANGE BETWEEN 3 PRECEDING AND 0 FOLLOWING) AS sliding_sum_spent,
SUM(budget) OVER (PARTITION BY employee_id
ORDER BY created_time
RANGE BETWEEN 3 PRECEDING AND 0 FOLLOWING) AS sliding_sum_budget,
COUNT(*) OVER (PARTITION BY employee_id
ORDER BY created_time DESC) rn
FROM expense
JOIN employee On expense.employee_id = employee.id
) t
WHERE t.rn = 1
As mentioned by Harshil, order of row according to the created_time may be a problem, therefore, it would be better to use date date type.
id originator revenue date
-- ---------- ------- ----------
1 acme 1 2013-09-15
2 acme 0 2013-09-15
3 acme 4 2013-09-14
4 acme 6 2013-09-13
5 acme -6 2013-09-13
6 hello 1 2013-09-15
7 hello 0 2013-09-14
8 hello 2 2013-09-13
9 hello 5 2013-09-14
I have the above table . And I would like to add the ranking column based on the revenue generated by the originator based on the revenue for last 3 days
the fields to be displayed as below:
originator revenue toprank
---------- ------- -------
hello 8 1
acme 5 2
2) And based on the above data , i would like to calculate the avg revenue generated based on the following criteria
If the sum of total revenue for the same date is 0 ( zero) then it should not be counted with calculating the average.
a) avg value for originator acme should be sum of revenue/count(no of dates where the revenue is non zero value) so (4+1)/2 i.e 2.5
b) avg value for originator hello should be sum of revenue/count(no of dates where the revenue is non zero value) so (5+2+1)/3 i.e 2.6666
originator revenue toprank avg(3 days)
---------- ------- ------- -----------
hello 8 1 2.6666
acme 5 2 2.5
To ignore a row when averaging, give AVG a null value. The NULLIF function is good for this.
The ranking is problematic in MySQL. It doesn't support analytic functions that make this a bit easier to do in Oracle, MySQL, Teradata, etc. The most common workaround is to use a counter variable, and that requires an ordered set of rows, which means the total revenue must be calculated in an inner query.
SELECT originator, TotalRev, Avg3Days, #rnk := #rnk + 1 AS TopRank
FROM (
SELECT
originator,
SUM(revenue) AS TotalRev,
AVG(NULLIF(revenue, 0)) AS Avg3Days
FROM myTable
GROUP BY originator
ORDER BY TotalRev DESC
) Tots, (SELECT #rnk := 0) Ranks
If you want to get the values for the last 3 days from today's date, try something like this:
SET #rank=0;
select originator, rev, #rank:=#rank+1 AS rank, avg
FROM
(select originator, sum(revenue) as rev,
AVG(NULLIF(revenue, 0)) as avg
FROM t1
WHERE date >= DATE_ADD(CURDATE(), INTERVAL -3 DAY)
group by originator
order by 2 desc) as t2;
SQL Fiddle..
EDITED:
If you want to get the values for the last 3 days from the nearest date, try this:
SET #rank=0;
select originator, rev, #rank:=#rank+1 AS rank, avg
from
(select originator, sum(revenue) as rev,
AVG(NULLIF(revenue, 0)) as avg
from t1
WHERE date >= DATE_ADD((select max(date) from t1), INTERVAL -3 DAY)
group by originator
order by 2 desc) as t2;
SQL Fiddle..