MySQL group by where day <= x - mysql
I need some help figuring out the correct SQL statement.
If've got a table with the following structure:
id, product_id, units, timestamp
I wan't a list which contains the all over units per day. A product has maximum one record per day.
So my first try was:
SELECT
DATE(timestamp) as day, SUM(units) as overall_units
FROM
tbl
GROUP BY
DATE(timestamp);
Normally this should do it. But sometimes there are days where is no record for a product. Nevertheless the units are still in the warehouse so they should be in the calculation.
For example:
We have 3 products. Cars, pens and wheels.
Records from 2012-10-20:
Cars => 5
Pens => 20
Wheels => 4
Records from 2012-10-21
Cars => 5
Wheels => 6
My query would give the following results:
2012-10-20 => 29
2012-10-21 => 11
But I want, that if there's no record for a product for the day it should use the record for this product which is the nearest one back in time.
So it should be:
2012-10-21 => 31
I hope you understand my needs.
SELECT
MAX (DATE(timestamp) ) as day, SUM(units) as overall_units
FROM tbl
update ::
SELECT max(day),sum(ou) from
( select DATE(timestamp) as day, SUM(units) as ou
FROM tbl
GROUP BY DATE(timestamp);
)
inner qry will return
2012-10-20 , 29
2012-10-21 , 11
and the final query will return
2012-10-21 , 40
SELECT
dd.ddate AS day
, SUM(t.units) as overall_units
FROM
( SELECT DISTINCT
product_id
FROM
tbl
) AS dp
CROSS JOIN
( SELECT DISTINCT
DATE(timestamp) AS ddate
FROM
tbl
) AS dd
JOIN
tbl AS t
ON t.id =
( SELECT
tt.id
FROM
tbl AS tt
WHERE
tt.product_id = dp.product_id
AND
tt.timestamp < dd.ddate + INTERVAL 1 DAY
ORDER BY tt.timestamp DESC
LIMIT 1
)
GROUP BY
dd.ddate ;
I think you should look into the DISTINCT() function.
The query could be something like: SELECT DISTINCT(product_id), * FROM tbl ORDER BY timestamp DESC; The use PHP to loop through your results and cumulate the units.
Related
Getting total average between dates
I have a table named sales with the following format. sale_id user_id sale_date sale_cost j847bv-6ggd bd48ta36-cn5x 2017-01-10 15:43:12 30 vf87x2-15gr bd48ta36-cn5x 2017-01-05 13:41:16 60 3gfd7f-2cdd 8g4f5ccf-1fet 2017-01-15 14:10:12 100 4bgfd5-12vn 8g4f5ccf-1fet 2017-01-20 19:47:14 20 b58e32-bf87 8g4f5ccf-1fet 2017-01-20 17:35:13 15 bg87db-127g gr4gg1f4-3gbb 2017-01-20 12:26:15 80 How could I get the average amount that a user (user_d) spends within the first X amount of days since their first purchase? I don't want an average for every user, but a total average for all. I know that I can get the average using select avg(sale_cost) but I'm unsure how to find out the average for a date period.
You can find average of total for each user within 10 days date range from intial sales date like this: select avg(sale_cost) from ( select sum(t.sale_cost) sale_cost from your_table t join ( select user_id, min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date from your_table group by user_id ) t2 on t.user_id = t2.user_id and t.sale_date between t2.start_date and t2.end_date group by t.user_id ) t; It finds the first sale_date and date 10 days after this for each user. Then joins it with the table to get total for each user within that range and then finally average of the above calculated totals. Demo If you want to find the average between overall first sale_date (not individual) and 10 days from it, use: select avg(sale_cost) from ( select sum(t.sale_cost) sale_cost from your_table t join ( select min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date from your_table ) t2 on t.sale_date between t2.start_date and t2.end_date group by t.user_id ) t; Demo
The between operator comes in handy whenever it comes to checking ranges SELECT column_name(s) FROM table_name WHERE column_name BETWEEN value1 AND value2; In this case value1 and value2 will be replaced by your dates using: '2011-01-01 00:00:00' AND '2011-01-31 23:59:59' or sale_date AND DATE_ADD(OrderDate,INTERVAL 10 DAY) The first way is faster and also the between values are inclusive.
MySQL loop and multiple LEFT joins
I got the following code: SELECT COALESCE(rv.views, 0) as views FROM ( select 0 as n union all select 1 union all select 2 union all select 3 ) n LEFT JOIN restaurant_views rv on rv.date = date_add("2015-02-24", interval - n.n day) and restaurant_id = 192 This code is giving me the amount of views a restaurant had the last 4 days. I am looking for a similar query to get the amount of likes a restaurant had the last 4 days. This is what I got so far: SELECT ( COUNT( DISTINCT a.restaurant_id) + COUNT( DISTINCT d.restaurant_id)) as num_likes FROM ( select 0 as n union all select 1 union all select 2 union all select 3 ) n LEFT JOIN apple_likes a on a.vote_date = date_add("2015-02-24", interval - n.n day) and a.restaurant_id = 192 LEFT JOIN android_likes d on d.vote_date = date_add("2015-02-24", interval - n.n day) and d.restaurant_id = 192 And here is the output, which is as you can see not what I'm looking for: What do I have to change to get the number of likes in the last query? (I have checked that the restaurant has likes on all days, so I am positive it's something wrong with the query)
Try this one: SELECT ( a.likes) + d.likes) as num_likes FROM ( select 0 as n union all select 1 union all select 2 union all select 3 ) n LEFT JOIN ( SELECT vote_date,COUNT(*) as likes FROM apple_likes WHERE restaurant_id = 192 GROUP BY restaurant_id, vote_date ) as a on a.vote_date = date_add("2015-02-24", interval - n.n day) LEFT JOIN ( SELECT vote_date, COUNT(*) as likes FROM android_likes WHERE restaurant_id = 192 GROUP BY restaurant_id, vote_date ) as d on d.vote_date = date_add("2015-02-24", interval - n.n day)
I can think of a couple items that might be what you are encountering... Just because somebody VIEWS a restaurant, does that mean they actually VOTED??? And if Voted, are the only two devices that of apple or android? What if viewing from a browser and they are on a Windows machine browser-based? Date Equality. In the restaurant views table, is the date field ALWAYS that of a time = 12:00:00 (ie: midnight/morning of the day). If the time-stamps of the votes are anything other than 12:00:00, and you are trying to compare for a date = date + time is probably failing. What you may need is a comparison of the date( vote_date ) = date( date_add( ... )) so this way BOTH are ignoring the time component... Now, that being said, a function on a date column is not going to be optimized, even if the restaurant ID is numeric and part of the index key... it would be PARTIALLY optimized. You may want to just add a generic date of AND vote_date >= '2015-02-20' so it can optimize the restaurant and date, then apply the DATE( vote_date ) for the actual qualfying of records.
Calculating a Moving Average MySQL?
Good Day, I am using the following code to calculate the 9 Day Moving average. SELECT SUM(close) FROM tbl WHERE date <= '2002-07-05' AND name_id = 2 ORDER BY date DESC LIMIT 9 But it does not work because it first calculates all of the returned fields before the limit is called. In other words it will calculate all the closes before or equal to that date, and not just the last 9. So I need to calculate the SUM from the returned select, rather than calculate it straight. IE. Select the SUM from the SELECT... Now how would I go about doing this and is it very costly or is there a better way?
If you want the moving average for each date, then try this: SELECT date, SUM(close), (select avg(close) from tbl t2 where t2.name_id = t.name_id and datediff(t2.date, t.date) <= 9 ) as mvgAvg FROM tbl t WHERE date <= '2002-07-05' and name_id = 2 GROUP BY date ORDER BY date DESC It uses a correlated subquery to calculate the average of 9 values.
Starting from MySQL 8, you should use window functions for this. Using the window RANGE clause, you can create a logical window over an interval, which is very powerful. Something like this: SELECT date, close, AVG (close) OVER (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING) FROM tbl WHERE date <= DATE '2002-07-05' AND name_id = 2 ORDER BY date DESC For example: WITH t (date, `close`) AS ( SELECT DATE '2020-01-01', 50 UNION ALL SELECT DATE '2020-01-03', 54 UNION ALL SELECT DATE '2020-01-05', 51 UNION ALL SELECT DATE '2020-01-12', 49 UNION ALL SELECT DATE '2020-01-13', 59 UNION ALL SELECT DATE '2020-01-15', 30 UNION ALL SELECT DATE '2020-01-17', 35 UNION ALL SELECT DATE '2020-01-18', 39 UNION ALL SELECT DATE '2020-01-19', 47 UNION ALL SELECT DATE '2020-01-26', 50 ) SELECT date, `close`, COUNT(*) OVER w AS c, SUM(`close`) OVER w AS s, AVG(`close`) OVER w AS a FROM t WINDOW w AS (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING) ORDER BY date DESC Leading to: date |close|c|s |a | ----------|-----|-|---|-------| 2020-01-26| 50|1| 50|50.0000| 2020-01-19| 47|2| 97|48.5000| 2020-01-18| 39|3|136|45.3333| 2020-01-17| 35|4|171|42.7500| 2020-01-15| 30|4|151|37.7500| 2020-01-13| 59|5|210|42.0000| 2020-01-12| 49|6|259|43.1667| 2020-01-05| 51|3|159|53.0000| 2020-01-03| 54|3|154|51.3333| 2020-01-01| 50|3|155|51.6667|
Use something like SELECT sum(close) as sum, avg(close) as average FROM ( SELECT (close) FROM tbl WHERE date <= '2002-07-05' AND name_id = 2 ORDER BY date DESC LIMIT 9 ) temp The inner query returns all filtered rows in desc order, and then you avg, sum up those rows returned. The reason why the query given by you doesn't work is due to the fact that the sum is calculated first and the LIMIT clause is applied after the sum has already been calculated, giving you the sum of all the rows present
an other technique is to do a table: CREATE TABLE `tinyint_asc` ( `value` tinyint(3) unsigned NOT NULL default '0', PRIMARY KEY (value) ) ; INSERT INTO `tinyint_asc` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250),(251),(252),(253),(254),(255); After you can used it like that: select date_add(tbl.date, interval tinyint_asc.value day) as mydate, count(*), sum(myvalue) from tbl inner join tinyint_asc.value <= 30 -- for a 30 day moving average where date( date_add(o.created_at, interval tinyint_asc.value day ) ) between '2016-01-01' and current_date() group by mydate
This query is fast: select date, name_id, case #i when name_id then #i:=name_id else (#i:=name_id) and (#n:=0) and (#a0:=0) and (#a1:=0) and (#a2:=0) and (#a3:=0) and (#a4:=0) and (#a5:=0) and (#a6:=0) and (#a7:=0) and (#a8:=0) end as a, case #n when 9 then #n:=9 else #n:=#n+1 end as n, #a0:=#a1,#a1:=#a2,#a2:=#a3,#a3:=#a4,#a4:=#a5,#a5:=#a6,#a6:=#a7,#a7:=#a8,#a8:=close, (#a0+#a1+#a2+#a3+#a4+#a5+#a6+#a7+#a8)/#n as av from tbl, (select #i:=0, #n:=0, #a0:=0, #a1:=0, #a2:=0, #a3:=0, #a4:=0, #a5:=0, #a6:=0, #a7:=0, #a8:=0) a where name_id=2 order by name_id, date If you need an average over 50 or 100 values, it's tedious to write, but worth the effort. The speed is close to the ordered select.
specific status on consecutive days
I have a MySQL table ATT which has EMP_ID,ATT_DATE,ATT_STATUS with ATT_STATUS with different values 1-Present,2-Absent,3-Weekly-off. I want to find out those EMP_ID's which have status 2 consecutively for 10 days in a given date range. Please help
Please have a try with this: SELECT EMP_ID FROM ( SELECT IF((#prevDate!=(q.ATT_DATE - INTERVAL 1 DAY)) OR (#prevEmp!=q.EMP_ID) OR (q.ATT_STATUS != 2), #rownum:=#rownum+1, #rownum:=#rownum) AS rownumber, #prevDate:=q.ATT_DATE, #prevEmp:=q.EMP_ID, q.* FROM ( SELECT EMP_ID , ATT_DATE , ATT_STATUS FROM org_tb_dailyattendance, (SELECT #rownum:=0, #prevDate:='', #prevEmp:=0) vars WHERE ATT_DATE BETWEEN '2013-01-01' AND '2013-02-15' ORDER BY EMP_ID, ATT_DATE, ATT_STATUS ) q ) sq GROUP BY EMP_ID, rownumber HAVING COUNT(*) >= 10 The logic is, to first sort the table by employee id and the dates. Then introduce a rownumber which increases only if the days are not consecutive or the employee id is not the previous one or the status is not 2 Then I just grouped by this rownumber and counted if there are 10 rows in each group. That should be the ones who were absent for 10 days or more.
Have you tried something like this SELECT EMP_ID count(*) as consecutive_count min(ATT_DATE) FROM (SELECT * FROM ATT ORDER BY EMP_ID) GROUP BY EMP_ID, ATT_DATE WHERE ATT_STATUS = 2 HAVING consecutive_count > 10
least value in count
i have a table employee(id,dept_id,salary,hire_date,job_id) . the following query i have to execute. Show all the employee who were hired on the day of the week on which least no of employee were hired. i have done the query, but am not able to get the least. please check if am correct. select id, WEEKDAY(hire_date)+1 as days,count(WEEKDAY(hire_date)+1) as count from test.employee group by days
This should get you the weekday on which the least number of employees were hired: SELECT count(id) as `Total`, WEEKDAY(hire_date) as `DoW` FROM test.employee GROUP BY `DoW` ORDER BY `Total` DESC LIMIT 1;
select id from test.employee where hire_date in ( select count(id) count,hire_date from test.employee order by count desc limit 1) this should work
You may try this, as it will not limit to one record if you have multiple week days where the same least number of employees were hired. In reality it makes sense. The following is based on sample data. Query: -- find minimum id count SELECT MIN(e.counts) INTO #min FROM (SELECT COUNT(*) as counts, WEEKDAY(hire_date+1) as day FROM employee GROUP BY WEEKDAY(hire_date+1)) e ; -- show weekdays with minimum id counts SELECT e2.counts as mincount, WEEKDAY(e1.hire_date+1) as weekday FROM employee e1 JOIN (SELECT COUNT(id) as counts, WEEKDAY(hire_date+1) as day FROM employee GROUP BY day HAVING COUNT(*) = #min) e2 ON WEEKDAY(e1.hire_date+1) = e2.day; Results: MINCOUNT WEEKDAY 1 6 1 3 1 4 1 2 SQLFIDDLE
select min(id), WEEKDAY(hire_date)+1 as days,count(WEEKDAY(hire_date)+1) as count from test.employee group by days
SELECT * FROM employee WHERE DAYOFWEEK(hire_date) IN ( SELECT weekday FROM ( SELECT count(*) as bcount, DAYOFWEEK(hire_date) as weekday FROM employee as a GROUP BY weekday HAVING bcount = ( SELECT MIN(tcount) FROM ( SELECT count(*) as tcount, DAYOFWEEK(hire_date) as weekday FROM employee GROUP BY weekday ) as t ) ) as q