Counting messages per day (after 17:00 counts for next day) - mysql

I have two MySQL tables: stats (left) and messages (right)
+------------+---------+ +---------+------------+-----------+----------+
| _date | msgcount| | msg_id | _date | time | message |
+------------+---------+ +----------------------+-----------+----------+
| 2011-01-22 | 2 | | 1 | 2011-01-22 | 06:23:11 | foo bar |
| 2011-01-23 | 4 | | 2 | 2011-01-22 | 15:17:03 | baz |
| 2011-01-24 | 0 | | 3 | 2011-01-22 | 17:05:45 | foobar |
| 2011-01-25 | 1 | | 4 | 2011-01-22 | 23:58:13 | barbaz |
+------------+---------+ | 5 | 2011-01-23 | 00:06:32 | foo foo |
| 6 | 2011-01-23 | 13:45:00 | bar foo |
| 7 | 2011-01-25 | 02:22:34 | baz baz |
+---------+------------+-----------+----------+
I filled in stats.msgcount, but in reality it is still empty. I'm looking for a query way to:
count the number of messages for every stats._date (notice the zero msgcount on 2011-01-25)
messages.time is in 24-hour format. All messages AFTER 5 o'clock (17:00:00) should be counted for the next day (notice msg_id 3 and 4 count for 2011-01-23)
update stats.msgcount to hold all counts
I'm especially concerned about the "later than 17:00:00 count for next day" part. Is this possible in (My)SQL?

You could use:
UPDATE stats LEFT JOIN
( SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date ) mc
ON stats._date = mc.corrected_date
SET stats.msgcount = COALESCE( mc.message_count, 0 )
However this query requires dates you are interested in to be in the stats table already, if you don't have them make _date primary or unique key if its not yet and use:
INSERT IGNORE INTO stats(_date,msgcount)
SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date

Really, all you're doing is shifting the times by 7 hours. Something like this should work:
UPDATE stats s
SET count = (SELECT COUNT(msg_id) FROM messages m
WHERE m._date BETWEEN DATE_SUB(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 7 HOUR)
AND DATE_ADD(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 17 HOUR));
The basic idea is that it takes each date in your stats table, adjusts it by 7 hours, and looks for messages sent in that range. If you used a DATETIME column instead of separate DATE and TIME columns, you wouldn't need the extra DATE_ADD(..., TIME_TO_SEC) stuff.
There may be a better way to add a date and a time, I didn't see one with a quick look at the MySQL reference documents.
So all you'd need to do is insert a new row in the stats table with a 0 for the msgcount, and run the update command. If you only wanted to update a few days (since the message count probably isn't changing 6 days later) you just need a simple where clause on the update:
UPDATE stats s
SET ...
WHERE s._date BETWEEN '2012-04-03' AND '2012-04-08'

Related

How to select rows with the latest date and calculate another field based on the row

I have two tables i.e vehicle and vehicle_maintenance.
vehicle
-----------------------------------
| v_id | v_name | v_no |
-----------------------------------
| 1 | car1 | car123 |
-----------------------------------
| 2 | car2 | car456 |
-----------------------------------
vehicle_maintenance
-----------------------------------------------------------------------
| v_main_id | v_id | v_main_date | v_main_remainder |
-----------------------------------------------------------------------
| 1 | 1 | 2020/10/10 | 1 |
| 2 | 1 | 2020/10/20 | 2 |
| 3 | 2 | 2020/10/04 | 365 |
| 4 | 2 | 2020/10/15 | 5 |
-----------------------------------------------------------------------
I want to get each car maintenance details i.e car2 maintenance date is 2020/10/15 and i want to check next maintenance date based on v_main_remainder field. That means next maintenance date will be 2020/10/20 ( add 5 day to the maintenance date). I want to also calculate the no of days left from next maintenance date. Suppose today is 2020/10/10 then it will show 10 days left.
Here is my query
SELECT
v.v_id,
v.v_name,
v.v_no,
max(vm.v_main_date) as renewal_date,
datediff(
DATE_ADD(
max(vm.v_main_date), INTERVAL +vm.v_main_remainder day
),
now()
) as day_left
FROM vehicle as v, vehicle_maintenance as vm
GROUP BY v.v_id
But the problem is vm.v_main_remainder in date_add function taken from first row.
Here is the result
-----------------------------------------------------------------------
| v_id | v_name | v_no | renewal_date | day_left |
-----------------------------------------------------------------------
| 1 | car1 | car123 | 2020/10/20 | 11 |
-----------------------------------------------------------------------
| 2 | car2 | car456 | 2020/10/15 | 370 |
-----------------------------------------------------------------------
As a starter, your query is obviously missing a join condition between the two tables, so that's a cartesian product. This type of problem is much easier to spot when using explicit joins.
Then: you want to filter on the latest maintenance record per car, so aggregation is not appropriate.
One option uses window functions, available in MySQL 8.0:
select v.v_id, v.v_name, v.v_no, vm.v_main_date as renewal_date,
datediff(vm.v_main_date + interval vm.v_main_remainder day, current_date) as day_left
from vehicle as v
inner join (
select vm.*, row_number() over(partition by v_id order by v_main_date desc) rn
from vehicle_maintenance
) as vm on vm.v_id = v.v_id
where vm.rn = 1
Note that I changed now() to current_date, so datediff() works consistently on dates rather than datetimes.

SQL calculate timediff between intervals including a time from a separate table

I have 2 different tables called observations and intervals.
observations:
id | type, | start
------------------------------------
1 | classroom | 2017-06-07 16:18:40
2 | classroom | 2017-06-01 15:12:00
intervals:
+----+----------------+--------+------+---------------------+
| id | observation_id | number | task | time |
+----+----------------+--------+------+---------------------+
| 1 | 1 | 1 | 1 | 07/06/2017 16:18:48 |
| 2 | 1 | 2 | 0 | 07/06/2017 16:18:55 |
| 3 | 1 | 3 | 1 | 07/06/2017 16:19:00 |
| 4 | 2 | 1 | 3 | 01/06/2017 15:12:10 |
| 5 | 2 | 2 | 1 | 01/06/2017 15:12:15 |
+----+----------------+--------+------+---------------------+
I want a view that will display:
observation_id | time_on_task (total time in seconds where task = 1)
1 | 13
2 | 5
So I must first check to see if the first observation has task = 1, if it is I must record the difference between the current interval and the start from the observations table, then add that to the total time. From there on after if the task = 1, I just add the time difference from the current interval and previous interval.
I know I can use:
select observation_id, TIME_TO_SEC(TIMEDIFF(max(time),min(time)))
from your_table
group by observation_id
to find the total time in the intervals table between all intervals outside of the first one.
But
1. I need to only include interval times where task = 1. (The endtime for the interval is the one listed)
2. Need the timediff between the first interval and initial start (from observations table) if number = 1
I'm still new to the Stackoverflow community, but you could try to use SQL
LAG() function
For instance
Using an outer Select Statement
SELECT COl1, COL2, (DATEDIFF(mi, Inner.prevtime, Currentdatetime,0)) AS Difference
FROM ( SELECT LAG(Created_Datetime) OVER (ORDER BY Created_Datetime) AS prevtime
From MyTable
Where SomeCondition) as Inner
Sorry if it looks goofy, still trying to learn to format code here.
https://explainextended.com/2009/03/12/analytic-functions-optimizing-lag-lead-first_value-last_value/
Hope it helps

MySQL query based on time range, group users, and sum values over a sliding window

I want to create a new Table B based on the information from another existing Table A. I'm wondering if MySQL has the functionality to take into account a range of time and group column A values then only sum up the values in a column B based on those groups in column A.
Table A stores logs of events like a journal for users. There can be multiple events from a single user in a single day. Say hypothetically I'm keeping track of when my users eat fruit and I want to know how many fruit they eat in a week (7days) and also how many apples they eat.
So in Table B I want to count for each entry in Table A, the previous 7 day total # of fruit and apples.
EDIT:
I'm sorry I over simplified my given information and didn't thoroughly think my example.
I'm initially have only Table A. I'm trying to create Table B from a query.
Assume:
User/id can log an entry multiple times in a day.
sum counts should be for id between date and date - 7 days
fruit column stands for the total # of fruit during the 7 day interval ( apples and bananas are both fruit)
The data doesn't only start at 2013-9-5. It can date back 2000 and I want to use the 7 day sliding window over all the dates between 2000 to 2013.
The sum count is over a sliding window of 7 days
Here's an example:
Table A:
| id | date-time | apples | banana |
---------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 1 |
| 2 | 2013-9-5 09:00:00 | 1 | 0 |
| 1 | 2013-9-5 16:00:00 | 1 | 0 |
| 1 | 2013-9-6 08:00:00 | 0 | 1 |
| 2 | 2013-9-9 08:00:00 | 1 | 1 |
| 1 | 2013-9-11 08:00:00 | 0 | 1 |
| 1 | 2013-9-12 08:00:00 | 0 | 1 |
| 2 | 2013-9-13 08:00:00 | 1 | 1 |
note: user 1 logged 2 entries on 2013-9-5
The result after the query should be Table B.
Table B
| id | date-time | apples | fruit |
--------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 2 |
| 2 | 2013-9-5 09:00:00 | 1 | 1 |
| 1 | 2013-9-5 16:00:00 | 2 | 3 |
| 1 | 2013-9-6 08:00:00 | 2 | 4 |
| 2 | 2013-9-9 08:00:00 | 2 | 3 |
| 1 | 2013-9-11 08:00:00 | 2 | 5 |
| 1 | 2013-9-12 08:00:00 | 0 | 3 |
| 2 | 2013-9-13 08:00:00 | 2 | 4 |
At 2013-9-12 the sliding window moves and only includes 9-6 to 9-12. That's why id 1 goes from a sum of 2 apples to 0 apples.
You need years in your data to be able to use date arithmetic correctly. I added them.
There's an odd thing in your data. You seem to have multiple log entries for each person for each day. You're assuming an implicit order setting the later log entries somehow "after" the earlier ones. If SQL and MySQL do that, it's only by accident: there's no implicit ordering of rows in a table. Plus if we duplicate date/id combinations, the self join (read on) has lots of duplicate rows and ruins the sums.
So we need to start by creating a daily summary table of your data, like so:
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
This summary will contain at most one row per id per day.
Next we need to do a limited cross product self-join, so we get seven days' worth of fruit eating.
select --whatever--
from (
-- summary query --
) as a
join (
-- same summary query once again
) as b
on ( a.id = b.id
and b.`date` between a.`date` - interval 6 day AND a.`date` )
The between clause in the on gives us the seven days (today, and the six days prior). Notice that the table in the join with the alias b is the seven day stuff, and the a table is the today stuff.
Finally, we have to summarize that result according to your specification. The resulting query is this.
select a.id, a.`date`,
sum(b.apples) + sum(b.banana) as fruit_last_week,
a.apples as apple_today
from (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as a
join (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as b on (a.id = b.id and
b.`date` between a.`date` - interval 6 day AND a.`date` )
group by a.id, a.`date`, a.apples
order by a.`date`, a.id
Here's a fiddle: http://sqlfiddle.com/#!2/670b2/15/0
Assumptions:
one row per id/date
the counts should be for id between date and date - 7 days
"fruit" = "banana"
the "date" column is actually a date (including year) and not just month/day
then this SQL should do the trick:
INSERT INTO B
SELECT a1.id, a1.date, SUM( a2.banana ), SUM( a2.apples )
FROM (SELECT DISTINCT id, date
FROM A
WHERE date > NOW() - INTERVAL 7 DAY
) a1
JOIN A a2
ON a2.id = a1.id
AND a2.date <= a1.date
AND a2.date >= a1.date - INTERVAL 7 DAY
GROUP BY a1.id, a1.date
Some questions:
Are the above assumptions correct?
Does table A contain more fruits than just Bananas and Apples? If so, what does the real structure look like?

Only show if date is less than 14 days ago

I have three tables - notices, notices_read and companies. Notices contains a list of notices for clients which are displayed in a web app and notices_read is an indicator that they have clicked and read the message so it is not shown again whilst companies holds the company info including their join date. Additionally, I only want the notice to be shown to clients who joined more than 14 days ago.
Everything works bar the 14 day ago part - if I remove that line the notice shows correctly depending on whether there is a value in notices_read but if I add the date line in then, whilst there is no error, nothing is returned.
companies
+-----------------+
| id | datestamp |
+-----------------+
| 1 | 2012-12-20 |
| 2 | 2012-12-20 |
| 3 | 2012-11-20 |
| 4 | 2012-11-20 |
+-----------------+
notices_read
+-----------------------------+
| id | company_id | notice_id |
+-----------------------------+
| 1 | 3 | 1 |
+-----------------------------+
notices
+----------------------+
| id | title | active |
+----------------------+
| 1 | title1 | 1 |
| 2 | title2 | 0 |
+----------------------+
Notice 2 should never show as it is not set to active
Notice 1 should not show to company 1 or 2 as they are not 14 days old
Notice 1 should not show to company 3 as it has already been read
Notice 1 should show to company 4 as it has not been read and company 4 is older than 14 days
Here is my query:
Select
notices.description,
notices.id,
notices.title,
notices_read.company_id,
companies.datestamp
From
notices Left Join
notices_read On notices.id = notices_read.dismiss_id Left Join
companies On notices_read.company_id = companies.id
Where
notices.active = 1 And
companies.datestamp <= DATE_SUB(SYSDATE(), Interval 14 Day) And
(notices_read.company_id Is Null Or notices_read.company_id != '$company_id')
If I understood your problem correctly, you only need to use DATE_SUB
DATE_SUB(SYSDATE(), Interval 14 Day)
The full query would be:
Select
notices.description,
notices.id,
notices.title,
notices_read.company_id,
companies.datestamp
From
notices_read Left Join
notices On notices_read.dismiss_id = notices.id Left Join
companies On notices_read.company_id = companies.id
Where
notices.active = 1 And
companies.datestamp <= DATE_SUB(SYSDATE(), Interval 14 Day) And
(notices_read.company_id Is Null Or notices_read.company_id != '$company_id')

MySQL - query help select results with latest timestamp grouped by date

I have a table containing the following fields:
date, time, node, result
describing some numeric result for different nodes at different dates and times throughout each day. Typical listing will look something like this:
date | time | node | result
----------------------------------
2011-03-01 | 10:02 | A | 10
2011-03-01 | 11:02 | A | 20
2011-03-02 | 03:13 | A | 23
2011-03-02 | 12:15 | A | 18
2011-03-02 | 13:15 | A | 8
2011-03-01 | 13:12 | B | 2
2011-03-01 | 14:26 | B | 1
2011-03-02 | 08:00 | B | 6
2011-03-02 | 07:22 | B | 3
2011-03-02 | 21:19 | B | 4
I want to form a query that'll get the last result from each day for each node, such that I'd get something like this:
date | time | node | latest
-----------------------------------
2011-03-01 | 11:02 | A | 20
2011-03-01 | 14:26 | B | 1
2011-03-02 | 13:15 | A | 8
2011-03-02 | 21:19 | B | 4
I thought about doing a group by date, node, but then extracting the last value was a mess (I used group_concat( result order by time ) and used SUBSTRING() to get the last value. Baah, I know). Is there a simple way to do this in mysql?
I'm pretty sure I saw a similar request solving it very nice without using an INNER JOIN but I can't find it right now (and it might have been SQL Server) but following should work nevertheless.
SELECT n.*
FROM Nodes n
INNER JOIN (
SELECT MAX(time) AS Time
, Date
, Node
FROM Nodes
GROUP BY
Date
, Node
) nm ON nm.time = n.time
AND nm.Date = n.Date
AND nm.Node = n.Node
I would think that you would have to use something like the Max() function. Sorry I don't have mysql, so I can't test but I would think something like this
select t.date, t.node, t.latest, Max(time) from Table t Group By t.node, t.date
I think the aggregate function will return only the one row per grouping.