MySQL GROUP BY DIV result not consistent - mysql

I'm trying to group data on 1 day interval using GROUP BY DIV as mentioned on this post:
Grouping into interval of 5 minutes within a time range
It looks fine on first glance.
But I notice inconsistency when comparing queries on 2 different date interval (but intersected).
First I use date range from Feb 01 00:00 to Feb 26 00:00,
second I use date range from Feb 20 00:00 to Feb 26 00:00
The values on Feb 20 are different between those 2 queries. But the rest (21 - 25) are matched.
Any idea what's going on & how to fix it?
Update:
Here's the stored procedure to generate dummy data on February on each minute:
DELIMITER $$
CREATE DEFINER=`root`#`127.0.0.1` PROCEDURE `testdata`()
BEGIN
DECLARE gap int;
DECLARE x bigint;
SET gap = 60000;
SET x = 1454265000000;
CREATE TABLE IF NOT EXISTS testdata (
timestamp bigint(20) default NULL,
value int(20) default NULL
)
ENGINE=MyISAM DEFAULT CHARSET=utf8;
WHILE x <= 1456770599000 DO
INSERT INTO testdata(timestamp, value) VALUES (x, FLOOR(RAND() * (270 + 1)) + 30);
SET x = x + gap;
END WHILE;
select x;
END
And here're 2 queries to compare 2 interval:
select from_unixtime(timestamp / 1000), count(value) from testdata where timestamp >= 1454265000000 and timestamp <= 1456770599000 group by timestamp div 86400000;
select from_unixtime(timestamp / 1000), count(value) from testdata where timestamp >= 1455906600000 and timestamp <= 1456770599000 group by timestamp div 86400000;
First query at 2016-02-20 return 1440. Second query at 2016-02-20 return 2 records at 2016-02-20 00:00:00 = 330 and at 2016-02-20 05:30:00 = 1440.

The duplication is because your server's timezone isn't the same as UTC. Unix timestamps are based on the time in UTC, so timestamp DIV 86400000 is grouping by UTC dates. But FROM_UNIXTIME() will return a time in the database's timezone. Since you're selecting FROM_UNIXTIME(timestamp/1000), you're selecting an arbitrary row within the group, and the date of that in the server's timezone may be different from its UTC date. As a result, two different UTC date groups will show the same timestamp date.
What you should do is select the date in UTC, so you're displaying the same date that you're grouping by.
SELECT FROM_UNIXTIME((TIMESTAMP DIV 86400000) * 86400), COUNT(*)
FROM testdata
WHERE timestamp BETWEEN 1455906600000 and 1456770599000
GROUP BY TIMESTAMP DIV 86400000

Related

Fetch distinct available dates from table

I have a data table where data is present for every date (for 50 branches) except saturday and sunday. I am trying to get the last valid date from table from multiple given dates.
select distinct BW_DATE from backdated_weekly where BW_DATE <='2021-09-30' or BW_DATE<='2021-09-26' order by BW_DATE desc;
As 2021-09-30 is a valid date but 2021-09-26 (Sunday) is not present in table. I am trying to get output as
2021-09-30
2021-09-24
But above query gives all results below given dates.
If it is confirmed there are dates continuously in the table for all mon-fri only, simply select the maximum date up to the given date
SELECT MAX(BW_DATE)
FROM backdated_weekly
WHERE BW_DATE <= '2021-09-30'
UNION
SELECT MAX(BW_DATE)
FROM backdated_weekly
WHERE BW_DATE <= '2021-09-26'
Also we can calculate newest date in mon-fri for a given date directly without any table
WEEKDAY is the function to be used
Returns the weekday index for date (0 = Monday, 1 = Tuesday, … 6 =
Sunday).
SELECT CASE WHEN WEEKDAY('2021-09-30') IN ( 5, 6 ) THEN DATE('2021-09-30') - INTERVAL WEEKDAY('2021-09-30') - 4 DAY ELSE DATE('2021-09-30') END
UNION
SELECT CASE WHEN WEEKDAY('2021-09-26') IN ( 5, 6 ) THEN DATE('2021-09-26') - INTERVAL WEEKDAY('2021-09-26') - 4 DAY ELSE DATE('2021-09-26') END
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=477775c0eddbfa733e60bc629a8a68d4

MySQL query or procedure to return table from values computed over multiple rows

I have a network created MYSQL table with following fields:
IP_SRC, IP_DST, BYTES_IN, BYTES_OUT, START_TIME, STOP_TIME
1.1.1.1 8.8.8.8 1080 540 1580684018 1580684100
8.8.4.4 1.1.1.1 2000 4000 1580597618 1580597800
The TIME values are epoch time ticks and each record is basically one TCP session.
I would like formulate a query (or procedure) to return a table with the following fields:
DayOfMonth, TotalOutBytes
1 12345
2 83747
3 2389
where DayOfMonth is the last 14 days or a range of last "n" days (or to keep the focus on the main problem assume the values are 1, 2, 3 of Feb 2020). The challenge is to grab all rows from the network table where STOP_TIME falls within the timeticks for DayOfMonth value for that row and sum up the BYTES_OUT to report as TotalOutBytes for that date.
I'm afraid I'm somewhat new to MYSQL and hopelessly lost.
Consider:
select
day(from_unixtime(stop_time)) dayOfMonth,
sum(bytes_out) TotalOutBytes
from mytable
where stop_time >= unix_timestamp(current_date - interval 14 day)
group by day(from_unixtime(stop_time))
Rationale:
the where clause uses unix_timestamp() to generate the unix timstamp corresponding to 14 days before the current date, which we can use to filter the table
from_unixtime() turns an epoch timestam pto a datetime value, then day() gives you the day number for that point in time
you can then aggregate with that value and sum bytes_out per group

What's wrong with this simple MySQL syntax? Summing the dates

Among the rest, I've got three columns in my table:
start- timestamp, the default value is CURRENT_TIMESTAMP
duration- datetime, usually 0000-00-07 00:00:00 (one week)
end - timestamp, the default value is 0000-00-00 00:00:00
Here's what I do:
UPDATE `banners` SET `end` = `start` + `duration` WHERE `id` = 93
No errors appear, the id is exact - but the operation doesn't execute, the end field just remains at zeros.
What's wrong? Any quotes, brackets needed? I also tried making the middle field the timestamp type as well with no result.
Very possible, just a little ugly in terms of code...
UPDATE `banners`
SET `end` = FROM_UNIXTIME(UNIX_TIMESTAMP(`start`) + (UNIX_TIMESTAMP(`duration`) - UNIX_TIMESTAMP('1970-01-01 00:00:00')),'%Y-%d-%m %h:%i')
WHERE `id` = 93
...you just need to convert everything to seconds, add the duration from teh second one and then convert back to a datetime string for setting :)
You cannot add DATETIME values the same way you add numbers. What's the meaning of April 25, 2016 added to January 5, 2016?
You should store your durations using the smallest time unit that can be used to represent them as integer numbers and use the MySQL DATE_ADD() function instead of the addition.
For example, if duration is 1 WEEK then you can use any of:
UPDATE `banners` SET `end` = DATE_ADD(`start`, INTERVAL 1 WEEK) WHERE `id` = 93
UPDATE `banners` SET `end` = DATE_ADD(`start`, INTERVAL 7 DAY) WHERE `id` = 93
UPDATE `banners` SET `end` = DATE_ADD(`start`, INTERVAL 168 HOUR) WHERE `id` = 93
If duration is usually 1 week, you can use DATE_ADD() function of MySql
DATE_ADD(start,INTERVAL 7 DAY)
Hope that helps

Number of e.g. Mondays left in month

How do you most easily calculate how many e.g. Mondays are left in a month using MySQL (counting today)?
Bonus points for a solution that solves it for all days of the week in one query.
Desired output (run on Tuesday August 17th 2010):
dayOfWeek left
1 2 -- Sunday
2 2 -- Monday
3 3 -- Tuesday (yep, including today)
4 2 -- Wednesday
5 2 -- Thursday
6 2 -- Friday
7 2 -- Saturday
Create a date table that contains one row for each day that you care about (say Jan 1 2000 - Dec 31 2099):
create table dates (the_date date primary key);
delimiter $$
create procedure populate_dates (p_start_date date, p_end_date date)
begin
declare v_date date;
set v_date = p_start_date;
while v_date <= p_end_date
do
insert ignore into dates (the_date) values (v_date);
set v_Date = date_add(v_date, interval 1 day);
end while;
end $$
delimiter ;
call populate_dates('2000-01-01','2099-12-31');
Then you can run a query like this to get your desired output:
set #date = curdate();
select dayofweek(the_date) as dayOfWeek, count(*) as numLeft
from dates
where the_date >= #date
and the_date < str_to_date(period_add(date_format(#date,'%Y%m'),1),'%Y%m')
group by dayofweek(the_date);
That will exclude days of the week that have 0 occurrences left in the month. If you want to see those you can create another table with the days of the week (1-7):
create table days_of_week (
id tinyint unsigned not null primary key,
name char(10) not null
);
insert into days_of_week (id,name) values (1,'Sunday'),(2,'Monday'),
(3,'Tuesday'),(4,'Wednesday'),(5,'Thursday'),(6,'Friday'),(7,'Saturday');
And query that table with a left join to the dates table:
select w.id, count(d.the_Date) as numLeft
from days_of_week w
left outer join dates d on w.id = dayofweek(d.the_date)
and d.the_date >= #date
and d.the_date < str_to_date(period_add(date_format(#date,'%Y%m'),1),'%Y%m')
group by w.id;
i found something
according to this article "find next monday"
http://www.gizmola.com/blog/archives/99-Finding-Next-Monday-using-MySQL-Dates.html
SELECT DATE_ADD(CURDATE(), INTERVAL (9 - IF(DAYOFWEEK(CURDATE())=1, 8,
DAYOFWEEK(CURDATE()))) DAY) AS NEXTMONDAY;
what we need to do is calculate the days between end month and next Monday,
and divide in 7 .
update (include current day) :
so the result is like :
for Monday
SELECT CEIL( ((DATEDIFF(LAST_DAY(NOW()),DATE_ADD(CURDATE(),
INTERVAL (9 - IF(DAYOFWEEK(CURDATE())=1, 8, DAYOFWEEK(CURDATE()))) DAY)))+1)/7)
+ IF(DAYOFWEEK(CURDATE())=2,1,0)
for Tuesday :
SELECT CEIL( ((DATEDIFF(LAST_DAY(NOW()),DATE_ADD(CURDATE(),
INTERVAL (10 - IF(DAYOFWEEK(CURDATE())=1, 8, DAYOFWEEK(CURDATE()))) DAY)))+1)/7)
+ IF(DAYOFWEEK(CURDATE())=3,1,0)
Have a look at my responses to;
MySQL: Using the dates in a between condition for the results
and
Select all months within given date span, including the ones with 0 values
for a way I think would work nicely, similar to #Walker's above, but without having to do the dayofweek() function within the query, and possibly more flexible too. One of the responses has a link to a SQL dump of my table which can be imported if it helps!

SQL Query to only exclude data of last 24 hours?

I have the following data:
Table Name: NODE
NID | Name | Date
001 | One | 1252587739
Date is a unix timestamp. I need a query, whereby I can select only the nodes who's "Date" is older than 24 hours. Something like this:
SELECT * FROM NODE WHERE Date < NOW() - SOMETHING
Anybody know how to do this?
Does the NOW() - SOMETHING part take into account that the date is stored as a unix timestamp?
Unix timestamp is in seconds. This works with MySQL:
SELECT * FROM NODE WHERE Date < (UNIX_TIMESTAMP(NOW()) - 24*60*60)
where datediff(hh, date, now()) < 24
Going by the definition of "Unix Timestamp = number of seconds from Jan 1, 1970", and based on MS SQL Server (7.0 and up compatible):
SELECT *
from NODE
where datediff(ss, dateadd(ss, Date, 'Jan 1, 1970'), getdate()) < 86400
The innermost parenthesis adds the number of seconds to Jan 1 1970 to get the row's datetime in SQL server format, the outer parenthesis gets the number of seconds between that date and "now", and 86400 is the number of seconds in 24 hours. (But double-check this--I can't debug this just now, and I might have the function paramater order mixed.)