Grouping Unix Timestamp by Day Producing Unevenly Spaced Groups - mysql

I'm using a MySQL query to pull a range of datetimes as a Unix Timestamp (because I'll be converting them to Javascript time). I'm grouping by 'FROM_UNIXTIME' as below:
SELECT
UNIX_TIMESTAMP(DateAndTime) as x,
Sum(If(Pass='Pass',1,0)) AS y,
Sum(If(Pass='Fail',1,0)) AS z,
Sum(If(Pass='Fail',1,0))/(Sum(If(Pass='Pass',1,0))+Sum(If(Pass='Fail',1,0))) AS a,
cases.primaryApp
FROM casehistory, cases
WHERE DATE_SUB(CURDATE(),INTERVAL 80 DAY) <= DateAndTime
AND cases.caseNumber = casehistory.caseNumber
AND cases.primaryApp = 'Promo'
GROUP BY FROM_UNIXTIME(x, '%Y-%m-%d')
While I'd expected my timestamps to be returnd evenly spaced (that is, same amount of time between each day/group), I get the following series:
1300488140, 1300501520,
1300625099, 1300699980
All the other data from the query is correct, but because the spacing of the timestamps is irregular, a bar chart based on these stamps looks pretty awful. Perhaps I'm doing something wrong in the way I apply the grouping?
Thank you for the reply. My query 'made sense' in that it produced that could be plotted (the grouping was done on the x alias for the dateandtime value), but the problem was that pulling a Unix timestamp from the database and grouping by day returned a series of timestamps that did not have equal distance between them.
I solved this by pulling only the day (without the time) from the datetime MySQL field, then - in PHP - concatenating an empty time to the date, converting the resulting string to a time, then multiplying the whole shebang by 1000 to return the Javascript time I needed for the charting, like this:
x = x . ' 00:00:00';
x = strtotime(x) * 1000;
The answer put me on the right track; I'll accept it. My chart looks perfect now.

Question is very confused.
Your SQL statement makes no sense - you are grouping by entities not found in the select statement. And a bar chart plots an ordered set of values - so if there's something funny with the spacing then its not really a bar chart.
But I think the answer you are looking for is:
SELECT DATE_FORMAT(dateandtime, '%Y-%m-%d') as ondate
, SUM(IF(Pass='Pass',1,0)) AS passed
, SUM(IF(Pass='Fail',1,0)) AS failed
, SUM(IF(Pass='Fail',1,0))
/(SUM(IF(pass='Pass',1,0))+SUM(IF(Pass='Fail',1,0))) AS fail_pct
, cases.primaryapp
FROM casehistory, cases
WHERE DATE_SUB(CURDATE(),INTERVAL 80 DAY) <= dateandtime
AND cases.casenumber = casehistory.casenumber
AND cases.primaryapp = 'Promo'
GROUP BY DATE_FORMAT(dateandtime, '%Y-%m-%d')
ORDER BY 1;
And if you need Unix timestamps, wrap the above in....
SELECT UNIX_TIMESTAMP(STR_TO_DATE(CONCAT(ilv.ondate, ' 00:00:00'))) AS tstamp
, passed
, failed
, fail_pct
, primaryapp
FROM (
...
) AS ilv
Note that you'll still get anomolies around DST switches.
C.

Related

MySQL - Date - Timestamp

My table has the below mentioned timestamp
Outcome required: data between 1997 and 1999 morning times i.e. (12:00:01 to 11:59:59)
1997-09-22 18:02:38
1997-10-15 01:26:11
1997-11-03 02:42:40
1997-10-15 01:25:19
1999-10-15 01:25:19
1999-10-15 23:25:19
1998-03-12 20:15:12
1998-02-13 23:52:53
1997-09-23 23:26:01
2000-09-23 23:26:01
I am trying the below query but does not give the right outcome
SELECT * FROM r WHERE ts BETWEEN '1997-01-01 00:00:01' AND '1999-12-31 11:59:59'
I can find the outcome by extracting hours and minutes separately but is there a way where the query is a bit concise?
You need to extract date and time separately to fetch the needed data.
In MySql you can use DATE_FORMAT method to extract same.
Read more here: DATE_FORMAT(date, format)
Your query will be:
SELECT * FROM `r` WHERE DATE_FORMAT(ts, "%Y-%m-%d") BETWEEN '1997-01-01' AND '1999-12-31' AND DATE_FORMAT(date_time, "%H:%i:%s") BETWEEN '00:00:01' AND '11:59:59'
If your date is not in DateTime format then you need to convert your string/raw date to date time format using STR_TO_DATE method.
Read more here: STR_TO_DATE(date, format)
You may use STR_TO_DATE function :
SELECT *
FROM r
WHERE ts >= STR_TO_DATE('1997-01-01', '%Y-%m-%d')
AND ts < STR_TO_DATE('2000-01-01', '%Y-%m-%d')
P.S: ts <= '1999-12-31 11:59:59' implicitly means ts < '2000-01-01'
There's no way to specify particular hours of day within the range comparison that spans years. We'd need to add another predicate (condition) to narrow down the rows that match the range scan.
We can use DATE_FORMAT function to get hours, minutes and seconds (formatted with two digits each)
For example, based on the stated specification (only times between 00:00:01 and 11:59:59) we could add something like this:
AND DATE_FORMAT(r.ts,'%h:%i:%s') BETWEEN '00:00:01' AND '11:59:59'
But it seems really strange to be omitting the second right after midnight, and the second immediately before noon. (MySQL DATETIME can have resolution smaller than a second, up to six decimal digits.)
Personally, I'd identify "morning hours" as simply hour values between 0 and 11, like this:
AND DATE_FORMAT(r.ts,'%h') BETWEEN '00' AND '11'
That will include "morning times" before 12:01 AM and after 11:59 AM. For example, these times would be included by this condition, but be omitted by the first example condition:
00:00:00.555
11:59:59.023
The specification isn't entirely clear... determining whether these times should be included or excluded would help clarify the specification. I suspect the statement of the specification is somewhat jarbled, and we really want all "morning times" between midnight and noon.
SELECT r.*
FROM r
WHERE r.ts >= '1997-01-01'
AND r.ts < '2000-01-01'
AND DATE_FORMAT(r.ts,'%h) BETWEEN '00' AND '11'
But it really depends on the definition of "morning hours", whether that first second after midnight is included or excluded.

Show the value from the nearest timestamp from another table in MYSQL

So I have a table called dash with two columns: value and date.
I have a timestamp variable called localtime.
Both localtime and date are in yyyy-mm-dd hh-mm format.
I need to find the closest timestamp on dash, return the value.
Right now what I have doesn't work.
def convValueChecking(cursor, localtime):
cursor.execute("SELECT Value, MIN(TIMESTAMPDIFF(MINUTE, DATE, %s)) FROM dash", (localtime))
value = cursor.fetchall()
Update:
This one seems to work, the order_date > locatime is very important, otherwise it looks for the smallest negative number
cursor.execute(
"SELECT order_value, order_date FROM dashboard \
WHERE order_date > %s\
ORDER BY ABS(TIMESTAMPDIFF(MINUTE, %s, order_date))\
LIMIT 1", (localtime, localtime,))
Try replacing the SQL string with:
SELECT
value
FROM
dash
WHERE
TIMEDIFF(Date, %s) IN
(
SELECT
MIN(TIMEDIFF(Date, %s))
FROM
dash
)
MySQL syntax means you need to do SELECT then FROM and then declare the WHERE part.
TIMEDIFF works out the difference between two timestamps. TIMESTAMPDIFF works out the difference between two parts of a timestamp (eg between the months, days, seconds etc.). If you want the closest date you should use TIMEDIFF to work out the smallest difference overall.
NB. The words Value and Date also have special properties in MySQL, so you may want to change your column name.
SQLFiddle.

SQL query to select values grouped by hour(col) and weekday(row) based on the timestamp

I have searched SO for this question and found slightly similar posts but was unable to adapt to my needs.
I have a database with server requests since forever, each one with a timestamp and i'm trying to come up with a query that allows me to create a heatmatrix chart (CCC HeatGrid).
The sql query result must represent the server load grouped by each hour of each weekday.
Like this: Example table
I just need the SQL query, i know how to create the chart.
Thank you,
Those looks like "counts" of rows.
One of the issues is "sparse" data, we can address that later.
To get the day of the week ('Sunday','Monday',etc.) returned, you can use the DATE_FORMAT function. To get those ordered, we need to include an integer value 0 through 6, or 1 through 7. We can use an ORDER BY clause on that expression to get the rows returned in the order we want.
To get the "hour" across the top, we can use expressions in the SELECT list that conditionally increments the count.
Assuming your timestamp column is named ts, and assuming you want to pull all rows from the year 2014, we start with something like this:
SELECT DAYOFWEEK(t.ts)
, DATE_FORMAT(t.ts,'%W')
FROM mytable t
WHERE t.ts >= '2014-01-01'
AND t.ts < '2015-01-01'
GROUP BY DAYOFWEEK(t.ts)
ORDER BY DAYOFWEEK(t.ts)
(I need to check the MySQL documentation, WEEKDAY and DAYOFWEEK are real similar, but we want the one that returns lowest value for Sunday, and highest value for Saturday... i think we want DAYOFWEEK, easy enough to fix later)
The "trick" now is the columns across the top.
We can extract the "hour" from timestamp using the DATE_FORMAT() function, the HOUR() function, or an EXTRACT() function... take your pick.
The expressions we want are going to return a 1 if the timestamp is in the specified hour, and a zero otherwise. Then, we can use a SUM() aggregate to count up the 1. A boolean expression returns a value of 1 for TRUE and 0 for FALSE.
, SUM( HOUR(t.ts)=0 ) AS `h0`
, SUM( HOUR(t.ts)=1 ) AS `h1`
, SUM( HOUR(t.ts)=2 ) AS `h2`
, '...'
, SUM( HOUR(t.ts)=22 ) AS `h22`
, SUM( HOUR(t.ts)=23 ) AS `h23`
A boolean expression can also evaluate to NULL, but since we have a predicate (i.e. condition in the WHERE clause) that ensures us that ts can't be NULL, that won't be an issue.
The other issue we can encounter (as I mentioned earlier) is "sparse" data. To illustrate that, consider what happens (with our query) if there are no rows that have a ts value for a Monday. What happens is that we don't get a row in the resultset for Monday. If it does happen that a row is "missing" for Monday (or any day of the week), we do know that all of the hourly counts across the "missing" Monday row would all be zero.

mysql select, sum, group by date when you have epoch or sysmillis for a time stamp

I have been wrestling around with various time/date manipulations in MySQL and have not figured out how to get this done. I am trying to select a daily sum of a column, volume. My date column contains sysmillis (epoch * 1000). I have tried things such as
SELECT YEAR(from_unixtime(date/1000)) FROM...
and none of what I have tried does the trick. What I want to end up with is a result table that does a sum of all the transactions volume column, for each day. Seems like a pretty simple idea to me, but it just is not working. Is this something that I need to do a nested query to do or should this just be a simple one-liner, that I am just not getting the function right?
SELECT DATE(FROM_UNIXTIME(date/1000)) AS date, SUM(volume)
FROM ...
WHERE ...
GROUP BY date
should do the trick, unless your table structure is wonky.
SELECT SUM(volume) FROM (
SELECT
SUBSTR( FROM_UNIXTIME( ROUND(volumeDate /1000) ), 1, 10) AS dayValue,
volume FROM `volumeTable`
) a
GROUP BY dayValue

How do I get the average time from a series of DateTime columns in SQL Server 2008?

Lets say I have a table that contains the following - id and date (just to keep things simple).
It contains numerous rows.
What would my select query look like to get the average TIME for those rows?
Thanks,
Disclaimer: There may be a much better way to do this.
Notes:
You can't use the AVG() function against a DATETIME/TIME
I am casting DATETIME to DECIMAL( 18, 6 ) which appears to yield a reasonably (+- few milliseconds) precise result.
#1 - Average Date
SELECT
CAST( AVG( CAST( TimeOfInterest AS DECIMAL( 18, 6 ) ) ) AS DATETIME )
FROM dbo.MyTable;
#2 - Average Time - Remove Date Portion, Cast, and then Average
SELECT
CAST( AVG( CAST( TimeOfInterest - CAST( TimeOfInterest AS DATE ) AS DECIMAL( 18, 6 ) ) ) AS DATETIME )
FROM dbo.MyTable;
The second example subtracts the date portion of the DATETIME from itself, leaving only the time portion, which is then cast to a decimal for averaging, and back to a DATETIME for formatting. You would need to strip out the date portion (it's meaningless) and the time portion should represent the average time in the set.
SELECT CAST(AVG(CAST(ReadingDate AS real) - FLOOR(CAST(ReadingDate as real))) AS datetime)
FROM Rbh
I know that, in at least some of the SQL standards, the value expression (the argument to the AVG() function) isn't allowed to be a datetime value or a string value. I haven't read all the SQL standards, but I'd be surprised if that restriction had loosened over the years.
In part, that's because "average" (or arithmetic mean) of 'n' values is defined to be the sum of the values divided by the 'n'. And the expression '01-Jan-2012 08:00' + '03-Mar-2012 07:53' doesn't make any sense. Neither does '01-Jan-2012 08:00' / 3.
Microsoft products have a history of playing fast and loose with SQL by exposing the internal representation of their date and time data types. Dennis Ritchie would have called this "an unwarranted chumminess with the implementation."
In earlier versions of Microsoft Access (and maybe in current versions, too), you could multiply the date '01-Jan-2012' by the date '03-Mar-2012' and get an actual return value, presumably in units of square dates.
If your dbms supports the "interval" data type, then taking the average is straightforward, and does what you'd expect. (SQL Server doesn't support interval data types.)
create table test (
n interval hour to minute
);
insert into test values
('1:00'),
('1:30'),
('2:00');
select avg(n)
from test;
avg (interval)
--
01:30:00