MySQL Query - Include dates without records - mysql

I have a report that displays a graph. The X axis uses the date from the below query. Where the query returns no date, I am getting gaps and would prefer to return a value. Is there any way to force a date where there are no records?
SELECT
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END AS type,
COUNT(*)
FROM taxticketitem
GROUP BY
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END
ORDER BY DATE(instime)

One possible way is to create a table of dates and LEFT JOIN your table with them. The table could look something like this:
CREATE TABLE `datelist` (
`date` DATE NOT NULL,
PRIMARY KEY (`date`)
);
and filled with all dates between, say Jan-01-2000 through Dec-31-2050 (here is my Date Generator script).
Next, write your query like this:
SELECT datelist.date, COUNT(taxticketitem.id) AS c
FROM datelist
LEFT JOIN taxticketitem ON datelist.date = DATE(taxticketitem.instime)
WHERE datelist.date BETWEEN `2012-01-01` AND `2012-12-31`
GROUP BY datelist.date
ORDER BY datelist.date
LEFT JOIN and counting not null values from right table's ensures that the count is correct (0 if no row exists for a given date).

You would need to have a set of dates to LEFT JOIN your table to it. Unfortunately, MySQL lacks a way to generate it on the fly.
You would need to prepare a table with, say, 100000 consecutive integers from 0 to 99999 (or how long you think your maximum report range would be):
CREATE TABLE series (number INT NOT NULL PRIMARY KEY);
and use it like this:
SELECT DATE(instime) AS r_date, CASE ... END AS type, COUNT(instime)
FROM series s
LEFT JOIN
taxticketitems ti
ON ti.instime >= '2013-01-01' + INTERVAL number DAY
AND ti.instime < '2013-01-01' + INTERVAL number + 1 DAY
WHERE s.number <= DATEDIFF('2013-02-01', '2013-01-01')
GROUP BY
r_date, type

Had to do something similar before.
You need to have a subselect to generate a range of dates. All the dates you want. Easiest with a start date added to a number:-
SELECT DATE_ADD(SomeStartDate, INTERVAL (a.I + b.1 * 10) DAY)
FROM integers a, integers b
Given a table called integers with a single column called i with 10 rows containing 0 to 9 that SQL will give you a range of 100 days starting at SomeStartDate
You can then left join your actual data against that to get the full range.

Related

php - mysql select last inserted record between two dates (every month)

I have a table: box (id autoincrement, net_amount, created_at timestamp);
I need to create a query in php mysql to select the last inserted record every month. Thus to get the net_amount at the end of every month.
I am trying this simple query:
select * from box
where box.created_at < 20055332516028
While the max created_at in my table is 2017-10-14 10:42:30, there is no records when I use the given query, I need to increase the number to get the records!!!
20055332516028 is not a Unix timestamp. You need to get timestamp of the end of the previous month, something like this:
$date = new DateTime();
$date->setDate($date->format('Y'),$date->format('m'),1);
$date->setTime(0,0,0);
$date->sub(new DateInterval('PT1S'));
$endOfMonth = $date->getTimestamp();
and then use it in a query:
select * from box where box.created_at < unix_timestamp(?) order by box.created_at desc limit 1
if i understand your problem your looking for this:-
SELECT b1.*
FROM box b1 LEFT JOIN box b2
ON (b1.name = b2.name AND b1.id < b2.id)
WHERE b1.created_at < PUT YOUR DATE and b2.id IS NULL;
Hope it helps

Generating complex sql tables

I currently have an employee logging sql table that has 3 columns
fromState: String,
toState: String,
timestamp: DateTime
fromState is either In or Out. In means employee came in and Out means employee went out. Each row can only transition from In to Out or Out to In.
I'd like to generate a temporary table in sql to keep track during a given hour (hour by hour), how many employees are there in the company. Aka, resulting table has columns HourBucket, NumEmployees.
In non-SQL code I can do this by initializing the numEmployees as 0 and go through the table row by row (sorted by timestamp) and add (employee came in) or subtract (went out) to numEmployees (bucketed by timestamp hour).
I'm clueless as how to do this in SQL. Any clues?
Use a COUNT ... GROUP BY query. Can't see what you're using toState from your description though! Also, assuming you have an employeeID field.
E.g.
SELECT fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable
INNER JOIN (SELECT employeeID AS 'empID', MAX(timestamp) AS 'latest' FROM StaffinBuildingTable GROUP BY employeeID) AS LastEntry ON StaffinBuildingTable.employeeID = LastEntry.empID
GROUP BY fromState
The LastEntry subquery will produce a list of employeeIDs limited to the last timestamp for each employee.
The INNER JOIN will limit the main table to just the employeeIDs that match both sides.
The outer GROUP BY produces the count.
SELECT HOUR(SBT.timestamp) AS 'Hour', SBT.fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable AS SBT
INNER JOIN (
SELECT SBIJ.employeeID AS 'empID', MAX(timestamp) AS 'latest'
FROM StaffinBuildingTable AS SBIJ
WHERE DATE(SBIJ.timestamp) = CURDATE()
GROUP BY SBIJ.employeeID) AS LastEntry ON SBT.employeeID = LastEntry.empID
GROUP BY SBT.fromState, HOUR(SBT.timestamp)
Replace CURDATE() with whatever date you are interested in.
Note this is non-optimal as it calculates the HOUR twice - once for the data and once for the group.
Again you are using the INNER JOIN to limit the number of returned row, this time to the last timestamp on a given day.
To me your description of the FromState and ToState seem the wrong way round, I'd expect to doing this based on the ToState. But assuming I'm wrong on that the following should point you in the right direction:
First, I create a "Numbers" table containing 24 rows one for each hour of the day:
create table tblHours
(Number int);
insert into tblHours values
(0),(1),(2),(3),(4),(5),(6),(7),
(8),(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),(23);
Then for each date in your employee logging table, I create a row in another new table to contain your counts:
create table tblDailyHours
(
HourBucket datetime,
NumEmployees int
);
insert into tblDailyHours (HourBucket, NumEmployees)
select distinct
date_add(date(t.timeStamp), interval h.Number HOUR) as HourBucket,
0 as NumEmployees
from
tblEmployeeLogging t
CROSS JOIN tblHours h;
Then I update this table to contain all the relevant counts:
update tblDailyHours h
join
(select
h2.HourBucket,
sum(case when el.fromState = 'In' then 1 else -1 end) as cnt
from
tblDailyHours h2
join tblEmployeeLogging el on
h2.HourBucket >= el.timeStamp
group by h2.HourBucket
) cnt ON
h.HourBucket = cnt.HourBucket
set NumEmployees = cnt.cnt;
You can now retrieve the counts with
select *
from tblDailyHours
order by HourBucket;
The counts give the number on site at each of the times displayed, if you want during the hour in question, we'd need to tweak this a little.
There is a working version of this code (using not very realistic data in the logging table) here: rextester.com/DYOR23344
Original Answer (Based on a single over all count)
If you're happy to search over all rows, and want the current "head count" you can use this:
select
sum(case when t.FromState = 'In' then 1 else -1) as Heads
from
MyTable t
But if you know that there will always be no-one there at midnight, you can add a where clause to prevent it looking at more rows than it needs to:
where
date(t.timestamp) = curdate()
Again, on the assumption that the head count reaches zero at midnight, you can generalise that method to get a headcount at any time as follows:
where
date(t.timestamp) = "CENSUS DATE" AND
t.timestamp <= "CENSUS DATETIME"
Obviously you'd need to replace my quoted strings with code which returned the date and datetime of interest. If the headcount doesn't return to zero at midnight, you can achieve the same by removing the first line of the where clause.

Update MySQL table with counts from subquery if string match, else set to zero

I have a MySQL table department_members that contains rows with a string field (member_name) and an int field (recent_actions) for every person in a single department. Recent_actions is currently NULL for all rows.
I have another, much larger table company_actions that contains a row for every time someone in the whole company has performed that type of action in the past year. Each row has a member_name, timestamp, and a unique action_id.
I want to update department_members.recent_actions with a count of how many times that member has performed that type of action within the past two weeks. If they haven't performed any actions recently, I want to update department_members.recent_actions with 0.
I've tried various CASE and IF approaches, but I can't get the syntax right.
In pseudocode, this is what I'm trying to do:
UPDATE department_members AS d,
(SELECT COUNT(action_id) AS recent, member_name
FROM company_actions
WHERE timestamp > DATE_SUB(NOW(), INTERVAL 14 DAY) AS tmp
/* then do something like this, only for real: */
IF d.member_name IN tmp(member_names) THEN d.recent_actions = tmp.recent
WHERE d.member_name = tmp.member_name
ELSE IF d.member_name NOT IN tmp(member_names) THEN d.recent_actions = 0
Hopefully that gets across what I'm going for? Any help would be appreciated! Been beating my head against this problem all day.
Join department_members with a subquery that calculates the total number of action_id in table company_actions using LEFT JOIN.
The COALESCE() returns the first non-null value in the params list.
UPDATE department_members a
LEFT JOIN
(
SELECT member_name, COUNT(*) TotalAction
FROM company_actions
WHERE timestamp > DATE_SUB(NOW(), INTERVAL 14 DAY)
GROUP BY member_name
) b ON a.member_name = b.member_name
SET a.action_id = COALESCE(b.TotalAction, 0)

Group by day and still show days without rows?

I have a log table with a date field called logTime. I need to show the number of rows within a date range and the number of records per day. The issue is that i still want to show days that do not have records.
Is it possible to do this only with SQL?
Example:
SELECT logTime, COUNT(*) FROM logs WHERE logTime >= '2011-02-01' AND logTime <= '2011-02-04' GROUP BY DATE(logTime);
It returns something like this:
+---------------------+----------+
| logTime | COUNT(*) |
+---------------------+----------+
| 2011-02-01 | 2 |
| 2011-02-02 | 1 |
| 2011-02-04 | 5 |
+---------------------+----------+
3 rows in set (0,00 sec)
I would like to show the day 2011-02-03 too.
MySQL will not invent rows for you, so if the data is not there, they will naturally not be shown.
You can create a calendar table, and join in that,
create table calendar (
day date primary key,
);
Fill this table with dates (easy with a stored procedure, or just some general scripting), up till around 2038 and something else will likely break unitl that becomes a problem.
Your query then becomes e.g.
SELECT logTime, COUNT(*)
FROM calendar cal left join logs l on cal.day = l.logTime
WHERE day >= '2011-02-01' AND day <= '2011-02-04' GROUP BY day;
Now, you could extend the calendar table with other columns that tells you the month,year, week etc. so you can easily produce statistics for other time units. (and purists might argue the calendar table would have an id integer primary key that the logs table references instead of a date)
In order to accomplish this, you need to have a table (or derived table) which contains the dates that you can then join from, using a LEFT JOIN.
SQL operates on the concept of mathematical sets, and if you don't have a set of data, there is nothing to SELECT.
If you want more details, please comment accordingly.
I'm not sure if this is a problem that should be solved by SQL. As others have shown, this requires maintaining a second table that contains the all of the individual dates of a given time span, which must be updated every time that time span grows (which presumably is "always" if that time span is the current time.
Instead, you should use to inspect the results of the query and inject dates as necessary. It's completely dynamic and requires no intermediate table. Since you specified no language, here's pseudo code:
EXECUTE QUERY `SELECT logTime, COUNT(*) FROM logs WHERE logTime >= '2011-02-01' AND logTime <= '2011-02-04' GROUP BY DATE(logTime);`
FOREACH row IN query result
WHILE (date in next row) - (date in this row) > 1 day THEN
CREATE new row with date = `date in this row + 1 day`, count = `0`
INSERT new row IN query result AFTER this row
ADVANCE LOOP INDEX TO new row (`this row` is now the `new row`)
END WHILE
END FOREACH
Or something like that
DECLARE #TOTALCount INT
DECLARE #FromDate DateTime = GetDate() - 5
DECLARE #ToDate DateTime = GetDate()
SET #FromDate = DATEADD(DAY,-1,#FromDate)
Select #TOTALCount= DATEDIFF(DD,#FromDate,#ToDate);
WITH d AS
(
SELECT top (#TOTALCount) AllDays = DATEADD(DAY, ROW_NUMBER()
OVER (ORDER BY object_id), REPLACE(#FromDate,'-',''))
FROM sys.all_objects
)
SELECT AllDays From d

MySQL: Average interval between records

Assume this table:
id date
----------------
1 2010-12-12
2 2010-12-13
3 2010-12-18
4 2010-12-22
5 2010-12-23
How do I find the average intervals between these dates, using MySQL queries only?
For instance, the calculation on this table will be
(
( 2010-12-13 - 2010-12-12 )
+ ( 2010-12-18 - 2010-12-13 )
+ ( 2010-12-22 - 2010-12-18 )
+ ( 2010-12-23 - 2010-12-22 )
) / 4
----------------------------------
= ( 1 DAY + 5 DAY + 4 DAY + 1 DAY ) / 4
= 2.75 DAY
Intuitively, what you are asking should be equivalent to the interval between the first and last dates, divided by the number of dates minus 1.
Let me explain more thoroughly. Imagine the dates are points on a line (+ are dates present, - are dates missing, the first date is the 12th, and I changed the last date to Dec 24th for illustration purposes):
++----+---+-+
Now, what you really want to do, is evenly space your dates out between these lines, and find how long it is between each of them:
+--+--+--+--+
To do that, you simply take the number of days between the last and first days, in this case 24 - 12 = 12, and divide it by the number of intervals you have to space out, in this case 4: 12 / 4 = 3.
With a MySQL query
SELECT DATEDIFF(MAX(dt), MIN(dt)) / (COUNT(dt) - 1) FROM a;
This works on this table (with your values it returns 2.75):
CREATE TABLE IF NOT EXISTS `a` (
`dt` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `a` (`dt`) VALUES
('2010-12-12'),
('2010-12-13'),
('2010-12-18'),
('2010-12-22'),
('2010-12-24');
If the ids are uniformly incremented without gaps, join the table to itself on id+1:
SELECT d.id, d.date, n.date, datediff(d.date, n.date)
FROM dates d
JOIN dates n ON(n.id = d.id + 1)
Then GROUP BY and average as needed.
If the ids are not uniform, do an inner query to assign ordered ids first.
I guess you'll also need to add a subquery to get the total number of rows.
Alternatively
Create an aggregate function that keeps track of the previous date, and a running sum and count. You'll still need to select from a subquery to force the ordering by date (actually, I'm not sure if that's guaranteed in MySQL).
Come to think of it, this is a much better way of doing it.
And Even Simpler
Just noting that Vegard's solution is much better.
The following query returns correct result
SELECT AVG(
DATEDIFF(i.date, (SELECT MAX(date)
FROM intervals WHERE date < i.date)
)
)
FROM intervals i
but it runs a dependent subquery which might be really inefficient with no index and on a larger number of rows.
You need to do self join and get differences using DATEDIFF function and get average.