conditionally sum from one table based on information from another table - mysql

I am trying to write an sql query that outputs data for a python script. The python script will eventually push that data to a table so to make things smoother, I decided to cast the output as char.
The data that I have is organized by 15min periods. Data A and data B are stored on one table and have columns start_time (as a datetime), counts A, and counts B. The second table has start_time (as a datetime), and counts C.
What I need is sum for A, B, and C for each day. However, I want to sum conditionally where it only counts in the sum where the other two data counts are not NULL for that 15 min period. For example, if a "row" for a 15 min period has data for A and B but not C, it would not count in the sum. How do I implement this conditional?
example output:
date| SUM(A) | SUM(B) | SUM(C)
I can write without the conditional like this (new to sql):
SELECT
DATE('timezone conversion') AS date,
cast(SUM(p1.COUNT_DATA_A) as char)
AS A,
cast(SUM(p1.COUNT_DATA_B) as char)
AS B,
cast(SUM(p2.COUNT_DATA_C) as char)
AS C
FROM
table_data_A_B
AS p1
LEFT JOIN table_data_C
AS p2 ON p1.start_time = p2.start_time
WHERE
DATE('timezone conversion') >= '2018-03-27'
AND DATE('timezone conversion') < '2018-03-29'
GROUP BY DATE('timezone conversion')
ORDER BY DATE(p1.start_time) DESC
How would I implement the conditional in this query? I appreciate the help. I am a bit new to stackoverflow, coding and sql in general but I will try my best to be helpful.

Just test for this in the WHERE clause of the query.
WHERE DATE('timezone conversion') BETWEEN '2018-03-27' AND '2018-03-29'
AND p1.COUNT_DATA_A IS NOT NULL AND p1.COUNT_DATA_B IS NOT NULL AND p2.COUNT_DATA_C IS NOT NULL

Related

Generating complex sql tables

I currently have an employee logging sql table that has 3 columns
fromState: String,
toState: String,
timestamp: DateTime
fromState is either In or Out. In means employee came in and Out means employee went out. Each row can only transition from In to Out or Out to In.
I'd like to generate a temporary table in sql to keep track during a given hour (hour by hour), how many employees are there in the company. Aka, resulting table has columns HourBucket, NumEmployees.
In non-SQL code I can do this by initializing the numEmployees as 0 and go through the table row by row (sorted by timestamp) and add (employee came in) or subtract (went out) to numEmployees (bucketed by timestamp hour).
I'm clueless as how to do this in SQL. Any clues?
Use a COUNT ... GROUP BY query. Can't see what you're using toState from your description though! Also, assuming you have an employeeID field.
E.g.
SELECT fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable
INNER JOIN (SELECT employeeID AS 'empID', MAX(timestamp) AS 'latest' FROM StaffinBuildingTable GROUP BY employeeID) AS LastEntry ON StaffinBuildingTable.employeeID = LastEntry.empID
GROUP BY fromState
The LastEntry subquery will produce a list of employeeIDs limited to the last timestamp for each employee.
The INNER JOIN will limit the main table to just the employeeIDs that match both sides.
The outer GROUP BY produces the count.
SELECT HOUR(SBT.timestamp) AS 'Hour', SBT.fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable AS SBT
INNER JOIN (
SELECT SBIJ.employeeID AS 'empID', MAX(timestamp) AS 'latest'
FROM StaffinBuildingTable AS SBIJ
WHERE DATE(SBIJ.timestamp) = CURDATE()
GROUP BY SBIJ.employeeID) AS LastEntry ON SBT.employeeID = LastEntry.empID
GROUP BY SBT.fromState, HOUR(SBT.timestamp)
Replace CURDATE() with whatever date you are interested in.
Note this is non-optimal as it calculates the HOUR twice - once for the data and once for the group.
Again you are using the INNER JOIN to limit the number of returned row, this time to the last timestamp on a given day.
To me your description of the FromState and ToState seem the wrong way round, I'd expect to doing this based on the ToState. But assuming I'm wrong on that the following should point you in the right direction:
First, I create a "Numbers" table containing 24 rows one for each hour of the day:
create table tblHours
(Number int);
insert into tblHours values
(0),(1),(2),(3),(4),(5),(6),(7),
(8),(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),(23);
Then for each date in your employee logging table, I create a row in another new table to contain your counts:
create table tblDailyHours
(
HourBucket datetime,
NumEmployees int
);
insert into tblDailyHours (HourBucket, NumEmployees)
select distinct
date_add(date(t.timeStamp), interval h.Number HOUR) as HourBucket,
0 as NumEmployees
from
tblEmployeeLogging t
CROSS JOIN tblHours h;
Then I update this table to contain all the relevant counts:
update tblDailyHours h
join
(select
h2.HourBucket,
sum(case when el.fromState = 'In' then 1 else -1 end) as cnt
from
tblDailyHours h2
join tblEmployeeLogging el on
h2.HourBucket >= el.timeStamp
group by h2.HourBucket
) cnt ON
h.HourBucket = cnt.HourBucket
set NumEmployees = cnt.cnt;
You can now retrieve the counts with
select *
from tblDailyHours
order by HourBucket;
The counts give the number on site at each of the times displayed, if you want during the hour in question, we'd need to tweak this a little.
There is a working version of this code (using not very realistic data in the logging table) here: rextester.com/DYOR23344
Original Answer (Based on a single over all count)
If you're happy to search over all rows, and want the current "head count" you can use this:
select
sum(case when t.FromState = 'In' then 1 else -1) as Heads
from
MyTable t
But if you know that there will always be no-one there at midnight, you can add a where clause to prevent it looking at more rows than it needs to:
where
date(t.timestamp) = curdate()
Again, on the assumption that the head count reaches zero at midnight, you can generalise that method to get a headcount at any time as follows:
where
date(t.timestamp) = "CENSUS DATE" AND
t.timestamp <= "CENSUS DATETIME"
Obviously you'd need to replace my quoted strings with code which returned the date and datetime of interest. If the headcount doesn't return to zero at midnight, you can achieve the same by removing the first line of the where clause.

mysql query to select all rows which satisfy a given input range and for multiple inputs

I need to find the number of rows which satisfy the given condition for a list of input values.
Suppose we have a table with columns id,open_date,close_date.
request table
id open_date close_date
1 '2013-04-08' '2013-04-10'
2 '2013-04-11' '2013-04-12'
1 '2013-04-09' '2013-04-12'
1 '2013-04-10' '2013-04-12'
Now I would want to the count of the rows which satisfy the condition for a given input x(x is one of the values from the input list) such that x>open_date and x < close_date.
This can be done for a single entry of the list ,x as below
SELECT count(*) FROM request WHERE '2013-04-10' BETWEEN open_date AND close_date
but how can I do this if for all x where x is an entry from a list ('2013-04-10','2013-04-12') etc.
Though I browsed many post,could not find an answer for this type of question.Can this be done in a single select query ?
Not recommended: programmatically generate a massive query:
SELECT count(*) FROM request WHERE
'2013-04-10' BETWEEN open_date AND close_date OR
'2013-05-10' BETWEEN open_date AND close_date OR
....
'2015-11-4' BETWEEN open_date AND close_date
Note: this gives you a total. If you want a total for each date, you need to do it the second way.
Recommended but a fair bit more complex. Placing the dates in a table with one column, combining them, filtering out only the sets that match, and then grouping them together to count them.
SELECT q.date, COUNT(*) FROM test_dates q, request r
WHERE q.date BETWEEN r.open_date AND r.close_date
GROUP BY q.date;
Demo sqlfiddle: http://sqlfiddle.com/#!9/474bb/3
Edit:
This variant will, as I think was actually requested, count the number of rows whose range includes any of the test dates:
SELECT COUNT(*) FROM
(SELECT DISTINCT r.id FROM test_dates q, request r
WHERE q.date BETWEEN r.open_date AND r.close_date) t;

MySQL Date in where clause

I have a table which contains date (Field Type: Date and Date Format: %Y-%m-%d) as a field. I need to select all the rows from the table for all the years whose date is not between Dec 3rd and Dec 24th.
The table contains month and day as a separate fields.
The result can be obtained by using the following query:
select * from mytable where date not in (select date from mytable where month=12 and day between 3 and 24);
But i m trying to get the result in a single query like the below one but it gave empty rows:
select * from mytable where date not between '%Y-12-03' and '%Y-12-24';
Can it be done in a single query like the above one?
SELECT *
FROM mytable
WHERE MONTH(`date`) <> 12
OR DAY(`date`) NOT BETWEEN 3 AND 24
;
This will give you every row that meets the requirements. I'm sure someone has a faster way of doing this, since this will ignore all indexes and will likely be slow on a large dataset, but it does work and return the data you require, so if no-one can suggest an improvement this will answer your question.

MS ACCESS Combining to result sets

SELECT c.siteno, a.sitename, a.location, Count(a.status) AS ChargeablePermit
FROM (PermitStatus AS a LEFT JOIN states AS b ON a.status = b.statusheading)
LEFT JOIN Sitedetails AS c ON a.zone = c.compexzone
WHERE b.statusheading like "Chargeable" and a.loaded_date between
(select monthstart from ChargeDate) and (select Monthend from ChargeDate)
GROUP BY a.sitename, c.siteno, a.location;
This query returns me the count of chargeable permits by site
Mar14
Siteno (1) Sitename (site1) Location (location1) Chargeablepermit (30)
these calculations are based on the period determined by the two sub selects (i.e. for the month of March 14)
i was wondering if i could change the date range covered by the subselects (i.e.to April 14) and do math on (subtract one count from the other) the counts of chargeable permits from the two different result sets and have that result displayed on the on one table
for instance if April 14 was
April
Siteno (1) Sitename (Site1) Location (Location1) ChargeablePermit (40) Difference (10)
Not in the way it seems you are proposing, you would simply double-up your SQL within a UNION query to return the data sets for the 2 periods, and then perform an aggregate on the results:
SELECT SUM(CP) FROM (
SELECT (ChargeablePermit * -1) AS CP FROM ... WHERE dates = Date1
UNION ALL
SELECT ChargeablePermit AS CP FROM ... WHERE dates = Date2
)
Depending on how many records you're dealing with, a UNION like this could be quite slow however. So the other approach would be to turn your SQL into an Append query which inserts the output into a temp table. You would run the query for each period, before running a 2nd query to aggregate the results from the temp table.
Also you should consider using joins to filter your results rather than subqueries.

MySQL Query - Include dates without records

I have a report that displays a graph. The X axis uses the date from the below query. Where the query returns no date, I am getting gaps and would prefer to return a value. Is there any way to force a date where there are no records?
SELECT
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END AS type,
COUNT(*)
FROM taxticketitem
GROUP BY
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END
ORDER BY DATE(instime)
One possible way is to create a table of dates and LEFT JOIN your table with them. The table could look something like this:
CREATE TABLE `datelist` (
`date` DATE NOT NULL,
PRIMARY KEY (`date`)
);
and filled with all dates between, say Jan-01-2000 through Dec-31-2050 (here is my Date Generator script).
Next, write your query like this:
SELECT datelist.date, COUNT(taxticketitem.id) AS c
FROM datelist
LEFT JOIN taxticketitem ON datelist.date = DATE(taxticketitem.instime)
WHERE datelist.date BETWEEN `2012-01-01` AND `2012-12-31`
GROUP BY datelist.date
ORDER BY datelist.date
LEFT JOIN and counting not null values from right table's ensures that the count is correct (0 if no row exists for a given date).
You would need to have a set of dates to LEFT JOIN your table to it. Unfortunately, MySQL lacks a way to generate it on the fly.
You would need to prepare a table with, say, 100000 consecutive integers from 0 to 99999 (or how long you think your maximum report range would be):
CREATE TABLE series (number INT NOT NULL PRIMARY KEY);
and use it like this:
SELECT DATE(instime) AS r_date, CASE ... END AS type, COUNT(instime)
FROM series s
LEFT JOIN
taxticketitems ti
ON ti.instime >= '2013-01-01' + INTERVAL number DAY
AND ti.instime < '2013-01-01' + INTERVAL number + 1 DAY
WHERE s.number <= DATEDIFF('2013-02-01', '2013-01-01')
GROUP BY
r_date, type
Had to do something similar before.
You need to have a subselect to generate a range of dates. All the dates you want. Easiest with a start date added to a number:-
SELECT DATE_ADD(SomeStartDate, INTERVAL (a.I + b.1 * 10) DAY)
FROM integers a, integers b
Given a table called integers with a single column called i with 10 rows containing 0 to 9 that SQL will give you a range of 100 days starting at SomeStartDate
You can then left join your actual data against that to get the full range.