Average and count with conditions - mysql - mysql

table name is data.
Columns - 'date', 'location, 'fp, 'TV'
Under date I will have multiple different dates but each date has a number of rows with the same date. Same with location.
I am trying to work out the average of TV for every time the date and location are the same and fp = 1, and insert the result into a new column called avgdiff
So I might have a number of rows with the date 2016-12-08 and location LA, with different numbers under fp and TV. So when the date is 2016-12-08 and location is LA, fp might equal 1, 4 times, and TV for those 4 rows might be 7.4, 8.2, 1, -2. So the avg will be 3.65.
I think I need to use avg and count functions with conditions but I am having a lot of trouble with this. I hope this makes sense.
Thanks

You can query for the average using a GROUP BY:
SELECT `date`, `location`, AVG(`TV`) AS `avgtv`
FROM `data`
WHERE `fp` = 1
GROUP BY `date`, `location`
To update another table with your computed averages (which I strongly recommend against), you can use an UPDATE...JOIN with the above as a subquery:
UPDATE ratings r
JOIN ( /* paste above query here */ ) t
ON t.date = r.date AND t.location = r.location
SET r.avgtv = t.avgtv

If, for any reason, you cannot avoid storing aggregated data in the same table (thereby introducing redundancy and possibly incorrect/not up to date values), do an update statement of the following form:
update data,
(select t2.location, t2.date, avg(t2.TV) as avgTV2
from data t2
where t2.fp = 1
group by t2.location, t2.date) aggValues
set avgTV = avgTV2
where data.location = aggValues.location
and data.date = aggValues.date
and data.fp = 1

Related

Water reading. Subtract latest reading from previous based on dates selected

I have a table of water readings. I need to know how much was consumed by each property based on 2 given dates.
Below is how my data looks like.
How can I create a mysql query which will
ask what dates (say between 08/16/2021 and 08/12/2021) to get the difference in readings
compute the difference
enter results in a new column "consumed between 8/12 and 8/16".
I am very new with mysql and your assistance is highly appreciated.
Thanks
In this query, you can replace your start and end dates with your application variables or stored procedure parameters.
Find my step-by-step queries here -> db<>fiddle
Try this -
set #v_start_date = '2021-08-12';
set #v_end_date = '2021-08-16';
SELECT
wr_start_table.Unit_No,
wr_start_table.Date,
wr_start_table.Reading,
wr_end_table.Date,
wr_end_table.Reading,
wr_end_table.Reading - wr_start_table.Reading AS difference_in_reading
FROM
(
SELECT
wr_start_unit.*
FROM
water_reading wr_start_unit
INNER JOIN (
SELECT
wr.Unit_No,
min(wr.Date) AS start_date
FROM
water_reading wr
WHERE
wr.Date >= #v_start_date
GROUP BY
wr.Unit_No) AS wr_start ON
wr_start.Unit_No = wr_start_unit.Unit_No
AND wr_start.start_date = wr_start_unit.Date) AS wr_start_table,
(
SELECT
wr_end_unit.*
FROM
water_reading wr_end_unit
INNER JOIN (
SELECT
wr.Unit_No,
max(wr.Date) AS end_date
FROM
water_reading wr
WHERE
wr.Date <= #v_end_date
GROUP BY
wr.Unit_No) AS wr_end ON
wr_end.Unit_No = wr_end_unit.Unit_No
AND wr_end.end_date = wr_end_unit.Date) AS wr_end_table
WHERE
wr_start_table.Unit_No = wr_end_table.Unit_No;

How do I SELECT a MySQL Table value that has not been updated on a given date?

I have a MySQL database named mydb in which I store daily share prices for
423 companies in a table named data. Table data has the following columns:
`epic`, `date`, `open`, `high`, `low`, `close`, `volume`
epic and date being primary key pairs.
I update the data table each day using a csv file which would normally have 423 rows
of data all having the same date. However, on some days prices may not available
for all 423 companies and data for a particular epic and date pair will
not be updated. In order to determine the missing pair I have resorted
to comparing a full list of epics against the incomplete list of epics using
two simple SELECT queries with different dates and then using a file comparator, thus
revealing the missing epic(s). This is not a very satisfactory solution and so far
I have not been able to construct a query that would identify any epics that
have not been updated for any particular day.
SELECT `epic`, `date` FROM `data`
WHERE `date` IN ('2019-05-07', '2019-05-08')
ORDER BY `epic`, `date`;
Produces pairs of values:
`epic` `date`
"3IN" "2019-05-07"
"3IN" "2019-05-08"
"888" "2019-05-07"
"888" "2019-05-08"
"AA." "2019-05-07"
"AAL" "2019-05-07"
"AAL" "2019-05-08"
Where in this case AA. has not been updated on 2019-05-08. The problem with this is that it is not easy to spot a value that is not a pair.
Any help with this problem would be greatly appreciated.
You could do a COUNT on epic, with a GROUP BY epic for items in that date range and see if you get any with a COUNT less than 2, then select from this result where UpdateCount is less than 2, forgive me if the syntax on the column names is not correct, I work in SQL Server, but the logic for the query should still work for you.
SELECT x.epic
FROM
(
SELECT COUNT(*) AS UpdateCount, epic
FROM data
WHERE date IN ('2019-05-07', '2019-05-08')
GROUP BY epic
) AS x
WHERE x.UpdateCount < 2
Assuming you only want to check the last date uploaded, the following will return every item not updated on 2019-05-08:
SELECT last_updated.epic, last_updated.date
FROM (
SELECT epic , max(`date`) AS date FROM `data`
GROUP BY 'epic'
) AS last_updated
WHERE 'date' <> '2019-05-08'
ORDER BY 'epic'
;
or for any upload date, the following will compare against the entire database, so you don't rely on '2019-08-07' having every epic row. I.e. if the epic has been in the database before then it will show if not updated:
SELECT d.epic, max(d.date)
FROM data as d
WHERE d.epic NOT IN (
SELECT d2.epic
FROM data as d2
WHERE d2.date = '2019-05-08'
)
GROUP BY d.epic
ORDER BY d.epic

Generating complex sql tables

I currently have an employee logging sql table that has 3 columns
fromState: String,
toState: String,
timestamp: DateTime
fromState is either In or Out. In means employee came in and Out means employee went out. Each row can only transition from In to Out or Out to In.
I'd like to generate a temporary table in sql to keep track during a given hour (hour by hour), how many employees are there in the company. Aka, resulting table has columns HourBucket, NumEmployees.
In non-SQL code I can do this by initializing the numEmployees as 0 and go through the table row by row (sorted by timestamp) and add (employee came in) or subtract (went out) to numEmployees (bucketed by timestamp hour).
I'm clueless as how to do this in SQL. Any clues?
Use a COUNT ... GROUP BY query. Can't see what you're using toState from your description though! Also, assuming you have an employeeID field.
E.g.
SELECT fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable
INNER JOIN (SELECT employeeID AS 'empID', MAX(timestamp) AS 'latest' FROM StaffinBuildingTable GROUP BY employeeID) AS LastEntry ON StaffinBuildingTable.employeeID = LastEntry.empID
GROUP BY fromState
The LastEntry subquery will produce a list of employeeIDs limited to the last timestamp for each employee.
The INNER JOIN will limit the main table to just the employeeIDs that match both sides.
The outer GROUP BY produces the count.
SELECT HOUR(SBT.timestamp) AS 'Hour', SBT.fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable AS SBT
INNER JOIN (
SELECT SBIJ.employeeID AS 'empID', MAX(timestamp) AS 'latest'
FROM StaffinBuildingTable AS SBIJ
WHERE DATE(SBIJ.timestamp) = CURDATE()
GROUP BY SBIJ.employeeID) AS LastEntry ON SBT.employeeID = LastEntry.empID
GROUP BY SBT.fromState, HOUR(SBT.timestamp)
Replace CURDATE() with whatever date you are interested in.
Note this is non-optimal as it calculates the HOUR twice - once for the data and once for the group.
Again you are using the INNER JOIN to limit the number of returned row, this time to the last timestamp on a given day.
To me your description of the FromState and ToState seem the wrong way round, I'd expect to doing this based on the ToState. But assuming I'm wrong on that the following should point you in the right direction:
First, I create a "Numbers" table containing 24 rows one for each hour of the day:
create table tblHours
(Number int);
insert into tblHours values
(0),(1),(2),(3),(4),(5),(6),(7),
(8),(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),(23);
Then for each date in your employee logging table, I create a row in another new table to contain your counts:
create table tblDailyHours
(
HourBucket datetime,
NumEmployees int
);
insert into tblDailyHours (HourBucket, NumEmployees)
select distinct
date_add(date(t.timeStamp), interval h.Number HOUR) as HourBucket,
0 as NumEmployees
from
tblEmployeeLogging t
CROSS JOIN tblHours h;
Then I update this table to contain all the relevant counts:
update tblDailyHours h
join
(select
h2.HourBucket,
sum(case when el.fromState = 'In' then 1 else -1 end) as cnt
from
tblDailyHours h2
join tblEmployeeLogging el on
h2.HourBucket >= el.timeStamp
group by h2.HourBucket
) cnt ON
h.HourBucket = cnt.HourBucket
set NumEmployees = cnt.cnt;
You can now retrieve the counts with
select *
from tblDailyHours
order by HourBucket;
The counts give the number on site at each of the times displayed, if you want during the hour in question, we'd need to tweak this a little.
There is a working version of this code (using not very realistic data in the logging table) here: rextester.com/DYOR23344
Original Answer (Based on a single over all count)
If you're happy to search over all rows, and want the current "head count" you can use this:
select
sum(case when t.FromState = 'In' then 1 else -1) as Heads
from
MyTable t
But if you know that there will always be no-one there at midnight, you can add a where clause to prevent it looking at more rows than it needs to:
where
date(t.timestamp) = curdate()
Again, on the assumption that the head count reaches zero at midnight, you can generalise that method to get a headcount at any time as follows:
where
date(t.timestamp) = "CENSUS DATE" AND
t.timestamp <= "CENSUS DATETIME"
Obviously you'd need to replace my quoted strings with code which returned the date and datetime of interest. If the headcount doesn't return to zero at midnight, you can achieve the same by removing the first line of the where clause.

MS ACCESS Combining to result sets

SELECT c.siteno, a.sitename, a.location, Count(a.status) AS ChargeablePermit
FROM (PermitStatus AS a LEFT JOIN states AS b ON a.status = b.statusheading)
LEFT JOIN Sitedetails AS c ON a.zone = c.compexzone
WHERE b.statusheading like "Chargeable" and a.loaded_date between
(select monthstart from ChargeDate) and (select Monthend from ChargeDate)
GROUP BY a.sitename, c.siteno, a.location;
This query returns me the count of chargeable permits by site
Mar14
Siteno (1) Sitename (site1) Location (location1) Chargeablepermit (30)
these calculations are based on the period determined by the two sub selects (i.e. for the month of March 14)
i was wondering if i could change the date range covered by the subselects (i.e.to April 14) and do math on (subtract one count from the other) the counts of chargeable permits from the two different result sets and have that result displayed on the on one table
for instance if April 14 was
April
Siteno (1) Sitename (Site1) Location (Location1) ChargeablePermit (40) Difference (10)
Not in the way it seems you are proposing, you would simply double-up your SQL within a UNION query to return the data sets for the 2 periods, and then perform an aggregate on the results:
SELECT SUM(CP) FROM (
SELECT (ChargeablePermit * -1) AS CP FROM ... WHERE dates = Date1
UNION ALL
SELECT ChargeablePermit AS CP FROM ... WHERE dates = Date2
)
Depending on how many records you're dealing with, a UNION like this could be quite slow however. So the other approach would be to turn your SQL into an Append query which inserts the output into a temp table. You would run the query for each period, before running a 2nd query to aggregate the results from the temp table.
Also you should consider using joins to filter your results rather than subqueries.

MySQL Query - Include dates without records

I have a report that displays a graph. The X axis uses the date from the below query. Where the query returns no date, I am getting gaps and would prefer to return a value. Is there any way to force a date where there are no records?
SELECT
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END AS type,
COUNT(*)
FROM taxticketitem
GROUP BY
DATE(instime),
CASE
WHEN direction = 1 AND duration > 0 THEN 'Incoming'
WHEN direction = 2 THEN 'Outgoing'
WHEN direction = 1 AND duration = 0 THEN 'Missed'
END
ORDER BY DATE(instime)
One possible way is to create a table of dates and LEFT JOIN your table with them. The table could look something like this:
CREATE TABLE `datelist` (
`date` DATE NOT NULL,
PRIMARY KEY (`date`)
);
and filled with all dates between, say Jan-01-2000 through Dec-31-2050 (here is my Date Generator script).
Next, write your query like this:
SELECT datelist.date, COUNT(taxticketitem.id) AS c
FROM datelist
LEFT JOIN taxticketitem ON datelist.date = DATE(taxticketitem.instime)
WHERE datelist.date BETWEEN `2012-01-01` AND `2012-12-31`
GROUP BY datelist.date
ORDER BY datelist.date
LEFT JOIN and counting not null values from right table's ensures that the count is correct (0 if no row exists for a given date).
You would need to have a set of dates to LEFT JOIN your table to it. Unfortunately, MySQL lacks a way to generate it on the fly.
You would need to prepare a table with, say, 100000 consecutive integers from 0 to 99999 (or how long you think your maximum report range would be):
CREATE TABLE series (number INT NOT NULL PRIMARY KEY);
and use it like this:
SELECT DATE(instime) AS r_date, CASE ... END AS type, COUNT(instime)
FROM series s
LEFT JOIN
taxticketitems ti
ON ti.instime >= '2013-01-01' + INTERVAL number DAY
AND ti.instime < '2013-01-01' + INTERVAL number + 1 DAY
WHERE s.number <= DATEDIFF('2013-02-01', '2013-01-01')
GROUP BY
r_date, type
Had to do something similar before.
You need to have a subselect to generate a range of dates. All the dates you want. Easiest with a start date added to a number:-
SELECT DATE_ADD(SomeStartDate, INTERVAL (a.I + b.1 * 10) DAY)
FROM integers a, integers b
Given a table called integers with a single column called i with 10 rows containing 0 to 9 that SQL will give you a range of 100 days starting at SomeStartDate
You can then left join your actual data against that to get the full range.