MySQL select avg reading every hour even if there is no reading - mysql

I'm having a hard time making a MySQL statement from a Postgres one for a project we are migrating. I won't give the exact use case since it's pretty involved, but I can create a simple comparable situation.
We have a graphing tool that needs somewhat raw output for our data in hourly intervals. In Postgres, the SQL would generate a series for the date and hour over a time span, then it would join a query against that for the average where that date an hour existed. We were able to get for example the average sales by hour, even if that number is 0.
Here's a table example:
Sales
datetime | sale
2017-12-05 08:34:00 | 10
2017-12-05 08:52:00 | 20
2017-12-05 09:15:00 | 5
2017-12-05 10:22:00 | 10
2017-12-05 10:49:00 | 10
Where something like
SELECT DATE_FORMAT(s.datetime,'%Y%m%d%H') as "byhour", AVG(s.sale) as "avg sales" FROM sales s GROUP BY byhour
would produce
byhour | avg sales
2017120508 | 10
2017120509 | 5
2017120510 | 10
I'd like something that gives me the last 24 hours, even the 0/NULL values like
byhour | avg sales
2017120501 | null
2017120502 | null
2017120503 | null
2017120504 | null
2017120505 | null
2017120506 | null
2017120507 | null
2017120508 | 10
2017120509 | 5
2017120510 | 10
...
2017120600 | null
Does anyone have any ideas how I could do this in MySQL?

Join the result on a table that you know contains all the desired hours
someting like this:
SELECT
* FROM (
SELECT
DATE_FORMAT(s.datetime, '%Y%m%d%H') AS 'byhour'
FROM
table_that_has_hours
GROUP BY byhour) hours LEFT OUTER JOIN (
SELECT
DATE_FORMAT(s.datetime, '%Y%m%d%H') AS 'byhour',
AVG(s.sale) AS 'avg sales'
FROM
sales s
GROUP BY byhour) your_stuff ON your_stuff.byhour = hours.by_hours
if you don't have a table like that you can create one.
like this:
CREATE TABLE ref (h INT);
INSERT INTO ref (h)
-> VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),
-> (12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23)
and then you can just DATE_FORMAT(date(now()),'%Y%m%d%H') to the values

Related

Getting records by date is multiples of 30 days

I have the following query to get appointments that need remind once a month if they are not done yet. I want to get records with 30, 60, 90, 120,etc... in the past from the current date.
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.appointment_date) % 30 = 0
is there another way not to use DATEDIFF to achieve this? I want to increase the performance of this query.
Ok, lets all put the dates and date-diff aside for a moment. Looking at the question, the person is trying to look for all appointments in the past that dont necessarily have another in the future. Such as doing a FOLLOW-UP appointment with a Dr. "Come back in a month to see where things change". This points me to thinking there is probably some patient ID in the table of appointments. So this probably turns the question to looking at the past 30, 60 or 90 days ago to see if there was a corresponding appointment scheduled in the future. If already scheduled, the patient does not need a call reminder to get into the office.
That said, I would start a bit differently, get all patients that DID have an appointment within the last 90 days, and see if they already have (or not) a follow-up appointment already on the schedule for the follow-up. This way, the office person can make contacts with said patients to get on the calendar.
start by getting all maximum appointments for any given patient within the last 90 days. If someone had an appointment 90 days ago, and had a follow-up at 59 days, then they probably only care about the most recent appointment to make sure THAT has the follow-up.
select
a1.patient_id,
max( a1.appointment_date ) MostRecentApnt
from
appointments a1
WHERE
a1.appointment_date > date_sub( a1.appointment_date, interval 90 day )
group by
a1.patient_id
Now, from this fixed list and beginning date, all we care is, how many days to current is there last appointment. IS it X number of days? Just use datediff and sort. You can visually see the how many days. By trying to break them into buckets of 30, 60 or 90 days, just knowing how many days since the last appointment is probably just as easy as sorting in DESCENDING order with the oldest appointments getting called on first, vs those that just happened. Maybe even cutting off the calling list at say 20 days and still has not made an appointment and getting CLOSE to the expected 30 days in question.
SELECT
p.LastName,
p.FirstName,
p.Phone,
Last90.Patient_ID,
Last90.MostRecentApnt,
DATEDIFF(CURDATE(), Last90.appointment_date) LastAppointmentDays
FROM
( select
a1.patient_id,
max( a1.appointment_date ) MostRecentApnt
from
appointments a1
WHERE
a1.appointment_date > date_sub( a1.appointment_date, interval 90 day )
group by
a1.patient_id ) Last90
-- Guessing you might want patient data to do phone calling
JOIN Patients p
on Last90.Patient_id = p.patient_id
order by
Last90.MostRecentApnt DESC,
p.LastName,
p.FirstName
Sometimes, having an answer just for the direct question doesnt get the correct need. Hopefully I am more on-target with the desired ultimate outcome needs. Again, the above implies joining to the patient table for follow-up call purposes to schedule an appointment.
You could use the following query which compares the day of the month of the appointement to the day of the month of today.
We also test whether we are the last day of the month so as to get appointements due at the end of the month. For example if we are the 28th February (not a leap year) we will accept days of the month >= 28, ie 29, 30 & 31, which would otherwise be missed.
This method has the same problem as your current system, that appointements falling during the weekend will be missed.
select a.*
from appointements a,
(select
day(now()) today,
case when day(now())= last_day(now()) then day(now()) else 99 end lastDay
) days
where d = today or d >= lastDay;
You just want the appointments for 30 days in the future? Are they stored as DATE? Or DATETIME? Well, this works in either case:
SELECT ...
WHERE appt_date >= CURDATE() + INTERVAL 30 DAY
AND appt_date < CURDATE() + INTERVAL 31 DAY
If you have INDEX(appt_date) (or any index starting with appt_date), the query will be efficient.
Things like DATE() are not "sargable", and prevent the use of an index.
If your goal is to nag customers, I see nothing in your query to prevent nagging everyone over and over. This might need a separate "nag" table, where customers who have satisfied the nag can be removed. Then performance won't be a problem, since the table will be small.
If your primary concern is to speed up this query we can add a column int for comparing the number of days and index it. We then add triggers to calculate the modulus of the datediff between the start of the Unix period: 01/01/1970 (or any other date if you prefer) and store the result in this column.
This will take a small amount of storage space, and slow down insert and update operations. This will not be noticable when we add or modify one appointment at the time, which I suspect to be the general case.
When we query our table we calculate the day value of today, which will take very little time as it will only be done once, and compare it with the days column which will be very quick because it is indexed and there are no calculations involved.
Finally we run your current query and look at it using explain to see that, even though we have indexed the column date_ , the index cannot be used for this query.
CREATE TABLE appointments (
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
date_ date,
days int
);
CREATE INDEX ix_apps_days ON appointments (days);
✓
✓
CREATE PROCEDURE apps_day()
BEGIN
UPDATE appointments SET days = day(date_);
END
✓
CREATE TRIGGER t_apps_insert BEFORE INSERT ON appointments
FOR EACH ROW
BEGIN
SET NEW.days = DATEDIFF(NEW.date_, '1970-01-01') % 30 ;
END;
✓
CREATE TRIGGER t_apps_update BEFORE UPDATE ON appointments
FOR EACH ROW
BEGIN
SET NEW.days = DATEDIFF(NEW.date_, '1970-01-01') % 30 ;
END;
✓
insert into appointments (date_) values ('2022-01-01'),('2022-01-01'),('2022-04-15'),(now());
✓
update appointments set date_ = '2022-01-12' where id = 1;
✓
select * from appointments
id | date_ | days
-: | :--------- | ---:
1 | 2022-01-12 | 14
2 | 2022-01-01 | 3
3 | 2022-04-15 | 17
4 | 2022-04-22 | 24
select
*
from appointments
where DATEDIFF(CURDATE() , '1970-01-01') % 30 = days;
id | date_ | days
-: | :--------- | ---:
4 | 2022-04-22 | 24
explain
select DATEDIFF(CURDATE() , '1970-01-01')
from appointments
where DATEDIFF(CURDATE() , '1970-01-01') = days;
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-: | :---------- | :----------- | :--------- | :--- | :------------ | :----------- | :------ | :---- | ---: | -------: | :----------
1 | SIMPLE | appointments | null | ref | ix_apps_days | ix_apps_days | 5 | const | 1 | 100.00 | Using index
CREATE INDEX ix_apps_date_ ON appointments (date_);
✓
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.date_) % 30 = 0
id | date_ | days
-: | :--------- | ---:
4 | 2022-04-22 | 24
explain
SELECT
a.*
FROM
appointments a
WHERE
DATEDIFF(CURDATE(), a.date_) % 30 = 0
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-: | :---------- | :---- | :--------- | :--- | :------------ | :--- | :------ | :--- | ---: | -------: | :----------
1 | SIMPLE | a | null | ALL | null | null | null | null | 4 | 100.00 | Using where
db<>fiddle here

Need an aggregate MySQL select that iterates virtually across date ranges and returns bills

I have a MySQL table named rbsess with columns RBSessID (key), ClientID (int), RBUnitID (int), RentAmt (fixed-point int), RBSessStart (DateTime), and PrevID (int, references to RBSessID).
It's not transactional or linked. What it does track when a client was moved into a room and what the rent at the time of move in was. The query to find what the rent was for a particular client on a particular date is:
SET #DT='Desired date/time'
SET #ClientID=Desired client id
SELECT a.RBSessID
, a.ClientID
, a.RBUnitID
, a.RentAmt
, a.RBSessStart
, b.RBSessStart AS RBSessEnd
, a.PrevID
FROM rbsess a
LEFT
JOIN rbsess b
ON b.PrevID=a.RBSessID
WHERE a.ClientID=#ClientID
AND (a.RBSessStart<=#DT OR a.RBSessStart IS NULL)
AND (b.RBSessStart>#DT OR b.RBSessStart IS NULL);
This will output something like:
+----------+----------+----------+---------+---------------------+-----------+--------+
| RBSessID | ClientID | RBUnitID | RentAmt | RBSessStart | RBSessEnd | PrevID |
+----------+----------+----------+---------+---------------------+-----------+--------+
| 2 | 4 | 1 | 57500 | 2020-11-22 00:00:00 | NULL | 1 |
+----------+----------+----------+---------+---------------------+-----------+--------+
I also have
SELECT * FROM rbsess WHERE rbsess.ClientID=#ClientID AND rbsess.PrevID IS NULL; //for finding the first move in date
SELECT TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT)) AS CountDays; //for finding the number of days until the end of the month
SELECT DAY(LAST_DAY(#DT)) AS MaxDays; //for finding the number of days in the month
SELECT (TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT))+1)/DAY(LAST_DAY(#DT)) AS ProRateRatio; //for finding the ratio to calculate the pro-rated rent for the move-in month
SELECT ROUND(40000*(SELECT (TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT))+1)/DAY(LAST_DAY(#DT)) AS ProRateRatio)) AS ProRatedRent; //for finding a pro-rated rent amount based on a rent amount.
I'm having trouble putting all of these together to form a single query that can output pro-rated and full rent amounts based on a start date and an optional end date all rent owed amounts in a single statement for each month in the period. I can add a payments table received and integrate it afterwards, just having a hard time with this seemingly simple real-world concept in a MySQL query. I'm using php with a MySQL back end. Temporary tables as intermediary queries are more than acceptable.
Even a nudge would be helpful. I'm not super-experienced with MySQL queries, just your basic CREATE, SELECT, INSERT, DROP, and UPDATE.
Examples as requested by GMB:
//Example data in rbsess table:
+----------+----------+----------+---------+---------------------+--------+
| RBSessID | ClientID | RBUnitID | RentAmt | RBSessStart | PrevID |
+----------+----------+----------+---------+---------------------+--------+
| 1 | 4 | 1 | 40000 | 2020-10-22 00:00:00 | NULL |
| 2 | 4 | 1 | 57500 | 2020-11-22 00:00:00 | 1 |
| 3 | 2 | 5 | 40000 | 2020-11-29 00:00:00 | NULL |
+----------+----------+----------+---------+---------------------+--------+
Expected results would be a list of the rent amounts owed for every month, including pro-rated amounts for partial occupancy in a month, from a date range of months. For example for the example data above for a date range spanning all of the year of 2020 from client with ClientID=4 the query would produce an amount for each month within the range similar to:
Month | Amt
2020-10-1 | 12903
2020-11-1 | 45834
2020-12-1 | 57500

how to select two different date in mysql?

I'm new to database and MySQL. I'm developing a stock tracking software backend with MySQL database. I have a problem in the MYSQL query.
I need to track price change for a certain period. how can I select two different column and row based on date. for example I need MYSQL to return 'open' value from date '2019-02-27' and 'close' value from date '2019-03-01';
and calculate the % differences in between two value(which is Decimal).
Is it possible to do this kind of query in MYSQL or I should write program which send two query. one to get 'open' from '2019-02-27' and other to get 'close' from '2019-03-01'.
here is the SQL fiddle for my problem http://sqlfiddle.com/#!9/eb23e3/6
here is any example table
symbol | date | open | close | low | high
----------------------------------------------------
HCL | 2019-02-27 | 36.00 | 38.00 | 34.00 | 40.00
HCL | 2019-02-28 | 37.00 | 39.00 | 36.00 | 41.00
HCL | 2019-03-01 | 38.00 | 42.00 | 37.00 | 46.00
how can I get 'open' from date '2019-02-27' AND 'close' from date '2019-03-01'
and then calculated the % difference like (2019-02-27) 'open' value is 36.00 and (2019-03-01) 'close' value is 42.00 so the % percentage difference is +16.6%.
With this you get the 3 needed columns:
select t.*,
concat(
case when t.close > t.open then '+' else '' end,
truncate(100.0 * (t.close - t.open) / t.open, 1)
) percentdif
from (
select
(select open from dailyprice where date = '2019-02-27') open,
(select close from dailyprice where date = '2019-03-01') close
) t
See the demo
I suggest to run two different queries and then calculate the difference of the returned values

Select a row for every date in the table, no matter the data

I have the following 3 tables in my database:
noobs
id
name
img_url
associations_id
noobs_has_points
noobs_id
points_id
points
id
amount
create_time (as UNIX timestamp)
I want to get a result for every day (such as FROM_UNIXTIME(points.create_time,'%Y-%m-%d')). And in that result I want the noobs.id and his amount of points so SUM(points.amount). So whether a noob has actually scored points on that day doesn't matter, if he did not I would want a row with 0 in there as the amount, so that for every day I get to see how many points each noob scored.
However, I have no idea how to get this result. I have tried some things with left/right (or unioned) joins but I don't get the result I want. Can anyone help me with this?
Example results:
day | points.amount | noobs.id
2015-04-11 | 3 | 1
2015-04-11 | 0 | 2 (no points scored, no entry in database)
2015-04-12 | 0 | 1 (no points scored, no entry in database)
2015-04-12 | 1 | 2
Some sample data from the three tables:
Noobs
id | name | img_url | associations_id
1 | Rien | NULL | 1
2 | Peter| NULL | 1
noobs_has_points
noobs_id | points_id
1 | 1
2 | 3
points
id | amount | create_time
1 | 3 | 1428779292
2 | 1 | 1428805351
Because there may be no dara for a given day for a given noob, you need a way to generate date values. Unfortunately, mysql doesn't have a built-in way to do this. You can code a range into the query with a series if unions as a subquery, but it's ugly and not scalable.
I recommend creating a table to hold date values:
create table dates(_date date not null primary key);
And populating it with lots of dates (say everything from 1970-2020).
Then you can code:
select _date day, sum(p.amount) total, n.id
from dates d
cross join noobs n
left join noobs_has_points np on np.noob_id = n.id
left join points p on p.id = np.points_id
and date(p.create_time) = _date
where _date between ? and ?
group by 1, 3
The cross join gives every noob a result for every date in the specified range, while to left joins ensure a zero for days without points for the noob.

Can't figure out a proper MySQL query

I have a table with the following structure:
id | workerID | materialID | date | materialGathered
Different workers contribute different amounts of different material per day. A single worker can only contribute once a day, but not necessarily every day.
What I need to do is to figure out which of them was the most productive and which of them was the least productive, while it is supposed to be measured as AVG() material gathered per day.
I honestly have no idea how to do that, so I'll appreciate any help.
EDIT1:
Some sample data
1 | 1 | 2013-01-20 | 25
2 | 1 | 2013-01-21 | 15
3 | 1 | 2013-01-22 | 17
4 | 1 | 2013-01-25 | 28
5 | 2 | 2013-01-20 | 23
6 | 2 | 2013-01-21 | 21
7 | 3 | 2013-01-22 | 17
8 | 3 | 2013-01-24 | 15
9 | 3 | 2013-01-25 | 19
Doesn't really matter how the output looks, to be honest. Maybe a simple table like that:
workerID | avgMaterialGatheredPerDay
And I didn't really attempt anything because I literally have no idea, haha.
EDIT2:
Any time period that is in the table (from earliest to latest date in the table) is considered.
Material doesn't matter at the moment. Only the arbitrary units in the materialGathered column matter.
As in your comments you say that we look at each worker and consider their avarage daily working skill, rather than checking which worked most in a given time, the answer is rather easy: Group by workerid to get a result record per worker, use AVG to get their avarage amount:
select workerid, avg(materialgathered) as avg_gathered
from work
group by workerid;
Now to the best and worst workers. These can be more than two. So you cannot just take the first or last record, but need to know the maximum and the minimum avg_gathered.
select max(avg_gathered) as max_avg_gathered, min(avg_gathered) as min_avg_gathered
from
(
select avg(materialgathered) as avg_gathered
from work
group by workerid
);
Now join the two queries to get all workers that worked the avarage minimum or maximum:
select work.*
from
(
select workerid, avg(materialgathered) as avg_gathered
from work
group by workerid
) as worker
inner join
(
select max(avg_gathered) as max_avg_gathered, min(avg_gathered) as min_avg_gathered
from
(
select avg(materialgathered) as avg_gathered
from work
group by workerid
)
) as worked on worker.avg_gathered in (worked.max_avg_gathered, worked.min_avg_gathered)
order by worker.avg_gathered;
There are other ways to do this. For example with HAVING avg(materialgathered) IN (select min(avg_gathered)...) OR avg(materialgathered) IN (select max(avg_gathered)...) instead of a join. The join is very effective though, because you need just one select for both min and max.