SQL for last changed value before given index - mysql

My table tracks changes to a value that is otherwise constant.
For example this is what it might look like if resources 1 and 2 start with values of 100 and 50 respectively - then resource 1 increases to 110 on day 5 (but resource 2 is unchanged), then resource 2 changes to 60 on day 7.
resource_id | date_id | value
------------+---------+------
1 | 1 | 100
2 | 1 | 50
1 | 5 | 110
2 | 7 | 60
Is there a simple query to get the value of the resources on a specific day? Something like:
SELECT resource_id, date_id, value FROM Resources WHERE ??? -- day = 6
resource_id | date_id | value
------------+---------+------
1 | 5 | 110
2 | 1 | 50
Note: I don't want interpolation of the values - just the last set value.

I think this should work:
SELECT resource_id,
date_id,
value
FROM resources,
(SELECT resource_id,
Max(date_id) AS date_id
FROM resources
WHERE date_id <= 6
GROUP BY resource_id ) temp
WHERE resources.resource_id = temp.resource_id
AND temp.date_id = resources.date_id
SQLFiddle Demo

Related

MySQL Query to get only one entry per interval from database

I have a table with this structure and some sample values:
ID | created | value | person
1 | 1 | 5 | 1
2 | 2 | 2 | 2
3 | 3 | 3 | 3
4 | 4 | 5 | 1
5 | 5 | 1 | 2
6 | 6 | 32 | 3
7 | 7 | 9 | 1
8 | 8 | 34 | 2
10 | 9 | 25 | 3
11 | 11 | 53 | 1
12 | 12 | 52 | 2
13 | 13 | 15 | 3
... etc
The created column will have timestamps. I.e. A number like "1555073978". I just made it incremental to demonstrate that the timestamps will rarely be the same.
So values are stored per person with creation times. Values are added every minute. After a week, this table is quite big. So when I do a query to draw a graph, PHP run's out of memory because the dataset is so huge.
So what I am looking for, is an easy way to do a query on a table like this, so that I get values in smaller intervals.
How would I query this table, so that I get:
- only one value per person per interval
- where interval should be 15 mins, 30 mins, 60 mins etc (i.e. a parameter in the query)
I've started with an approach but don't want to spend too much time, in case i am missing a much easier way. My way involves converting the timestamp to YEAR-MONTH-DAY-HOUR, but this will only work for hourly. I am also struggling to make sure that the query returns the MOST RECENT entry PER PERSON for that hour.
Any help would be greatly appreciated.
assuming your created column is a timestamp and you want the max value for person every each 15 minutes you could try
select person, max(value)
from my_table
group by person, FLOOR(UNIX_TIMESTAMP(created )/(15 * 60))
but if you dont need unix_timestamp
then
group by person, FLOOR(created /(15 * 60))
If you want the most recent values for person and interval then you could use
select * from my_table m
inner join (
select person, max(created) max_created
from my_table
group by person, FLOOR(UNIX_TIMESTAMP(created )/(15 * 60))
) t on t.person = m.person and t.max_created = m.created

SQL calculate timediff between intervals including a time from a separate table

I have 2 different tables called observations and intervals.
observations:
id | type, | start
------------------------------------
1 | classroom | 2017-06-07 16:18:40
2 | classroom | 2017-06-01 15:12:00
intervals:
+----+----------------+--------+------+---------------------+
| id | observation_id | number | task | time |
+----+----------------+--------+------+---------------------+
| 1 | 1 | 1 | 1 | 07/06/2017 16:18:48 |
| 2 | 1 | 2 | 0 | 07/06/2017 16:18:55 |
| 3 | 1 | 3 | 1 | 07/06/2017 16:19:00 |
| 4 | 2 | 1 | 3 | 01/06/2017 15:12:10 |
| 5 | 2 | 2 | 1 | 01/06/2017 15:12:15 |
+----+----------------+--------+------+---------------------+
I want a view that will display:
observation_id | time_on_task (total time in seconds where task = 1)
1 | 13
2 | 5
So I must first check to see if the first observation has task = 1, if it is I must record the difference between the current interval and the start from the observations table, then add that to the total time. From there on after if the task = 1, I just add the time difference from the current interval and previous interval.
I know I can use:
select observation_id, TIME_TO_SEC(TIMEDIFF(max(time),min(time)))
from your_table
group by observation_id
to find the total time in the intervals table between all intervals outside of the first one.
But
1. I need to only include interval times where task = 1. (The endtime for the interval is the one listed)
2. Need the timediff between the first interval and initial start (from observations table) if number = 1
I'm still new to the Stackoverflow community, but you could try to use SQL
LAG() function
For instance
Using an outer Select Statement
SELECT COl1, COL2, (DATEDIFF(mi, Inner.prevtime, Currentdatetime,0)) AS Difference
FROM ( SELECT LAG(Created_Datetime) OVER (ORDER BY Created_Datetime) AS prevtime
From MyTable
Where SomeCondition) as Inner
Sorry if it looks goofy, still trying to learn to format code here.
https://explainextended.com/2009/03/12/analytic-functions-optimizing-lag-lead-first_value-last_value/
Hope it helps

SQL query SUM() AND GROUP BY

I have a MySQL table like this:
acco_id | room_id | arrival | amount | persons | available
1 | 1 | 2015-19-12 | 3 | 4 | 1
1 | 2 | 2015-19-12 | 1 | 10 | 1
1 | 1 | 2015-26-12 | 4 | 4 | 1
1 | 2 | 2015-26-12 | 2 | 10 | 1
2 | 3 | 2015-19-12 | 2 | 6 | 0
2 | 4 | 2015-19-12 | 1 | 4 | 1
What im trying to achieve is a single query with a result like:
acco_id | max_persons_available
1 | 22
2 | 4
I tried using a GROUP BY accommodation_id using a query like:
SELECT
accommodation_id,
SUM(amount * persons) as max_persons_available
FROM
availabilities
WHERE
available = 1
GROUP BY
accommodation_id
Only now the result of acco_id uses all arrival dates. When I add arrival to the query no more unique acco_id's.
Does anyone know a good Single SQL which can use the table indexes?
If I'm understanding the question correct (the last part is a bit confusing). You want to have the accomodation id and numbers as you have now but limited to specific arrival dates.
If so the following statement should do exactly that as it is not necessary to put arrival into the select if you "just" use it in the where statement. As else you would need to put it into the group by and thus have non unique accomodation id's.
SELECT
accommodation_id,
SUM(amount * persons) as max_persons_available
FROM
availabilities
WHERE
available = 1 and arrival >= '2015-12-19' and arrival < '2015-10-26'
GROUP BY
accommodation_id
I guess (reading your question) what you are looking for is this but im not sure as your question is a bit unclear:
SELECT
accommodation_id,
arrival,
SUM(amount * persons) as max_persons_available
FROM
availabilities
WHERE
available = 1
GROUP BY
accommodation_id, arrival

MySQL query based on time range, group users, and sum values over a sliding window

I want to create a new Table B based on the information from another existing Table A. I'm wondering if MySQL has the functionality to take into account a range of time and group column A values then only sum up the values in a column B based on those groups in column A.
Table A stores logs of events like a journal for users. There can be multiple events from a single user in a single day. Say hypothetically I'm keeping track of when my users eat fruit and I want to know how many fruit they eat in a week (7days) and also how many apples they eat.
So in Table B I want to count for each entry in Table A, the previous 7 day total # of fruit and apples.
EDIT:
I'm sorry I over simplified my given information and didn't thoroughly think my example.
I'm initially have only Table A. I'm trying to create Table B from a query.
Assume:
User/id can log an entry multiple times in a day.
sum counts should be for id between date and date - 7 days
fruit column stands for the total # of fruit during the 7 day interval ( apples and bananas are both fruit)
The data doesn't only start at 2013-9-5. It can date back 2000 and I want to use the 7 day sliding window over all the dates between 2000 to 2013.
The sum count is over a sliding window of 7 days
Here's an example:
Table A:
| id | date-time | apples | banana |
---------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 1 |
| 2 | 2013-9-5 09:00:00 | 1 | 0 |
| 1 | 2013-9-5 16:00:00 | 1 | 0 |
| 1 | 2013-9-6 08:00:00 | 0 | 1 |
| 2 | 2013-9-9 08:00:00 | 1 | 1 |
| 1 | 2013-9-11 08:00:00 | 0 | 1 |
| 1 | 2013-9-12 08:00:00 | 0 | 1 |
| 2 | 2013-9-13 08:00:00 | 1 | 1 |
note: user 1 logged 2 entries on 2013-9-5
The result after the query should be Table B.
Table B
| id | date-time | apples | fruit |
--------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 2 |
| 2 | 2013-9-5 09:00:00 | 1 | 1 |
| 1 | 2013-9-5 16:00:00 | 2 | 3 |
| 1 | 2013-9-6 08:00:00 | 2 | 4 |
| 2 | 2013-9-9 08:00:00 | 2 | 3 |
| 1 | 2013-9-11 08:00:00 | 2 | 5 |
| 1 | 2013-9-12 08:00:00 | 0 | 3 |
| 2 | 2013-9-13 08:00:00 | 2 | 4 |
At 2013-9-12 the sliding window moves and only includes 9-6 to 9-12. That's why id 1 goes from a sum of 2 apples to 0 apples.
You need years in your data to be able to use date arithmetic correctly. I added them.
There's an odd thing in your data. You seem to have multiple log entries for each person for each day. You're assuming an implicit order setting the later log entries somehow "after" the earlier ones. If SQL and MySQL do that, it's only by accident: there's no implicit ordering of rows in a table. Plus if we duplicate date/id combinations, the self join (read on) has lots of duplicate rows and ruins the sums.
So we need to start by creating a daily summary table of your data, like so:
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
This summary will contain at most one row per id per day.
Next we need to do a limited cross product self-join, so we get seven days' worth of fruit eating.
select --whatever--
from (
-- summary query --
) as a
join (
-- same summary query once again
) as b
on ( a.id = b.id
and b.`date` between a.`date` - interval 6 day AND a.`date` )
The between clause in the on gives us the seven days (today, and the six days prior). Notice that the table in the join with the alias b is the seven day stuff, and the a table is the today stuff.
Finally, we have to summarize that result according to your specification. The resulting query is this.
select a.id, a.`date`,
sum(b.apples) + sum(b.banana) as fruit_last_week,
a.apples as apple_today
from (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as a
join (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as b on (a.id = b.id and
b.`date` between a.`date` - interval 6 day AND a.`date` )
group by a.id, a.`date`, a.apples
order by a.`date`, a.id
Here's a fiddle: http://sqlfiddle.com/#!2/670b2/15/0
Assumptions:
one row per id/date
the counts should be for id between date and date - 7 days
"fruit" = "banana"
the "date" column is actually a date (including year) and not just month/day
then this SQL should do the trick:
INSERT INTO B
SELECT a1.id, a1.date, SUM( a2.banana ), SUM( a2.apples )
FROM (SELECT DISTINCT id, date
FROM A
WHERE date > NOW() - INTERVAL 7 DAY
) a1
JOIN A a2
ON a2.id = a1.id
AND a2.date <= a1.date
AND a2.date >= a1.date - INTERVAL 7 DAY
GROUP BY a1.id, a1.date
Some questions:
Are the above assumptions correct?
Does table A contain more fruits than just Bananas and Apples? If so, what does the real structure look like?

Counting messages per day (after 17:00 counts for next day)

I have two MySQL tables: stats (left) and messages (right)
+------------+---------+ +---------+------------+-----------+----------+
| _date | msgcount| | msg_id | _date | time | message |
+------------+---------+ +----------------------+-----------+----------+
| 2011-01-22 | 2 | | 1 | 2011-01-22 | 06:23:11 | foo bar |
| 2011-01-23 | 4 | | 2 | 2011-01-22 | 15:17:03 | baz |
| 2011-01-24 | 0 | | 3 | 2011-01-22 | 17:05:45 | foobar |
| 2011-01-25 | 1 | | 4 | 2011-01-22 | 23:58:13 | barbaz |
+------------+---------+ | 5 | 2011-01-23 | 00:06:32 | foo foo |
| 6 | 2011-01-23 | 13:45:00 | bar foo |
| 7 | 2011-01-25 | 02:22:34 | baz baz |
+---------+------------+-----------+----------+
I filled in stats.msgcount, but in reality it is still empty. I'm looking for a query way to:
count the number of messages for every stats._date (notice the zero msgcount on 2011-01-25)
messages.time is in 24-hour format. All messages AFTER 5 o'clock (17:00:00) should be counted for the next day (notice msg_id 3 and 4 count for 2011-01-23)
update stats.msgcount to hold all counts
I'm especially concerned about the "later than 17:00:00 count for next day" part. Is this possible in (My)SQL?
You could use:
UPDATE stats LEFT JOIN
( SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date ) mc
ON stats._date = mc.corrected_date
SET stats.msgcount = COALESCE( mc.message_count, 0 )
However this query requires dates you are interested in to be in the stats table already, if you don't have them make _date primary or unique key if its not yet and use:
INSERT IGNORE INTO stats(_date,msgcount)
SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date
Really, all you're doing is shifting the times by 7 hours. Something like this should work:
UPDATE stats s
SET count = (SELECT COUNT(msg_id) FROM messages m
WHERE m._date BETWEEN DATE_SUB(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 7 HOUR)
AND DATE_ADD(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 17 HOUR));
The basic idea is that it takes each date in your stats table, adjusts it by 7 hours, and looks for messages sent in that range. If you used a DATETIME column instead of separate DATE and TIME columns, you wouldn't need the extra DATE_ADD(..., TIME_TO_SEC) stuff.
There may be a better way to add a date and a time, I didn't see one with a quick look at the MySQL reference documents.
So all you'd need to do is insert a new row in the stats table with a 0 for the msgcount, and run the update command. If you only wanted to update a few days (since the message count probably isn't changing 6 days later) you just need a simple where clause on the update:
UPDATE stats s
SET ...
WHERE s._date BETWEEN '2012-04-03' AND '2012-04-08'