Suppose i have a table like this
id user_id activity_id start_time duration
1 1 1 2015-12-02 12:24:22 00:17:25
1 1 2 2015-12-02 12:25:22 00:17:25
1 1 3 2015-12-02 12:26:22 00:17:25
1 1 4 2015-12-02 12:26:22 00:17:25
1 1 4 2015-12-02 12:27:22 00:17:25
1 1 4 2015-12-02 12:29:22 00:17:25
1 1 4 2015-12-02 12:33:22 00:17:25
Now,suppose i need a query something like which count the number within 3 minute from each other-for example
like
12:24:22 is 4 it get count 24 25 and 26
12:25:22 is 2 it get count 25 and 24
12:25:22 is 4 it get count 25 26 and 27
12:26:22 is 4 it get count 24 25 and 26
12:26:22 is 3 it get count 26 and 27 ie it get count of near by 3 minute.
(but in Actually, i will have interval time is 10 minute).
i need to count nearby every minute within interval from each.i see some solution like
MySQL GROUP BY DateTime +/- 3 seconds. But i dont get how actually i can apply it in my suation.Can you please give me some hints to how to works with this.
Thanks in advance.
How about something like:
SELECT t1.start_time, COUNT(*) FROM table AS t1
CROSS JOIN table AS t2
WHERE ABS(TIMESTAMPDIFF(MINUTE, t1.start_time, t2.start_time)) < 10
GROUP BY t1.start_time
Note this would give matches with start_times within 10 minutes BOTH sides of the stated time.
SELECT a.*, COUNT(a.start_time) AS TOT
FROM `your_table`AS a
INNER JOIN `your_table` AS b
WHERE ABS(TIME_TO_SEC(a.start_time) - TIME_TO_SEC(b.start_time)) < 180
GROUP BY a.start_time
Where 180 = 3 minutes(180sec)
#Jonathan Twite answer works too
Related
I have a table
DAY 1
ID
amount
DATE
1
10
12-02-2020
2
15
12-02-2020
3
20
12-02-2020
4
25
12-02-2020
I did a sum of the amount on day one which turns out to be 70
Now next day I have few more rows where the amount is UPDATED an APPENDED
New tables looks like this
DAY 2
ID
amount
DATE
1
10
12-02-2020
2
20
13-02-2020
3
20
12-02-2020
4
25
12-02-2020
5
30
13-02-2020
6
35
14-02-2020
Now if you see the ID 2 has new updates amount which is 20 earlier 15
and it has new data from dates 13 and 14 on ID 5 and 6
Can I just run a query where it will only process the changed data and add it to the
previous sum
so like 30+35+5(as only 5 increased from the last value)
total = 70
Mainly to process changed data
This will very much depend on how the historical data will be provided.
This example requires additional Day column in the historical data table AND that you're using a MySQL version that supports LAG() (e.g. MySQL v8+ OR MariaDB 10.3+). Let's assume that it's possible for the historical data table to be like this:
ID
Amount
Date
Day
1
10
2020-02-12
1
2
15
2020-02-12
1
3
20
2020-02-12
1
4
25
2020-02-12
1
1
10
2020-02-12
2
2
20
2020-02-13
2
3
20
2020-02-12
2
4
25
2020-02-12
2
5
30
2020-02-13
2
6
35
2020-02-14
2
.. then maybe a query like this:
SELECT Day,
SUM(amount) AS Total,
SUM(amount)-LAG(SUM(amount)) OVER (ORDER BY Day) AS diff
FROM historical_data
GROUP BY Day
ORDER BY Day;
OR (in for MariaDB):
SELECT Day, Total,
Total-LAG(Total) OVER (ORDER BY Day) AS Diff
FROM
(SELECT Day,
SUM(amount) AS Total
FROM historical_data
GROUP BY Day) A;
This will return result like:
Day
Total
diff
1
70
2
140
70
I was following an example from this site on how to use LAG() to get the row data value above it an using them to subtract the SUM(amount) value for that day.
Here's a demo fiddle of the experiment.
Basically I am trying to calculate shots received in golf for various four balls, here is my data:-
DatePlayed PlayerID HCap Groups Hole01 Hole02 Hole03 Shots
----------------------------------------------------------------------
2018-11-10 001 15 2 7 3 6
2018-11-10 004 20 1 7 4 6
2018-11-10 025 20 2 7 4 5
2018-11-10 047 17 1 8 3 6
2018-11-10 048 20 2 8 4 6
2018-11-10 056 17 1 6 3 5
2018-11-10 087 18 1 7 3 5
I want to retrieve the above lines with an additional column which is to be calculated depending on the value in the group column, which is the players (Handicap - (the lowest handicap in the group)) x .75
I can achieve it in a group by but need to aggregate everything, is there a way I can return the value as above?, here is query that returns the value:
SELECT
PlayerID,
MIN(Handicap),
MIN(Hole01) AS Hole01,
MIN(Hole02) AS Hole02,
MIN(Hole03) AS Hole03,
MIN(CourseID) AS CourseID,
Groups,
ROUND(
MIN((Handicap -
(SELECT MIN(Handicap) FROM Results AS t
WHERE DatePlayed='2018-11-10 00:00:00' AND t.Groups=Results.Groups)) *.75))
AS Shots
FROM
Results
WHERE
Results.DatePlayed='2018=11=10 00:00:00'
GROUP BY
DatePlayed, Groups, PlayerID
.
PlayerID MIN(Handicap)Hole01 Hole02 Hole03 CourseID Groups Shots
-----------------------------------------------------------------
4 20 7 4 6 1 1 2
47 17 8 3 6 1 1 0
56 17 6 3 5 1 1 0
87 18 7 3 5 1 1 1
1 15 7 3 6 1 2 0
25 20 7 4 5 1 2 4
48 20 8 4 6 1 2 4
Sorry about any formatting really couldn't see how to get my table in here, any help will be much appreciated, I am using the latest mysql from ubuntu 18.04
Not an answer; too long for a comment...
First off, I happily know nothing about golf, so what follows might not be optimal, but it must, at least, be a step in the right direction...
A normalized schema might look something like this...
rounds
round_id DatePlayed PlayerID HCap Groups
1 2018-11-10 1 15 2
2 2018-11-10 4 20 1
round_detail
round_id hole shots
1 1 7
1 2 3
1 3 6
2 1 7
2 2 4
2 3 6
Hi Guys I have found the solution, basically I need to drop the MIN immediately after the ROUND of the equation and therefore it does not need a Group By.
SELECT
PlayerID,
Handicap,
Hole01,
Hole02,
Hole03,
CourseID,
Groups,
ROUND((Handicap -
(SELECT MIN(Handicap) FROM Results AS t
WHERE DatePlayed='2018-11-10 00:00:00'
AND t.Groups=Results.Groups))
*.75) AS Shots
FROM
Results
WHERE
Results.DatePlayed='2018=11=10 00:00:00'
I'm learning HIVE these days and meet some problems...
I have a table called SAMPLE:
USER_ID PRODUCT_ID NUMBER
1 3 20
1 4 30
1 2 25
1 6 50
1 5 40
2 1 10
2 3 15
2 2 40
2 5 30
2 3 35
How can I use HIVE to group table by user_id and in each group order the records by DESC order of NUMBER and in each group I want to keep up to 3 records.
The result I want to have is like:
USER_ID PRODUCT_ID NUMBER(optional column)
1 6 50
1 5 40
1 4 30
2 2 40
2 3 35
2 5 30
or
USER_ID PRODUCT_IDs
1 [6,5,4]
2 [2,3,5]
Could someone help me ?..
Thanks very much!!!!!!!!!!!!!!!!
try this,
select user_id,product_id,number
from(
select user_id,product_id,number, ROW_NUMBER() over (Partition BY user_id) as RNUM
from (
select user_id, number,product_id
from SAMPLE
order by number desc
) t) t2
where RNUM <=3
output
1 6 50
1 5 40
1 4 30
2 2 40
2 3 35
2 5 30
hive version should be 0.11 or greater, may I know if your version is lower
My table votes contains votes that have been made by users at different times:
id item_id position user_id created_at
1 2 0 1 11/21/2013 11:27
26 1 1 1 11/21/2013 11:27
27 3 2 1 11/21/2013 11:27
42 2 2 1 12/7/2013 2:20
41 3 1 1 12/7/2013 2:20
40 1 0 1 12/7/2013 2:20
67 2 2 1 12/13/2013 1:13
68 1 1 1 12/13/2013 1:13
69 3 0 1 12/13/2013 1:13
84 2 0 1 12/28/2013 2:29
83 3 2 1 12/28/2013 2:29
82 1 1 1 12/28/2013 2:29
113 3 0 1 1/17/2014 22:08
114 1 1 1 1/17/2014 22:08
115 2 2 1 1/17/2014 22:08
138 2 0 1 1/20/2014 16:49
139 1 1 1 1/20/2014 16:49
140 3 2 1 1/20/2014 16:49
141 1 1 11 1/20/2014 16:51
142 3 2 11 1/20/2014 16:51
143 2 0 11 1/20/2014 16:51
I need to tally the results on a monthly basis but here's the tricky part: the start/end of the month does not necessarily fall on the first day of the month. So if the votes are due on the 10th day of every month, I need a vote that was cast on the 10th to be in a different group from a vote that was cast on the 11th. Using the data above, I want to get three groups:
Group 1: 6 votes (11/21 and 12/7)
Group 2: 6 votes (12/13, 12/28)
Group 3: 9 votes (1/17, 1/20)
I've tried a lot of approaches but to no avail. This is my query right now:
select created_at, ADDDATE(DATE_FORMAT(created_at, '%Y-%m-01'),interval 10 day) as duedate,count("id") from votes where list_id = 2 group by duedate
I am getting group sizes of 3, 9, and 9, not 6, 6 and 9. Any help you can provide would be much appreciated. Thanks.
Your query is close. You just need to subtract 9 days (10 - 1) from the current day to get the month:
select created_at, date_sub(created_at, interval 9 day) as duedate,
count(id)
from votes
where list_id = 2
group by duedate;
date_format() converts a date to a string. There is no need to convert a date value to a character value for this query.
EDIT:
To group by month:
select date_format(date_sub(created_at, interval 9 day), '%Y-%m') as YYYYMM,
count(id)
from votes
where list_id = 2
group by YYYYMM;
i have a database with workers, stations and session. A session describes at which time which worker has been on which station. I managed to build a query that gives me the duration of the overlap of each session.
SELECT
sA.station_id,
sA.worker_id AS worker1,
sB.worker_id AS worker2,
SEC_TO_TIME(
TIME_TO_SEC(LEAST(sA.end,sB.end)) - TIME_TO_SEC(GREATEST(sA.start,sB.start))
) AS overlap
FROM
`sessions` AS sA,
`sessions` AS sB
WHERE
sA.station_id = sb.station_id
AND
sA.station_id = 6
AND (
sA.start BETWEEN sB.start AND sB.end
OR
sA.end BETWEEN sB.start AND sB.end
)
With this query i get an result like this
station_id worker1 worker2 overlap
6 1 1 09:00:00
6 2 1 02:30:00
6 5 1 00:00:00
6 1 1 09:00:00
6 2 1 01:30:00
6 3 1 09:00:00
...
6 12 3 02:00:00
6 14 3 01:00:00
6 17 3 02:00:00
...
What i would like now is to sum up the overlap for every combination of worker1 and worker2 to get the overall overlap duration.
I tried different ways of using SUM() and GROUP BY but i never got the wanted result.
SELECT
...
SEC_TO_TIME(
**SUM**(TIME_TO_SEC(LEAST(sA.end,sB.end)) - TIME_TO_SEC(GREATEST(sA.start,sB.start)))
) AS overlap
...
#has as result
station_id worker1 worker2 overlap
6 1 1 838:59:59
#in combination with
GROUP BY
worker1
#i get
station_id worker1 worker2 overlap
6 1 1 532:30:00
6 2 1 -33:00:00
6 3 1 270:30:00
6 5 1 598:30:00
6 6 1 542:00:00
6 7 1 508:00:00
6 8 5 53:00:00
6 9 1 54:30:00
6 10 1 310:00:00
6 11 1 -108:00:00
6 12 1 593:30:00
6 14 1 97:30:00
6 15 1 -53:30:00
6 17 1 293:30:00
the last result is close but i am still missing a lot of combinations. I also dont understand why the combination 8 - 5 is displayed.
thanks for ur help (and time to read)
aaargh, sorry for my stupidity, the solution was fairly simple
....
SUM(((UNIX_TIMESTAMP(LEAST(sA.end,sB.end))-UNIX_TIMESTAMP(GREATEST(sA.start,sB.start)))/3600))
...
GROUP BY station_id, worker1, worker2
ORDER BY worker1, worker2
i switched to using timestamps and transforming it to hours by /3600 because my former used approach with TIME_TO_SEC and SEC_TO_TIME only used the TIME part of the DATETIME field and thereby produced some wrong numbers. With MySQL 5.5 i could use TO_SECONDS but unfortunately my server is still runing 5.1.