Double ORDER BY sort with UNION statement - mysql

(
SELECT *
FROM (
SELECT d
FROM myTable
WHERE id = "4h"
AND d < "2011-12-08 12:00:00"
ORDER BY d DESC
LIMIT 10
)tmp
ORDER BY d ASC
)
UNION (
SELECT d
FROM myTable
WHERE id = "4h"
AND d >= "2011-12-08 12:00:00"
ORDER BY d ASC
LIMIT 10
)
I'm trying to get the 10 results before and after a particular ID by using two SELECT statements and a UNION. The first SELECT uses ORDER BY DESC to get the 10 preceding and then I attempt to envelope that in a second ORDER BY ASC to get all the results in ASC order but for some reason it does not work.
Here is what I get currently for a result:
d
2011-12-08 08:00:00
2011-12-08 04:00:00
2011-12-08 00:00:00
2011-12-07 20:00:00
2011-12-07 16:00:00
2011-12-07 12:00:00
2011-12-07 08:00:00
2011-12-07 04:00:00
2011-12-07 00:00:00
2011-12-06 20:00:00 <- These top 10 results should ASC!
2011-12-08 12:00:00
2011-12-08 16:00:00
2011-12-08 20:00:00
2011-12-09 00:00:00
2011-12-09 04:00:00
2011-12-09 08:00:00
2011-12-09 12:00:00
2011-12-09 16:00:00
2011-12-09 20:00:00
2011-12-11 20:00:00
And here is what I want:
d
2011-12-06 20:00:00
2011-12-07 00:00:00
2011-12-07 04:00:00
2011-12-07 08:00:00
2011-12-07 12:00:00
2011-12-07 16:00:00
2011-12-07 20:00:00
2011-12-08 00:00:00
2011-12-08 04:00:00
2011-12-08 08:00:00
2011-12-08 12:00:00
2011-12-08 16:00:00
2011-12-08 20:00:00
2011-12-09 00:00:00
2011-12-09 04:00:00
2011-12-09 08:00:00
2011-12-09 12:00:00
2011-12-09 16:00:00
2011-12-09 20:00:00
2011-12-11 20:00:00

(
SELECT d
FROM myTable
WHERE id = '4h' AND d < '2011-12-08 12:00:00'
ORDER BY d DESC
LIMIT 10
) UNION ALL (
SELECT d
FROM myTable
WHERE id = '4h' AND d >= '2011-12-08 12:00:00'
ORDER BY d ASC
LIMIT 10
)
ORDER BY d ASC

Related

Group By 3 columns (JobId, StartTime, EndTime) for continuous days in MySQL

I want to group by the JobId, StartTime & EndTime only for continuous days. If a specific row doesn't form part of a range it should be discarded. The Id's should also pivot into a column per grouping.
Id
Date
StartTime
EndTime
JobId
1
2021-08-23
08:30:00
19:00:00
1
2
2021-08-24
08:30:00
19:00:00
1
3
2021-08-24
12:30:00
14:30:00
2
4
2021-08-24
15:30:00
19:00:00
1
5
2021-08-25
08:30:00
19:00:00
1
6
2021-08-25
12:30:00
14:30:00
2
7
2021-08-25
15:45:00
19:00:00
1
8
2021-08-26
08:30:00
09:30:00
1
9
2021-08-26
15:30:00
19:00:00
1
10
2021-08-26
10:30:00
11:00:00
1
11
2021-08-26
12:00:00
14:30:00
1
12
2021-08-27
08:30:00
09:30:00
1
13
2021-08-27
11:00:00
11:15:00
1
14
2021-08-27
11:30:00
14:30:00
1
15
2021-08-28
08:30:00
09:30:00
1
Using the above sample data you can see 3 groupings that can form such a continuous range.
Range 1 consists of Id's, 1,2 & 5 - 2021-08-23 to 2021-08-25, 08:30:00 to 19:00:00
Range 2 consists of Id's 3 & 6 - 2021-08-24 to 2021-08-25, 12:30:00 to 14:30:00
Range 3 consists of Id's 8, 12 & 15 - 2021-08-26 to 2021-08-28, 08:30:00 to 09:30:00
The end result should be:
JobId
StartDate
EndDate
StartTime
EndTime
Ids
1
2021-08-23
2021-08-25
08:30:00
19:00:00
1,2,5
2
2021-08-24
2021-08-25
12:30:00
14:30:00
3,6
1
2021-08-26
2021-08-28
08:30:00
09:30:00
8,12,15
MySQL 8.0.23
Assuming that JobId, `Date`, StartTime, EndTime is unique you may use:
SELECT JobId,
MIN(`Date`) StartDate,
MAX(`Date`) EndDate,
StartTime,
EndTime,
GROUP_CONCAT(Id) Ids
FROM test
GROUP BY JobId,
StartTime,
EndTime
HAVING COUNT(*) > 1
AND DATEDIFF(EndDate, StartDate) = COUNT(*) - 1
ORDER BY StartDate, StartTime
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=fce8590f72ac1d50cd9e89add3ed01e7

Spark scala joining with subquery with limit

I need to join two tables on fake_id but table 2 contains more than one matching records for fake_id so I need to match with record where table2.end_time >= table1.event_time and table2.start_time <= table1.event_time
If there are more than one record in table 2 matching this condition, I need to only consider latest by updated_time
Here is what I tried.
spark.sql("select t1.fake_id, t1.attribute_1,t1.event_time,t22.end_time from table1 t1 left outer join (
select fake_id, end_time from table2 t2 where t2.fake_id=t1.fake_id and t2.end_time >= t1.event_time and t2.start_time <= t1.event_time order by t2.updated_time desc limit 1)
as t22 on t1.fake_id=t22.fake_id")
For above statement spark throwing me error for unknown column t1.fake_id
Table.1 -
---------------------------------------------------------------------------
fake_id attribute_1 event_time
---------------------------------------------------------------------------
1 attr_val_11 2020-08-01 05:00:00
2 attr_val_12 2020-08-01 15:00:00
3 attr_val_31 2020-08-03 07:00:00
4 attr_val_41 2020-08-01 05:00:00
Table.2 -
---------------------------------------------------------------------------
fake_id start_time end_time updated_time
---------------------------------------------------------------------------
1 2020-08-01 02:00:00 2020-08-01 08:00:00 2020-08-01 00:00:00
2 2020-08-01 04:00:00 2020-08-01 23:00:00 2020-08-01 00:00:00
3 2020-08-03 02:00:00 2020-08-03 08:00:00 2020-08-03 08:00:00
3 2020-08-03 05:00:00 2020-08-03 10:00:00 2020-08-03 12:00:00
3 2020-08-04 05:00:00 2020-08-04 10:00:00 2020-08-04 12:00:00
4 2020-08-01 08:00:00 2020-08-01 18:00:00 2020-08-01 18:00:00
4 2020-08-01 02:00:00 2020-08-01 05:00:00 2020-08-01 22:00:00
Result :
----------------------------------------------------------------------------------------------
fake_id attribute_1 event_time start_time end_time
----------------------------------------------------------------------------------------------
1 attr_val_11 2020-08-01 05:00:00 2020-08-01 02:00:00 2020-08-01 08:00:00
2 attr_val_12 2020-08-01 15:00:00 2020-08-01 04:00:00 2020-08-01 23:00:00
3 attr_val_31 2020-08-03 07:00:00 2020-08-03 05:00:00 2020-08-03 10:00:00
4 attr_val_41 2020-08-01 05:00:00 2020-08-01 02:00:00 2020-08-01 05:00:00
Use the between and get the row_number, sort and take the maximum update time.
spark.sql('''
select
fake_id,
attribute_1,
event_time,
start_time,
end_time
from (
select
t1.fake_id,
t1.attribute_1,
t1.event_time,
t2.start_time,
t2.end_time,
row_number() OVER (PARTITION BY t1.fake_id, t1.attribute_1 ORDER BY t2.updated_time DESC) as rank
from
table1 t1
left join
table2 t2
on
t1.fake_id = t2.fake_id and
t1.event_time between t2.start_time and t2.end_time) t
where
rank = 1
order by
fake_id
''').show()
+-------+-----------+-------------------+-------------------+-------------------+
|fake_id|attribute_1| event_time| start_time| end_time|
+-------+-----------+-------------------+-------------------+-------------------+
| 1|attr_val_11|2020-08-01 05:00:00|2020-08-01 02:00:00|2020-08-01 08:00:00|
| 2|attr_val_12|2020-08-01 15:00:00|2020-08-01 04:00:00|2020-08-01 23:00:00|
| 3|attr_val_31|2020-08-03 07:00:00|2020-08-03 05:00:00|2020-08-03 10:00:00|
| 4|attr_val_41|2020-08-01 05:00:00|2020-08-01 02:00:00|2020-08-01 05:00:00|
+-------+-----------+-------------------+-------------------+-------------------+

Trying to get 24 hours of data from SQL

I am not very skilled at SQL so hopefully someone here can help me out.
I have a date_of_post column in my table which looks like this (example) 2015-08-31 11:00:00.
I use the INTERVAL 1 DAY to get the last 24 hours. However it returns more than the last 24 hours it seems. This is the query I use to fetch my data
SELECT DATE_ADD(date(t.date_of_post),
INTERVAL hour(t.date_of_post) HOUR) AS dateTime,
count(*) as entries
FROM `soc_stat` t
WHERE `main_tag` = 'morgenmad'
AND t.date_of_post > DATE_SUB(CURDATE(), INTERVAL 1 DAY)
GROUP BY date(t.date_of_post), hour(t.date_of_post)
And it returns the following:
2015-08-31 11:00:00 = 11
2015-08-31 12:00:00 = 2
2015-08-31 13:00:00 = 3
2015-08-31 14:00:00 = 3
2015-08-31 15:00:00 = 1
2015-08-31 16:00:00 = 3
2015-08-31 17:00:00 = 2
2015-08-31 19:00:00 = 1
2015-09-01 04:00:00 = 1
2015-09-01 05:00:00 = 3
2015-09-01 06:00:00 = 9
2015-09-01 07:00:00 = 33
2015-09-01 08:00:00 = 38
2015-09-01 09:00:00 = 29
2015-09-01 10:00:00 = 13
2015-09-01 11:00:00 = 12
2015-09-01 12:00:00 = 6
2015-09-01 13:00:00 = 5
I don't understand why 11:00:00, 12:00:00 and 13:00:00 exist in 2015-08-31 and 2015-09-01. Shouldn't it only return the last 24 hours?
CURDATE() returns the "beginning of today". replace it with NOW()
A visual might help. If you use aliases stick with them throughout. When you use aggregate functions like count, group by all non-aggregate columns.
for me it is 2015-09-01 08:47:00
create table soc_stat
( id int auto_increment primary key,
main_tag varchar(20) not null,
date_of_post datetime not null
);
truncate table soc_stat;
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-09-02 11:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-09-01 11:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-09-01 09:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-09-01 08:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-09-01 07:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-08-31 09:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-08-31 08:00:00');
insert soc_stat (main_tag,date_of_post) values ('morgenmad','2015-08-31 07:00:00');
SELECT date(t.date_of_post) dt, hour(t.date_of_post) hr,count(*) as entries
FROM `soc_stat` t
WHERE t.`main_tag` = 'morgenmad'
AND t.date_of_post between DATE_SUB(now(), INTERVAL 1 DAY) and now()
GROUP BY dt,hr
order by t.date_of_post desc;
+------------+------+---------+
| dt | hr | entries |
+------------+------+---------+
| 2015-09-01 | 8 | 1 |
| 2015-09-01 | 7 | 1 |
| 2015-08-31 | 9 | 1 |
+------------+------+---------+

How to Group the duplicate items in MySQL separately

I have a request table..
user_id no:of_mach time_start req_time
11 3 2012-12-12 09:00:00 2012-12-11 09:00:00
12 4 2012-12-14 08:00:00 2012-12-14 06:00:00
13 4 2012-12-12 09:00:00 2012-12-12 02:00:00
14 2 2013-12-12 07:00:00 2012-12-12 03:00:00
15 2 2012 12-14 08:00:00 2012-12-14 05:00:00
From the above table, I need to get the req_time of the users who has requested for the same time_start.
The duplicate time_start are
2012-12-12 09:00:00 by user_id 11,13.
2012-12-14 08:00:00 by user_id 12,15.
Now, each of theirs request time is different..
I want a query so that it will get me the result as:-
req_time of user requested for the time_start 2012-12-12 09:00:00 are:-
2012-12-11 09:00:00
2012-12-12 02:00:00
req_time of user requested for the time_start 2012-12-14 08:00:00 are:-
2012-12-14 06:00:00
2012-12-14 05:00:00
I have used a query:-
SELECT req_time FROM user_req WHERE user_id IN (SELECT o.user_id FROM user_req o INNER JOIN ( SELECT starttime, COUNT( * ) AS dupeCount FROM user_req GROUP BY starttime HAVING COUNT( * ) >1)oc ON o.starttime = oc.starttime) ORDER BY req_time ASC;
And this prints all the req_time together for all the duplicate time_start values..
The output will be :-
2012-12-11 09:00:00
2012-12-12 02:00:00
2012-12-14 06:00:00
2012-12-14 05:00:00
Can I have a query that help me to group this req_time based on each duplicate time_start which I have explained above.
Then I can call it in java and use it for my program..
Please help me..
Try this:
select * from user_req where time_start in
(select time_start
from user_req
group by time_start
having count(time_start) > 1)
order by time_start, req_time
This will return records from the table with multiple counts of same time_start, ordered by the start_time and req_time. You can choose to show only those 2 columns if you want by replacing the select * with appropriate column names.

how to get latest datetime from multiple same dates in mysql

how do i get the latest datetime from multiple same dates in mysql?
SELECT start_time FROM times WHERE start_time BETWEEN '2013-01-27' AND '2013-02-02' ORDER BY start_time
this outputs:
2013-01-27 00:00:00
2013-01-28 09:00:00
2013-01-29 00:00:00
2013-01-30 09:00:00
2013-01-31 00:00:00
2013-02-01 09:00:00
2013-02-01 21:00:00
2013-02-02 00:00:00
i want all this to output except i want the latest datetime for 2013-02-01
so it would output like this:
2013-01-27 00:00:00
2013-01-28 09:00:00
2013-01-29 00:00:00
2013-01-30 09:00:00
2013-01-31 00:00:00
2013-02-01 21:00:00 <<<<<<<<
2013-02-02 00:00:00
SELECT MAX(start_time)
FROM times
WHERE start_time BETWEEN '2013-01-27 00:00:00' AND '2013-02-02 23:59:59'
GROUP BY DATE(start_time)
ORDER BY start_time
SQLFiddle Demo