My database contains a single table with lots of different data columns. Following simple representation shows the columns I am currently interested in:
Table Playlist:
id metadata lastplayed
===============================================
1 All Night 1571324631
2 Perfect Replacement 1571324767
3 One More Day 1571324952
4 Stay Awake 1571325184
5 Perfect Replacement 1571325386
6 All Night 1571325771
7 Close Enemies 1571326422
I already have a View which groups the metadata, so I can see all single occurrences of the songs and when they were last played (epoch seconds).
View 'Music' (desired result):
id metadata lastplayed count (*)
==============================================================
3 One More Day 1571324952 1
4 Stay Awake 1571325184 1
5 Perfect Replacement 1571325386 2
6 All Night 1571325771 2
7 Close Enemies 1571326422 1
The column "count" does not yet exist in the View, and I would like to include it via the existing SQL script that creates the View:
CREATE VIEW `Music` AS
SELECT
t1.`id`,
t1.`metadata`,
t1.`lastplayed`
FROM Playlist t1
INNER JOIN
(
SELECT `metadata`, MAX(`lastplayed`) AS `timestamp`
FROM Playlist
GROUP BY `metadata`
) t2
ON t1.`metadata` = t2.`metadata` AND t1.`lastplayed` = t2.`timestamp`
ORDER BY t1.`Id` ASC
So now I am running into the problem where and how to put my COUNT(metadata) AS count line, to get the desired result. When I add it in the top SELECT row, the table is reduced to a single data line with one song and the count of all data rows.
Put it in the inner select
CREATE VIEW `Music` AS
SELECT
t1.`id`,
t1.`metadata`,
t2.`lastplayed`,
t2.count
FROM Playlist t1
INNER JOIN
(
SELECT `metadata`, MAX(`lastplayed`) AS `timestamp`, COUNT(*) AS count
FROM Playlist
GROUP BY `metadata`
) t2
ON t1.`metadata` = t2.`metadata` AND t1.`lastplayed` = t2.`timestamp`
ORDER BY t1.`Id` ASC
You just need a simple aggregation through grouping by metadata column
CREATE OR REPLACE VIEW `Music` AS
SELECT MAX(id) AS id, `metadata`, MAX(`lastplayed`) AS lastplayed, COUNT(*) AS count
FROM Playlist
GROUP BY `metadata`
Related
This is my query.
select trackingbatches.batchnumber, requireddate, jobno, frames.frame_id, frame_no, groupdesc, finishdesc, finish2desc, BOUGHTINFRAME, NAME, MAX(statustimestamp) from JOBQUOTEHEADER
right join trackingbatches on JOBQUOTEHEADER.header_id=trackingbatches.header_id
right join frames on trackingbatches.header_Id=frames.header_id
right join trackingstagesettings on trackingbatches.status=trackingstagesettings.stage_id
where requireddate between current_date-1 and current_date
GROUP BY 1,2,3,4,5,6,7,8,9,10
ORDER BY JOBNO
However, I do not want to group by 'name'. I want it to select the latest 'statustimestamp' but as of now, it groups by name also so it gives me a row which is the same frame with all the same details but multiple occurrences of it in the factory. name refers to the stage of the frame in our factory.
BATCHNUMBER
REQUIREDDATE
JOBNO
FRAME_ID
FRAME_NO
GROUPDESC
FINISHDESC
FINISH2DESC
BOUGHTINFRAME
NAME
STATUSTIMESTAMP
5079
01.09
5STAR1
1
1
INT CASEMENT
STD WHITE
N/A
0
CUT
16.08.2021
5079
01.09
5STAR1
1
1
INT CASEMENT
STD WHITE
N/A
0
LOADED
02.09.2021
So as you can see from the two records above. That is the same frame but i only want one instance of it to show in my results, the instance of the latest status of it which is the second one that is 'LOADED' as its status with the field called name. So i want the max status timestamp for the latest instance of the frame but because it groups by the field 'name' as well, i can't get just the latest instance by itself.
You can create a rank on your current query (when you see duplicates) something like the following:
Rank will create a row number for each row per batchnumber in the desc order for date
Once the rank is created you can just filter = 1 to get the data per batchnumer in desc order based on statustimestamp
This way you don't need to worry about NAME GROUP BY as it it will not be in the partition by clause of the ROW_NUMBER() [which is creating the rank based on just batchnumber -- you can adjust it by adding more columns if you need something else to aggregate by]
WITH data AS (
select trackingbatches.batchnumber, requireddate, jobno, frames.frame_id, frame_no, groupdesc, finishdesc, finish2desc, BOUGHTINFRAME, NAME, MAX(statustimestamp) from JOBQUOTEHEADER
right join trackingbatches on JOBQUOTEHEADER.header_id=trackingbatches.header_id
right join frames on trackingbatches.header_Id=frames.header_id
right join trackingstagesettings on trackingbatches.status=trackingstagesettings.stage_id
where requireddate between current_date-1 and current_date
GROUP BY 1,2,3,4,5,6,7,8,9,10
ORDER BY JOBNO
),
rank_data_by_date AS
SELECT
*
,
ROW_NUMBER() OVER(PARTITION BY BATCHNUMBER ORDER BY STATUSTIMESTAMP DESC) AS RANK_
)
SELECT * FROM rank_data_by_date WHERE RANK_ = 1
I have 3 tables, one for website connection errors, one for successful website connections and another with name/location of each specific website.
Table 1 has WebsiteClass_ID, Website_ID and Error_Date
Table 2 has WebsiteClass_ID, Website_ID and Success_Date
Table 3 has WebsiteClass_ID, Website Name and Location
I need to return the rate of error by WebsiteClass_ID by Website_ID per day. To do this, I need the count of errors per WebsiteClass_ID, Website_ID and Date from Table 1 and the count of successful connections per WebsiteClass_ID, Website_ID and Date from Table 2. I still need to return Website Name and Location from table 3 as well. The date field is different in Table 1 than it is in Table 2.
I can easily get the count for each in two different queries but would prefer to accomplish this in one query to avoid extra work in Excel. I created the two individual queries below but do not know how to merge them.
#QUERY
#TITLE-WEBSEROR
#SUBJECT-WEBSITE ERRORS PER DAY BY CLASS AND ID
SELECT
A.WEBSITE_CLASS_ID AS WEBSITE_CLASS_ID
,A.WEBSITE_ID AS WEBSITE_ID
,A.ERROR_DATE AS DATE_OF_ERROR
,COUNT(A.EVENT_NAME) AS NUMBER_OF_ERRORS
,B.NAME AS WEBSITE_NAME
,B.LOCATION AS COMPANY_LOCATION
FROM
&DATABASE..ERRORS A
,&DATABASE..DETAILS B
WHERE
A.WEBSITE_ID = B.WEBSITE_ID
GROUP BY A.WEBSITE_CLASS_ID, A.WEBSITE_ID, A.ERROR_DATE, B.NAME, B.LOCATION
#QUERY
#TITLE-WEBSCNFM
#SUBJECT-SUCCESSUL CONNECTIONS PER DAY BY CLASS AND ID
SELECT
C.WEBSITE_CLASS_ID AS WEBSITE_CLASS_ID
,C.WEBSITE_ID AS WEBSITE_ID
,DATE(C.SUCCESS_DATE) AS SUCCESSFUL_CONNECTION
,COUNT(C.SUCCESS) AS COUNT_SUCCESS_CNCTN
,B.NAME AS WEBSITE_NAME
,B.LOCATION AS COMPANY_LOCATION
FROM
&DATABASE..SUCCESS C
,&DATABASE..DETAILS B
WHERE
C.WEBSITE_ID = B.WEBSITE_ID
GROUP BY C.WEBSITE_CLASS_ID, C.WEBSITE_ID, DATE(C.SUCCESS_DATE), B.NAME, B.LOCATION
Data Sample:
Table 1: Errors
Table 2: Success
Table 3: Details
Expected Results :
Website_Class_ID
Website_ID
Date of Error or Success
Count of Errors
Count of Success
Website Name
Website Location
ClassB
ID 2
12/1/2019
3
5
Website #1
USA
ClassC
ID 3
12/2/2019
1
6
Website #2
Canada
SELECT
`Errors$`.WEBSITE_CLASS_ID
,`Errors$`.WEBSITE_ID
,`Errors$`.ERROR_DATE
,COUNT(`Errors$`.EVENT_NAME)
,`Details$`.NAME
,`Details$`.LOCATION
FROM
`D:\mike\SnapCommerce Case Study\Data.xlsx`.`Errors$` `Errors$`,
INNER JOIN `D:\mike\SnapCommerce Case Study\Data.xlsx`.`Details$`
`Details$`
ON `Details$`.WEBSITE_ID = `Errors$`.WEBSITE_ID
GROUP BY `Errors$`.WEBSITE_CLASS_ID, `Errors$`.WEBSITE_ID,
`Errors$`.ERROR_DATE, `Details$`.NAME, `Details$`.LOCATION
UNION
SELECT
`Success$`.WEBSITE_CLASS_ID
,`Success$`.WEBSITE_ID
,DATE(`Success$`.SUCCESS_DATE)
,COUNT(`Success$`.SUCCESS)
,`Details$`.NAME
,`Details$`.LOCATION
FROM
`D:\mike\SnapCommerce Case Study\Data.xlsx`.`Success$` `Success$`,
INNER JOIN `D:\mike\SnapCommerce Case Study\Data.xlsx`.`Details$`
`Details$`
ON `Details$`.WEBSITE_ID = `Success$`.WEBSITE_ID
GROUP BY `Success$`.WEBSITE_CLASS_ID, `Success$`.WEBSITE_ID,
`Success$`.SUCCESS_DATE, `Details$`.NAME, `Details$`.LOCATION
Vertically, You can use UNION, this would eliminate doubles as well.
If you need them use UNION ALL
SELECT
A.WEBSITE_CLASS_ID AS WEBSITE_CLASS_ID
,A.WEBSITE_ID AS WEBSITE_ID
,A.ERROR_DATE AS DATE_OF_ERROR
,COUNT(A.EVENT_NAME) AS NUMBER_OF_ERRORS
,B.NAME AS WEBSITE_NAME
,B.LOCATION AS COMPANY_LOCATION
FROM
&DATABASE..ERRORS A
INNER JOIN &DATABASE..DETAILS B
ON A.WEBSITE_ID = B.WEBSITE_ID
GROUP BY A.WEBSITE_CLASS_ID, A.WEBSITE_ID, A.ERROR_DATE, B.NAME, B.LOCATION
UNION
SELECT
C.WEBSITE_CLASS_ID AS WEBSITE_CLASS_ID
,C.WEBSITE_ID AS WEBSITE_ID
,DATE(C.SUCCESS_DATE) AS SUCCESSFUL_CONNECTION
,COUNT(C.SUCCESS) AS COUNT_SUCCESS_CNCTN
,B.NAME AS WEBSITE_NAME
,B.LOCATION AS COMPANY_LOCATION
FROM
&DATABASE..SUCCESS C
INNER JOIN
&DATABASE..DETAILS B
ON C.WEBSITE_ID = B.WEBSITE_ID
GROUP BY C.WEBSITE_CLASS_ID, C.WEBSITE_ID, DATE(C.SUCCESS_DATE), B.NAME, B.LOCATION
I am trying to list several products on a page. My query returns multiples of the same product and I am trying to figure out how to limit it to one only with my query.
The primary key on the first table that we will call table_one is ID.
The second table has a column of ProductID that references the primary key on table_one.
My query brings me back multiples of my ProductID that is equal to 6 below. I just want one result to be brought back, BUT I still want my all of my data in DateReserved on table_two to be queried. Pretty sure I need to add one more thing to my query, but I have not had much luck.
The results I want back are as follows.
ID Productname Quantity Image Date Reserved SumQuantity
6 productOne 6 'image.jpg' 03-31-2013 3
7 productTwo 1 'product.jpg' 04-04-2013 1
Here is my first table. table_one
ID Productname Quantity Image
6 productOne 6 'image.jpg'
7 productTwo 1 'product.jpg'
Here is my second table. table_two
ID ProductID DateReserved QuantityReserved
1 6 03-31-2013 3
2 6 04-07-2013 2
3 7 04-04-2013 1
Here is my query that I am trying to use.
SELECT *
FROM `table_one`
LEFT JOIN `table_two`
ON `table_one`.`ID` = `table_two`.`ProductID`
WHERE `table_one`.`Quantity` > 0
OR `table_two`.`DateReserved` + INTERVAL 5 DAY <= '2013-03-27'
ORDER BY ProductName
Sorry for posting another answer, but as it seems my first try on it was not so good ;)
To only get one result row per reservation you need to sum them up somehow.
First I suggest you explicitly select the columns you want back in your result and don't use "*".
I suggest you try something like this:
SELECT
`table_one`.`ID`, `table_one`.`Productname`, `table_one`.`Image`, `table_one`.`Quantity`,
`table_two`.`ProductID`, SUM(`table_two`.`QuantityReserved`)
FROM
`table_one`
LEFT JOIN
`table_two` ON `table_one`.`ID` = `table_two`.`ProductID`
WHERE
`table_one`.`Quantity` > 0
OR `table_two`.`DateReserved` + INTERVAL 5 DAY <= '2013-03-27'
GROUP BY `table_two`.`ProductID`
ORDER BY ProductName
As you see I used "SUM" to get a combined quantity, this is called aggregation and the "GROUP BY" helps you getting rid of multiple occurences of the same ProductID.
One problem that you have now is that you will have to get the reservation date from a seperate query (well at least I am now unsure how you would get it into the same query)
Since you are using MySQL
LIMIT <NUMBER>
should exactly do what you want, you just insert it after your ORDER BY clause, but probably you should also add one more ordering to that, so you can be sure that you will always get the one entity that you wanted and not just some "random" entity ;)
So without further ordering your query would look like this:
SELECT
*
FROM `table_one`
LEFT JOIN `table_two` ON `table_one`.`ID` = `table_two`.`ProductID`
WHERE
`table_one`.`Quantity` > 0
OR `table_two`.`DateReserved` + INTERVAL 5 DAY <= '2013-03-27'
ORDER BY ProductName
LIMIT 1
here some more description about that
SELECT a.member_id,a.member_name,a.gender,a.amount,b.trip_id,b.location
FROM tbl_member a
LEFT JOIN (SELECT trip_id, MAX(amount) as amount FROM tbl_member GROUP BY trip_id ) b ON a.trip_id= b.trip_id
LEFT JOIN tbl_trip b ON a.trip_id=c.trip_id
ORDER BY member_name
Here is a simplified version of my table:
group price spec
a 1 .
a 2 ..
b 1 ...
b 2
c .
. .
. .
I'd like to produce a result like this: (I'll refer to this as result_table)
price_a |spec_a |price_b |spec_b |price_c ...|total_cost
1 |. |1 |.. |... |
(min) (min) =1+1+...
Basically I want to:
select the rows containing the min price within each group
combine columns into a single row
I know this can be done using several queries and/or combined with some non-sql processing on the results, but I suspect that there maybe better solutions.
The reason that I want to do task 2 (combine columns into a single row)
is because I want to do something like the following with the result_table:
select *,
(result_table.total_cost + table1.price + table.2.price) as total_combined_cost
from result_table
right join table1
right join table2
This may be too much to ask for, so here is some other thoughts on the problem:
Instead of trying to combine multiple rows(task 2), store them in a temporary table
(which would be easier to calculate the total_cost using sum)
Feel free to drop any thoughts, don't have to be complete answer, I feel it's brilliant enough if you have an elegant way to do task 1 !
==Edited/Added 6 Feb 2012==
The goal of my program is to identify best combinations of items with minimal cost (and preferably possess higher utilitarian value at the same time).
Consider #ypercube's comment about large number of groups, temporary table seems to be the only feasible solution. And it is also pointed out there is no pivoting function in MySQL (although it can be implemented, it's not necessary to perform such operation).
Okay, after study #Johan's answer, I'm thinking about something like this for task 1:
select * from
(
select * from
result_table
order by price asc
) as ordered_table
group by group
;
Although looks dodgy, it seems to work.
==Edited/Added 7 Feb 2012==
Since there could be more than one combination may produce the same min value, I have modified my answer :
select result_table.* from
(
select * from
(
select * from
result_table
order by price asc
) as ordered_table
group by group
) as single_min_table
inner join result_table
on result_table.group = single_min_table.group
and result_table.price = single_min_table.price
;
However, I have just realised that there is another problem I need to deal with:
I can not ignore all the spec, since there is a provider property, items from different providers may or may not be able to be assembled together, so to be safe (and to simplify my problem) I decide to combine items from the same provider only, so the problem becomes:
For example if I have an initial table like this(with only 2 groups and 2 providers):
id group price spec provider
1 a 1 . x
2 a 2 .. y
3 a 3 ... y
4 b 1 ... y
5 b 2 x
6 b 3 z
I need to combine
id group price spec provider
1 a 1 . x
5 b 2 x
and
2 a 2 .. y
4 b 1 ... y
record (id 6) can be eliminated from the choices since it dose not have all the groups available.
So it's not necessarily to select only the min of each group, rather it's to select one from each group so that for each provider I have a minimal combined cost.
You cannot pivot in MySQL, but you can group results together.
The GROUP_CONCAT function will give you a result like this:
column A column B column c column d
groups specs prices sum(price)
a,b,c some,list,xyz 1,5,7 13
Here's a sample query:
(The query assumes you have a primary (or unique) key called id defined on the target table).
SELECT
GROUP_CONCAT(a.`group`) as groups
,GROUP_CONCAT(a.spec) as specs
,GROUP_CONCAT(a.min_price) as prices
,SUM(a.min_prices) as total_of_min_prices
FROM
( SELECT price, spec, `group` FROM table1
WHERE id IN
(SELECT MIN(id) as id FROM table1 GROUP BY `group` HAVING price = MIN(price))
) AS a
See: http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
Producing the total_cost only:
SELECT SUM(min_price) AS total_cost
FROM
( SELECT MIN(price) AS min_price
FROM TableX
GROUP BY `group`
) AS grp
If a result set with the minimum prices returned in row (not in column) per group is fine, then your problem is of the gretaest-n-per-group type. There are various methods to solve it. Here's one:
SELECT tg.grp
tm.price AS min_price
tm.spec
FROM
( SELECT DISTINCT `group` AS grp
FROM TableX
) AS tg
JOIN
TableX AS tm
ON
tm.PK = --- the Primary Key of the table
( SELECT tmin.PK
FROM TableX AS tmin
WHERE tmin.`group` = tg.grp
ORDER BY tmin.price ASC
LIMIT 1
)
Suppose I have a table with 3 columns:
id (PK, int)
timestamp (datetime)
title (text)
I have the following records:
1, 2010-01-01 15:00:00, Some Title
2, 2010-01-01 15:00:02, Some Title
3, 2010-01-02 15:00:00, Some Title
I need to do a GROUP BY records that are within 3 seconds of each other. For this table, rows 1 and 2 would be grouped together.
There is a similar question here: Mysql DateTime group by 15 mins
I also found this: http://www.artfulsoftware.com/infotree/queries.php#106
I don't know how to convert these methods into something that will work for seconds. The trouble with the method on the SO question is that it seems to me that it would only work for records falling within a bin of time that starts at a known point. For instance, if I were to get FLOOR() to work with seconds, at an interval of 5 seconds, a time of 15:00:04 would be grouped with 15:00:01, but not grouped with 15:00:06.
Does this make sense? Please let me know if further clarification is needed.
EDIT: For the set of numbers, {1, 2, 3, 4, 5, 6, 7, 50, 51, 60}, it seems it might be best to group them {1, 2, 3, 4, 5, 6, 7}, {50, 51}, {60}, so that each grouping row depends on if the row is within 3 seconds of the previous. I know this changes things a bit, I'm sorry for being wishywashy on this.
I am trying to fuzzy-match logs from different servers. Server #1 may log an item, "Item #1", and Server #2 will log that same item, "Item #1", within a few seconds of server #1. I need to do some aggregate functions on both log lines. Unfortunately, I only have title to go on, due to the nature of the server software.
I'm using Tom H.'s excellent idea but doing it a little differently here:
Instead of finding all the rows that are the beginnings of chains, we can find all times that are the beginnings of chains, then go back and ifnd the rows that match the times.
Query #1 here should tell you which times are the beginnings of chains by finding which times do not have any times below them but within 3 seconds:
SELECT DISTINCT Timestamp
FROM Table a
LEFT JOIN Table b
ON (b.Timestamp >= a.TimeStamp - INTERVAL 3 SECONDS
AND b.Timestamp < a.Timestamp)
WHERE b.Timestamp IS NULL
And then for each row, we can find the largest chain-starting timestamp that is less than our timestamp with Query #2:
SELECT Table.id, MAX(StartOfChains.TimeStamp) AS ChainStartTime
FROM Table
JOIN ([query #1]) StartofChains
ON Table.Timestamp >= StartOfChains.TimeStamp
GROUP BY Table.id
Once we have that, we can GROUP BY it as you wanted.
SELECT COUNT(*) --or whatever
FROM Table
JOIN ([query #2]) GroupingQuery
ON Table.id = GroupingQuery.id
GROUP BY GroupingQuery.ChainStartTime
I'm not entirely sure this is distinct enough from Tom H's answer to be posted separately, but it sounded like you were having trouble with implementation, and I was thinking about it, so I thought I'd post again. Good luck!
Now that I think that I understand your problem, based on your comment response to OMG Ponies, I think that I have a set-based solution. The idea is to first find the start of any chains based on the title. The start of a chain is going to be defined as any row where there is no match within three seconds prior to that row:
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
Now we can assume that any non-chain starters belong to the chain starter that appeared before them. Since MySQL doesn't support CTEs, you might want to throw the above results into a temporary table, as that would save you the multiple joins to the same subquery below.
SELECT
SQ1.my_id,
COUNT(*) -- You didn't say what you were trying to calculate, just that you needed to group them
FROM
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ1
INNER JOIN My_Table MT3 ON
MT3.title = SQ1.title AND
MT3.my_time >= SQ1.my_time
LEFT OUTER JOIN
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ2 ON
SQ2.title = SQ1.title AND
SQ2.my_time > SQ1.my_time AND
SQ2.my_time <= MT3.my_time
WHERE
SQ2.my_id IS NULL
This would look much simpler if you could use CTEs or if you used a temporary table. Using the temporary table might also help performance.
Also, there will be issues with this if you can have timestamps that match exactly. If that's the case then you will need to tweak the query slightly to use a combination of the id and the timestamp to distinguish rows with matching timestamp values.
EDIT: Changed the queries to handle exact matches by timestamp.
Warning: Long answer. This should work, and is fairly neat, except for one step in the middle where you have to be willing to run an INSERT statement over and over until it doesn't do anything since we can't do recursive CTE things in MySQL.
I'm going to use this data as the example instead of yours:
id Timestamp
1 1:00:00
2 1:00:03
3 1:00:06
4 1:00:10
Here is the first query to write:
SELECT a.id as aid, b.id as bid
FROM Table a
JOIN Table b
ON (a.Timestamp is within 3 seconds of b.Timestamp)
It returns:
aid bid
1 1
1 2
2 1
2 2
2 3
3 2
3 3
4 4
Let's create a nice table to hold those things that won't allow duplicates:
CREATE TABLE
Adjacency
( aid INT(11)
, bid INT(11)
, PRIMARY KEY (aid, bid) --important for later
)
Now the challenge is to find something like the transitive closure of that relation.
To do so, let's find the next level of links. by that I mean, since we have 1 2 and 2 3 in the Adjacency table, we should add 1 3:
INSERT IGNORE INTO Adjacency(aid,bid)
SELECT adj1.aid, adj2.bid
FROM Adjacency adj1
JOIN Adjacency adj2
ON (adj1.bid = adj2.aid)
This is the non-elegant part: You'll need to run the above INSERT statement over and over until it doesn't add any rows to the table. I don't know if there is a neat way to do that.
Once this is over, you will have a transitively-closed relation like this:
aid bid
1 1
1 2
1 3 --added
2 1
2 2
2 3
3 1 --added
3 2
3 3
4 4
And now for the punchline:
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
returns:
aid Neighbors
1 1,2,3
2 1,2,3
3 1,2,3
4 4
So
SELECT DISTINCT Neighbors
FROM (
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
) Groupings
returns
Neighbors
1,2,3
4
Whew!
I like #Chris Cunningham's answer, but here's another take on it.
First, my understanding of your problem statement (correct me if I'm wrong):
You want to look at your event log as a sequence, ordered by the time of the event,
and partitition it into groups, defining the boundary as being an interval of
more than 3 seconds between two adjacent rows in the sequence.
I work mostly in SQL Server, so I'm using SQL Server syntax. It shouldn't be too difficult to translate into MySQL SQL.
So, first our event log table:
--
-- our event log table
--
create table dbo.eventLog
(
id int not null ,
dtLogged datetime not null ,
title varchar(200) not null ,
primary key nonclustered ( id ) ,
unique clustered ( dtLogged , id ) ,
)
Given the above understanding of the problem statement, the following query should give you the upper and lower bounds your groups. It's a simple, nested select statement with 2 group by to collapse things:
The innermost select defines the upper bound of each group. That upper boundary defines a group.
The outer select defines the lower bound of each group.
Every row in the table should fall into one of the groups so defined, and any given group may well consist of a single date/time value.
[edited: the upper bound is the lowest date/time value where the interval is more than 3 seconds]
select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
You could then pull rows from the event log and tag them with the group to which they belong thus:
select *
from ( select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
) period
join dbo.EventLog t on t.dtLogged >= period.dtFrom
and t.dtLogged <= coalesce( period.dtThru , t.dtLogged )
order by period.dtFrom , period.dtThru , t.dtLogged
Each row is tagged with its group via the dtFrom and dtThru columns returned. You could get fancy and assign an integral row number to each group if you want.
Simple query:
SELECT * FROM time_history GROUP BY ROUND(UNIX_TIMESTAMP(time_stamp)/3);