SQL Group By Date Conflicts - mysql

I have a table with columns start_date and end_date. What we need to do is Select everything and group them by date conflicts for each Object_ID.
A date conflict is when a row's start date and/or end date pass through another rows'. For instance, here are some examples of conflicts:
Row 1 has dates 1st through the 5th, Row 2 has dates 2nd through the 3rd.
Row 1 has dates 2nd through the 5th, Row 2 has dates 1st through the 3rd.
Row 1 has dates 2nd through the 5th, Row 2 has dates 3rd through the 6th.
Row 1 has dates 2nd through the 5th, Row 2 has dates 1st through the 7th.
So for example, if we have some sample data (assume the numbers are just days of the month for simplicity):
id | object_id | start_date | end_date
1 | 1 | 1 | 5
2 | 1 | 2 | 4
3 | 1 | 6 | 8
4 | 2 | 2 | 3
What i would expect to see is this:
object_id | start_date | end_date | numconflicts
1 | <na> | <na> | 2
1 | 6 | 8 | 0 or null
2 | 2 | 3 | 0 or null
And for a Second Test Case, Here is some sample data:
id | object_id | start_date | end_date
1 | 1 | 1 | 5
2 | 1 | 2 | 4
3 | 1 | 6 | 8
4 | 2 | 2 | 3
5 | 2 | 4 | 5
6 | 1 | 2 | 3
7 | 1 | 10 | 12
8 | 1 | 11 | 13
And for the second Test Case, what I would expect to see as output:
object_id | start_date | end_date | numconflicts
1 | <na> | <na> | 3
1 | 6 | 8 | 0 or null
2 | 2 | 3 | 0 or null
2 | 4 | 5 | 0 or null
1 | <na> | <na> | 2
Yes, I will need some way of differentiating the first and the second grouping (the first and last rows) but I haven't quite figured that out. The goal is to view this list, and then when you click on a group of conflicts you can view all of the conflicts in that group.
My first thought was to attempt some GROUP BY CASE ... clause but I just wrapped by head around itself.
The language I am using to call mysql is php. So if someone knows of a php-loop solution rather than a large mysql query i am all ears.
Thanks in advance.
Edit: Added in primary Keys to provide a little less confusion.
Edit: Added in a Test case 2 to provide some more reasoning.

This query finds the number of duplicates:
select od1.object_id, od1.start_date, od1.end_date, sum(od2.id is not null) as dups
from object_date od1
left join object_date od2
on od2.object_id = od1.object_id
and od2.end_date >= od1.start_date
and od2.start_date <= od1.end_date
and od2.id != od1.id
group by 1,2,3;
You can use this query as the basis of a query that gives you exactly what you asked for (see below for output).
select
object_id,
case dups when 0 then start_date else '<na>' end as start_date,
case dups when 0 then end_date else '<na>' end as end_date,
sum(dups) as dups
from (
select od1.object_id, od1.start_date, od1.end_date, sum(od2.id is not null) as dups
from object_date od1
left join object_date od2
on od2.object_id = od1.object_id
and od2.end_date >= od1.start_date
and od2.start_date <= od1.end_date
and od2.id != od1.id
group by 1,2,3) x
group by 1,2,3;
Note that I have used an id column to distinguish the rows. However, you could replace the test of id's not matching with comparisons on every column, ie replace od2.id != od1.id with tests that every other column is not equal, but that would require a unique index on all the other columns to make sense, and having an id column is a good idea anyway.
Here's a test using your data:
create table object_date (
id int primary key auto_increment,
object_id int,
start_date int,
end_date int
);
insert into object_date (object_id, start_date, end_date)
values (1,1,5),(1,2,4),(1,6,8),(2,2,3);
Output of first query when run against this sample data:
+-----------+------------+----------+------+
| object_id | start_date | end_date | dups |
+-----------+------------+----------+------+
| 1 | 1 | 5 | 1 |
| 1 | 2 | 4 | 1 |
| 1 | 6 | 8 | 0 |
| 2 | 2 | 3 | 0 |
+-----------+------------+----------+------+
Output of second query when run against this sample data:
+-----------+------------+----------+------+
| object_id | start_date | end_date | dups |
+-----------+------------+----------+------+
| 1 | 6 | 8 | 0 |
| 1 | <na> | <na> | 2 |
| 2 | 2 | 3 | 0 |
+-----------+------------+----------+------+

Oracle : This could be done with a subquery in a group by CASE statement.
https://forums.oracle.com/forums/thread.jspa?threadID=2131172
Mysql : You could have a view which had all the conflicts .
select distinct a1.appt, a2.appt from appointment a1, appointment a2 where a1.start < a2.end and a1.end > a2.start.
and then simply do a count(*) on that table.

Something like the following should work:
select T1.object_id, T1.start_date, T1.end_date, count(T1.object_id) as numconflicts
from T1
inner join T2 on T1.start_date between T2.start_date and T2.end_date
inner join T3 on T1.end_date between T2.start_date and T2.end_date
group by T1.object_id
I might be off a little bit, but it should help you get started.
Edit: Indented it properly

Related

Sum of Counted records that calculated using "group by" with condition and "group by"

I'm sorry for fuzzy title of this question.
I have 2 Tables in my database and want to count records of first_table using "group by" on a foreign key id that exists in a column of second_table (which stores ids like array "1,2,3,4,5").
id | name | fk_id
1 | john | 1
2 | mike | 1
3 | jane | 2
4 | tailor | 1
5 | jane | 3
6 | tailor | 5
7 | jane | 4
8 | tailor | 5
9 | jane | 5
10 | tailor | 5
id | name | fk_ids | s_fk_id
1 | xxx | 1,5,6 | 1
2 | yyy | 2,3 | 1
3 | zzz | 9 | 1
4 | www | 7,8 | 1
Now i wrote the following query but it not working properly and displays wrong numbers.
I WANT TO:
1-Count records in first_table group by "fk_id"
2-Sum the counted records which exists in "fk_ids"
3-Display the sum result (sum of related counts) grouped by id.
symbol ' ' means ``.
select sum(if(FIND_IN_SET('fk_id', 'fk_ids')>0,'count',0) 'sum', 'count', 'from'.'fk_id', 'second_table'.* FROM 'second_table'
LEFT JOIN
(
SELECT 'fk_id', count(*) 'count'
FROM 'first_table'
group BY 'fk_id'
) AS 'from'
ON FIND_IN_SET('fk_id', 'fk_ids')>0
WHERE 'second_table'.'s_fk_id'=1
GROUP BY 'id'
ORDER by 'count' DESC
This table has many data and we have no plan to change the structure.
Edit:
Desired output:
id | name | sum
1 | xxx | 7 (3+4+0)
2 | yyy | 2 (1+1)
3 | zzz | 0 (0)
4 | www | 0 (0+0)
After two holidays i came back to work and found out that the "FIND_IN_SET" function is not working properly with space contained string.
And the problem is that i was ignored the spaces too, (same as this question)
Finnaly this query worked:
select sum(`count`) `sum`, `count`, `from`.`fk_id`, `second_table`.* FROM `second_table`
LEFT JOIN
(
SELECT `fk_id`, count(*) `count`
FROM `first_table`
group BY `fk_id`
) AS `from`
ON FIND_IN_SET(`fk_id`, replace(`fk_ids`,' ',''))>0
WHERE `second_table`.`s_fk_id`=1
GROUP BY `id`
ORDER by `count` DESC
And the magic is replace(fk_ids,' ','')

Latest datetime from unique mysql index

I have a table. It has a pk of id and an index of [service, check, datetime].
id service check datetime score
---|-------|-------|----------|-----
1 | 1 | 4 |4/03/2009 | 399
2 | 2 | 4 |4/03/2009 | 522
3 | 1 | 5 |4/03/2009 | 244
4 | 2 | 5 |4/03/2009 | 555
5 | 1 | 4 |4/04/2009 | 111
6 | 2 | 4 |4/04/2009 | 322
7 | 1 | 5 |4/05/2009 | 455
8 | 2 | 5 |4/05/2009 | 675
Given a service 2 I need to select the rows for each unique check where it has the max date. So my result would look like this table.
id service check datetime score
---|-------|-------|----------|-----
6 | 2 | 4 |4/04/2009 | 322
8 | 2 | 5 |4/05/2009 | 675
Is there a short query for this? The best I have is this, but it returns too many checks. I just need the unique checks at it's latest datetime.
SELECT * FROM table where service=?;
First you need find out the biggest date for each check
SELECT `check`, MAX(`datetime`)
FROM YourTable
WHERE `service` = 2
GROUP BY `check`
Then join back to get the rest of the data.
SELECT Y.*
FROM YourTable Y
JOIN ( SELECT `check`, MAX(`datetime`) as m_date
FROM YourTable
WHERE `service` = 2
GROUP BY check) as `filter`
ON Y.`service` = `filter`.service
AND Y.`datetime` = `fiter`.m_date
WHERE Y.`service` = 2

a query that returns a single row for each foreign key

I have a table of routines. In this table, I have the column "grade" (which is not mandatory), and the column "date". Also, I have a number of days and an array of ids of users. I need a query that returns me the last routine that have a value != null for "grade" column and datediff(current_date,date) >= number_of_days for each id in the array and make an average of all these values.
e.g.
today = 2014/10/15
number_of_days = 10
ids(1,3)
routines
id | type | date | grade | user_id
1 | 1 | 2014-10-10 | 3 | 1
2 | 1 | 2014-10-04 | 3 | 1
3 | 1 | 2014-10-01 | 3 | 1
4 | 1 | 2014-09-24 | 2 | 1
5 | 1 | 2014-10-10 | 2 | 2
6 | 1 | 2014-10-04 | 3 | 2
7 | 1 | 2014-10-01 | 3 | 2
8 | 1 | 2014-09-24 | 1 | 2
9 | 1 | 2014-10-10 | 1 | 3
10 | 1 | 2014-10-04 | 1 | 3
11 | 1 | 2014-10-01 | 1 | 3
12 | 1 | 2014-09-24 | 1 | 3
In this case, my query would return an avg between "grade" of row id #2 and #10
I think you're saying that you want to consider rows having non-null values in the grade column, a date within a given number of days of the current date, and one of a given set of user_ids. Among those rows, for each user_id you want to choose the row with the latest date, and compute an average of the grade columns for those rows.
I will assume that you cannot have any two rows with the same user_id and date, both with non-null grades, else the question you want to ask does not have a well-defined answer.
A query along these lines should do the trick:
SELECT AVG(r.grade) AS average_grade
FROM
(SELECT user_id, MAX(date) AS date
FROM routines
WHERE grade IS NOT NULL
AND DATEDIFF(CURDATE(), date) >= 10
AND user_id IN (1,3)
GROUP BY user_id) AS md
JOIN routines r
ON r.user_id = md.user_id AND r.date = md.date
Note that in principle you need a grade IS NOT NULL condition on both the inner and the outer query to select the correct rows to average, but in practice AVG() ignores nulls, so you don't actually have to filter out the extra rows in the outer query.

Mysql Queue Start/End time

I have one problem that I can't resolve.
I have 2 tables.
Table 1:
ID | Time
1 | 08:12:54
2 | 08:15:40
3 | 09:30:01
4 | 10:15:15
5 | 10:56:12
6 | 11:00:03
Table 2:
ID | Name| Previous | Current
1 | Queue | null | 11
2 | Queue | 11 | 19
3 | Queue | 19 | 11
3 | List | null | 11
4 | Queue | 11 | 16
4 | List | null | 11
5 | Queue | null | 15
6 | Queue | 15 | 19
The result wanted:
NumberQueue | Start | End
11 | 08:12:54 | 08:15:40
19 | 08:15:40 | 09:30:01
11 | 09:30:01 | 10:15:15
15 | 10:56:12 | 11:00:03
...
...
The previous and the current fields, have the number of the Queue and I want to know for each Queue, the start date and the end date, knowing that the previous has the previous Queue, and the current has the new Queue.
I want one query that can present this result. Help me. :(
Regards.
SELECT t1outer.ID, t1outer.Time AS start, (
SELECT Time FROM Table1 AS t1inner
WHERE t1inner.ID > t1outer.ID
ORDER BY ID ASC LIMIT 1
) AS end, Table2.Previous, Table2.Current
FROM Table1 AS t1outer
LEFT JOIN Table2 USING (ID);
This select statement should provide the information you need:
SELECT Current AS Number, t1out.Time AS Start, (
SELECT Time FROM Table1 AS t1in
WHERE t1in.ID > t1out.ID
ORDER BY ID ASC LIMIT 1
) AS End FROM Table2
LEFT JOIN Table1 AS t1out USING (ID)
WHERE Table2.Name = 'Queue';

mysql select ordernumber by group

I'm trying to do something like 'select groupwise maximum', but I'm looking for groupwise order number.
so with a table like this
briefs
----------
id_brief | id_case | date
1 | 1 | 06/07/2010
2 | 1 | 04/07/2010
3 | 1 | 03/07/2010
4 | 2 | 18/05/2010
5 | 2 | 17/05/2010
6 | 2 | 19/05/2010
I want a result like this
breifs result
----------
id_brief | id_case | dateOrder
1 | 1 | 3
2 | 1 | 2
3 | 1 | 1
4 | 2 | 2
5 | 2 | 1
6 | 2 | 3
I think I want to do something like described here MySQL - Get row number on select, but I don't know how I would reset the variable for each id_case.
This will give you how many records are there with this id_case value and a date less than or equal to this date value.
SELECT t1.id_brief,
t1.id_case,
COUNT(t2.*) AS dateOrder
FROM yourtable AS t1
LEFT JOIN yourtable AS t2 ON t2.id_case = t1.id_case AND t2.date <= t1.date
GROUP BY t1.id_brief
Mysql is permissive about columns which can be queries using GROUP BY. With a more stric DBMS you may need GROUP BY t1.id_brief, t1.id_case.
I strongly advise you to have the right indexes on the table:
CREATE INDEX filter1 ON yourtabl (id_case, date)