I am having a table structure like this
CREATE TABLE yourTable (
`Source` VARCHAR(20),
`Destination` VARCHAR(20),
`Distance` Integer
);
INSERT INTO yourTable
(`Source`, `Destination`, `Distance`)
VALUES
('Buffalo', 'Rochester', 2200),
('Yonkers', 'Syracuse', 1400),
('Cheektowaga', 'Schenectady', 600),
('Rochester', 'Buffalo', 2200)
How can we return only unique records for example as 'Buffalo' and 'Rochester' are present in 1 & 4 rows so one should be taken while retrieving.
I tried writing this query but here source and destination values are not correct for 3 rows Schenectady Cheektowaga
SELECT DISTINCT GREATEST(Source, Destination) as Source, LEAST(Source, Destination) AS Destination, Distance
FROM yourTable
Use two queries that you combine with UNION. One query returns the rows that are already unique, the other removes the duplicate from the rows that are duplicated in the other direction.
SELECT t1.Source, t1.Destination, t1.Distance
FROM yourTable AS t1
LEFT JOIN yourTable AS t2 ON t1.Source = t2.Destination AND t1.Destination = t2.Source
WHERE t2.Source IS NULL
UNION ALL
SELECT GREATEST(Source, Destination) AS s, LEAST(Source, Destination) AS d, MAX(Distance) AS Distance
FROM yourTable
GROUP BY s, d
HAVING COUNT(*) > 1
DEMO
Try this:
select * from yourTable group by greatest(source,destination);
I am trying to do an insert select but require some aliased values for calculations but don't need all of them for my insert. I just need field0, total_sum, hard_coded_val but rely on the others for the calculations.
is there any way to either ignore the other values or specify the VALUES() in the insert SELECT?
INSERT INTO table(field0,total_sum,hard_coded_val)
SELECT s.*, sum1+sum2 AS total_sum, 'hard_coded_val' FROM
(SELECT t.*, (fielda+fieldb)*2 AS sum1, (fieldc+fieldd)/4 AS sum2 from
(SELECT field0,
sum(IF(field1 = 1, totalcount,0)) AS fielda,
sum(IF(field1 = 2, totalcount,0)) AS fieldb,
sum(IF(field1 = 3, totalcount,0)) AS fieldc,
sum(IF(field1 = 4,totalcount,0)) AS fieldd
from source_table GROUP BY field0)
t ORDER BY sum1 DESC)
s ORDER BY total_sum DESC
You just need to limit the number of columns you're returning. * will return all columns for the table associated with it.
INSERT INTO table(field0,total_sum,hard_coded_val)
SELECT s.field0, sum1+sum2 AS total_sum, 'hard_coded_val' FROM
...
I have the following columns in a table called meetings: meeting_id - int, start_time - time, end_time - time. Assuming that this table has data for one calendar day only, how many minimum number of rooms do I need to accomodate all the meetings. Room size/number of people attending the meetings don't matter.
Here's the solution:
select * from
(select t.start_time,
t.end_time,
count(*) - 1 overlapping_meetings,
count(*) minimum_rooms_required,
group_concat(distinct concat(y.start_time,' to ',t.end_time)
separator ' // ') meeting_details from
(select 1 meeting_id, '08:00' start_time, '09:15' end_time union all
select 2, '13:20', '15:20' union all
select 3, '10:00', '14:00' union all
select 4, '13:55', '16:25' union all
select 5, '14:00', '17:45' union all
select 6, '14:05', '17:45') t left join
(select 1 meeting_id, '08:00' start_time, '09:15' end_time union all
select 2, '13:20', '15:20' union all
select 3, '10:00', '14:00' union all
select 4, '13:55', '16:25' union all
select 5, '14:00', '17:45' union all
select 6, '14:05', '17:45') y
on t.start_time between y.start_time and y.end_time
group by start_time, end_time) z;
My question - is there anything wrong with this answer? Even if there's nothing wrong with this, can someone share a better answer?
Let's say you have a table called 'meeting' like this -
Then You can use this query to get the minimum number of meeting Rooms required to accommodate all Meetings.
select max(minimum_rooms_required)
from (select count(*) minimum_rooms_required
from meetings t
left join meetings y on t.start_time >= y.start_time and t.start_time < y.end_time group by t.id
) z;
This looks clearer and simple and works fine.
Meetings can "overlap". So, GROUP BY start_time, end_time can't figure this out.
Not every algorithm can be done in SQL. Or, at least, it may be grossly inefficient.
I would use a real programming language for the computation, leaving the database for what it is good at -- being a data repository.
Build a array of 1440 (minutes in a day) entries; initialize to 0.
Foreach meeting:
Foreach minute in the meeting (excluding last minute):
increment element in array.
Find the largest element in the array -- the number of rooms needed.
CREATE TABLE [dbo].[Meetings](
[id] [int] NOT NULL,
[Starttime] [time](7) NOT NULL,
[EndTime] [time](7) NOT NULL) ON [PRIMARY] )GO
sample data set:
INSERT INTO Meetings VALUES (1,'8:00','09:00')
INSERT INTO Meetings VALUES (2,'8:00','10:00')
INSERT INTO Meetings VALUES (3,'10:00','11:00')
INSERT INTO Meetings VALUES (4,'11:00','12:00')
INSERT INTO Meetings VALUES (5,'11:00','13:00')
INSERT INTO Meetings VALUES (6,'13:00','14:00')
INSERT INTO Meetings VALUES (7,'13:00','15:00')
To Find Minimum number of rooms required run the below query:
create table #TempMeeting
(
id int,Starttime time,EndTime time,MeetingRoomNo int,Rownumber int
)
insert into #TempMeeting select id, Starttime,EndTime,0 as MeetingRoomNo,ROW_NUMBER()
over (order by starttime asc) as Rownumber from Meetings
declare #RowCounter int
select top 1 #RowCounter=Rownumber from #TempMeeting order by Rownumber
WHILE #RowCounter<=(Select count(*) from #TempMeeting)
BEGIN
update #TempMeeting set MeetingRoomNo=1
where Rownumber=(select top 1 Rownumber from #TempMeeting where
Rownumber>#RowCounter and Starttime>=(select top 1 EndTime from #TempMeeting
where Rownumber=#RowCounter)and MeetingRoomNo=0)set #RowCounter=#RowCounter+1
END
select count(*) from #TempMeeting where MeetingRoomNo=0
Consider a table meetings with columns id, start_time and end_time. Then the following query should give correct answer.
with mod_meetings as (select id, to_timestamp(start_time, 'HH24:MI')::TIME as start_time,
to_timestamp(end_time, 'HH24:MI')::TIME as end_time from meetings)
select CASE when max(a_cnt)>1 then max(a_cnt)+1
when max(a_cnt)=1 and max(b_cnt)=1 then 2 else 1 end as rooms
from
(select count(*) as a_cnt, a.id, count(b.id) as b_cnt from mod_meetings a left join mod_meetings b
on a.start_time>b.start_time and a.start_time<b.end_time group by a.id) join_table;
Sample DATA:
DROP TABLE IF EXISTS meeting;
CREATE TABLE "meeting" (
"meeting_id" INTEGER NOT NULL UNIQUE,
"start_time" TEXT NOT NULL,
"end_time" TEXT NOT NULL,
PRIMARY KEY("meeting_id")
);
INSERT INTO meeting values (1,'08:00','14:00');
INSERT INTO meeting values (2,'09:00','10:30');
INSERT INTO meeting values (3,'11:00','12:00');
INSERT INTO meeting values (4,'12:00','13:00');
INSERT INTO meeting values (5,'10:15','11:00');
INSERT INTO meeting values (6,'12:00','13:00');
INSERT INTO meeting values (7,'10:00','10:30');
INSERT INTO meeting values (8,'11:00','13:00');
INSERT INTO meeting values (9,'11:00','14:00');
INSERT INTO meeting values (10,'12:00','14:00');
INSERT INTO meeting values (11,'10:00','14:00');
INSERT INTO meeting values (12,'12:00','14:00');
INSERT INTO meeting values (13,'10:00','14:00');
INSERT INTO meeting values (14,'13:00','14:00');
Solution:
DROP VIEW IF EXISTS Final;
CREATE VIEW Final AS SELECT time, group_concat(event), sum(num) num from (
select start_time time, 's' event, 1 num from meeting
union all
select end_time time, 'e' event, -1 num from meeting)
group by 1
order by 1;
select max(room) AS Min_Rooms_Required FROM (
select
a.time,
sum(b.num) as room
from
Final a
, Final b
where a.time >= b.time
group by a.time
order by a.time
);
Here's the explanation to gashu's nicely working code (or otherwise a non-code explanation of how to solve it with any language).
Firstly, if the variable 'minimum_rooms_required' would be renamed to 'overlap' it would make the whole thing much easier to understand. Because for each of the start or end times we want to know the numbers of overlapping ongoing meetings. When we found the maximum, this means there's no way of getting around with less than the overlapping amount, because well they overlap.
By the way, I think there might be a mistake in the code. It should check for t.start_time or t.end_time between y.start_time and y.end_time. Counterexample: meeting 1 starts at 8:00, ends at 11:00 and meeting 2 starts at 10:00, ends at 12:00.
(I'd post it as a comment to the gashu's answerbut I don't have enough reputation)
I'd go for Lead() analytic function
select
sum(needs_room_ind) as min_rooms
from (
select
id,
start_time,
end_time,
case when lead(start_time,1) over (order by start_time asc) between start_time
and end_time then 1 else 0 end as needs_room_ind
from
meetings
) a
IMO, I wanna to take the difference between how many meeting are started and ended at the same time when each meeting_id is started (assuming meeting starts and ends on time)
my code was just like this :
with alpha as
(
select a.meeting_id,a.start_time,
count(distinct b.meeting_id) ttl_meeting_start_before,
count(distinct c.meeting_id) ttl_meeting_end_before
from meeting a
left join
(
select meeting_id,start_time from meeting
) b
on a.start_time > b.start_time
left join
(
select meeting_id,end_time from meeting
) c
on a.start_time > c.end_time
group by a.meeting_id,a.start_time
)
select max(ttl_meeting_start_before-ttl_meeting_end_before) max_meeting_room
from alpha
Right now I just have
INSERT INTO MY_TABLE (VAL1, VAL2)
SELECT VAL1, VAL2 FROM OTHER_TABLE;
However, if MY_TABLE already has the values (1, 2), I don't want to let it insert (2,1) if (2,1) is in OTHER_TABLE;
Is there a way to do this here, or even while creating the tables?
I have tried to ALTER the table and create a UNIQUE constraint but it doesn't account for duplicates
Try this.
SELECT VAL1,
VAL2
FROM OTHER_TABLE a
WHERE NOT EXISTS (SELECT 1
FROM my_table b
WHERE ( a.val1 = b.val2
AND a.val2 = b.val1 )
OR ( a.val1 = b.val1
AND a.val2 = b.val2 ))
If there is already a UNIQUE constraint on the target table, you should be able to use the IGNORE command to prevent duplicate records.
http://dev.mysql.com/doc/refman/5.5/en/insert.html
INSERT IGNORE INTO MY_TABLE (VAL1, VAL2)
SELECT VAL1, VAL2 FROM OTHER_TABLE
We can place a condition while creating the table itself:
create table my_table
(
val1 int,
val2 int,
check( val,val2 not in (select val1,val2 from other_table,my_table where my_table.val1=other_table.val2 and my_table.val2=other_table.val1)
union
select val1,val2 from my_table mt1,mt2 where mt1.val1=mt2.val2 and mt1.val2=mt2.val1))
);
I create a temporary table #tbl(account, last_update). I have following two inserts from different source (could be tables from different databases) to insert account with last update date. For example
create table #tbl ([account] numeric(18, 0), [last_update] datetime)
insert into #tbl(account , last_update)
select table1.account, max(table1.last_update)
from table1 join…
group by table1.account
insert into #tbl(account , last_update)
select table2.account, max(table2.last_update)
from table2 join…
group by table2.account
The problem is this could cause duplicate account in the table #tbl. I either have to avoid it during each insert or remove the duplicate after both insert. Also, if there is account with two different last_update, I want the #tbl have the latest last_update. How do I achieve this conditional insert? Which one will have better performance?
Do you think you could rewrite your query to something like:
create table #tbl ([account] numeric(18, 0), [last_update] datetime)
insert into #tbl(account , last_update)
select theaccount, MAX(theupdate) from
(
select table1.account AS theaccount, table1.last_update AS theupdate
from table1 join…
UNION ALL
select table2.account AS theaccount, table2.last_update AS theupdate
from table2 join…
) AS tmp GROUP BY theaccount
The UNION ALL will build you 1 unique table combining table1 + table2 records. From there, you can act as if was a regular table, which means that you are able to find the max last_update for each record using a "group by"
insert into #tbl(account , last_update)
select account, last_update
from
(
select a.* from #table1 a where
last_update in( select top 1 last_update from #table1 b
where
a.account = b.account
order by last_update desc)
UNION
select a.* from #table2 a where
last_update in( select top 1 last_update from #table2 b
where
a.account = b.account
order by last_update desc)
) AS tmp