I need to join two table on one common column, but I want to maintain a one-to-one relation on other two column. For example:
table_1
ID_C ID_ROW_C OPT
C 1 10
C 2 10
table_2
ID_F ID_ROW_F OPT
F 3 10
F 4 10
My query:
select *
from table_1, table_2
where table_1.OPT=table_2.OPT
result
ID_C ID_ROW_C OPT ID_F ID_ROW_F
C 1 10 F 3
C 1 10 F 4
C 2 10 F 3
C 2 10 F 4
desired result:
ID_C ID_ROW_C OPT ID_F ID_ROW_F
C 1 10 F 4
C 2 10 F 3
or
ID_C ID_ROW_C OPT ID_F ID_ROW_F
C 1 10 F 3
C 2 10 F 4
How can I do?
What you need to do is use JOIN.
SELECT * FROM table_1
JOIN table_2
ON table_1.OPT = table_2.OPT
More info from the MySQL manual: https://dev.mysql.com/doc/refman/5.0/en/join.html
And a relevant Stack Overflow discussion on the different types of JOINs: What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?
Since you're not providing any rule to relate the columns, you're getting exactly what you're supposed to get: All the rows of both tables that fulfill the relation.
However, you can create an "artificial" condition to get what you want... it's not pretty, but it will work:
select t1.id_c, t1.id_row_c
, t1.opt
, t2.id_f, t2.id_row_f
from
(
select #r_id_1 := (case
when #prev_opt_1 = table_1.opt then #r_id_1 + 1
else 1
end) as r_id
, table_1.*
, #prev_opt_1 := table_1.opt as new_opt_1
from (select #r_id_1 := 0, #prev_opt_1 := 0) as init_1
, table_1
order by table_1.opt, table_1.id_row_c
) as t1
inner join (
select #r_id_2 := (case
when #prev_opt_2 = table_2.opt then #r_id_2 + 1
else 1
end) as r_id
, table_2.*
, #prev_opt_2 := table_2.opt as new_opt_2
from (select #r_id_2 := 0, #prev_opt_2 := 0) as init_2, table_2
order by table_2.opt, table_2.id_row_f
) as t2 on t1.opt = t2.opt and t1.r_id = t2.r_id
See the result at SQL Fiddle.
The explanation
Let's take the first subquery:
select #r_id_1 := (case
when #prev_opt_1 = table_1.opt then #r_id_1 + 1
else 1
end) as r_id
, table_1.*
, #prev_opt_1 := table_1.opt as new_opt_1
from (select #r_id_1 := 0, #prev_opt_1 := 0) as init_1
, table_1
order by table_1.opt, table_1.id_row_c
In the from clause for this query, I'm declaring two user variables and initializing them to zero. The #r_id_1 variable will increase by one if the previous value of #prev_opt_1 is equal to the current value of opt, or reset to 1 if the value is different. The variable #prev_opt_1 will take the value of the opt column after the #r_id_1 variable is set. This means that, for each opt value, the #r_id_1 variable will have an increasing value.
The second subquery does exactly the same for the other table.
Finally, the outer-most query will join both subqueries using opt and the increasing Id.
Take the time to understand what's going on behind scenes (execute each subquery separatedly and see what happens).
As I said, this solution is "artificial"... it's a way to get what you need, but to avoid this dirty and quite complex hard solutions, you need to rethink your tables, and make them more easy to relate with each other.
Hope this helps
Related
I have table with 3 columns:
ID,
Cancellation_Policy_Type
Cancellation_Policy_Hours.
The query I would like to get to will allow me to select:
the min Cancellation_Policy_Hours which correspond to the Free Cancellation (if exists)
if the above doesn't exist for the specific ID, then I want to check if there is a partially refundable
if none of the above exist, then check if there is No Refundable.
The below query is not correct but it may give a better idea about what I am trying to achieve:
IF (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Free Cancellation') IS NOT NULL)
THEN (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Free Cancellation')
ELSEIF (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Free Cancellation') IS NULL AND (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours from MYTABLE WHERE Cancellation_Policy_Type = 'Partially Refundable') IS NOT NULL Then (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Partially Refundable')
ELSEIF (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Free Cancellation') IS NULL AND (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'Partially Refundable') IS NULL THEN (SELECT ID, Cancellation_Policy_Type, MIN(Cancellation_Policy_Hours) from MYTABLE WHERE Cancellation_Policy_Type = 'No Refundable')
END
Below you will find an example of my dataset:
This is the table which contains all data regarding the cancellation policies of every single ID:
ID
Cancellation_Policy_Type
Cancellation_Policy_Hours
1
No Refundable
17520
1
Partially Refunable
168
1
Free Cancellation
96
2
No Refundable
17520
2
Partially Refunable
336
2
Free Cancellation
48
3
No Refundable
17520
3
Partially Refunable
336
4
No Refundable
17520
Below is the desired result, that is a table which contains other pieces of information (including production) and the 2 columns where for every single ID repeats the best available cancellation policy type and hours:
ID
Most Flexible Cancellation Type
Most Flexible Cancellation Hours
Other Columns (including buckets)
1
Free Cancellation
96
a
1
Free Cancellation
96
b
1
Free Cancellation
96
c
2
Free Cancellation
48
a
2
Free Cancellation
48
b
2
Free Cancellation
48
c
3
Partially Refunable
336
a
3
Partially Refunable
336
b
3
Partially Refunable
336
c
4
No Refundable
17520
a
4
No Refundable
17520
b
4
No Refundable
17520
c
SELECT
a.ID
, Most_Flexible_Policy_Type
, Most_Flexible_Cancellation_Hours
, a.BookingWindowBuckets
FROM Production a
LEFT JOIN Property b on a.ID = b.ID
GROUP BY
1,2,3,4
Thank you
I understand that, for each row in production, you want to bring from table property the cancellation policy with the least policy hours.
We can do this with window functions to rank the policies of feach id, and a join to production:
select d.id,
p.cancellation_type_policy most_flexible_cancellation_type,
p.cancellation_policy_hours most_flexible_cancellation_hours
from production d
inner join (
select p.*, row_number() over(partition by id order by cancellation_policy_hours) rn
from property p
) p on p.id = d.id
where rn = 1
You have not given enough information to be able to help you with any degree of confidence that it is the "right" way but (guessing here) you could try -
SELECT
ID,
MIN(IF(Cancellation_Policy_Type = 'Free Cancellation', Cancellation_Policy_Hours, NULL)) AS minFreeCancellation,
MIN(IF(Cancellation_Policy_Type = 'Partially Refundable', Cancellation_Policy_Hours, NULL)) AS minPartiallyRefundable,
MIN(IF(Cancellation_Policy_Type = 'No Refundable', Cancellation_Policy_Hours, NULL)) AS minNoRefundable
FROM MYTABLE
WHERE ID = ?
GROUP BY ID;
If you provide an example in the form of full table structure (CREATE statement), some sample data, some stats and the desired outcome you are more likely to get the "right" answer. The above example of conditional aggregation is unlikely to be the best way to do it but it will probably provide what you are looking for. Depending on your dataset, just running the separate queries may be the best solution.
UPDATE
Here is one way of doing this using a window function (MySQL 8) -
WITH Properties (ID, Cancellation_Policy_Type, Cancellation_Policy_Hours, Most_Flexible) AS (
SELECT
ID, Cancellation_Policy_Type, Cancellation_Policy_Hours,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY
CASE Cancellation_Policy_Type
WHEN 'Free Cancellation' THEN 0
WHEN 'Partially Refundable' THEN 1
WHEN 'No Refundable' THEN 2
END ASC, Cancellation_Policy_Hours ASC
)
FROM Property
)
SELECT
a.ID,
b.Cancellation_Policy_Type,
b.Cancellation_Policy_Hours,
a.BookingWindowBuckets
FROM Production a
LEFT JOIN Properties b on a.ID = b.ID
WHERE b.Most_Flexible = 1;
As a beginner with SQL, I’m ok to do simple tasks but I’m struggling right now with multiple nested queries.
My problem is that I have 3 tables like this:
a Case table:
id nd date username
--------------------------------------------
1 596 2016-02-09 16:50:03 UserA
2 967 2015-10-09 21:12:23 UserB
3 967 2015-10-09 22:35:40 UserA
4 967 2015-10-09 23:50:31 UserB
5 580 2017-02-09 10:19:43 UserA
a Value table:
case_id labelValue_id Value Type
-------------------------------------------------
1 3633 2731858342 X
1 124 ["864","862"] X
1 8981 -2.103 X
1 27 443 X
... ... ... ...
2 7890 232478 X
2 765 0.2334 X
... ... ... ...
and a Label table:
id label
----------------------
3633 Value of W
124 Value of X
8981 Value of Y
27 Value of Z
Obviously, I want to join these tables. So I can do something like this:
SELECT *
from Case, Value, Label
where Case.id= Value.case_id
and Label.id = Value.labelValue_id
but I get pretty much everything whereas I would like to be more specific.
What I want is to do some filtering on the Case table and then use the resulting id's to join the two other tables. I'd like to:
Filter the Case.nd's such that if there is serveral instances of the same nd, take the oldest one,
Limit the number of nd's in the query. For example, I want to be able to join the tables for just 2, 3, 4 etc... different nd.
Use this query to make a join on the Value and Label table.
For example, the output of the queries 1 and 2 would be:
id nd date username
--------------------------------------------
1 596 2016-02-09 16:50:03 UserA
2 967 2015-10-09 21:12:23 UserB
if I ask for 2 different nd. The nd 967 appears several times but we take the oldest one.
In fact, I think I found out how to do all these things but I can't/don't know how to merge them.
To select the oldest nd, I can do someting like:
select min((date)), nd,id
from Case
group by nd
Then, to limit the number of nd in the output, I found this (based on this and that) :
select *,
#num := if(#type <> t.nd, #num + 1, 1) as row_number,
#type := t.nd as dummy
from(
select min((date)), nd,id
from Case
group by nd
) as t
group by t.nd
having row_number <= 2 -- number of output
It works but I feel it's getting slow.
Finally, when I try to make a join with this subquery and with the two other tables, the processing keeps going on for ever.
During my research, I could find answers for every part of the problem but I can't merge them. Also, for the "counting" problem, where I want to limit the number of nd, I feel it's kind of far-fetch.
I realize this is a long question but I think I miss something and I wanted to give details as much as possible.
to filter the case table to eliminate all but oldest nds,
select * from [case] c
where date = (Select min(date) from case
where nd = c.nd)
then just join this to the other tables:
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
to limit it to a certain number of records, there is a mysql specific command, I think it called Limit
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
Limit 4 -- <=== will limit return result set to 4 rows
if you only want records for the top N values of nd, then the Limit goes on a subquery restricting what values of nd to retrieve:
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
and nd In (select distinct nd from [case]
order by nd desc Limit N)
So finally, here is what worked well for me:
select *
from (
select *
from Case
join (
select nd as T_ND, date as T_date
from Case
where nd in (select distinct nd from Case)
group by T_ND Limit 5 -- <========= Limit of nd's
) as t
on Case.nd = t.T_ND
where date = (select min(date)
from Case
where nd = t.T_ND)
) as subquery
join Value
on Value.context_id = subquery.id
join Label
on Label.id = Value.labelValue_id
Thank you #charlesbretana for leading me on the right track :).
SELECT *
FROM a
WHERE a.re_id = 3443499
AND a.id IN
(
SELECT b.rsp_id FROM b
WHERE b.f_id = 9
GROUP BY b.rsp_id
HAVING FIND_IN_SET(16, GROUP_CONCAT(b.o_id)) > 0
AND FIND_IN_SET(15, GROUP_CONCAT(b.o_id)) > 0
UNION
SELECT b.rsp_id FROM b
WHERE b.f_id = 4
GROUP BY b.rsp_id
HAVING FIND_IN_SET(5, GROUP_CONCAT(b.o_id)) > 0
)
ORDER BY id DESC
Here "f_id" is array and its values are those in first parameter of "FIND_IN_SET" function.
For example
9=>(
16,
15
),
4=>(
5
)
Sample data for those 2 folumns in table b, 2 columns f_id and o_id
f_id o_id
9 15
9 18
9 23
4 5
3 8
The gist of this answer is that the current query does not run. So, fix the syntax and ask another question.
First, you could write the query so it is syntactically correct. The query will fail as written, because the first subquery returns at least two rows and the second only one.
Second, use UNION ALL instead of UNION, unless you specifically want to incur the overhead of removing duplicates.
Third, the ORDER BY will generate an error.
Fourth, the GROUP_CONCAT() is dangerous and unnecessary.
I'm not 100% sure this is the intention, but I would start with a query like this:
SELECT a.id, a.re_id
FROM a
WHERE a.re_id = 3443499 AND
a.id IN (SELECT b.rsp_id
FROM b
WHERE b.f_id = 9
GROUP BY b.rsp_id
HAVING MAX(b.o_id = 16) > 0 AND
MAX(b.o_id = 15) > 0
)
UNION ALL
SELECT b.rsp_id, NULL
FROM b
WHERE b.f_id = 4
GROUP BY b.rsp_id
HAVING MAX(b.o_id = 5) > 0
ORDER BY id;
Then, if you want this optimized, I would suggest asking another question, along with relevant information about the table structures and current performance.
I have the following table (call it trans):
issue_id: state_one: state_two: timer:
1 A B 1
1 B C 3
2 A B 2
2 B C 4
2 C D 7
I'd like the get the difference in 'timer' between consecutive rows, but only those with the same issue_id.
Expected result:
issue_id: state_one: state_two: timer: time_diff:
1 B C 3 2
2 B C 4 2
2 C D 7 3
When taking the time difference between two rows, I'd like the result displayed next to the later row.
If we only had one, time-ordered issue in the table, the following code works fine:
select
X.issue_id,
X.timer as X_timer,
Y.timer as Y_timer,
(X.timer - Y.timer) as time_diff
from trans X
cross join trans Y
where Y.timer in (
select
max(Z.timer)
from trans Z
where Z.timer < X.timer);
I want to generalize this approach to handle MANY issues with time-ordered state changes.
My idea was to add the following condition, but it only works if consecutive events belong to the same issue (not the case in the real world):
... where Z.timer < X.timer)
and X.issue_id = Y.issueid;
Question: In MySQL, can I do this iteratively (i.e. calculate differences for issue_id=1, then for issue_id=2, and so on)? A function or subquery?
Other strategies? Constraint: I have read-only privileges. I truly appreciate the help!
EDIT: I added expected output, added a row to my example table, and clarified.
select
issue_id, (MAX(timer)-MIN(timer)) as diff from trans
group by issue_id
Assuming timer or (issue_id,timer) is PRIMARY...
SELECT a.*, a.timer-MAX(b.timer)
FROM trans a
JOIN trans b
ON b.issue_id = a.issue_id
AND b.timer < a.timer
GROUP
BY a.issue_id
, a.timer;
Select * from #Temp
Select T1.Issuerid,T1.stateone,T1.statetwo,MAX(T1.timer)-MIN(T.timer) as Time_Diff from #Temp T1
left join #Temp T2 on
T1.issuerid=T2.IssuerId
group by T1.Issuerid,T1.stateone,T1.statetwo
Please Give me Reply
I have started learning MySQL and I'm having a problem with JOIN.
I have two tables: purchase and sales
purchase
--------------
p_id date p_cost p_quantity
---------------------------------------
1 2014-03-21 100 5
2 2014-03-21 20 2
sales
--------------
s_id date s_cost s_quantity
---------------------------------------
1 2014-03-21 90 9
2 2014-03-22 20 2
I want these two tables to be joined where purchase.date=sales.date to get one of the following results:
Option 1:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 1 2014-03-21 90 9
2 2014-03-21 20 2 NULL NULL NULL NULL
NULL NULL NULL NULL 2 2014-03-22 20 2
Option 2:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 NULL NULL NULL NULL
2 2014-03-21 20 2 1 2014-03-21 90 9
NULL NULL NULL NULL 2 2014-03-22 20 2
the main problem lies in the 2nd row of the first result. I don't want the values
2014-03-21, 90, 9 again in row 2... I want NULL instead.
I don't know whether it is possible to do this. It would be kind enough if anyone helps me out.
I tried using left join
SELECT *
FROM sales
LEFT JOIN purchase ON sales.date = purchase.date
output:
s_id date s_cost s_quantity p_id date p_cost p_quantity
1 2014-03-21 90 9 1 2014-03-21 100 5
1 2014-03-21 90 9 2 2014-03-21 20 2
2 2014-03-22 20 2 NULL NULL NULL NULL
but I want 1st 4 values of 2nd row to be NULL
Since there are no common table expressions or full outer joins to work with, the query will have some duplication and instead need to use a left join unioned with a right join;
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p LEFT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
UNION
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p RIGHT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
An SQLfiddle to test with.
In a general sense, what you're looking for is called a FULL OUTER JOIN, which is not directly available in MySQL. Instead you only get LEFT JOIN and RIGHT JOIN, which you can UNION together to get essentially the same result. For a very thorough discussion on this subject, see Full Outer Join in MySQL.
If you need help understanding the different ways to JOIN a table, I recommend A Visual Explanation of SQL Joins.
The way this is different from a regular FULL OUTER JOIN is that you're only including any particular row from either table at most once in the JOIN result. The problem being, if you have one purchase record and two sales records on a particular day, which sales record is the purchase record associated with? What is the relationship you're trying to represent between these two tables?
It doesn't sound like there's any particular relationship between purchase and sales records, except that some of them happened to take place on the same day. In which case, you're using the wrong tool for the job. If all you want to do is display these tables side by side and line the rows up by date, you don't need a JOIN at all. Instead, you should SELECT each table separately and do your formatting with some other tool (or manually).
Here's another way to get the same result, but the EXPLAIN for this is horrendous; and performance with large sets is going to be atrocious.
This is essentially two queries UNIONed together. The first query is essentially "purchase LEFT JOIN sales", the second query is essentially "sales ANTI JOIN purchase".
Because there is no foreign key relationship between the two tables, other than rows matching on date, we have to "invent" a key we can join on; we use user variables to assign ascending integer values to each row within a given date, so we can match row 1 from purchase to row 1 from sales, etc.
I wouldn't normally generate this type of result using SQL; it's not a typical JOIN operation, in the sense of how we traditionally join tables.
But, if I had to produce the specified resultset using MySQL, I would do it like this:
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #pl_i := IF(pl.date = #pl_prev_date,#pl_i+1,1) AS i
, #pl_prev_date := pl.date AS p_date
, pl.p_id
, pl.p_cost
, pl.p_quantity
FROM purchase pl
JOIN ( SELECT #pl_i := 0, #pl_prev_date := NULL ) pld
ORDER BY pl.date, pl.p_id
) p
LEFT
JOIN ( SELECT #sr_i := IF(sr.date = #sr_prev_date,#sr_i+1,1) AS i
, #sr_prev_date := sr.date AS s_date
, sr.s_id
, sr.s_cost
, sr.s_quantity
FROM sales sr
JOIN ( SELECT #sr_i := 0, #sr_prev_date := NULL ) srd
ORDER BY sr.date, sr.s_id
) s
ON s.s_date = p.p_date
AND s.i = p.i
UNION ALL
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #sl_i := IF(sl.date = #sl_prev_date,#sl_i+1,1) AS i
, #sl_prev_date := sl.date AS s_date
, sl.s_id
, sl.s_cost
, sl.s_quantity
FROM sales sl
JOIN ( SELECT #sl_i := 0, #sl_prev_date := NULL ) sld
ORDER BY sl.date, sl.s_id
) s
LEFT
JOIN ( SELECT #pr_i := IF(pr.date = #pr_prev_date,#pr_i+1,1) AS i
, #pr_prev_date := pr.date AS p_date
, pr.p_id
, pr.p_cost
, pr.p_quantity
FROM purchase pr
JOIN ( SELECT #pr_i := 0, #pr_prev_date := NULL ) prd
ORDER BY pr.date, pr.p_id
) p
ON p.p_date = s.s_date
AND p.i = s.i
WHERE p.p_date IS NULL
ORDER BY COALESCE(p_date,s_date),COALESCE(p_id,s_id)