Can you help me write my MySQL join query. Here is what i have so far:
SELECT * FROM table1 LEFT JOIN table2 ON table2.id IN
(table1.comma_separated_ids) WHERE table1.id = [some id]
where table1.comma_separated_ids is a VARCHAR column containing a list of comma separated IDs (integers) that relate to IDs in table2.
The above query returns only one row when it should return every row in table1.comma_separated_ids that has a matching row in table2
What I'm actually trying to do is a little more complex but it's hard to explain so I'm starting here. Any help?
In MySQL, you cannot put a comma-separated list as a single argument to in. It is treated as a string, a single string.
You can use find_in_set():
SELECT *
FROM table1 LEFT JOIN
table2
ON find_in_set(table2.id, table1.comma_separated_ids)
WHERE table1.id = XXX;
However, the bigger issue is that you are storing ids in a comma-separated list. These should be in a separate junction table, with one row per id. It is bad enough to store lists in strings; storing integer ids is even worse.
Related
I don't know if this is possible, but can mysql do a sub select and retrieve multiple records?
Here is my simplified query:
SELECT table1.*,
(
SELECT table2.*
FROM Table2 table2
WHERE table2.key_id = table1.key_id
)
FROM Table1 table1
Basically, Table2 has X amount of records that I need to pull back in the query and I don't want to have to run a secondary query (for instance get the results from Table1 and then loop over those results and then get all the results from Table2).
Thanks.
No. The subquery in the SELECT clause is called a scalar subquery. A scalar subquery has two important properties:
It can only retrieve one column.
It can only retrieve zero or one rows.
A scalar subquery -- as its name implies -- substitutes for a scalar value in an expression. If the subquery returns no rows, the value used in the expression is NULL.
In your case, you can use a LEFT JOIN instead:
SELECT t1.*, t2.*
FROM Table1 t1 LEFT JOIN
Table2 t2
ON t2.key_id = t1.keyid;
Note that table aliases are a good thing. However, they should make the query simpler, so repeating the table name is not a big win.
MySQL can do a subquery that returns multiple rows or multiple columns, but it's not valid to do that in a scalar context.
You're putting a subquery in a scalar context. In other words, in the select-list, a subquery must return one column and one row (or zero rows), because it will be used for one item on the respective row as it uses the select-list to build a result.
I have a set of 15 tables with a large number of columns, I know how the joins work but many of the tables have similar column names.
For example:
Select *
from table1
join table2 on table1.id = table2.id
I get back columns:
id|id|name|name
etc... I don't know which columns correspond to which tables.
What I'd like to get returned is:
table1.id|table2.id|table1.name|table2.name
I realize that I could spell out the select statement like:
select table1.id 'table1.id', table2.id 'table2.id'
But there are hundreds of column names, and dozens of tables, so this would be impractical and it seems like something that should be easy to do.
Basically what I want to do can be put neatly in the following two steps.
Execute a query on a particular table. I then get the result set.
One of the columns in the result set say 'id' has only numbers in it. I would want to take this number
'id' in every row in the result set, join it with another table called ID_NAMES & replace the
individual ids of every row with corresponding names obtained by joining with ID_NAMES table.
In a sense this is like performing post SQL query on the result set I obtain from executing pre SQL query.
Is there anyway I can accomplish this?
SELECT table1.id, table2.name FROM table1 INNER JOIN table2 on table1.id = table2.id
Try this query....
Refer at JNevill
I have read a few posts on SO on how to delete duplicates, by comparing a table with another instance of itself, however I don't want to delete the duplicates I want to compare them.
eg. I have the fields "id", "sold_price", "bruksareal", "kommunenr", "Gårdsnr" ,"Bruksnr", "Festenr", "Seksjonsnr". All fields are int.
I want to identify the rows that are duplicates/identical (the same bruksareal, kommunenr, gårdsnr, bruksnr,festenr and seksjonsnr). If identical then I want to give these rows a unique reference number.
I believe this will make is easier to identify the rows that I later want to compare on other fields (eg. such as "sold_price", "sold_date" etc..)
I'm open to suggestions if you believe my approach is wrong...
Perform a join on the table to itself across all fields, then use an exists, query, such as:
Update Table1
Set reference = UUID()
Where exists (
Select tb1.id
from Table1 tb1 inner join Table1 tb2 on
tb1.Field1 = tb2.Field1 AND
tb1.Field2 = tb2.Field2 AND
etc
Where tb1.Id = Table1.Id
And tb1.Id != tb2.Id
)
actually you can simplify with just a join
Update Table1
Set reference = UUID()
From Table1 inner join Table1 tb2 on
Table1.Field1 = tb2.Field1 AND
Table1.Field2 = tb2.Field2 AND
etc
Where Table1.Id != tb2.Id
Depending on where you want to do that, i would go for a hash implementation. For every insert, calculate the hash of the needed columns when you do the insert (trigger maybe), and after that you should be able to find out very easily what rows are duplicated (if you index that column, the queries should be pretty fast, but remember that that is still not a int column, so it will get a little slower over time).
After this you can do whatever you please with the duplicated records, without very expensive queries on the database.
Later edit: Make sure that you convert the null values into some defined value, since some of the mysql functions like MD5 will just return null if the operand is null. The same goes for concat - if one operand is null, it will return null (the same is not valid for concat_ws though).
I have a table that compares the competitiveness of airline routes in United States. So, some of the fields in the table are id, route_id1, route_id2, airline_id1, airline_id2, sources_airport_id, and destination_airport_id.
This table is the result of self joining the routes table which consists of route maps.
But as the result, the table has somewhat duplicate records.
For example,
route 1 is competitive with route2 because they have the same source_airport and destination_airport but different airline_id. But I have two records comparing route1 to route2 and route2 to route1. They are the same comparison, but just ordered differently.
I've tried to fetch the duplicates by self-joining:
SELECT t1.*
FROM routes AS t1, routes AS t2
WHERE t1.route_id1 = t2.route_id2 AND t1.route_id2 = t2.route_id1
But this query just gets the same number of records in the table.
How do I get rid of the "duplicate" data?
Thanks in advance.
The problem is that you have no condition to separate t1 and t2. First you'll get duplicates where t1 and t2 are swapped. Secondly, if any rows have route_id1 = route_id2, you'll get those rows too, in both t1 and t2 of the result set.
The simplest way to get around this would be:
SELECT t1.* FROM routes AS t1, routes AS t2
WHERE t1.route_id1 = t2.route_id2 AND t1.route_id2 = t2.route_id1
AND t2.id > t1.id
The added criterion is that one row must have a larger id than the other. This means that t1, as returned, will always be the row with the lower id. You can of course replace it with a < or swap the parameters to get the row with the upper id.
That will get rid of most of the duplicates. If you have proper duplicates too in the database, those will create some duplicate rows in the result set of the above query. The reason is that a "duplicate" might be detected as being a "duplicate" of two different corresponding rows, which in turn are actual duplicates of each other.
in the select use the actual names of the fields and use the DISTINCT clause instead of using t1.* .
in the list of field make sure you do not include the airline_id as those are different and they would make your records not duplicates.
Have you tried using "SELECT DISTINCT t1.* FROM ..."?