If I run an inline select subquery, I can filter the rows to line up with the root query. E.G.
A.Field1, A.Field2, (SELECT B.Field1 FROM tblB as B WHERE B.AID = A.ID ORDER BY B.DateAdded LIMIT 1)
FROM tblA as A
But If I try to move that select subquery to a joined subquery I can't use the where criteria (WHERE B.AID = A.ID) and there's no way to limit tblB fields to only matching tblA's row.
So what is the correct way to modify the query so I can select B.Field1, B.Field2, etc. when dealing with a 1:M?
sqlfiddle
Related
SELECT breakgame, Streak,
((SELECT (maxGameId - gameId) as gameGap
FROM game_result
WHERE game_result.breakgame >= kokopam.game_streak.breakgame
ORDER BY gameId DESC LIMIT 1)/ Streak) as nowWeight
FROM kokopam.game_streak, (SELECT max(gameId) as maxGameId FROM game_result ORDER BY gameId DESC LIMIT 1) maxGameId
WHERE breakgame>= 2
how to change this query to use join?
please help me
In first place, you should've a condition in the "where" clause that states the ID that the rows share.
Anyway, the method you're using works the same as inner join.
Select *
From tableA a, tableB b
Where a.id=b.id
Is the same as
Select *
From tableA a
Inner join tableB b on b.id=a.id
I could help you a bit more if you specify what you were trying do in the query and the columns that the tables have.
I have two tables with some data ( > 300_000 rows) and this simple query is taking ~1 seconds.
Any idea to make it faster?
SELECT a.*
FROM a
INNER JOIN b on (a.b_id = b.id)
WHERE b.some_int_column = 2
ORDER BY a.id DESC
LIMIT 0,10
Both, a.b_id and b.some_int_column have indexes. Also, a.id and i.id are integer primary keys.
When I try a explain, it says first it is using some_int_column index, with temporary and filesort.
If I do this same query, but ordering by b.id ASC it takes ~0.2 ms instead (I know this is because in such case I'm ordering by first explain row), but I really need to order by a table.
Is there something I am missing?
For this query:
SELECT a.*
FROM a INNER JOIN
b
ON a.b_id = b.id
WHERE b.some_int_column = 2
ORDER BY a.id DESC
LIMIT 0, 10;
The optimal indexes are likely to be b(some_int_column, id), and a, b_id, id).
You might find that this version has better performance with these indexes:
SELECT a.*
FROM a
WHERE EXISTS (SELECT 1
FROM b
WHERE a.b_id = b.id AND b.some_int_column = 2
)
ORDER BY a.id DESC
LIMIT 0, 10;
For this query, the indexes should be a(id, b_id) and b(id, some_int_column).
SELECT a.*
FROM b
INNER JOIN a on (b.id = a.b_id)
WHERE b.some_int_column = 2
ORDER BY a.id DESC
LIMIT 0,10
Try this. Because your are filtering on a column in table B, not a column in Table A. This may reduce the volume of data read. Depending on the sql optimizer it may match up all records in the join and then filter out those =2. But reversing it, the optimizer may only match up records in table b to a that are = 2 in your where clause.
If I have the following two tables:
Table "a" with 2 columns: id (int) [Primary Index], column1 [Indexed]
Table "b" with 3 columns: id_table_a (int),condition1 (int),condition2 (int) [all columns as Primary Index]
I can run the following query to select rows from Table a where Table b condition1 is 1
SELECT a.id FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.id_table_a=a.id && condition1=1 LIMIT 1) ORDER BY a.column1 LIMIT 50
With a couple hundred million rows in both tables this query is very slow. If I do:
SELECT a.id FROM a INNER JOIN b ON a.id=b.id_table_a && b.condition1=1 ORDER BY a.column1 LIMIT 50
It is pretty much instant but if there are multiple matching rows in table b that match id_table_a then duplicates are returned. If I do a SELECT DISTINCT or GROUP BY a.id to remove duplicates the query becomes extremely slow.
Here is an SQLFiddle showing the example queries: http://sqlfiddle.com/#!9/35eb9e/10
Is there a way to make a join without duplicates fast in this case?
*Edited to show that INNER instead of LEFT join didn't make much of a difference
*Edited to show moving condition to join did not make much of a difference
*Edited to add LIMIT
*Edited to add ORDER BY
You can try with inner join and distinct
SELECT distinct a.id
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
but using distinct on select * be sure you don't distinct id that return wrong result in this case use
SELECT distinct col1, col2, col3 ....
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
You could also add a composite index with use also condtition1 eg: key(id, condition1)
if you can you could also perform a
ANALYZE TABLE table_name;
on both the table ..
and another technique is try to reverting the lead table
SELECT distinct a.id
FROM b INNER JOIN a ON a.id=b.id_table_a AND b.condition1=1
Using the most selective table for lead the query
Using this seem different the use of index http://sqlfiddle.com/#!9/35eb9e/15 (the last add a using where)
# USING DISTINCT TO REMOVE DUPLICATES without col and order
EXPLAIN
SELECT DISTINCT a.id
FROM a
INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
;
It looks like I found the answer.
SELECT a.id FROM a
INNER JOIN b ON
b.id_table_a=a.id &&
b.condition1=1 &&
b.condition2=(select b.condition2 from b WHERE b.id_table_a=a.id && b.condition1=1 LIMIT 1)
ORDER BY a.column1
LIMIT 5;
I don't know if there is a flaw in this or not, please let me know if so. If anyone has a way to compress this somehow I will gladly accept your answer.
SELECT id FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
Take the condition into the ON clause of the join, that way the index of table b can get used to filter. Also use INNER JOIN over LEFT JOIN
Then you should have less results which have to be grouped.
Wrap the fast version in a query that handles de-duping and limit:
SELECT DISTINCT * FROM (
SELECT a.id
FROM a
JOIN b ON a.id = b.id_table_a && b.condition1 = 1
) x
ORDER BY column1
LIMIT 50
We know the inner query is fast. The de-duping and ordering has to happen somewhere. This way it happens on the smallest rowset possible.
See SQLFiddle.
Option 2:
Try the following:
Create indexes as follows:
create index a_id_column1 on a(id, column1)
create index b_id_table_a_condition1 on b(a_table_a, condition1)
These are covering indexes - ones that contain all the columns you need for the query, which in turn means that index-only access to data can achieve the result.
Then try this:
SELECT * FROM (
SELECT a.id, MIN(a.column1) column1
FROM a
JOIN b ON a.id = b.id_table_a
AND b.condition1 = 1
GROUP BY a.id) x
ORDER BY column1
LIMIT 50
Use your fast query in a subselect and remove the duplicates in the outer select:
SELECT DISTINCT sub.id
FROM (
SELECT a.id
FROM a
INNER JOIN b ON a.id=b.id_table_a && b.condition1=1
WHERE b.id_table_a > :offset
ORDER BY a.column1
LIMIT 50
) sub
Because of removing duplicates you might get less than 50 rows. Just repeat the query until you get anough rows. Start with :offset = 0. Use the last ID from last result as :offset in the following queries.
If you know your statistics, you can also use two limits. The limit in the inner query should be high enough to return 50 distinct rows with a probability which is high enough for you.
SELECT DISTINCT sub.id
FROM (
SELECT a.id
FROM a
INNER JOIN b ON a.id=b.id_table_a && b.condition1=1
ORDER BY a.column1
LIMIT 1000
) sub
LIMIT 50
For example: If you have an average of 10 duplicates per ID, LIMIT 1000 in the inner query will return an average of 100 distinct rows. Its very unlikely that you get less than 50 rows.
If the condition2 column is a boolean, you know that you can have a maximum of two duplicates. In this case LIMIT 100 in the inner query would be enough.
Here's my sql statement:
SELECT
tA.a1, GROUP_CONCAT(tB.b2) AS b2
FROM
tableA tA
LEFT JOIN
tableB tB ON tA.a2 = tB.b1
WHERE
CONCAT(tA.a1, b2) LIKE '%somestring%'
GROUP BY tA.a1;
I get an sql error saying something along the lines of "unknown column name b2 in WHERE".
SELECT
tA.a1, GROUP_CONCAT(tB.b2) AS b2
FROM
tableA tA
LEFT JOIN
tableB tB ON tA.a2 = tB.b1
GROUP BY tA.a1
HAVING
CONCAT(tA.a1, b2) LIKE '%somestring%';
You can't use aliases in WHERE clause - but in your case that's even senseless, because WHERE applies filter to rows that will be grouped while GROUP_CONCAT() collects rows that are already grouped
You may do that, for example, with subquery:
SELECT *
FROM
(SELECT
tA.a1 AS ta1, GROUP_CONCAT(tB.b2) AS b2
FROM
tableA tA
LEFT JOIN
tableB tB ON tA.a2 = tB.b1
GROUP BY tA.a1) AS grouped
WHERE
CONCAT(ta1, grouped.b2) LIKE '%somestring%'
for filtering aggregate functions, use HAVING instead of WHERE
select a, group_concat(b) as b_aggregate from
tbl
where concat(a,b) like "%somestring%" -- not aggregate
group by a
having concat(a, group_concat(b)) like "%somestring%" -- aggregate
I have a query like this:
SELECT q,COUNT(x),y,
(SELECT i FROM (SELECT q,w FROM tableA WHERE conds)
JOIN tableC ON (cond)
WHERE id = t.q)
FROM (SELECT q,w FROM tableA WHERE conds) t
JOIN tableB
GROUP BY q
The subquery (SELECT q,w FROM tableA WHERE conds) returns several hundred rows. After the GROUP BY q there is around 20 rows left.
The subquery (SELECT i FROM (SELECT q,w FROM tableA WHERE conds) join tableC WHERE id = t.q) uses inside of it the exactly same subquery as the one above, but then also selects a fraction of the results based on which q value is currently being grouped.
My problem seems to be this. The performance is too slow because I can't seem to put the WHERE id = t.q inside the (SELECT q,w, FROM Table A WHERE conds) subquery. I can only guess that for every unique value of q, the query is being run, it produces hundreds of rows and then has to perform the WHERE clause on an un-indexed temporary table. I think I need to perform the WHERE before the full join
Any ideas please?
This query could produce the same results, but so much information is missing from the question, who can be sure?
Select
q,
count(x),
y,
i
From
tableA a
inner join
tableC c
on cond and c.id = a.q
cross join -- is this an inner join?
tableB b
Where
conds
Group By
q,
y,
i