Slow SQL query with LEFT JOIN

Slow SQL query with LEFT JOIN - mysql

I have already read similar questions, but it does not help me.
I have query
SELECT `login`,
`photo`,
`username`,
`user`.`id`,
`name`,
`msg_info`
FROM `user`
LEFT JOIN `friends`
ON `friends`.`child` = `user`.`fb_id`
WHERE `friends`.`parent` = '1111'
ORDER BY `msg_info` DESC
Which tooks 0.7411 seconds (and even more)
It shows 158 total rows (ok i can limit it, but query still slow)
Each of tables friends and user has more than 200.000 rows
What can i do for query go faster?
Thank you!

As the comments pointed out, your left join is really not different than the following inner join query:
SELECT
login,
photo,
username,
user.id,
name,
msg_info
FROM user u
INNER JOIN friends f
ON f.child = u.fb_id
WHERE
f.parent = '1111'
ORDER BY
msg_info DESC;
We can try adding an index to the friends table on (parent, child, name, msg_info, ...). I am not sure which other columns belong to friends, but the basic idea is to create an index on parent, to speed up the WHERE clause, and hopefully take advantage of low cardinality on the parent column. Then, we include the child column to speed up the join. We also include all the other columns in the select clause to let the index cover the other columns we need.
CREATE INDEX idx ON friends (parent, child, name, msg_info, ...);

As #MrVimes suggeted, sometimes adding a condition to the JOIN clause can make a big difference:
SELECT login, photo, username, user.id, name, msg_info
FROM user u
INNER JOIN friends f ON f.child = u.fb_id AND f.parent = '1111'
ORDER BY msg_info DESC;
Assuming, of course, all your PK and FKs are properly defined and indexed.

Related

MySQL optimize a union-query by using a join-query instead

I have 3 tables - one for users, one for their incoming payments, and one for their outgoing payments. I want to display all incoming and outgoing payments in a single result set. I can do this with multiple selects and a union but it seems cumbersome, and I suspect its slow due to the subqueries - and the tables are extremely large (though I am using indexes). Is there a faster way to achieve this? Maybe using a full outer join?
Here is a simplified version of the schema with some example data:
create table users (
id int auto_increment,
name varchar(20),
primary key (id)
) engine=InnoDB;
insert into users (name) values ('bob'),('fred');
create table user_incoming_payments (
user_id int,
funds_in int
) engine=InnoDB;
insert into user_incoming_payments
values (1,100),(1,101),(1,102),(1,103),
(2,200),(2,201),(2,202),(2,203);
create table user_outgoing_payments (
user_id int,
funds_out int
) engine=InnoDB;
insert into user_outgoing_payments
values (1,100),(1,101),(2,200),(2,201);
And here is the ugly looking query which generates the result I want for user bob:
select * from (
(select u.name, i.funds_in, 0 as 'funds_out' from users u
inner join user_incoming_payments i on u.id = i.user_id)
union
(select u.name, 0 as 'funds_in', o.funds_out from users u
inner join user_outgoing_payments o on u.id = o.user_id)
) a where a.name = 'bob'
order by a.funds_in asc, a.funds_out asc;
And here is as close as I can get to doing the same thing with joins - its not correct though because I want this result set to look the same as the previous and I wasn't sure how to use full outer join:
select *
from users u
right join user_incoming_payments i on u.id = i.user_id
right join user_outgoing_payments o on u.id = o.user_id
where u.name = 'bob';
SQL Fiddle here

MySQL doesn't support FULL OUTER JOIN. Even if it did support it, I don't think you would want that, as it would introduce a semi-cartesian product... with each row from incoming_ matching every row in outgoing_, creating extra rows.
If there were four rows from incoming_ and six rows from outgoing_, the set produced by a join operation would contain 24 rows.
This really looks more like you want a set concatenation operation. That is, you have two separate sets that you want to concatenate together. That's not a JOIN operation. That's a UNION ALL set operation.
SELECT ... FROM ...
UNION ALL
SELECT ... FROM ...
If you don't need to remove duplicates (and it looks like you wouldn't want to in this scenario, if there are multiple rows in incoming_ with the same value of funds_in, I don't think you want to remove any of the rows.)...
Then use the UNION ALL set operator which does not perform the check for and removal of duplicate rows.
The UNION operator removes duplicate rows. Which (again) I don't think you want.
The derived table isn't necessary.
And MySQL doesn't "push" the predicate from the outer table into the inline view. Which means that MySQL is going to materialized a derived table with all incoming and outgoing for all users. And the the outer query is going to look through that to find the rows. And until the most recent versions of MySQL, there were no indexes created on derived tables.
See the answer from Strawberry for an example of a more efficient query.
With the small example set, indexes aren't going to make any difference. With a large set, however, you are going to want to add appropriate covering indexes.
Also, with queries like this, I tend to include a discriminator column that tells me which query returned a row.
(
SELECT 'i' AS src
, ...
FROM ...
)
UNION ALL
(
SELECT 'o' AS src
, ...
FROM ...
)
ORDER BY ...

With this model, I'd probably write that query as follows, but I doubt it makes much difference...
select u.name
, i.funds_in
, 0 funds_out
from users u
join user_incoming_payments i
on u.id = i.user_id
where u.name = 'bob'
union all
select u.name
, 0 funds_in
, o.funds_out
from users u
join user_outgoing_payments o
on u.id = o.user_id
where u.name = 'bob'
order
by funds_in asc
, funds_out asc;
However, note that there's no PK here, which may prove problematic.
If it was me, I'd have one table for transactions, which would include a transaction_id PK, a timestamp for each each transaction, and a column to record whether a value was a credit or a debit.

Best way to structure SQL queries with many inner joins?

I have an SQL query that needs to perform multiple inner joins, as follows:
SELECT DISTINCT adv.Email, adv.Credit, c.credit_id AS creditId, c.creditName AS creditName, a.Ad_id AS adId, a.adName
FROM placementlist pl
INNER JOIN
(SELECT Ad_id, List_id FROM placements) AS p
ON pl.List_id = p.List_id
INNER JOIN
(SELECT Ad_id, Name AS adName, credit_id FROM ad) AS a
ON ...
(few more inner joins)
My question is the following: How can I optimize this query? I was under the impression that, even though the way I currently query the database creates small temporary tables (inner SELECT statements), it would still be advantageous to performing an inner join on the unaltered tables as they could have about 10,000 - 100,000 entries (not millions). However, I was told that this is not the best way to go about it but did not have the opportunity to ask what the recommended approach would be.
What would be the best approach here?

To use derived tables such as
INNER JOIN (SELECT Ad_id, List_id FROM placements) AS p
is not recommendable. Let the dbms find out by itself what values it needs from
INNER JOIN placements AS p
instead of telling it (again) by kinda forcing it to create a view on the table with the two values only. (And using FROM tablename is even much more readable.)
With SQL you mainly say what you want to see, not how this is going to be achieved. (Well, of course this is just a rule of thumb.) So if no other columns except Ad_id and List_id are used from table placements, the dbms will find its best way to handle this. Don't try to make it use your way.
The same is true of the IN clause, by the way, where you often see WHERE col IN (SELECT DISTINCT colx FROM ...) instead of simply WHERE col IN (SELECT colx FROM ...). This does exactly the same, but with DISTINCT you tell the dbms "make your subquery's rows distinct before looking for col". But why would you want to force it to do so? Why not have it use just the method the dbms finds most appropriate?
Back to derived tables: Use them when they really do something, especially aggregations, or when they make your query more readable.
Moreover,
SELECT DISTINCT adv.Email, adv.Credit, ...
doesn't look to good either. Yes, sometimes you need SELECT DISTINCT, but usually you wouldn't. Most often it is just a sign that you haven't thought your query through.
An example: you want to select clients that bought product X. In SQL you would say: where a purchase of X EXISTS for the client. Or: where the client is IN the set of the X purchasers.
select * from clients c where exists
(select * from purchases p where p.clientid = c.clientid and product = 'X');
Or
select * from clients where clientid in
(select clientid from purchases where product = 'X');
You don't say: Give me all combinations of clients and X purchases and then boil that down so I just get each client once.
select distinct c.*
from clients c
join purchases p on p.clientid = c.clientid and product = 'X';
Yes, it is very easy to just join all tables needed and then just list the columns to select and then just put DISTINCT in front. But it makes the query kind of blurry, because you don't write the query as you would word the task. And it can make things difficult when it comes to aggregations. The following query is wrong, because you multiply money earned with the number of money-spent records and vice versa.
select
sum(money_spent.value),
sum(money_earned.value)
from user
join money_spent on money_spent.userid = user.userid
join money_earned on money_earned.userid = user.userid;
And the following may look correct, but is still incorrect (it only works when the values happen to be unique):
select
sum(distinct money_spent.value),
sum(distinct money_earned.value)
from user
join money_spent on money_spent.userid = user.userid
join money_earned on money_earned.userid = user.userid;
Again: You would not say: "I want to combine each purchase with each earning and then ...". You would say: "I want the sum of money spent and the sum of money earned per user". So you are not dealing with single purchases or earnings, but with their sums. As in
select
sum(select value from money_spent where money_spent.userid = user.userid),
sum(select value from money_earned where money_earned.userid = user.userid)
from user;
Or:
select
spent.total,
earned.total
from user
join (select userid, sum(value) as total from money_spent group by userid) spent
on spent.userid = user.userid
join (select userid, sum(value) as total from money_earned group by userid) earned
on earned.userid = user.userid;
So you see, this is where derived tables come into play.

Joining two mysql tables where clause may not exist

Ok that title hardly describes the issue. But Im stuck on how to solve this, so Ill just try to describe it.
I have two mySql table: listings, userVotes
Listings has fields (id, description, votes)
userVotes has fields (userId, listingsId)
Well Im trying to get the join of these tables so the result is (id, description, votes, userId, listingsId, vote)
So statement would normally be "select * join on id = listings id" (paraphrased)
But this led to problems because I still need the listing to show even if there was no userVote for it.
So I changed it to "select * left join" and that allowed me to retrieve the listing even if there was no userVote associated with it.
But this led to another problem. I need to have the constraint "where userId = 'foo'"; on the result. But this doesnt work because again it leaves out the listings with no userId.
So essentially I need the statement:
"Select * from listings l left join userVote u on l.id = u.listingId where if exists userId = 'foo'"
Is it doable?

You can just move the condition to the ON clause of the left join to not make it restrict the resulting rows;
SELECT *
FROM listings l
LEFT JOIN userVote u
ON l.id = u.listingId
AND u.userId = 'foo'
Any restriction you put in the WHERE clause will remove results, while any you put in the ON clause of a left join will only set the userVote to NULL if not matched.

MySQL select rows that do not have matching column in other table

I can't seem to figure this out so far. I am trying to join two tables and only select the rows in table A that do not have a matching column in table B. For example, lets assume we have a users table and a sent table.
users table has the following columns: id, username
sent table has the following columns: id, username
I want to select all rows from users where username does not exist in sent table. So, if tom is in users and in sent he will not be selected. If he is in users but not in sent he will be selected. I tried this but it didn't work at all:
SELECT pooltest.name,senttest.sentname
FROM pooltest,senttest
WHERE pooltest.name != senttest.sentname

Typically, you would use NOT EXISTS for this type of query
SELECT p.Name
FROM pooltest p
WHERE NOT EXISTS (SELECT s.Name
FROM senttest s
WHERE s.Name = p.Name)
An alternative would be to use a LEFT OUTER JOIN and check for NULL
SELECT p.Name
FROM pooltest p
LEFT OUTER JOIN senttest s ON s.Name = p.Name
WHERE s.Name IS NULL
Note that the implicit join syntax you are using is considered obsolete and should be replaced with an explicit join.

Try this SQL:
SELECT users.username
FROM users
LEFT JOIN sent ON sent.username = users.username
WHERE sent.username IS NULL;
The better way in my opinion would be:
SELECT users.username
FROM users
LEFT JOIN sent ON sent.id = users.id
WHERE sent.id IS NULL;
As both the id fields, would be indexed (primary key I would have thought) so this query would be better optimised than the first one I suggested.
However you may find my first suggestion better for you, it depends on what your requirements are for your application.

May be this one can help you ....
I had also the same problem but Solved using this this query
INSERT INTO tbl1 (id,name) SELECT id,name from tbl2 where (name) not in(select name from tbl1);
hope this one will solve your problem

Mysql group concat on double join

I have a user table from which I want all values, so I have this query:
SELECT tbl_user.* FROM tbl_user
Now I want one additional column in this result which shows all roles this user has, (or nothing if there are no roles for the user). The role information comes from two additional tables.
The first table contains these two values: userid, roleid
The second table contains roleid and role_name.
So the group concat needs to get all role names based on the roleid's in table1.
I have tried several different ways to do this, but I don't succeed. Either I get only one result with several times the same rolename, or no result at all.
Thanks for your help
Michael

Update: added LEFT JOIN for users with no role.
SELECT
tbl_user.*,
GROUP_CONCAT(role_name) AS roles
FROM
tbl_user LEFT JOIN tbl_roles ON tbl_user.userid = tbl_roles.userid
JOIN tbl_rolenames ON tbl_roles.roleid = tbl_rolenames.roleid
GROUP BY tbl_user.userid
Note that MySQL will permit a GROUP BY on fewer columns than appear in the SELECT list in total, but in other RDBMS you would need to explicitly list out the columns in tbl_user and include them in the GROUP BY, or do an additional self join against tbl_user to get the remaining columns from that table.
Something like:
SELECT
urole.userid,
uall.username,
uall.name,
uall.othercols,
urole.roles
FROM
tbl_user uall JOIN (
SELECT
tbl_user.userid,
GROUP_CONCAT(role_name) AS roles
FROM
tbl_user LEFT JOIN tbl_roles ON tbl_user.userid = tbl_roles.roleid
JOIN tbl_rolenames ON tbl_roles.roleid = tbl_rolenames.roleid
GROUP BY tbl_user.userid
) urole ON uall.userid = urole.userid

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Slow SQL query with LEFT JOIN - mysql

Related

MySQL optimize a union-query by using a join-query instead

Best way to structure SQL queries with many inner joins?

Joining two mysql tables where clause may not exist

MySQL select rows that do not have matching column in other table

Mysql group concat on double join

Categories

Resources