How to group in MySQL - mysql

I am trying to group rows in my tables in MySQL
I have the following database setup:
I would like to get the event with the highest id which has a specific user_id (such as '3') either as the organiser.user_id or the helper.user_id.
I am not sure how to go around doing this as I am fairly new to MySQL.
Any help would be appreciated.

you can either join event_helpers or event_organisers to events table.
then specific user_id and order by event id get the top one
select
events.* from events
inner join event_helpers on (event_helpers.event_id=events.id)
where event_helpers.user_id=3
order by events.id desc limit 1

the following should do the trick. As there's no examplary data I cannot vouch for it entirely. also I haven't actually ran the code against a database so it could contain minor errors.
In short i'm doing:
select all from the events table.
join the organisers table on event_id
limit results to only show rows with a joined user_id of 3
order rows descending (highest first)
limit the result to retrieve only one
SELECT *
FROM events
LEFT JOIN event_organisers ON event_organisers.event_id = events.id
LEFT JOIN event_helpers ON event_helpers.event_id = events.id
WHERE event_organisers.user_id = 3
OR event_helpers.user_id = 3
ORDER BY events.id DESC
LIMIT 1;
This should get you the result you're looking for. Or at least it should get you going.

Such lookups are usually done with IN or EXISTS.
select *
from events
where id in (select event_id from event_helpers where user_id = 3)
or id in (select event_id from event_organisers where user_id = 3)
order by id desc
limit 1;
IN or EXISTS often perform better than joins (though I don't expect this to be the case with the given task), but queries with an OR condition are often rather slow, because indexes might not be used as would be desired. Another option that may or may not perform better:
select *
from
(
select *
from events
where id = (select max(event_id) from event_helpers where user_id = 3)
union all
select *
from events
where id = (select max(event_id) from event_organisers where user_id = 3)
) newest_two
order by id desc
limit 1;

Related

Mysql multiple select statements in single query with if condition

i have table
users
with id,name,type,active,...
i have another table orders
with orderid,userid,...
i want to update orders table in such a way that
UPDATE orders SET userid=(SELECT id FROM users WHERE type="some" and active=1)
but my problem is
if SELECT id FROM users WHERE type="some" and active=1 doesnt have any result
i want to use
SELECT id FROM users WHERE type="some" limit 0,1
ie the first result
i can do this easly in any language like php/python etc but i just have access to mysql server so cannot do that
but how can i do in pure sql in single query
i tried if statement but not working
Here is one method using ORDER BY:
UPDATE orders o
SET userid = (SELECT u.id
FROM users u
WHERE u.type = 'some'
ORDER BY active DESC
LIMIT 1
);
This assumes that active only takes on the values 0 and 1. If there are other values, use ORDER BY (active = 1) DESC.
Performance should be fine with an index on users(type, active, id).
Another method uses aggregation and COALESCE():
UPDATE orders o
SET userid = (SELECT COALESCE(MAX(CASE WHEN active = 1 THEN u.id END),
MAX(u.id)
)
FROM users u
WHERE u.type = 'some'
);
I would expect the ORDER BY to be a wee bit faster, but sometimes MySQL surprises me with aggregations in correlated subqueries. That said, if you have very few rows for a given type, the performance difference may not be noticeable.

MySQL: combination of LEFT JOIN and ORDER BY is slow

There are two tables: posts (~5,000,000 rows) and relations (~8,000 rows).
posts columns:
-------------------------------------------------
| id | source_id | content | date (int) |
-------------------------------------------------
relations columns:
---------------------------
| source_id | user_id |
---------------------------
I wrote a MySQL query for getting 10 most recent rows from posts which are related to a specific user:
SELECT p.id, p.content
FROM posts AS p
LEFT JOIN relations AS r
ON r.source_id = p.source_id
WHERE r.user_id = 1
ORDER BY p.date DESC
LIMIT 10
However, it takes ~30 seconds to execute it.
I already have indexes at relations for (source_id, user_id), (user_id) and for (source_id), (date), (date, source_id) at posts.
EXPLAIN results:
How can I optimize the query?
Your WHERE clause renders your outer join a mere inner join (because in an outer-joined pseudo record user_id will always be null, never 1).
If you really want this to be an outer join then it is completely superfluous, because every record in posts either has or has not a match in relations of course. Your query would then be
select id, content
from posts
order by "date" desc limit 10;
If you don't want this to be an outer join really, but want a match in relations, then we are talking about existence in a table, an EXISTS or IN clause hence:
select id, content
from posts
where source_id in
(
select source_id
from relations
where user_id = 1
)
order by "date" desc
limit 10;
There should be an index on relations(user_id, source_id) - in this order, so we can select user_id 1 first and get an array of all desired source_id which we then look up.
Of course you also need an index on posts(source_id) which you probably have already, as source_id is an ID. You can even speed things up with a composite index posts(source_id, date, id, content), so the table itself doesn't have to be read anymore - all the information needed is in the index already.
UPDATE: Here is the related EXISTS query:
select id, content
from posts p
where exists
(
select *
from relations r
where r.user_id = 1
and r.source_id = p.source_id
)
order by "date" desc
limit 10;
You could put an index on the date column of the posts table, I believe that will help the order-by speed.
You could also try reducing the number of results before ordering with some additional where statements. For example if you know the that there will likely be ten records with the correct user_id today, you could limit the date to just today (or N days back depending on your actual data).
Try This
SELECT p.id, p.content FROM posts AS p
WHERE p.source_id IN (SELECT source_id FROM relations WHERE user_id = 1)
ORDER BY p.date DESC
LIMIT 10
I'd consider the following :-
Firstly, you only want the 10 most recent rows from posts which are related to a user. So, an INNER JOIN should do just fine.
SELECT p.id, p.content
FROM posts AS p
JOIN relations AS r
ON r.source_id = p.source_id
WHERE r.user_id = 1
ORDER BY p.date DESC
LIMIT 10
The LEFT JOIN is needed if you want to fetch the records which do not have a relations mapping. Hence, doing the LEFT JOIN results in a full table scan of the left table, which as per your info, contains ~5,000,000 rows. This could be the root cause of your query.
For further optimisation, consider moving the WHERE clause into the ON clause.
SELECT p.id, p.content
FROM posts AS p
JOIN relations AS r
ON (r.source_id = p.source_id AND r.user_id = 1)
ORDER BY p.date DESC
LIMIT 10
I would try with a composite index on relations :
INDEX source_user (user_id,source_id)
and change the query to this :
SELECT p.id, p.content
FROM posts AS p
INNER JOIN relations AS r
ON ( r.user_id = 1 AND r.source_id = p.source_id )
ORDER BY p.date DESC
LIMIT 10

Joining on "greater than" returning more than one row for left table

I have a query.
SELECT * FROM users LEFT JOIN ranks ON ranks.minPosts <= users.postCount
This returns a row every time it is matched. By using a GROUP BY users.id I get each row as a individual id.
However, when they group I only get the first row. I would instead like the row with the highest value of ranks.minPosts
Is there a way to do this, also, would it be faster (less resources) to just use two different queries?
Assuming there is only one column in ranks that you want, you can do this using a correlated subquery:
SELECT u.*,
(select r.minPosts
from ranks r
where r.minPosts <= u.PostCount
order by minPosts desc
limit 1
) as minPosts
FROM users u;
If you need the entire row from ranks, then join it back in:
SELECT ur.*, r.*
FROM (SELECT u.*,
(select r.minPosts
from ranks r
where r.minPosts <= u.PostCount
order by minPosts desc
limit 1
) as minPosts
FROM users u
) ur join
ranks r
on ur.minPosts = r.minPosts;
(The * is for convenience; you should list out the columns you want.)
Because you're using mysql, this will work:
SELECT * FROM (
SELECT *, users.id user_id
FROM users
LEFT JOIN ranks ON ranks.minPosts <= users.postCount
ORDER BY ranks.minPosts DESC
) x
GROUP BY user_id
Mysql always returns the first row encountered for each unique group, so if you first order the data, then use the non-standard grouping behaviour, you'll get the row you want.
Disclaimer:
Although this works reliably in practice, the mysql documentation says not to rely on it. If you use this convenient approach (which will reliably pass any test you can write), you should consider that it is not recommended by mysql and that later releases of mysql may not continue behave in this way.
What we'd really like to do would be to order the rows by ranks.minPosts before the group by. Unfortunately MySQL doesn't support that without using a subquery of some form.
If the ranks are already ordered by their ids then you can extract the id by selecting MAX(ranks.id), and if they're not, you can still get the highest ranks.minPosts by selecting MAX(ranks.minPosts). However, it would be nice to be able to get the entire record. I guess you're left with the subquery solution, which is as follows:
SELECT <fields> FROM users LEFT JOIN
(SELECT * FROM ranks ORDER BY minPosts DESC) as r
ON r.minPosts <= users.postCount GROUP BY users.id

How to sort groups in MySQL join operator?

In my sql I have this query
SELECT * FROM threads t
JOIN (
SELECT c.*
FROM comments c
WHERE c.thread_id = t.id
ORDER BY date_sent
ASC LIMIT 1
) d ON t.id = d.thread_id
ORDER By d.date_sent DESC
Basically I have two tables, threads and comments. Comments have a foreign key to the thread table. I want to get the earliest comment row for each thread row. Threads should have at least 1 comment. If it doesn't, then the thread row shouldn't be included.
In my query above, I do a select on thread, and then I join it with a custom query. I want to use t.id, where t is the select table outside the brackets. Inside the brackets I create a new result set thats comments are for the current thread. I do the sorting and limiting there.
Then afterwards, I sort it again, so its earliest on top. However when I run this, it gives an error #1054 - Unknown column 't.id' in 'where clause'.
Does anyone know whats wrong here?
Thanks
The unknown column t.id is due to the fact that the alias t is unknown inside the subquery, but indeed it isn't needed anyway since you join it in the ON clause.
Instead of a LIMIT 1, use a MIN(date_sent) aggregate grouped by thread_id in the subquery. Be careful also using SELECT * in a join query, if columns in both tables have the same names; better to list the columns explicitly.
SELECT
/* List the columns you explicitly need here rather than *
if there is any name overlap (like `id` for example) */
t.*,
c.*
FROM
threads t
/* join threads against the subquery returning only thread_id and earliest date_sent */
INNER JOIN (
SELECT thread_id, MIN(date_sent) AS firstdate
FROM comments
GROUP BY thread_id
) earliest ON t.id = earliest.thread_id
/* then join the subquery back against the full comments table to get the other columns
in that table. The join is done on both thread_id and the date_sent timestamp */
INNER JOIN comments c
ON earliest.thread_id = c.thread_id
AND earliest.firstdate = c.date_sent
ORDER BY c.date_sent DESC
Michael's answer is correct. This is another answer that follows more the form of your query. You can do what you want as a correlated subquery and then join in the additional information:
SELECT *
FROM (SELECT t.*,
(SELECT c.id
FROM comments c
WHERE c.thread_id = t.id
ORDER BY c.date_sent ASC
LIMIT 1
) as mostrecentcommentid
FROM threads t
) t JOIN
comments c
on t.mostrecentcommentid = c.id
ORDER By c.date_sent DESC;
It is possible that this has better performance, because it does not require aggregating all the data. However, for performance, you would want an index on comments(thread_id, date_set, id).

Using UNION, JOIN and ORDER by to Merge 2 identical tables

I need to join 2 identical tables to display the same list sorted by id. (posts and posts2)
It happens that before only worked with 1 table, but we've been using a second table (posts2) to store the new data from a certain id.
This is the query I used when I worked with 1 table(posts) and works fine.
select posts.id_usu,posts.id_cat,posts.titulo,posts.html,posts.slug,posts.fecha,hits.id,hits.hits,usuarios.id,usuarios.usuario,posts.id
From posts
Join hits On posts.id = hits.id
Join usuarios On posts.id_usu = usuarios.id
where posts.id_cat='".$catid."' order by posts.id desc
Now I tried to apply this query to Union 2 tables, but I don't know at what point instantiate the JOINS. I tried several ways but sends MYSQL Error. The following query merge the 2 tables and order by id, but need to add the JOIN.
select * from (
SELECT posts.id,posts.id_usu,posts.id_cat,posts.titulo,posts.html,posts.slug,posts.fecha
FROM posts where id_cat='6' ORDER BY id
)X
UNION ALL
SELECT posts2.id,posts2.id_usu,posts2.id_cat,posts2.titulo,posts2.html,posts2.slug,posts2.fecha FROM posts2 where id_cat='4' ORDER BY id DESC limit 20
I need to add this at the above query
Join hits On posts.id = hits.id
Join usuarios On posts.id_usu = usuarios.id
Thanks in advance guys.
If you want the same query as your first query but this time with union of your identical table i.e post2 then you can do so
select
p.id_usu,p.id_cat,p.titulo,p.html,p.slug,p.fecha
,hits.id,hits.hits,usuarios.id,usuarios.usuario
from (
(select
id_usu,id_cat,titulo,html,slug,fecha ,id
From posts
where id_cat='".$catid."' order by id desc limit 20)
UNION ALL
(select
id_usu,id_cat,titulo,html,slug,fecha ,id
From posts2
where id_cat='".$catid."' order by id desc limit 20)
) p
Join hits On p.id = hits.id
Join usuarios On p.id_usu = usuarios.id
order by p.id desc limit 20