Limit selected results by unique selected IDs when using left joins - mysql

I have a table users and some other tables like images and products
Table users:
user_id user_name
1 andrew
2 lutz
3 sophie
4 michael
5 peter
6 oscor
7 anton
8 billy
9 henry
10 jon
Tables images:
user_id img_type img_url
1 0 url1
1 1 url4
2 0 url5
7 0 url7
8 0 url8
9 1 url9
Table Products
user_id prod_id
1 5
1 55
2 555
8 5555
9 5
9 55
I use this kind of SELECT:
SELECT * FROM
(SELECT user.user_id,user.user_name, img.img_type, prod.prod_id FROM
users
LEFT JOIN images img ON img.user_id = users.user_id
LEFT JOIN products prod ON prod.user_id = users.user_id
WHERE user.user_id <= 5) AS users
ORDER BY user.user_id ASC
The result should be the following output. Due to performance improvements, I use ORDER BY and an inner select. If I put a LIMIT 5 within the inner or outer select, things won't work. MySQL will hard LIMIT the results to 5. However I need the LIMIT of 5 (pagination) found unique user_id results which would lead to 9 in this case.
Can I use maybe an if-statement to push an array with found user_id and break/finish up the select when the array consist of 5 UIDs? Or can I modify somehow the select?
user_id user_name img_type prod_id
1 andrew 0 5
1 andrew 1 5
1 andrew 0 55
1 andrew 1 55
2 lutz 0 5
2 lutz 0 55
3 sophie null null
4 michael null null
5 peter null null
results: 9

LIMIT 5 and user_id <= 5 do not necessarily give you the same results. One reason: There are multiple rows (after the JOINs) for user_id = 1. This is because there can be multiple images and/or multiple products for a given 'user'.
So, first decide which you want.
LIMIT without ORDER BY gives you an arbitrary set of rows. (Yeah, it is somewhat predictable, but you should not depend on it.)
ORDER BY + LIMIT usually implies gathering all the potentially relevant rows, sorting them, then doing the "limit". There are sometimes ways around this sluggishness.
LEFT leads to the NULLs you got; did you want that?
What do you want pagination to do if you are displaying 5 items per page, but user 1 has 6 images? You need to think about this edge case before we can help you with a solution. Maybe you want all of user 1 on a page, even if it exceeds 5? Maybe you want to break in the middle of '1'; but then we need an unambiguous way to know where to continue from for the next page.
Probably any viable solution will not use nested SELECTs. As you are finding out, it leads to "errors". Think of it this way: First find all the rows you need to display on all the pages, then carve out 5 for the current page.
Here are some more musings on pagination: http://mysql.rjweb.org/doc.php/pagination

Related

MySQL: Place rows that have parent_id filled right after the parent id

I have a table with 3 columns comment_id, comment and parent_comment_id. The query is always limited to 10 and I have to return within those 10 results all comment_id followed by rows that have parent_comment_id of previous comment_id.
Example: comment_id 2 has answer in comment_id 20 that has parent_comment_id set as 2. I need result to look like this:
comment_id ... parent_comment_id
2 0
20 2
3 0
4 0
5 0
15 5
...
Is there some nice way to order it? Here is sample data: http://sqlfiddle.com/#!9/70120c/5
If I understand your question correctly, I think this query will do what you want. I modified your fiddle slightly to add another comment response to comment 1 (new version here). The ordering is a little tricky, the first term orders the comment and its replies together, and then the second term orders the comments within that group (so that 1 and its replies 3 & 15 are sorted in that order). The DISTINCT is required to prevent multiple output rows for a comment which has multiple replies.
SELECT DISTINCT c.comment_id,c.comment,c.parent_comment_id
FROM comments c
LEFT JOIN comments r ON c.comment_id = r.parent_comment_id
ORDER BY IF(c.parent_comment_id=0, c.comment_id, c.parent_comment_id), c.comment_id
LIMIT 10;
Output:
comment_id comment parent_comment_id
1 The earth is flat 0
3 Response to 1 1
15 Another response to 1 1
2 One hundred angels can dance on the head of a pin 0
14 Response to 2 2
4 The earth is like a ball. 0
6 Response to 4 4
5 The earth is like a ball. 0
7 The earth is like a ball. 0
8 The earth is like a ball. 0

MySQL custom sort by subquery impacts performance

I have 1 table from which I return search results and display them in a a specific order. This example is an exact, simplified version of my db structure: http://www.java2s.com/Code/SQL/Select-Clause/Orderbyvaluefromsubquery.htm
and here is my current code, which works but heavily impacts performance to a large extend because of the subquery used:
SELECT * FROM `table` AS p1
WHERE CONCAT(title,artist,creator,version) LIKE '%searchInput%'
ORDER BY
(SELECT
MAX(`rating`) FROM `table` AS p2 WHERE p1.setId=p2.setId
) DESC
the above code searches and sorts the result sets by the highest rating in the set and that all rows from the same set are kept together, for example:
id setId rating title,artist,etc...
1 1 5
2 1 5
3 2 7
4 1 6
5 2 1
6 3 3
would sort to:
id setId rating title,artist,etc...
3 2 7
5 2 1
4 1 6
1 1 5
2 1 5
6 3 3
Currently it takes around 8.5sec to query 1000 rows and over half a minute for a large amount of rows, is there any way to improve the performance or would it be better to fetch all the results and sort them in PHP memory?
Help is much appreciated
You can probably speed things up a bit by separating the LIKEs:
SELECT p1.* FROM `table` AS p1
WHERE (title LIKE '%searchInput%')
OR (artist LIKE '%searchInput%')
OR (creator LIKE '%searchInput%')
OR (version LIKE '%searchInput%')
ORDER BY
(SELECT MAX(`rating`) FROM `table` AS p2 WHERE p1.setId=p2.setId) DESC
You could also try to
CREATE INDEX tbl_ndx ON table(setId, rating)
to improve sorting performances.

MYSQL Can WHERE IN default to ALL if no rows returned

Have a existing table of results like this;
race_id race_num racer_id place
1 0 32 2
1 1 32 3
1 2 32 1
1 3 32 6
1 0 44 2
1 1 44 2
1 2 44 2
1 3 44 2
etc...
Have lots of PHP scripts that access this table output the results in a nice format.
Now I have a case where I need to output the results for only certain race_nums.
So I have created this table races_included.
race_view race_id race_num
Day 1 1 0
Day 1 1 1
Day 2 1 2
Day 2 1 3
And can use this query to get the right results.
SELECT racer_id, place from results WHERE race_id=1
AND race_num IN
(SELECT race_num FROM races_included WHERE race_id='1' AND race_view='Day 1')
This is great but I only need this feature for a few races and to have it work in a compatible mode for the simple case show all races. I need to add alot of rows to the races_included table. Like
race_view race_id race_num
All 1 0
All 1 1
All 1 2
All 1 3
95% of my races don't use the daily feature.
So I am looking for a way to change the query so that if for race 1 there are no records in the races_included table it defaults to all races. In addition I need it to be close the same execution speed as the query without the IN clause, because this query Or variations of it are used a lot.
One way that does work is to redefine the table as races_excluded and use NOT IN. This works great but is a pain to manage the table when races are added or deleted.
Is there a simple way to use EXISTS and IN in tandem as a subquery to get the desired results? Or some other neat trick I am missing.
To clarify I have found a working but very slow solution.
SELECT * FROM race_results WHERE race_id=1
AND FIND_IN_SET(race_num, (SELECT IF((SELECT Count(*) FROM races_excluded
WHERE rid=1>0),(SELECT GROUP_CONCAT(rnum) FROM races_excluded
WHERE rid=1 AND race_view='Day 1' GROUP BY rid),race_num)))
It basically checks if any records exists for that race_id and if not return a set equal to the current race_num and if yes returns a list of included race nums.
You can do this by using or in the subquery:
SELECT racer_id, plac
from results
WHERE race_id = 1 AND
race_num IN (SELECT race_num
FROM races_included
WHERE race_id = '1' AND (race_view = 'Day 1' or raw_view = 'ANY')
);

Compare rows and get percentage

I found it hard to find a fitting title. For simplicity let's say I have the following table:
cook_id cook_rating
1 2
1 1
1 3
1 4
1 2
1 2
1 1
1 3
1 5
1 4
2 5
2 2
Now I would like to get an output of 'good' cooks. A good cook is someone who has a rating of at least 70% of 1, 2 or 3, but not 4 or 5.
So in my example table, the cook with id 1 has a total of 10 ratings, 7 of which have type 1, 2 and 3. Only three have type 4 or 5. Therefore the cook with id 1 would be a 'good' cook, and the output should be the cook's id with the number of good ratings.
cook_id cook_rating
1 7
The cook with id 2, however, doesn't satisfy my condition, therefore should not be listed at all.
select cook_id, count(cook_rating) - sum(case when cook_rating = 4 OR cook_rating = 5 then 1 else 0 end) as numberOfGoodRatings from cook
where cook_rating in (1,2,3,4,5)
group by cook_id
order by numberOfGoodRatings desc
However, this doesn't take into account the fact that there might be more 4 or 5 than good ratings, resulting in negative outputs. Plus, the requirement of at least 70% is not included.
You can get this with a comparison in your HAVING clause. If you must have just the two columns in the result set, this can be wrapped as a sub-select select cook_id, positive_ratings FROM (...)
SELECT
cook_id,
count(cook_rating < 4 OR cook_rating IS NULL) as positive_ratings,
count(*) as total_ratings
FROM cook
GROUP BY cook_id
HAVING (positive_ratings / total_ratings) >= 0.70
ORDER BY positive_ratings DESC
Edit Note that count(cook_rating < 4) is intended to only count rows where the rating is less than 4. The MySQL documentation says that count will only count non-null rows. I haven't tested this to see if it equates FALSE with NULL but I would be surprised it it doesn't. Worst case scenario we would need to wrap that in an IF(cook_rating < 4, 1,NULL).
I suggest you change a little your schema to make this kind of queries trivial.
Suppose you add 5 columns to your cook table, to simply count the number of each ratings :
nb_ratings_1 nb_ratings_2 nb_ratings_3 nb_ratings_4 nb_ratings_5
Updating such a table when a new rating is entered in DB is trivial, just as would be recomputing those numbers if having redundancy makes you nervous. And it makes all filterings and sortings fast and easy.

MySQL select unique values from a relationship table

I am currently working on a MYSQL project which has a few fairly standard many to many relationship.
I've shown a greatly simplified relationship table to aid my example.
table: book_authors
ID BOOKS AUTHORS
1 1 4
2 1 5
3 4 4
4 4 5
5 2 6
6 2 1
7 2 5
8 3 6
9 3 5
10 3 1
12 5 2
13 6 2
14 7 5
What I'm looking to achieve is to be able to select all books which have specified authors, and only get the books which match all of the supplied authors. The number of authors requested each time is also variable.
So if i'm looking for all books written by author 4 and author 5, I'll get result of books 1 and 4 only. If im looking for books written by author 5 only, i'll get book 7 only.
I don't think there's a very smooth way to do this, but if performance isn't a big concern, you can exploit MySQL's GROUP_CONCAT function, and write this:
SELECT books
FROM book_authors
WHERE books IN
( SELECT books
FROM book_authors
WHERE authors = 4
)
GROUP
BY books
HAVING GROUP_CONCAT(authors ORDER BY authors) = '4,5'
;
(The WHERE clause isn't even needed for correct results, but it makes the query less gratuitously expensive.)
If it's a one off you could do:
not very repeatable though.
select distinct
ba1.books
from
book_authors ba1
join
book_authors ba2
on ba1.id<>ba2.id and ba1.books=ba2.books
where
ba1.authors=5
and ba2.authors=4
edit: forgot part of the join