Optimize MySQL Subquery - mysql

I'm rather new to MySQL and am trying to simplify this statement:
SELECT DISTINCT p.user_id, a.artist_id, a.artist_name,
(SELECT COUNT(*) FROM plays WHERE user_id = p.user_id AND artist_id = a.artist_id) as count
FROM plays as p
LEFT OUTER JOIN artists AS a
ON p.artist_id = a.artist_id;
This accomplishes what I need but painfully slowly. There simply must be some way to do this in a more efficient manner. To give you an idea of the schema:
artists
artist_id artist_name
1 ArtistA
2 ArtistB
3 ArtistC
4 ArtistD
plays
user_id artist_id
1 1
1 2
1 2
2 4
2 4
3 3
And I'm trying to make a table like this:
plays per artist by user
user_id artist_id artist_name count
1 1 ArtistA 1
1 2 ArtistB 2
2 4 ArtistD 2
4 3 ArtistC 1
Granted, I'm working with several hundred thousands rows of data. I wasn't able to find anything on SO pertaining to this certain case but any resources/instruction would be hugely appreciated.
Thanks!

Yes, it is called a simple aggregation:
SELECT p.user_id, a.artist_id, a.artist_name, COUNT(*) as cnt
FROM artists a JOIN
plays p
ON p.artist_id = a.artist_id
GROUP BY p.user_id, a.artist_id, a.artist_name;
Because your aggregation has fields from both tables, it seems that you really want a match between the two tables. I changed the LEFT JOIN to an inner join.

Are there indexes on any of your tables? You'll probably want an index on artist_id on your plays table, if you don't already.
Also, I assume that artist_id on artists if a primary key, but if not, you'll want to do that too.
See https://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html for details.
It might help to provide the output of DESC SELECT DISTINCT p.user_id, a.artist_id, a.artist_name,
(SELECT COUNT(*) FROM plays WHERE user_id = p.user_id AND artist_id = a.artist_id) as count
FROM plays as p
LEFT OUTER JOIN artists AS a
ON p.artist_id = a.artist_id; to check whether your query is using indexes or not.
Having said that, you should switch to gordon-linoff#'s query too.

Related

How to select two columns(foreign key) value and make its match with a primary key (they have the same primary key) MYSQL

I have 2 table, one team and one match.
Team:
ID
Team_name
1
PSG
2
OM
ID is a Primary key
Matchs
ID_team_home
ID_team_away
goal_team_home
goal_team_away
1
2
5
4
2
1
6
1
ID_team_home and ID_team_away are foreign keys.
And the results i am aiming for is ONE query that doesn't create a table but just select a sum of all of the goals of the teams
Team_name
Team_goals
PSG
6
OM
10
please help
I have tried many solutions, i have used sum,join,case when,if,subqueries nothing worked please help.
Most of the time it just sum the two rows and give me a totally unaccurate answer.
SELECT team.team_name, SUM(matchs.goal_team_home) as BPe, CASE WHEN matchs.ID_team_home=team.id THEN SUM(matchs.goal_team_home) WHEN matchs.ID_team_away=equipe.id THEN SUM(matchs.goal_team_away) END as test from matchs,team
WHERE matchs.ID_team_home=team.id or matchs.ID_team_away=team.id
GROUP BY equipe.Equipes
ORDER BY test
We can use a union approach combined with an outer aggregation:
SELECT t1.Team_name, SUM(t2.goal_team) AS Team_goals
FROM Team t1
INNER JOIN
(
SELECT ID_team_home AS ID_team, goal_team_home AS goal_team FROM Matches
UNION ALL
SELECT ID_team_away, goal_team_away FROM Matches
) t2
ON t2.ID_team = t1.ID
GROUP BY t1.Team_name
ORDER BY t1.Team_name;
The union brings all teams/goals inline into just two columns. We then aggregate by team on that intermediate result to get the team goals.
Join team to matchs with a LEFT join (just in case there is a team without any row in matchs) and use conditional aggregation:
SELECT t.Team_name,
SUM(
CASE t.ID
WHEN m.ID_team_home THEN m.goal_team_home
WHEN m.ID_team_away THEN m.goal_team_away
ELSE 0
END
) Team_goals
FROM team t LEFT JOIN matchs m
ON t.ID IN (m.ID_team_home, m.ID_team_away)
GROUP BY t.ID
ORDER BY Team_goals;
See the demo.

how to fetch songs based on multiple conditions from joined tables

I have two tables songs and song_clubs. The schema is below:-
songs schema
id available_for song_name status
1 all Song 1 1
2 selection Song 2 1
3 selection Song 3 1
song_clubs schema
song_id club_id
2 1
2 2
3 2
Now i want to fetch the songs of club id 1 and the song is available for all clubs.
My execpted output is like below:-
id available_for song_name
1 all Song 1
2 selection Song 2
I have tried below Query
select id,available_for,song_name from songs
JOIN
song_clubs
on song_clubs.song_id = songs.id
WHERE songs.status =1 and song_clubs.club_id=1 or songs.available_for ='all'
But its only returning one entry that is selection based.
You can do it with EXISTS:
SELECT s.id, s.available_for, s.song_name
FROM songs s
WHERE s.status =1 AND (
s.available_for = 'all'
OR EXISTS (SELECT 1 FROM song_clubs c WHERE c.club_id = 1 AND c.song_id = s.id))
or with the operator IN:
SELECT id, available_for, song_name
FROM songs
WHERE status =1 AND (
available_for = 'all'
OR id IN (SELECT song_id FROM song_clubs WHERE club_id = 1))
Two things.
Use parentheses to group WHERE clauses; otherwise they evaluate left-to-right.
Use LEFT JOIN to avoid losing items from your first table that don't match any items in your second table.
This should work (https://www.db-fiddle.com/f/6dAz91ejhe8AbGECFDihbu/0)
SELECT id,available_for,song_name
FROM songs
LEFT JOIN song_clubs ON songs.id = song_clubs.song_id
WHERE songs.status = 1
AND (song_clubs.club_id=1 or songs.available_for ='all')
ORDER BY id;
you can use this answer too
select unique id,available_for,song from songs,song_clubs
WHERE (song_clubs.song_id = songs.id and songs.status = 1 and song_clubs.club_id=1) or (songs.available_for ='all');
Here I use full join to select all the matches and then select the unique id values for the songs so you can get only the required 2 rows
Note: It is not the best performance query if you have huge tables.
and it is better to use EXISTS or LEFT JOIN.so other answers are more better for performance and this answer is just another way to do that.

How to write SELECT query from 2 different tables based on result from 3rd table, while keeping order?

I have 3 tables where one table has 3 columns with foreign keys to the other two tables.
table album_posters_albums-
+---------+---------+---------+
| album_id|poster_id|albums_id|
+---------+---------+---------+
| 49 | 167 | NULL |
| 49 | NULL | 45 |
+---------+---------+---------+
album_id and albums_id references the album table and poster_id represents the poster table.
I need to
SELECT * FROM poster
WHERE poster_id IN (
SELECT poster_id
FROM album_poster_albums
WHERE album_id=49);
IF the poster_id IS NULL:
SELECT * FROM album
WHERE album_id IN (
SELECT poster_id
FROM album_poster_albums
WHERE album_id=49).
The problem is I need to keep the posters and albums in the same order as they occur in the album_posters_albums table.
I was sending a query to get the list of ids, then looping through each result and querying the db to get either the poster or album but that is obviously very inefficient when I should be able to do it in one query.
It sounds like you want to use INNER JOINS
SELECT album.*, poster.*
FROM album_poster_albums
INNER JOIN album ON album_poster_albums.albums_id = album.album_id
INNER JOIN poster ON album_poster_albums.poster_id = poster.poster_id
WHERE album_poster_albums.album_id = 49
Based on your comment about one row with a poster and one row with an album, UNION ALL might be what you're looking for. (We'd need to see more details about the tables and a few more rows to understand the ordering part.) This should give you an album row then a poster row for each album id.
Caveats: The number and the orders of columns in the album and poster tables must be the same. Also, the data types of those columns must be the same or compatible. (I haven't used a UNION, or UNION ALL, in a very long time.)
SELECT * FROM (
SELECT album.*
FROM album_poster_albums
INNER JOIN album ON album_poster_albums.albums_id = album.album_id
WHERE album_poster_albums.album_id = 49
UNION ALL
SELECT poster.*
FROM album_poster_albums
INNER JOIN poster ON album_poster_albums.poster_id = poster.poster_id
WHERE album_poster_albums.album_id = 49
)
ORDER BY album_id
DECLARE #rowId INT(11);
SET #rowId :=0;
SELECT * FROM(SELECT #rowId:=#rowId+1,t.album_id,album.*
FROM album_poster_albums t
INNER JOIN album ON album.albums_id = t.albums_id
WHERE t.album_id = 49
UNION
SELECT #rowId:=#rowId + 2,t.album_id,poster.*
FROM album_poster_albums s
INNER JOIN poster ON poster.poster_id = t.poster_id
WHERE t.album_id = 49) T
ORDER BY #rowId,t.album_id
I decided to create a new table with an auto increment field based on #beltouche comment. My Mysql is pretty rusty and I thought there may be a way using case or if null. I didn't need the unique id previously with how I wrote the queries.
In hindsight the solution is obvious.
SELECT * FROM (SELECT albums.*, 1 AS type, t.id
FROM album_poster_album t
INNER JOIN albums ON albums.album_id = t.albums_id
WHERE t.album_id = 49
UNION ALL
SELECT poster.*, 2 AS type, s.id
FROM album_poster_album s
INNER JOIN poster ON poster.posterID = s.poster_id
WHERE s.album_id = 49) T
ORDER BY t.id

Select the first image from multiple images using JOIN mysql

I have three tables food, fav_food and food_image. The food table has details about foods, food_image has multiple images for a single food and the
fav_food table has user's favorite food ids.
food:
f_id | description
food_image
f_id | img_url | rank
fav_food
user_id | f_id
Here's what I tried:
SELECT food.f_id,
food.description,
img.minimgrank,
i.img_url AS profile_photo
FROM fav_food
INNER JOIN food
ON fav_food.f_id = food.f_id
LEFT OUTER JOIN(SELECT f_id,
Min(rank) AS minImgRank
FROM food_image
GROUP BY f_id) img
ON img.f_id = food.f_id
JOIN food_image i
ON i.rank = img.minimgrank
WHERE fav_food.user_id = ?
Now I need a query that will show the favorite foods of user with description and image. Although there is multiple images, I need to select a single image with minimum rank (Suppose rank- 1,2,3 the the image with rank 1 will be selected). So my question is how to write a faster query to achieve my goal?
Consider joining the three tables and after that using a window function such as min(rank) to get the results.
select *
from (
select a.user_id
,b.img_url
,b.rank
,min(b.rank) over(order by b.rank asc) as rnk
from fav_food a
join food_image b
on a.f_id=b.f_id
join food c
on a.f_id=c.f_id
)x
where x.rank=x.rnk
This is a situation where a correlated subquery should have better performance:
SELECT f.f_id, f.description,
fi.rank, fi.img_url AS profile_photo
FROM fav_food ff JOIN
food f
ON ff.f_id = f.f_id LEFT JOIN
food_image fi
ON fi.f_id = f.f_id AND
fi.rank = (SELECT MIN(fi2.rank) FROM food_image fi2 WHERE fi2.f_id = fi.f_id)
WHERE ff.user_id = ? ;
For performance, you want to be sure that you have an index on food_image(f_id, rank), and fav_food(user_id, f_id), as well as indexes on the primary keys.
Why is this faster and than the GROUP BY version? First, indexes can be used for the correlated query, but probably won't be for the JOIN after the GROUP BY.
Second, the GROUP BY needs to process all the data in the images table. This only needs to process the favorite foods for the given user.
Finally, ROW_NUMBER() is an option in MySQL 8+. However, this is still likely to be as fast or faster than that solution (based on my experience with other databases).

Mysql query in drupal database - groupwise maximum with duplicate data

I'm working on a mysql query in a Drupal database that pulls together users and two different cck content types. I know people ask for help with groupwise maximum queries all the time... I've done my best but I need help.
This is what I have so far:
# the artists
SELECT
users.uid,
users.name AS username,
n1.title AS artist_name
FROM users
LEFT JOIN users_roles ur
ON users.uid=ur.uid
INNER JOIN role r
ON ur.rid=r.rid
AND r.name='artist'
LEFT JOIN node n1
ON n1.uid = users.uid
AND n1.type = 'submission'
WHERE users.status = 1
ORDER BY users.name;
This gives me data that looks like:
uid username artist_name
1 foo Joe the Plumber
2 bar Jane Doe
3 baz The Tooth Fairy
Also, I've got this query:
# artwork
SELECT
n.nid,
n.uid,
a.field_order_value
FROM node n
LEFT JOIN content_type_artwork a
ON n.nid = a.nid
WHERE n.type = 'artwork'
ORDER BY n.uid, a.field_order_value;
Which gives me data like this:
nid uid field_order_value
1 1 1
2 1 3
3 1 2
4 2 NULL
5 3 1
6 3 1
Additional relevant info:
nid is the primary key for an Artwork
every Artist has one or more Artworks
valid data for field_order_value is NULL, 1, 2, 3, or 4
field_order_value is not necessarily unique per Artist - an Artist could have 4 Artworks all with field_order_value = 1.
What I want is the row with the minimum field_order_value from my second query joined with the artist information from the first query. In cases where the field_order_value is not valuable information (either because the Artist has used duplicate values among their Artworks or left that field NULL), I would like the row with the minimum nid from the second query.
The Solution
Using divide and conquer as a strategy and mysql views as a technique, and referencing this article about groupwise maximum queries, I solved my problem.
Create the View
# artists and artworks all in one table
CREATE VIEW artists_artwork AS
SELECT
users.uid,
users.name AS artist,
COALESCE(n1.title, 'Not Yet Entered') AS artist_name,
n2.nid,
a.field_image_fid,
COALESCE(a.field_order_value, 1) AS field_order_value
FROM users
LEFT JOIN users_roles ur
ON users.uid=ur.uid
INNER JOIN role r
ON ur.rid=r.rid
AND r.name='artist'
LEFT JOIN node n1
ON n1.uid = users.uid
AND n1.type = 'submission'
LEFT JOIN node n2
ON n2.uid = users.uid
AND n2.type = 'artwork'
LEFT JOIN content_type_artwork a ON n2.nid = a.nid
WHERE users.status = 1;
Query the View
SELECT
a2.uid,
a2.artist,
a2.artist_name,
a2.nid,
a2.field_image_fid,
a2.field_order_value
FROM (
SELECT
uid,
MIN(field_order_value) AS field_order_value
FROM artists_artwork
GROUP BY uid
) a1
JOIN artists_artwork a2
ON a2.nid = (
SELECT
nid
FROM artists_artwork a
WHERE a.uid = a1.uid
AND a.field_order_value = a1.field_order_value
ORDER BY
uid ASC, field_order_value ASC, nid ASC
LIMIT 1
)
ORDER BY artist;
A simple solution to this can be to create views in your database that can then be joined together. This is especially useful if you often want to see the intermediate data in the same way in some other place. While it is possible to mash together the one huge query, I just take the divide and conquer approach sometimes.