i have been trying to use this particular query to find the top three most used musical keys in songs and show all the songs made using those musical keys so far the problem is that i'm using in operator with a subquery and it doesn't work!!
i have read that instead of in using join is preferable but since i haven't used any joins i am not able to use that with the query!! plaeae help!!!
SELECT `Key_Name`,`Song_Title`
FROM `musicalkey_record`,`musical_keys`,`record`
WHERE `record`.`Record_ID`=`musicalkey_record`.`Record_ID`
AND `musical_keys`.`Key_ID`=`musicalkey_record`.`Key_ID`
AND `Key_Name` IN (SELECT `Key_Name` FROM `musicalkey_record`,`musical_keys`,`record`
WHERE `record`.`Record_ID`=`musicalkey_record`.`Record_ID`
AND `musical_keys`.`Key_ID`=`musicalkey_record`.`Key_ID` GROUP BY `Key_Name` ORDER BY
COUNT(`Song_Title`) DESC LIMIT 3) ORDER BY `Key_Name`;
query with joins but without subquery:
SELECT `Key_Name`,`Song_Title` FROM `musical_keys` INNER JOIN `musicalkey_record` ON
`musical_keys`.`Key_ID`=`musicalkey_record`.`Key_ID`
INNER JOIN `record` ON `record`.`Record_ID`=`musicalkey_record`.`Record_ID` AND `
Key_Name` IN ('4F','Circle of fifths','C-Major') ORDER BY `Key_Name`;
This is a simplification of Barmar's approach, reducing the number of joins:
SELECT mk.Key_Name, Song_Title
FROM musicalkey_record mr JOIN
musical_keys mk
ON mk.Key_ID = mr.Key_ID JOIN
record r
ON r.Record_ID = mr.Record_ID JOIN
(SELECT Key_ID
FROM musicalkey_record mr
GROUP BY Key_ID
ORDER BY COUNT(*) DESC
LIMIT 3
) top3
ON mr.Key_ID = top3.Key_ID
ORDER BY mk.Key_Name;
I may be wrong, but it seems all those joins are not necessary. You want to count song records per musical key. So join musical_keys with musicalkey_record and count. To get the song names you would have to join with record, too, and use wm_concat to get the song names in one string.
SELECT *
FROM
(
SELECT mk.Key_Name, GROUP_CONCAT(r.Song_Title) as Song_Titles
FROM musical_keys mk
LEFT JOIN musicalkey_record mkr ON mkr.Key_ID = mk.Key_ID
LEFT JOIN record r ON r.Record_ID = mkr.Record_ID
GROUP BY mk.Key_Name
ORDER BY COUNT(*) DESC
LIMIT 3
) dummy
ORDER BY Key_Name;
EDIT: If you want to show all equally ranking records, i.e. at least three, but more if record four or more have the same count as record three, then you would have to get the top three, look up the third place and then select again to get all records with at least that count.
SELECT mk.Key_Name, GROUP_CONCAT(r.Song_Title) as Song_Titles
FROM musical_keys mk
LEFT JOIN musicalkey_record mkr ON mkr.Key_ID = mk.Key_ID
LEFT JOIN record r ON r.Record_ID = mkr.Record_ID
GROUP BY mk.Key_Name
HAVING COUNT(*) >=
(
SELECT MIN(cnt)
FROM
(
SELECT COUNT(*) as cnt
FROM musical_keys mk
LEFT JOIN musicalkey_record mkr ON mkr.Key_ID = mk.Key_ID
GROUP BY mk.Key_Name
ORDER BY COUNT(*) DESC
LIMIT 3
) dummy
)
ORDER BY Key_Name;
I suppose that the Key_Name is unique in musical_keys? Then you can even remove musical_keys from the inner select altogether and only select from musicalkey_record grouping by mkr.Key_ID instead of mk.Key_Name. Thus the query is even shorter.
SELECT mk.Key_Name, Song_Title
FROM musicalkey_record AS mr
JOIN musical_keys AS mk ON mk.Key_ID = mr.Key_ID
JOIN record AS r ON r.Record_ID = mr.Record_ID
JOIN (SELECT Key_Name
FROM FROM musicalkey_record AS mr
JOIN musical_keys AS mk ON mk.Key_ID = mr.Key_ID
JOIN record AS r ON r.Record_ID = mr.Record_ID
GROUP BY Key_Name
ORDER BY COUNT(Song_Title) DESC
LIMIT 3) AS top3 ON mk.Key_Name = top3.Key_Name
ORDER BY mk.Key_Name
Related
I have a query that displays the total amount of matches won by individual teams in the database,
select t.name, count(*) 'Matches_Won'
from test.team t
inner join test.match_scores m on m.winner = t.id
group by t.name
order by Matches_Won desc;
Another query prints out the total number of Matches Played by the Individual Teams in the database,
select t.name, count(*) 'Matches_PLayed' from test.team t inner join test.results r on r.home_team = t.id or r.away_team = t.id group by t.name
order by Matches_Played desc;
Now, I am trying to combine these two queries, I want a table with three columns,
Team Name
Matches Played
Matches Won
I tried to union the two queries, but it didn't work. Anyone who can guide me on this?
This is the Team Table
This is the Match Scores Table. In this table, the columns "Home Team", "Away Team" represents the Goals scored by respective team and the "Winner" is the foreign key, referring to the Team Table.
This is the Results Table, where home_team and away_team are foreign keys referring to the Teams Table.
Union unions two result sets. You want to have both results, something like a merged.
The following statement isn't proved, but it should show you the idea of the solution. You have to join both, the results and the played games in one query.
select t.name, count(distinct m.id) 'Matches_Won', count(distinct r.id) 'Matches_PLayed'
from test.team t
left join test.match_scores m on m.winner = t.id
left join test.results r on r.home_team = t.id or r.away_team = t.id
group by t.name
order by Matches_Won, Matches_Played desc;
can you join them based on team_name instead of union?
select
matchesplayed.name ,Matches_Won,Matches_PLayed
from
(
select t.name, count(*) 'Matches_PLayed' from test.team t inner join test.results r on r.home_team = t.id or r.away_team = t.id group by t.name
) matchesplayed
LEFT JOIN (select t.name, count(*) 'Matches_Won'
from test.team t
inner join test.match_scores m on m.winner = t.id
group by t.name ) matchesown
ON matchesown.name = matchesplayed.name
order by 1
Now, if a team looses all the matches, they will also show up with null in matches won column(left join is used for that).
I have three tables, which looks like this
forts
|id|lat|lon|
fort_sightings
|id|fort_id|team|
fort_raids
|id|fort_id|raid_level|
I need a query that fetches all the rows from forts and then select the latest information from fort_sightings and fort_raids, if any. There might be several rows where fort_id has the same value, so I need the latest information.
Currently, I have this, which might not be the prettiest
SELECT
*
FROM
forts c
LEFT JOIN fort_sightings o ON o.id = (
SELECT
id
FROM
fort_sightings
WHERE
fort_id = c.id
ORDER BY
id DESC
LIMIT 1
)
LEFT JOIN fort_raids r ON r.id = (
SELECT
id
FROM
fort_raids
WHERE
fort_id = c.id
ORDER BY
id DESC
LIMIT 1
)
But it's painfully slow, the query takes over 10 seconds. There's only ~350 rows in forts, so it really shouldn't take this long. I believe it's from all the SELECT queries in the JOIN, but I don't know any alternative.
EXPLAIN
This is your query:
SELECT *
FROM forts c LEFT JOIN
fort_sightings o
ON o.id = (SELECT fs2.id
FROM fort_sightings fs2
WHERE fs.fort_id = c.id
ORDER BY fs2.id DESC
LIMIT 1
) LEFT JOIN
fort_raids r
ON r.id = (SELECT fr2.id
FROM fort_raids fr2
WHERE fr2.fort_id = c.id
ORDER BY fr2.id DESC
LIMIT 1
);
I find it strangely structured. But, see if this works:
fort_sightings(fort_id, id)
fort_raids(fort_id, id)
It is important that the fort_id be the first key in the indexes.
If this doesn't work, then you might need to modify the query.
Aside from the index recommendation from Gordon, you are doing a correlated subquery which means that for every record in the forts table, you are re-running the query to the sightings and raids. I would change slightly to pull all max() per fort from each THEN join... like
SELECT
c.id,
c.lat,
c.lon,
fs2.team,
fr2.raid_level
FROM
forts c
LEFT JOIN
( select fs.fort_id, max( fs.id ) as MaxSightID
from fort_sightings fs
group by fs.fort_id ) S
on c.id = s.fort_id
LEFT JOIN fort_sightings fs2
on s.MaxSightID = fs2.id
LEFT JOIN
( select fr.fort_id, max( fr.id ) as MaxRaidID
from fort_raids fs
group by fr.fort_id ) R
on c.ID = r.fort_id
LEFT JOIN fort_raids fr2
on r.MaxRaidID = fr2.id
The second part of the left-joins goes back to the original raid and sighting tables to pull the corresponding team and raid level in the final results if any such are found.
If you need the most efficient query, you should add columns latest_fort_sighting_id latest_fort_raid_id to table forts. MySQL does not have powerful features such as Materialized Views or Hash Joins like PostgreSQL, we need to handle them manually. Do not forget using transaction for updates.
If you limit range of forts, alternatively, you can run optimized query only using LEFT JOIN.
select - SQL join: selecting the last records in a one-to-many relationship - Stack Overflow
SELECT forts.*, fs1.team, fr1.raid_level FROM forts
LEFT JOIN fort_sightings fs1 ON fs1.fort_id = forts.id
LEFT JOIN fort_sightings fs2 ON fs2.fort_id = forts.id AND fs1.id < fs2.id
LEFT JOIN fort_raids fr1 ON fr1.fort_id = forts.id
LEFT JOIN fort_raids fr2 ON fr2.fort_id = forts.id AND fr1.id < fr2.id
WHERE fs2.id IS NULL AND fr2.id IS NULL AND forts.id > 5 ORDER BY forts.id LIMIT 5;
I'm creating a system that allows a user to search a database of photo albums images for a keyword, it's working great, the only issue is that I'm ordering relevancy by the amount of times that keyword appears in an album. I'm doing this using:
SELECT collections_ids.collection_id
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
ORDER BY COUNT(*) DESC
As said, this works great.
The only issue is, when this is included in an "WHERE IN" query, it loses it's order and is returned randomly. For clarity, here is the query:
SELECT collections.id,
collections.title
images.img_small
FROM `collections`
INNER JOIN images ON images.id = collections.cover_photo
WHERE collections.`id` IN
(SELECT collections_ids.collection_id
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
ORDER BY COUNT(*) DESC)
I've tried researching, and people have suggested using the FIELD function, but I don't see that working in this context.
Any suggestions?
you can use sub query as join and take it count(*) as order by like below
SELECT collections.id,
collections.title
images.img_small
FROM `collections`
INNER JOIN images ON images.id = collections.cover_photo
INNER JOIN
(SELECT distinct collections_ids.collection_id As collection_id,COUNT(*) as total
FROM `keywords`
INNER JOIN collections_ids ON keywords.id = collections_ids.photo_id
WHERE keywords.`keyword` = 'trees'
GROUP BY collection_id
) as A
ON A.collection_id =collections.collection_id
order by A.total
I don't know much about query optimization but I know the order in which queries get executed
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
This the query I had written
SELECT
`main_table`.forum_id,
my_topics.topic_id,
(
SELECT MAX(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id
) AS `maxpostid`,
(
SELECT my_posts.admin_user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `admin_user_id`,
(
SELECT my_posts.user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `user_id`,
(
SELECT COUNT(my_topics.topic_id) FROM my_topics WHERE my_topics.forum_id = main_table.forum_id ORDER BY my_topics.forum_id DESC LIMIT 1
) AS `topicscount`,
(
SELECT COUNT(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_topics.topic_id DESC LIMIT 1
) AS `postcount`,
(
SELECT CONCAT(admin_user.firstname,' ',admin_user.lastname) FROM admin_user INNER JOIN my_posts ON my_posts.admin_user_id = admin_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `adminname`,
(
SELECT forum_user.nick_name FROM forum_user INNER JOIN my_posts ON my_posts.user_id = forum_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `nickname`,
(
SELECT CONCAT(ce1.value,' ',ce2.value) AS fullname FROM my_posts INNER JOIN customer_entity_varchar AS ce1 ON ce1.entity_id = my_posts.user_id INNER JOIN customer_entity_varchar AS ce2 ON ce2.entity_id=my_posts.user_id WHERE (ce1.attribute_id = 1) AND (ce2.attribute_id = 2) AND my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `fullname`
FROM `my_forums` AS `main_table`
LEFT JOIN `my_topics` ON main_table.forum_id = my_topics.forum_id
WHERE (forum_status = '1')
And now I want to know if there is any way to optimize it ? Because all the logic is written in Select section not From, but I don't know how to write the same logic in From section of the query ?
Does it make any difference or both are same ?
Thanks
Correlated subqueries should really be a last resort, they often end up being executed RBAR, and given that a number of your subqueries are very similar, trying to get the same result using joins is going to result in a lot less table scans.
The first thing I note is that all of your subqueries include the table my_posts, and most contain ORDER BY my_posts.post_id DESC LIMIT 1, those that don't have a count with no group by so the order and limit are redundant anyway, so my first step would be to join to my_posts:
SELECT *
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
WHERE forum_status = '1';
Here the subquery just ensures you get the latest post per topic_id. I have shortened your table aliases here for my convenience, I am not sure why you would use a table alias that is longer than the actual table name?
Now you have the bulk of your query you can start adding in your columns, in order to get the post count, I have added a count to the subquery Maxp, I have also had to add a few more joins to get some of the detail out, such as names:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
t2.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id,
MAX(post_id) AS post_id,
COUNT(*) AS postcount
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
LEFT JOIN
( SELECT forum_id, COUNT(*) AS topicscount
FROM my_topics
GROUP BY forum_id
) AS t2
ON t2.forum_id = f.forum_id
WHERE forum_status = '1';
I am not familiar with your schema so the above may need some tweaking, but the principal remains - use JOINs over sub-selects.
The next stage of optimisation I would do is to get rid of your customer_entity_varchar table, or at least stop using it to store things as basic as first name and last name. The Entity-Attribute-Value model is an SQL antipattern, if you added two columns, FirstName and LastName to your forum_user table you would immediately lose two joins from your query. I won't get too involved in the EAV vs Relational debate as this has been extensively discussed a number of times, and I have nothing more to add.
The final stage would be to add appropriate indexes, you are in the best decision to decide what is appropriate, I'd suggest you probably want indexes on at least the foreign keys in each table, possibly more.
EDIT
To get one row per forum_id you would need to use the following:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
MaxT.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN
( SELECT t.forum_id,
COUNT(DISTINCT t.topic_id) AS topicscount,
COUNT(*) AS postCount,
MAX(t.topic_ID) AS topic_id
FROM my_topics AS t
INNER JOIN my_posts AS p
ON p.topic_id = p.topic_id
GROUP BY t.forum_id
) AS MaxT
ON MaxT.forum_id = f.forum_id
LEFT JOIN my_topics AS t
ON t.topic_ID = Maxt.topic_ID
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
WHERE forum_status = '1';
I'm looking for help using sum() in my SQL query:
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
I use DISTINCT because I'm doing "group by" and this ensures the same row is not counted more than once.
The problem is that SUM(conversions.value) counts the "value" for each row more than once (due to the group by)
I basically want to do SUM(conversions.value) for each DISTINCT conversions.id.
Is that possible?
I may be wrong but from what I understand
conversions.id is the primary key of your table conversions
stats.id is the primary key of your table stats
Thus for each conversions.id you have at most one links.id impacted.
You request is a bit like doing the cartesian product of 2 sets :
[clicks]
SELECT *
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
[conversions]
SELECT *
FROM links
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
and for each link, you get sizeof([clicks]) x sizeof([conversions]) lines
As you noted the number of unique conversions in your request can be obtained via a
count(distinct conversions.id) = sizeof([conversions])
this distinct manages to remove all the [clicks] lines in the cartesian product
but clearly
sum(conversions.value) = sum([conversions].value) * sizeof([clicks])
In your case, since
count(*) = sizeof([clicks]) x sizeof([conversions])
count(*) = sizeof([clicks]) x count(distinct conversions.id)
you have
sizeof([clicks]) = count(*)/count(distinct conversions.id)
so I would test your request with
SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY links.id
ORDER BY links.created desc;
Keep me posted !
Jerome
Jeromes solution is actually wrong and can produce incorrect results!!
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
let's assume the following table
conversions
id value
1 5
1 5
1 5
2 2
3 1
the correct sum of value for distinct ids would be 8.
Jerome's formula produces:
sum(conversions.value) = 18
count(distinct conversions.id) = 3
count(*) = 5
18*3/5 = 9.6 != 8
For an explanation of why you were seeing incorrect numbers, read this.
I think that Jerome has a handle on what is causing your error. Bryson's query would work, though having that subquery in the SELECT could be inefficient.
Use the following query:
SELECT links.id
, (
SELECT COUNT(*)
FROM stats
WHERE links.id = stats.parent_id
) AS clicks
, conversions.conversions
, conversions.conversion_value
FROM links
LEFT JOIN (
SELECT link_id
, COUNT(id) AS conversions
, SUM(conversions.value) AS conversion_value
FROM conversions
GROUP BY link_id
) AS conversions ON links.id = conversions.link_id
ORDER BY links.created DESC
I use a subquery to do this. It eliminates the problems with grouping.
So the query would be something like:
SELECT COUNT(DISTINCT conversions.id)
...
(SELECT SUM(conversions.value) FROM ....) AS Vals
How about something like this:
select l.id, count(s.id) clicks, count(c.id) clicks, sum(c.value) conversion_value
from (SELECT l.id id, l.created created,
s.id clicks,
c.id conversions,
max(c.value) conversion_value
FROM links l
LEFT JOIN stats s ON l.id = s.parent_id
LEFT JOIN conversions c ON l.id = c.link_id
GROUP BY l.id, l.created, s.id, c.id) t
order by t.created
This will do the trick, just divide the sum with the count of conversation id which are duplicate.
SELECT a.id,
a.clicks,
SUM(a.conversion_value/a.conversions) AS conversion_value,
a.conversions
FROM (SELECT links.id,
COUNT(DISTINCT stats.id) AS clicks,
COUNT(conversions.id) AS conversions,
SUM(conversions.value) AS conversion_value
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id,links.id
ORDER BY links.created DESC) AS a
GROUP BY a.id
Select sum(x.value) as conversion_value,count(x.clicks),count(x.conversions)
FROM
(SELECT links.id,
count(DISTINCT stats.id) as clicks,
count(DISTINCT conversions.id) as conversions,
conversions.value,
FROM links
LEFT OUTER JOIN stats ON links.id = stats.parent_id
LEFT OUTER JOIN conversions ON links.id = conversions.link_id
GROUP BY conversions.id) x
GROUP BY x.id
ORDER BY x.created desc;
I believe this will give you the answer that you are looking for.