How can I use MySQL to COUNT with a LEFT JOIN? - mysql

How can I use MySQL to count with a LEFT JOIN?
I have two tables, sometimes the Ratings table does not have ratings for a photo so I thought LEFT JOIN is needed but I also have a COUNT statement..
Photos
id name src
1 car bmw.jpg
2 bike baracuda.jpg
Loves (picid is foreign key with photos id)
id picid ratersip
4 1 81.0.0.0
6 1 84.0.0.0
7 2 81.0.0.0
Here the user can only rate one image with their IP.
I want to combine the two tables in order of the highest rating. New table
Combined
id name src picid
1 car bmw.jpg 1
2 bike baracuda.jpg 2
(bmw is highest rated)
My MySQL code:
SELECT * FROM photos
LEFT JOIN ON photos.id=loves.picid
ORDER BY COUNT (picid);
My PHP Code: (UPDATED AND ADDED - Working Example...)
$sqlcount = "SELECT p . *
FROM `pics` p
LEFT JOIN (
SELECT `loves`.`picid`, count( 1 ) AS piccount
FROM `loves`
GROUP BY `loves`.`picid`
)l ON p.`id` = l.`picid`
ORDER BY coalesce( l.piccount, 0 ) DESC";
$pics = mysql_query($sqlcount);

MySQL allows you to group by just the id column:
select
p.*
from
photos p
left join loves l on
p.id = l.picid
group by
p.id
order by
count(l.picid)
That being said, I know MySQL is really bad at group by, so you can try putting the loves count in a subquery in your join to optimize it:
select
p.*
from
photos p
left join (select picid, count(1) as piccount from loves group by picid) l on
p.id = l.picid
order by
coalesce(l.piccount, 0)
I don't have a MySQL instance to test out which is faster, so test them both.

You need to use subqueries:
SELECT id, name, src FROM (
SELECT photos.id, photos.name, photos.src, count(*) as the_count
FROM photos
LEFT JOIN ON photos.id=loves.picid
GROUP BY photos.id
) t
ORDER BY the_count

select
p.ID,
p.name,
p.src,
PreSum.LoveCount
from
Photos p
left join ( select L.picid,
count(*) as LoveCount
from
Loves L
group by
L.PicID ) PreSum
on p.id = PreSum.PicID
order by
PreSum.LoveCount DESC

I believe you just need to join the data and do a count(*) in your select. Make sure you specify which table you want to use for ambigous columns. Also, don't forget to use a group by function when you do a count(*). Here is an example query that I run on MS SQL.
Select CmsAgentInfo.LOGID, LOGNAME, hCmsAgent.SOURCEID, count(*) as COUNT from hCmsAgent
LEFT JOIN CmsAgentInfo on hCmsAgent.logid=CmsAgentInfo.logid
where SPLIT = '990'
GROUP BY CmsAgentInfo.LOGID, LOGNAME, hCmsAgent.SOURCEID
The example results form this will be something like this.
77615 SMITH, JANE 1 36
29422 DOE, JOHN 1 648
Hope that helps. Good Luck.

Related

Return the company with most film in a genre

I am working on this project at my university, where I need to create a query to the database. I want the query to return the company with most movies in the given genre. At the moment I have this query, but this only return one company, but there can probably be more than one.
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATUAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC LIMIT 1
I beleave I will need a "having" in here.
pls try this, it may because you added limit 1, which only show 1st retrieved record
SELECT CompanyID, CategoryID, COUNT(*) as NumberOfMovies
FROM Movie
NATURAL JOIN CategoryFilm
NATURAL JOIN Category
NATURAL JOIN Comapny
GROUP BY CategoryID, CompanyID
Order by NumberOfMovies DESC
I assume by "category" you mean "genre" -- or that they are the same thing.
Do not use NATURAL JOIN. It does not even use properly declared foreign key relationships, instead relying merely on name similarity between tables. It is dangerous because the columns used are not specified and can introduce hard-to-debug errors. I often refer to it as an "abomination" because it does not take table declarations into account.
If you have a given category, then I would expect a WHERE clause:
SELECT CompanyID, COUNT(*) as NumberOfMovies
FROM Movie m JOIN
CategoryFilm cf
ON cf.movie_id = m.movie_id JOIN
Company c
ON c.company_id = m.company_id
WHERE cf.category_id = ?
GROUP BY CategoryID
ORDER BY NumberOfMovies DESC
LIMIT 1;
If you want to allow ties, you can use window function rank():
select *
from (
select
co.companyID,
ca.categoryID,
count(*) NumberOfMovies,
rank() over(partition by c.categoryID order by count(*) desc) rn
from movie m
inner join categoryFilm cf on cf.movieID = m.movieID
inner join category ca on ca.categoryID = cf.categoryID
inner join company co on co.companyID = m.companyID
group by co.companyID, ca.categoryID
) t
where rn = 1
order by ca.categoryID
This gives you the top company for each and every category, ties included. If you want to filter on a given category, you can just add a where clause to the inner query.
Side note: do not use natural joins: they are error-prone. I rewrote the query to use inner joins instead (I made a few assumptions on the relations).

top 10 scorers by season

How I can get top 10 scorers by seasons.
So it shows last season top 10 scorers...
I've tryed left join into table, but it goes broken showing 2 player and counts all goals to first player.
My sqlfiddle:
http://sqlfiddle.com/#!9/b5d0a78/1
You got it almost right.
You want to group match_goals by player ID (match_player_id), but then you should not select goal_minute or any other per goal data.
After grouping by player, then you can create a column for COUNT(match_player_id) this will give you the number of goals, you can also use this column to order the results.
Your joins and conditions are correct I think.
EDIT
I think your schema needs a few tweaks: check this http://sqlfiddle.com/#!9/f5a75b/2
Basically create direct relations in the match_players and match_goals to the other tables.
I think the query you want looks like this:
SELECT p.*, count(*) as num_goals
FROM match_goals g INNER JOIN
match_players p
ON g.match_player_id = p.id INNER JOIN
matches m
ON m.id = p.match_id
WHERE p.is_deleted = 0 AND
g.is_own_goal = 0 AND
m.seasion_id = <last season id>
GROUP BY p.id
ORDER BY num_goals DESC
LIMIT 10;
Note that the teams table is not needed. The SELECT p.* is allowed because p.id (the GROUP BY key) is unique.

Query to select random values with inner join on three tables

I have a database with tree tables,
person: id, bio, name
book: id, id_person, title, info
file: id, id_book, location
Other information: Book is about ~50,000 rows, File is about ~ 300,000 rows.
What I'm trying to do is to select 12 different authors and select just one book and from that book select location from the table file.
What I tried is the following:
SELECT DISTINCT(`person`.`id`), `person`.`name`, `book`.`id`, `book`.`title`, `book`.`info`, `file`.`location`
FROM `person`
INNER JOIN `book`
ON `book`.`id_person` = `person`.`id`
INNER JOIN `file`
ON `file`.`id_book` = `book`.`id`
LIMIT 12
I have learned that the DISTINCT does not work the way one might expect. Or is it me that I'm missing something? The above code returns books from the same author and goes with the next one. Which is NOT what I want. I want 1 book from each one of the 12 different authors.
What would be the correct way to retrieve this information from the database? Also, I would want to retrieve 12 random people. Not people that are stored in consecutive order in the database,. I could not formulate any query with rand() since I couldn't even get different authors.
I use MariaDB. And I would appreciate any help, especially help that allows to me do this with great performance.
In MySQL, you can do this, in practice, using GROUP BY
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`, f.`location`
FROM `person` p INNER JOIN
`book` b
ON b.`id_person` = p.`id` INNER JOIN
`file` f
ON f.id_book = b.id
GROUP BY p.id
ORDER BY rand()
LIMIT 12;
However, this is not guaranteed to return the non-id values from the same row (although it does in practice). And, although the authors are random, the books and locations are not.
The SQL Query to do this consistently is a bit more complicated:
SELECT p.`id`, p.`name`, b.`id`, b.`title`, b.`info`,
(SELECT f.location
FROM file f
WHERE f.id_book = b.id
ORDER BY rand()
LIMIT 1
) as location
FROM (SELECT p.*,
(SELECT b.id
FROM book b
WHERE b.id_person = p.id
ORDER BY rand()
LIMIT 1
) as book_id
FROM person p
ORDER BY rand()
LIMIT 12
) p INNER JOIN
book b
ON b.id = p.book_id ;

Order by with distinct has an impact on performance

I have .6 million data set. Now I am trying to sort the data set by timestamp, and also due to one to many relationship I have to do some Inner JOIN and use distinct.
My Query is like below,
SELECT DISTINCT p.id, s.subject, p.joining_time
FROM profile p
INNER JOIN profile_subject ps ON p.id=ps.profile_id
LEFT JOIN subject s ON ps.subject_id=s.id
ORDER BY p.joining_time LIMIT 20;
Now this query is taking almost 28 sec
But without order by clause its taking only 0.11 sec
How to improve this query with desired result?
My simplest suggestion is to put an index on profile(joining_time). Then select a certain number of the most recent in a subquery. For instance, if you are pretty confident that the top 20 rows you want are within the most recent 100 records in profile, then you can try this:
SELECT DISTINCT p.id, s.subject, p.joining_time
FROM (SELECT p.id, p.joining_join
FROM profile p
ORDER BY p.joining_time
LIMIT 100
) p INNER JOIN
profile_subject ps
ON p.id = ps.profile_id LEFT JOIN
subject s
ON ps.subject_id = s.id
ORDER BY p.joining_time
LIMIT 20;
I would also suggest that you remove the DISTINCT keyword. Unless you have duplicate subjects for one profile, then this is not necessary. Similarly, it is hard to believe that the LEFT JOIN is necessary. In a well-structured database, there would be no subject_id values in profile_subject that are not in subject. So, try this:
SELECT p.id, s.subject, p.joining_time
FROM (SELECT p.id, p.joining_join
FROM profile p
ORDER BY p.joining_time
LIMIT 100
) p INNER JOIN
profile_subject ps
ON p.id = ps.profile_id JOIN
subject s
ON ps.subject_id = s.id
ORDER BY p.joining_time
LIMIT 20;

MySQL "Distinct" join super slow

I have the following query which gives me the right results. But it's super slow.
What makes it slow is the
AND a.id IN (SELECT id FROM st_address GROUP BY element_id)
part. The query should show from which countries we get how many orders.
A person can have multiple addresses, but in this case, we only only want one.
Cause otherwise it will count the order multiple times. Maybe there is a better way to achieve this? A distinct join on the person or something?
SELECT cou.title_en, COUNT(co.id), SUM(co.price) AS amount
FROM customer_order co
JOIN st_person p ON (co.person_id = p.id)
JOIN st_address a ON (co.person_id = a.element_id AND a.element_type_id = 1)
JOIN st_country cou ON (a.country_id = cou.id)
WHERE order_status_id != 7 AND a.id IN (SELECT id FROM st_address GROUP BY element_id)
GROUP BY cou.id
Have you tried to replace the IN with an EXISTS?
AND EXISTS (SELECT 1 FROM st_address b WHERE a.id = b.id)
The EXISTS part should stop the subquery as soon as the first row matching the condition is found. I have read conflicting comments on if this is actually happening though so you might throw a limit 1 in there to see if you get any gain.
I found a faster solution. The trick is a join with a sub query:
JOIN (SELECT element_id, country_id, id FROM st_address WHERE element_type_id = 1 GROUP BY
This is the complete query:
SELECT cou.title_en, COUNT(o.id), SUM(o.price) AS amount
FROM customer_order o
JOIN (SELECT element_id, country_id, id FROM st_address WHERE element_type_id = 1 GROUP BY element_id) AS a ON (o.person_id = a.element_id)
JOIN st_country cou ON (a.country_id = cou.id)
WHERE o.order_status_id != 7
GROUP BY cou.id