SQL Query does not print 0 on UNION - mysql

I am new to SQL queries and I am working on a query to print the COUNT of movies an actor has appeared in, including a total of 0 if they have not appeared in any movies.
I am getting all results as expected except if an actor has not appeared in any movies then they do not appear in the list.
Here is what I have so far -
SELECT a.actor_name, count(*)
FROM ACTOR a
JOIN CAST_MEMBER c ON (a.actor_name = c.actor_name)
GROUP BY a.actor_name
UNION
SELECT c2.actor_name, 0
FROM ACTOR a2 JOIN CAST_MEMBER c2 ON (a2.actor_name = c2.actor_name)
WHERE a2.actor_name NOT IN
(
SELECT a3.actor_name
FROM ACTOR a3
JOIN CAST_MEMBER c3 ON (a3.actor_name = c3.actor_name)
)
EXPECTED OUTPUT
ACTOR_NAME COUNT(*)
-------------------------------------------------- ----------
Amy Adams 1
Brad Pitt 4
Christian Bale 1
Jennifer Anitson 0
Jennifer Lawrence 2
Leonardo DiCaprio 2
OUTPUT I HAVE
ACTOR_NAME COUNT(*)
-------------------------------------------------- ----------
Amy Adams 1
Brad Pitt 4
Christian Bale 1
Jennifer Lawrence 2
Leonardo DiCaprio 2
I would really appreciate it if someone could guide me in the right direction.
Thank you for all the help.

SELECT a.actor_name, count(*)
FROM ACTOR a
LEFT JOIN CAST_MEMBER c ON (a.actor_name = c.actor_name)
GROUP BY a.actor_name;
You need to use LEFT OUTER JOIN in order to show all rows from table ACTOR.
An outer join is a join that displays data from the same rows an inner
join does, but also adds data from rows that don’t necessarily have
matches in all the tables that are joined together. There are three
types of outer joins—LEFT, RIGHT, and FULL.

ANSWER WITHOUT THE UNION
ACTOR_NAME COUNT(*)
-------------------------------------------------- ----------
Amy Adams 1
Brad Pitt 4
Christian Bale 1
Jennifer Aniston 1
Jennifer Lawrence 2
Leonardo DiCaprio 2
ANSWER WITH THE UNION
ACTOR_NAME COUNT(*)
-------------------------------------------------- ----------
Amy Adams 1
Brad Pitt 4
Christian Bale 1
Jennifer Aniston 0
Jennifer Lawrence 2
Leonardo DiCaprio 2
My error was on the joins, Now I have a better understanding of them.

Related

SQL query for duplicate rows based on 2 columns

I have 3 tables movie, rating and reviewer
movie has 4 columns movieID, title, year, director
rating has 4 columns reviewerID, movieID, stars, ratingDate
reviewer has 2 columns reviewerID, name
How do I query reviewer who rated the same movie more than 1 time and gave it higher rating on the second review.
This is my attempt at query to find rows with duplicate values in 2 columns (meaning the movie has been rated by 1 reviewer more than once), and then somehow I need to query reviewer who gave higher stars on second review.
SELECT reviewer.name, movie.title, rating.stars, rating.ratingDate
FROM rating
INNER JOIN reviewer ON reviewer.rID = rating.rID
INNER JOIN movie ON movie.mID = rating.mID
WHERE rating.rID IN (SELECT rating.rID FROM rating GROUP BY rating.rID, rating.mID HAVING COUNT(*) > 1)
ORDER BY reviewer.name, rating.ratingDate;
movie table
movieID
Title
Year
Director
101
Gone with the Wind
1939
Victor Fleming
102
Star Wars
1977
George Lucas
103
The Sound of Music
1965
Robert Wise
104
E.T.
1982
Steven Spielberg
105
Titanic
1997
James Cameron
106
Snow White
1937
null
107
Avatar
2009
James Cameron
108
Raiders of the Lost Ark
1981
Steven Spielberg
rating table
reviewerID
movie ID
Stars
ratingDate
201
101
2
2011-01-22
201
101
4
2011-01-27
202
106
4
null
203
103
2
2011-01-20
203
108
4
2011-01-12
203
108
2
2011-01-30
204
101
3
2011-01-09
205
103
3
2011-01-27
205
104
2
2011-01-22
205
108
4
null
206
107
3
2011-01-15
206
106
5
2011-01-19
207
107
5
2011-01-20
208
104
3
2011-01-02
reviewer table
reviewerID
Name
201
Sarah Martinez
202
Daniel Lewis
203
Brittany Harris
204
Mike Anderson
205
Chris Jackson
206
Elizabeth Thomas
207
James Cameron
208
Ashley White
Expected result
Reviewer
Title
Sarah Martinez
Gone with the Wind
EDIT: I am using MySQL version 8.0.29
Use:
select re.Name,mo.Title
FROM (
select reviewerID,movieID,ratingDate,Stars
from rating r
where exists (select 1
from rating r1
where r1.reviewerID=r.reviewerID
and r.movieID=r1.movieID
and r.ratingDate>r1.ratingDate
and r.Stars>r1.Stars
)) as t1
inner join movie mo on t1.movieID=mo.movieID
inner join reviewer re on t1.reviewerID=re.reviewerID
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=0c5d850ee3393b054d9af4c4ac241d96
The key part is the EXISTS statement
where exists (select 1
from rating r1
where r1.reviewerID=r.reviewerID
and r.movieID=r1.movieID
and r.ratingDate>r1.ratingDate
and r.Stars>r1.Stars
which will return only the results on which you have the same user more than one movie, the rating Stars are bigger than the previos one based on the ratingDate
we don't need to use where in with rating and join with rating
You can try to use lead window function to get the next start each reviewerID and movieID which represent duplicate rating (order by ratingDate)
then compare with your logic to find a newer start greater than older start.
SELECT DISTINCT r.Name,m.Title
FROM (
SELECT reviewerID,
movieID,
Stars,
LEAD(Stars) OVER(PARTITION BY reviewerID, movieID ORDER BY ratingDate) n_start
FROM rating
) t1
INNER JOIN movie m ON t1.movieID = m.movieID
INNER JOIN reviewer r ON r.reviewerID = t1.reviewerID
WHERE Stars < t1.n_start
This sample data sqlfiddle provide by #ErgestBasha

Find what genre is most frequent in each age category

I am thinking about making age groups per decade and find out what genre is more frequent. It is more difficult than I expected but here is what I have tried:
One table is like this, called: sell_log
id id_film id_cust
1 2 2
2 3 4
3 1 5
4 4 3
5 5 1
6 2 4
7 2 3
8 3 1
9 5 3
2nd here is a table about the films that has the id and the genres:
id_film genres
1 comedy
2 fantasy
3 sci-fi
4 drama
5 thriller
and 3rd table, customers is this:
id_cust date_of_birth_cust
1 1992-03-12
2 1999-06-25
3 1986-01-14
4 1985-09-18
5 1992-05-19
This is the code I did:
select id_cust,date_of_birth_cust,
CASE
WHEN date_of_birth_cust > 1980-01-01 and date_of_birth_cust < 1990-01-01 then ##show genre##
WHEN date_of_birth_cust > 1990-01-01 and date_of_birth_cust < 2000-01-01 then ##show genre##
ELSE ##show genre##
END
from purchases
INNER JOIN (
select id_cust
FROM sell_log
group by id_cust
) customer.id_cust = sell_log.id_cust
How is the correct form in your opinion?
Expected results: for example
based on the most frequent number of genres find that genre and pass it for that age group.
ages most frequent genre
from 1980 to 1990 comedy
from 1990 to 2000 fantasy
rest ages drama
Update:
doing the code in the answer gives this:
ages most_frequent_genre
from 1980 to 1989 Comedy
from 1990 to 1999 Thriller
from 1990 to 1999 Action
from 1990 to 1999 Comedy
rest Comedy
What am I doing wrong
You can use a CTE to get the results per age and genre and then use it to get the maximum number of purchases per age. Finally join again to the CTE:
with cte as (
select
CASE
WHEN year(c.date_of_birth_cust) between 1980 and 1989 then 'from 1980 to 1989'
WHEN year(c.date_of_birth_cust) between 1990 and 1999 then 'from 1990 to 1999'
ELSE 'rest'
END ages,
f.genres,
count(*) counter
from sell_log s
inner join films f on f.id_film = s.id_film
inner join customers c on c.id_cust = s.id_cust
group by ages, f.genres
)
select c.ages, c.genres most_frequent_genre
from cte c inner join (
select c.ages, max(counter) counter
from cte c
group by c.ages
) g on g.ages = c.ages and g.counter = c.counter
order by c.ages
See the demo.
In your sample data there are ties which will all be at the results.
Results:
| ages | most_frequent_genre |
| ----------------- | ------------------- |
| from 1980 to 1989 | fantasy |
| from 1990 to 1999 | comedy |
| rest | fantasy |

Get movies in order of maximum common genres, then keywords

I have the following table Movies:
id | title | year
315 Harry Potter and the Deathly Hallows: Part 2 2011
407 Cinderella 2015
826 The Shape of Water 2017
799 Enchanted 2007
523 How to Train Your Dragon 2010
618 Crazy Rich Asians 2018
and the table Genres:
movie_id | genre
315 adventure
315 fantasy
315 mystery
315 drama
407 drama
407 fantasy
826 drama
826 thriller
826 adventure
826 horror
799 fantasy
799 comedy
799 romance
523 drama
523 fantasy
618 romance
618 comedy
and the table keyword:
movie_id | keyword
315 magic
315 wizards
315 witch
315 friendship
315 abuse
407 prince
407 fairy tale
407 magic
407 poor girl
407 abuse
826 scientist
826 mute
826 friendship
799 musical
799 magic
799 witch
799 friendship
523 viking
523 boy
523 fire
618 singapore
618 wedding
618 money
I am trying to construct a query which outputs all the movies which have genres in common to a given movie. If there are movies which have the same number of common genres, then I want to rank those movies by the order of maximum common keywords.
E.g. If the movie was 'Harry Potter and the Deathly Hallows: Part 2', then the output of the query would be:
title | genre_frequency | keyword_frequency
Cinderella 2 2
The Shape of Water 2 1
How to Train Your Dragon 2 0
Enchanted 1 3
Movies that don't have any genres that are common with the specified movie are not included in the output (e.g. Crazy Rich Asians).
I have two queries that can give me the genre_frequency and keyword_frequency.
select m.*, genre_frequency from movie m
join (
select m.id, count(*) as genre_frequency
from movie m
join genre g on m.id=g.movie_id
where g.genre in (select g1.genre
from genre g1
where g1.movie_id=315)
group by m.id
) f
on m.id=f.id
where m <> 315
order by f.genre_frequency desc;
select m.*, keyword_frequency from movie m
join (
select m.id, count(*) as keyword_frequency
from movie m
join keyword k on m.id=k.movie_id
where k.keyword in (select k1.keyword
from keyword k1
where k1.movie_id=315)
group by m.id
) f
on m.id=f.id
where m <> 315
order by f.keyword_frequency desc;
The problem is that I want to combine the two queries above into a single query so that I can the output table as seen above. I am not sure how I can do this. Any insights are appreciated.
You can try to use UNION ALL combine Genres and keyword tables and add grp column to split two part for the result set. then use condition aggregate function.
Query #1
select m.title,
count(CASE WHEN t1.grp = 'g' THEN 1 END) as genre_frequency,
count(CASE WHEN t1.grp = 'k' THEN 1 END) as keyword_frequency
from Movies m
join (
SELECT movie_id,genre name,'g' grp
FROM Genres
UNION ALL
SELECT movie_id,keyword,'k' grp
FROM keyword
) t1 on m.id=t1.movie_id
where (t1.name in (select g1.genre
from Genres g1
where g1.movie_id=315) or
t1.name in (select k1.keyword
from keyword k1
where k1.movie_id=315))
AND m.id <> 315
group by m.title;
| title | genre_frequency | keyword_frequency |
| ------------------------ | --------------- | ----------------- |
| Cinderella | 2 | 2 |
| Enchanted | 1 | 3 |
| How to Train Your Dragon | 2 | 0 |
| The Shape of Water | 2 | 1 |
View on DB Fiddle
the query below first get all movies and inner join with the movies that had genres in common with the movie you are looking for. this will allow to get rid of any movies without any genres in common with the movie to be searched on.
I am using your query for genre frequency as a derived table in this case. I also removed the IN clause in the where statement and used another inner join for better performance.
the second derived table, the one joined using LEFT JOIN is the query you used to get the keyword frequency. same logic applies as the genre frequency table, the only difference is the LEFT JOIN since two movies can have genres in common but not keywords.
notice the IFNULL statement in the select clause so that we return 0 if no keywords found that are common.
at the end, we just sort first by genre frequency and then keyword frequency, in descending order.
select m.title, IFNULL(g_fq.genre_frequency,0),
IFNULL(k_fq.keyword_frequency,0)
FROM movie m
INNER JOIN
(select m.id as movie_id, genre_frequency from movie m
join (
select m.id, count(*) as genre_frequency
from movie m
join genre g on m.id=g.movie_id
INNER JOIN
(select g1.genre
from genre g1
where g1.movie_id=315) as a on a.genre=g.genre
group by m.id
) f
on m.id=f.id
where m.id <> 315
) as g_fq ON m.id=g_fq.movie_id
LEFT JOIN
(
select m.id as movie_id, keyword_frequency from movie m
join (
select m.id, count(*) as keyword_frequency
from movie m
join keyword k on m.id=k.movie_id
INNER JOIN
(select k1.keyword
from keyword k1
where k1.movie_id=315) as b on b.keyword=k.keyword
group by m.id
) f
on m.id=f.id
where m.id <> 315
) as k_fq on m.id=k_fq.movie_id
order by IFNULL(g_fq.genre_frequency,0) DESC,IFNULL(k_fq.keyword_frequency,0) DESC

SQL Inner Join statement not giving desired results

Hopefully someone can give me a hand here. I have the following two tables:
Table: locations
location_id user_id city state
1 1 Los Angeles CA
2 1 New York NY
3 1 Chicago IL
4 2 Dallas TX
5 3 Denver CO
6 4 Miami FL
7 5 Atlanta GA
Table: events
event_id user_id event_name event_date
1 1 My Event 1 2017-02-01
2 2 My Event 2 2017-03-01
3 3 My Event 3 2017-04-01
4 4 My Event 4 2017-05-01
5 5 My Event 5 2017-06-01
I am running the following query:
SELECT e.event_id, e.user_id, e.event_name, e.event_date,
l.user_id, l.city, l.state
FROM events e
INNER JOIN locations l
ON e.user_id = l.user_id
ORDER BY e.event_date ASC
I am trying just to get JUST the records in the events table, but also pull the corresponding city and state that match the user_id that both tables have in common. The output should be:
event_id user_id event_name event_date city state
1 1 My Event 1 2017-02-01 Los Angeles CA
2 2 My Event 2 2017-03-01 Dallas TX
3 3 My Event 3 2017-04-01 Denver CO
4 4 My Event 4 2017-05-01 Miami FL
5 5 My Event 5 2017-06-01 Atlanta GA
Can anyone point me to my error in the SQL statement?
You never gave us the logic for deciding which location to choose for a given user. One approach would be to take the minimum location_id associated with a given user:
SELECT t1.*,
COALESCE(t2.city, 'NA'),
COALESCE(t2.state, 'NA')
FROM events t1
LEFT JOIN locations t2
ON t1.user_id = t2.user_id
INNER JOIN
(
SELECT user_id, MIN(location_id) AS min_location_id
FROM locations
GROUP BY user_id
) t3
ON t2.user_id = t3.user_id AND
t2.location_id = t3.min_location_id
That's quite impossible, here the issue is for user with id 1 you have 3 tuples in the locations table , which city to select is the big question.
1 1 Los Angeles CA
2 1 New York NY
3 1 Chicago IL

multiple subquery result into main query, select in select

OK have 2 tables
user_id login_history
1 2011-01-01
1 2011-01-02
1 2011-03-05
1 2011-04-05
1 2011-06-07
2 2011-01-01
2 2011-01-02
3 2011-03-05
3 2011-04-05
3 2011-06-07
user_id user_details
1 Jack
2 Jeff
3 Irin
What kind of query can I use to get a result like
1. Jack 2011-01-01 2011-01-02 2011-03-05
2. Jeff 2011-01-01 2011-01-02
3. Irin 2011-03-05 2011-04-05 2011-06-07
Basically I want latest 3 records from table one and be joint with table 2
The query I used will get me a list of below, which is vertical records
Jack ,2011-01-01
Jack ,2011-01-02
Jack ,2011-03-05
Jeff ,2011-01-01
Jeff ,2011-01-02
Irin ,2011-03-05
Irin ,2011-04-05
Irin ,2011-06-07
Please help
select t2.user_details,
substring_index(group_concat(login_history order by login_history separator ' '),' ',3) as recents
from table_2 as t2
left join table_1 as t1
on t1.user_id = t2.user_id
group by t2.user_id
in your example you list first three records, not the last three. By the way you would have just to add desc to the order clause within group_concat if you need it.