I have a table
Films
id release_year category_id rating
1 2015 1 8
2 2015 2 8.5
3 2015 3 9
4 2016 2 8.2
5 2016 1 8.4
6 2017 2 7
I want to add a new column "avg_better_films" to find an average rating of all better films in its release year
the output should be
id release_year category_id rating avg_better_films
1 2015 1 8 8.75
2 2015 2 8.5 9
3 2015 3 9 Not Available
4 2016 2 8.2 8.4
5 2016 1 8.4 Not Available
6 2017 2 7 Not Available
and you can see, when the film has the best rating in the release year
it will show "Not Available"
Do you know how to get this output in MySQL
We can use a Correlated Subquery to calculate the Average rating for the "better films" for the same year.
Subquery would return null in case of no better films for a specific film and year.
We can then use Coalesce() function to handle the case when there is no "better film" found for the year.
Thanks to #Strawberry in comments, we need to do + 0 to the result of subquery, so that MySQL considers it as number
Try the following:
SELECT
t1.id,
t1.release_year,
t1.category_id,
t1.rating,
COALESCE(
(
SELECT AVG(t2.rating)
FROM Films AS t2
WHERE t2.rating > t1.rating AND -- higher rating films
t2.release_year = t1.release_year -- films from same year
) + 0,
'Not Available' -- handle null result of subquery
) AS avg_better_films
FROM Films AS t1
SQL Fiddle: http://sqlfiddle.com/#!9/4960ea/3
you can try using scalar subquery
select *, (select avg(rating) from Films b on a.release_year=b.release_year) as avg_film_Rating
from Films a
This is tricky, because you need to output the results as a string, not a number. If any value is a string, all must be.
I would recommend a correlated subquery that looks like:
SELECT f.*,
(SELECT COALESCE(FORMAT(AVG(t2.rating), 2), 'Not Available)
FROM Films f2
WHERE f2.rating > f.rating AND
f2.release_year = f.release_year
) AS avg_better_films
FROM Films f
Notes:
Table aliases should be abbreviations for the table name.
If no rows match the WHERE clause in the subquery, then AVG() returns NULL. Hence, the COALESCE()can be in the subquery.
FORMAT() is used to convert the average rating to a string.
Related
I'm using mysql. I have two tables, one is about movie type, and the other is about movie rating with timestamps. I want to join these two tables together with movie id to count the average rating for each type of movie. I'm trying to extract only the movie types which have at least 10 ratings per film and the ratings made in December, and order by the highest to lowest average rating.
Table 'types'
movieId
type
1
Drama
2
Adventure
3
Comedy
...
...
Table 'ratings'
movieId
rating
timestamp
1
1
851786086
2
1.5
1114306148
1
2
1228946388
3
2
850723898
1
2.5
1167422234
2
2.5
1291654669
1
3
851345204
2
3
944978286
3
3
965088579
3
3
1012598088
1
3.5
1291598726
1
4
1291779829
1
4
850021197
2
4
945362514
1
4.5
1072836909
1
5
881166397
1
5
944892273
2
5
1012598088
...
...
...
Expect result: (Nb ratings >= 10 and rate given in December)
type
Avg_Rating
Drama
3.45
I'm trying to write the query like below, but I'm not able to execute it. (around 10 thousand data in original table)
Where should I adjust my query?
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
WHERE R.timestamp LIKE (
SELECT FROM_UNIXTIME(R.timestamp,'%M') AS Month FROM ratings
GROUP BY Month
HAVING Month = 'December')
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
I see two problems:
timestamp LIKE - what's that supposed to do?
and
inner query with GROUP BY by without any aggregation. Perhaps you meant WHERE? And anyway you don't need it at all - just do the same check for December directly on timestamp, w/o LIKE and w/o subquery
SELECT DISTINCT T.type, AVG(R.rating) FROM
types AS T INNER JOIN ratings AS R
ON T.movieId = R.movieId
WHERE FROM_UNIXTIME(R.timestamp,'%M') = 'December'
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
You can try next query.
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
GROUP BY T.type
HAVING
COUNT(R.rating) >= 10 -- have 10 or more rating records
AND SUM(MONTH(FROM_UNIXTIME(R.timestamp)) = 12) > 0 -- have at least one rating in December
ORDER BY AVG(R.rating) DESC;
sqlize
I need to find the most popular name per year from the below data based on the combined total count for a name each year. Note there can be multiple entries per year (as seen below).
ID person_name total_count person_year
1 MIKE 1 2006
2 MIKE 2 2007
3 MIKE 4 2007
4 MIKE 3 2008
5 TED 1 2006
6 TED 2 2007
7 TED 4 2008
8 TED 7 2008
9 MOOKIE 1 2006
10 MOOKIE 12 2006
11 MOOKIE 5 2007
12 MOOKIE 3 2008
The SQL I need to write would produce the below result:
person_name max_value person_year
MOOKIE 13 2006
MIKE 6 2007
TED 11 2008
Creating the SUM table is easy:
SELECT id, person_name,SUM(total_count) AS sum_count, person_year FROM temp_table GROUP BY person_name, person_year;
This gives me the Sum count per year for each name.
The problem is any MAX logic I write doesn't carry the associated NAME with the selected MAX when I group by YEAR. I've tried numerous variations and none of them work. I would have thought the below would work, but the NAME is mismatched:
SELECT id, person_name, MAX(sum_count) AS max_count, person_year FROM
(SELECT id, person_name, SUM(total_count) AS sum_count, person_year FROM temp_table GROUP BY person_name, person_year) AS PC
GROUP BY person_year;
It returns:
1 MIKE 13 2006
2 MIKE 6 2007
4 MIKE 11 2008
So I don't know how to map the selected MAX grouped by YEAR to the proper name... That's the only piece I'm missing.
Any help on this would be appreciated.
First write a query to get the total for each name in each year:
SELECT person_name, person_year, SUM(total_count) AS count
FROM temp_table
GROUP BY person_name, person_year
Then use that as a CTE in a query to find the row with the max value for each year:
WITH counts AS (
SELECT person_name, person_year, SUM(total_count) AS count
FROM temp_table
GROUP BY person_name, person_year)
SELECT c1.*
FROM counts AS c1
JOIN (
SELECT person_year, MAX(count) AS max_count
FROM counts
GROUP BY person_year) AS c2
ON c1.person_year = c2.person_year AND c1.count = c2.max_count
DEMO
This follows the same pattern as this answer, it simply uses a CTE instead of a real table.
If you're using MySQL 5.x, define a view instead of using the CTE.
Without a CTE or view, you have to substitute the entire subquery everywhere that counts appears above.
SELECT c1.*
FROM (
SELECT person_name, person_year, SUM(total_count) AS count
FROM temp_table
GROUP BY person_name, person_year) AS c1
JOIN (
SELECT person_year, MAX(count) AS max_count
FROM (
SELECT person_name, person_year, SUM(total_count) AS count
FROM temp_table
GROUP BY person_name, person_year) AS x
GROUP BY person_year) AS c2
ON c1.person_year = c2.person_year AND c1.count = c2.max_count
I have these data
idhouse year
7 2016
2 2018
2 2017
3 2017
4 2015
14 2003
3 2018
5 2018
4 2017
4 2018
I want to counting the number of houses belong to two years.
I tried with mysql select but didn't work.
How I should do it?
EDITED
Sorry for my bad explanation.
I have only one mysql table.
Filtering by 2017 and 2018 and count the numbrer these houses, I should get these match:
idhouse year
7 2016
2 2018*
2 2017*
3 2017*
4 2015
14 2003
3 2018*
5 2018
4 2017*
4 2018*
And the SELECT should be show 3
I assume a house can only appear once in a year. Try this:
SELECT
COUNT(*) nb_houses
FROM (SELECT house
FROM yourTable
GROUP BY house
HAVING COUNT(*)>1) A;
See this run on SQL Fiddle.
Assuming a PK on (house,year), if you just want to know how many houses are listed more than once, you can do this...
SELECT COUNT(DISTINCT x.house) total
FROM my_table x
JOIN my_table y
ON y.house = x.house
AND y.year <> x.year;
I asuming you're doing a web app with this question
SELECT house, COUNT(year) as count_year FROM table GROUP BY house HAVING COUNT(year) = 2
By using your data above, the result will be
house | count_year
____2 | _______2
Then if you are using server side scripting like PHP use mysqli_num_rows for get the number of row(s).
Or if you use other language, just adjust the algorithm to get the number of row(s)
With this Select I get it:
select count(*) from (select house from yourTable Where year = 2018 and house in (select house from yourTable where year = '2017')) A;
But can we improve this Select in terms of efficiency?
You can try here
Thanks.
In MySQL I'm tasked with a big dataset, with data from 1970 to 2010.
I want to check for consistency: check if each instance occurs minimum one time per year. I took a snippet from 1970-1972 as example to demonstrate my problem.
input:
id year counts
-- ---- ---------
1 1970 1
1 1971 1
2 1970 3
2 1971 8
2 1972 1
3 1970 4
expected:
id 1970-1972
-- ----------
1 no
2 yes
3 no
I though about counting within the date range and then taking those out who had 3 counts: 1970, 1971, 1972. The following query doesn't force the check on each point in the range though.
select id, count(*)
from table1
WHERE (year BETWEEN '1970' AND '1972') AND `no_counts` >= 1
group by id
What to do?
You can use GROUP BY with CASE / inline if.
Using CASE. SQL Fiddle
select id,CASE WHEN COUNT(distinct year) = 3 THEN 'yes'ELSE 'No' END "1970-72"
from abc
WHERE year between 1970 and 1972
GROUP BY id
Using inline IF. SQL Fiddle
select id,IF( COUNT(distinct year) = 3,'yes','No') "1970-72"
from abc
WHERE year between 1970 and 1972
GROUP BY id
You can use a having clause with distinct count:
select `id`
from `table1`
where `year` between '1970' and '1972'
group by id
having count(distinct `year`) = 3
Do you expect this?
select id, count(*)
from table1
WHERE (year BETWEEN '1970' AND '1972')
group by id
having count(distinct year) = 3
I have a big view called: how_many_per_month
name_of_product | how_many_bought | year | month
p1 20 2012 1
p2 7 2012 1
p1 10 2012 2
p2 5 2012 2
p1 3 2012 3
p2 20 2012 3
p3 66 2012 3
How to write MySQL query in order to get only first few occurences of product p1, p2, p3 at once?
To get it one by one for first 3 months I can write:
SELECT name_of_product , sum(how_many_bought) FROM
(SELECT name_of_product, how_many_bought FROM `how_many_per_month`
WHERE name_of_product= 'p1' LIMIT 3) t
How to do it to all possible products at once so my result for taking only first month is like:
p1 20
p2 7
p3 66
For two months:
p1 30
p2 12
p3 66
The problem is that some products are published in different months and I have to make statistic how many of total of them are sold in first month, first 3 months, 6 months, 1 year divided by total.
Example using union
select
name_of_product,
sum(how_many_bought) as bought,
"first month" as period
from how_many_per_month
where month = 1
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 2 month" as period
from how_many_per_month
where month <= 2
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 6 month" as period
from how_many_per_month
where month <= 6
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 12 month" as period
from how_many_per_month
where month <= 12
group by name_of_product
Demo: http://www.sqlfiddle.com/#!2/788ea/11
Results are different a little bit from your expectation. Are you sure that you write them properly? If you need to gain more speed in query time you can use group by case as I've already said.
I'm not quite sure what you're trying to achieve as the description of your question is a bit unclear. From what I've read so far, I understand you want to show the total of how many ITEM_X, ITEM_Y, ITEM_Z were sold for the past 1,3,6 months.
Based on the data you've provided, I've created this sqlfiddle that sums all results and groups them by item. This is the query:
SELECT
name_of_product,
sum(how_many_bought) as how_many_bought
FROM how_many_per_month
WHERE year = 2012
AND month BETWEEN 1 AND 3
GROUP BY name_of_product
-- NOTE: Not specifying an year will result in including all "months"
which are between the values 1 and 3 for all years. Remove it
in case you need that effect.
In the example above the database will sum all sold items between months 1 and 3 (including) for 2012. When you execute this query in your application just change the range in the BETWEEN X AND X and you'll be good to go.
Additional tip:
Avoid using sub-queries or try using the as a last resort method (in case there's simply no other way to do it). They are significantly slower than normal and even join queries. Usually sub-queries can be transformed into a join query.
SELECT
hmpm.name_of_product , SUM(hmpm.how_many_bought)
FROM (
SELECT name_of_product
FROM how_many_per_month
/* WHERE ... */
/* ORDER BY ... */
) sub
INNER JOIN how_many_per_month hmpm
ON hmpm.name_of_product = sub.name_of_product
GROUP BY hmpm.name_of_product
/* LIMIT ... */
MySQL not support LIMIT in subquery, but you need ordering and condition. And why not have id_of_product field?