More concise SQL query involving MAX() - mysql

inventory
+------------------+-------------------+------------+
| DVD | replacement_price | stock |
+------------------+-------------------+------------+
| Pi | 9.99 | 500 |
| Dune | 29.99 | 100 |
| Heathers | 4.99 | 20 |
| Jaws | 19.99 | 500 |
| Mulholland_Drive | 39.99 | 50 |
| Waking_Life | 29.99 | 200 |
+------------------+-------------------+------------+
rented
+-----------------+-----------+------------------+
| subscriber | queue_nbr | DVD |
+-----------------+-----------+------------------+
| Bob | 1 | Mulholland_Drive |
| Bob | 2 | Jaws |
| Chey | 1 | Pi |
| Chey | 2 | Heathers |
| Jamie | 2 | Mulholland_Drive |
| Jamie | 4 | Dune |
| Jamie | 1 | Jaws |
| Jamie | 3 | Waking_Life |
| Nora | 4 | Jaws |
| Nora | 2 | Mulholland_Drive |
| Nora | 3 | Dune |
| Nora | 1 | Waking_Life |
+-----------------+-----------+------------------+
I want to return ONLY the subscriber(s) with the priciest movie queue (think Netflix DVD replacement costs if you lost all the movies you had out at a given time). I've used MAX() rather than TOP, LIMIT or ROWNUM because the query needs to be as db-independent as possible and must return multiple subscribers in the event of a tie. Using the tables above, the result should be
+---------+
| highest |
+---------+
| Jamie |
| Nora |
+---------+
After much searching and experimentation, I've come up with code that works, but it seems to my novice eyes bloated and inefficient, both in quantity of code and execution.
Would anyone mind refactoring and explaining your code?
My code:
SELECT z.subscriber highest
FROM
(SELECT MAX(price) max_price
FROM (
SELECT subscriber_name subscriber, SUM(replacement_price) price
FROM inventory i
INNER JOIN rented r
ON i.DVD = r.DVD
GROUP BY subscriber
) x
) y
INNER JOIN
(
SELECT subscriber_name subscriber, SUM(replacement_price) price
FROM inventory i
INNER JOIN rented r
ON i.DVD = r.DVD
GROUP BY subscriber
) z
ON z.price = y.max_price

If you want to return only those with the max total, then you could use the following which works in both MySQL and SQL Server. It is not any more concise than your current query though:
select subscriber
from inventory i
inner join rented r
on i.dvd = r.dvd
group by subscriber
having sum(replacement_price) = (select max(TotalCost)
from
(
select sum(replacement_price) TotalCost
from inventory i
inner join rented r
on i.dvd = r.dvd
group by subscriber
) p);
If you are using SQL Server, then I would suggest implementing windowing functions, similar to this:
select subscriber
from
(
select subscriber,
rank() over(order by sum(replacement_price) desc) rnk
from inventory i
inner join rented r
on i.dvd = r.dvd
group by subscriber
) src
where rnk = 1
See SQL Fiddle with Demo

SELECT z.subscriber
FROM(
SELECT RANK() OVER(ORDER BY SUM(replacement_price)) subscriber_rank,
r.subscriber subscriber,
SUM(replacement_price) totalReplacementPrice
FROM inventory i
INNER JOIN rented r ON i.dvd = r.DVD
GROUP BY subscriber
) z
WHERE z.subscriber_rank = 1
Some of your column names are different in you query from you sql sample, so I've used the column names given in the demo tables. I use the rank function in the inner query to find the order of all of the people ordering by the sum of the replacement_price. Then select the row(s) where the rank is 1.
Rank is available in both MS Sql Server and Oracle. To go much further than that as #bluefeet says you will need to give more detail as to which database you are targetting.

Related

Bring all data from a table with joins with where clause that may not exist in the other table

I'm having a hard time setting up a query(select). Database is not my specialty, so I'm turning to the experts. Let me show what I need.
----companies--- ----company_server----- -----servers---- -----print------------------------
| id | name | | company | server | | id | name | | id |page|copy | date |server
|----|-------- | |---------|----------| |----|-------- | |----|----|-----|-------------
| 1 | Company1 |1--N| 1 | 1 |N*--1| 1 | Server1 |1--N| 1 | 2 | 3 | 2020-1-11 | 1
| 2 | Company2 | | 2 | 1 | | 2 | Server2 | | 2 | 1 | 6 | 2020-1-12 | 3
| 3 | Company3 | | 3 | 2 | | 3 | Server3 | | 3 | 4 | 5 | 2020-1-13 | 4
| 3 | 3 | | 4 | Server4 | | 4 | 5 | 3 | 2020-1-15 | 2
| 5 | 3 | 4 | 2020-1-15 | 4
| 6 | 1 | 2 | 2020-1-16 | 3
| 7 | 2 | 2 | 2020-1-16 | 4
What I need?
Example where date between CAST(2020-1-12 AS DATE) AND CAST(2020-1-15 AS DATE) group by servers.id
| companies | server | sum | percent
------------------------------------------------------------------------------------
| company1,company2 | server1 | sum(page*copy) = 0 or null | 0 or NULL
| company3 | server2 | sum(page*copy) = 15 | 28.30
| company3 | server3 | sum(page*copy) = 6 | 11.32
| NULL | server4 | sum(page*copy) = 32 | 60.38
Few notes:
I need this query for MYSQL;
Every Company is linked to at least one server.
I need result grouped by server. So, every company linked to that server must be concatenated by a comma.
If the company has not yet been registered, the value null should be presented.
The sum (page * copie) must be presented as zero or null (I don't care) in the case that there was no printing in the date range.
The percentage should be calculated according to the date range entered and not with all records in the database.
The field date is stored as MYSQL DATE.
Experts, I thank you in advance for your help. I currently solve this problem with at least 03 queries to the database, but I have a conviction that I could do it with just one query.
Added a fiddle. Sorry. Im still learing how to use this.
https://www.db-fiddle.com/f/dXej7QCPe9iDopfYd1SfVh/2
Follows the query that more or less represents how far I had arrived. Notice that in the middle of the way 'server4' disappeared because there are no values ​​for it in print in the period searched for him and I am in possession of the total of the period but I cannot calculate the percentage.
i'm stuck
select
*
from
(select
sum(p.copy * p.page) as sum1,
s.name as s_name,
s.id as s_id
from
print p
join servers s on s.id = p.server
where p.date between cast('2020-1-12' as date) and cast('2020-1-15' as date)
group by s.id) as t1
join company_server cs on cs.server = t1.s_id
right join companies c on c.id = cs.company
cross join(
select
sum(p1.copy * p1.page) sum2
from
print p1
where p1.date between cast('2020-1-12' as date) and cast('2020-1-15' as date)
) as c;
I did this query before you add fiddle, so may be name of column of mine is not same as you. Anyway, this is my solution, hope it help you.
select group_concat(c.name separator ',') as name_company,
ss.name,
sum_print as sum,
(sum_print/total) *100 as percentage
from companies c
inner join company_server cs on c.id = cs.company
right join servers ss on ss.id = cs.id
left join
(
select server,sum(page*copy) as sum_print, date from print
where date between CAST('2020-1-12' AS DATE) AND CAST('2020-1-15' AS DATE)
group by server
) tmp on tmp.server = ss.id
cross join
(select sum(page*copy) as total from print where date between CAST('2020-1-12' AS DATE) AND CAST('2020-1-15' AS DATE)) tmp2
group by id
Group and concat by comma, using GROUP_CONCAT .
You can reference this image for JOIN clause.
https://i.stack.imgur.com/6cioZ.png

Get one single record when existing duplicates

I have an ingredients translations table this form (some columns have been removed for simplicity, but still required in the result)
| id | name | ingredient_id | language |
| 1 | Water | 11 | en |
| 2 | Bell pepper | 12 | en |
| 3 | Sweet pepper | 12 | en |
I'm trying to build a query to retrieve just one single ingredient translation per ingredient like this (expected result)
| id | name | ingredient_id |
| 1 | Water | 11 |
| 2 | Bell pepper | 12 |
So far now I'm trying to do it with this query
select it1.*
from ingredient_translations it1
left outer join ingredient_translations it2
on it1.ingredient_id = it2.ingredient_id
and it1.id < it2.id
where it1.language = 'es'
but it's now giving the expected results :/
flag
I'm using postgresql, though I was trying to do this using joins so I can device a cross-db (Postgresql - MySQL) solution.
Please, any insight will be apreciated!!! :D
WITH CustomerCTE (
SELECT t1.*,ROW_NUMBER() OVER (PARTITION BY ingredient_id ORDER BY id DESC) AS RN
FROM ingredient_translations t1
INNER JOIN ingredient_translations t2 ON t1.ingredient_id = t2.ingredient_id
)
SELECT * FROM CustomerCTE WHERE RN = 1
ORDER BY id;
Use ROW_NUMBER() over partition.
Query
select id,name,ingredient_id,language from
(
select id,name,ingredient_id,language,
row_number() over
(
partition by ingredient_id
order by id
) rn
from tbl_Name
)t
where t.rn < 2;
SQL Fiddle

mysql subquery runs indefinitely

I have two tables: I am doing a join and wish to return a query with multiple codenames listed for each GenEx prescription brand. However it looks like the way im doing the join causes it to timeout.
Drugs:
ID | GenEx | CodeName | Desc
----------------------------
1 | Cipro | Dolvo |
2 | Ludavil | Ymir |
3 | Cipro | Alpha |
Medicine:
ID | GenEx | Price |
----------------------------
1 | Cipro | 4.99 |
2 | Ludavil | 12.99 |
3 | Benazol | 5.00 |
I wish to return:
1. GenEx->Cipro, CodeName=>Dolvo,Alpha, Price->4.99
2. GenEx->Ludavil, CodeName=>Ymir, Price->12.99
myquery which never completes:
SELECT GenEx, Price
GROUP_CONCAT(CodeName) as CodeName
FROM (`Drugs` d)
JOIN `Medicine` m ON `m`.`GenEx` = `d`.`GenEx`
WHERE GenEx
IN (
SELECT DISTINCT GenEx
FROM Drugs
WHERE codeName IN ('Alpha'))
)
GROUP BY `GenEx`;
Now updated the aswer as per the last update in the question.
Try this code:
SELECT d.`GenEx`, d.`CodeName`, d.`Price`,
GROUP_CONCAT(d.`CodeName`) as CodeName
FROM Drugs d
JOIN Medicine m
ON m.`GenEx` = d.`GenEx`
AND d.`GenEx`
IN (
SELECT DISTINCT `GenEx`
FROM drugs
WHERE codeName IN ('Alpha'))
)
GROUP BY d.`GenEx`;
And let me know what you get now.

Mysql count records grouped by ID in multiple tables

I'm developing an application integrated with facebook. This application can be embedded in FB page as tab app.
Using FB SDK feeds of page will be stored in Feeds table.
Page fans will may have liked and commented on feeds posted by page.
Users' likes store in Like Table and users' comments store in Comment table
I want to get total count ( Likes count + comment count) of each users'.
SQL Fiddle : http://sqlfiddle.com/#!2/ecb37/10/0
Table : Feeds
| ID | POST_ID |
|----|---------------------------------|
| 56 | 150348635024244_795407097185058 |
| 55 | 150348635024244_795410940518007 |
| 54 | 150348635024244_795414953850939 |
| 53 | 150348635024244_797424133650021 |
| 52 | 150348635024244_797455793646855 |
| 51 | 150348635024244_798997120159389 |
| 50 | 150348635024244_798997946825973 |
Table : Likes
SELECT user_id, COUNT(*) FROM likes GROUP by user_id
| USER_ID | LIKECOUNT |
|------------------|-----------|
| 913403225356462 | 4 |
| 150348635024244 | 3 |
| 356139014550882 | 2 |
| 753274941400012 | 2 |
| 1559751687580867 | 1 |
Table : Comments
SELECT user_id, COUNT(*) FROM comments GROUP by user_id
| USER_ID | COMMENTSCOUNT |
|-----------------|---------------|
| 150348635024244 | 2 |
| 356139014550882 | 2 |
| 913403225356462 | 2 |
Result should be like this
| POINTS | LIKESCOUNT | COMMENTSCOUNT | USER_ID |
|--------|------------|---------------|-----------------|
| 6 | 4 | 2 | 913403225356462 |
| 5 | 3 | 2 | 150348635024244 |
| 4 | 2 | 2 | 356139014550882 |
| 2 | 2 | 0 | 753274941400012 |
| 1 | 1 | 0 |1559751687580867 |
I tried this query. but count of each user's is wrong
SELECT COUNT(likes.user_id)+COUNT(comments.user_id) as points, likes.user_id FROM `likes`
LEFT JOIN comments ON likes.user_id = comments.user_id
LEFT JOIN feeds ON likes.post_id = feeds.post_id
WHERE likes.post_id LIKE '153548635024244%'
GROUP BY likes.user_id
ORDER BY points DESC
The two queries are unrelated and a join is useless. Use a UNION ALL:
SELECT user_id, sum(n) from (
SELECT user_id, COUNT(*) n FROM likes GROUP by user_id
UNION ALL
SELECT user_id, COUNT(*) FROM comments GROUP by user_id
) x
GROUP BY user_id
UNION ALL is needed instead of just UNION, because UNION removes duplicates and would cause incorrect results for the edge case of the two subqueries yielding the same counts.
The simple way to get what you want is to use count(distinct). But that will likely have lousy performance. Instead, use correlated subqueries:
SELECT COUNT(*) +
(select COUNT(c.user_id) from comments c where c.user_id = l.user_id)
) as points, l.user_id
FROM likes l
WHERE l.post_id LIKE '153548635024244%'
GROUP BY l.user_id
ORDER BY points DESC;
I'm not sure what the feeds table is for. However, you version of the query creates a cartesian product between the different tables. If you have a lot of activity for a given user, that would be very bad for performance.

Sorting a dataset with a group by statement

I am writing a query into a database that tracks the results of athletic competitions. My database has an athletes table:
| id | first_name | last_name | Gender |
| 1 | Sam | Johnson | m |
| 2 | Adam | Jones | m |
and a results table
| id | time | athlete_id
| 1 | 1302 | 1
| 2 | 1420 | 1
| 3 | 1491 | 2
| 4 | 1541 | 2
| 5 | 0 | 1
I want to retrieve all the athletes and only their fastest result. I have a query like this
select a.id as aid, a.`first`, a.`last`, r.`id` as `rid`, min(r.`time`) as `time`
FROM athletes a, results r
WHERE
r.athlete_id=a.id AND
r.time > 0
GROUP BY a.id
ORDER BY r.time
So far my query does limit the result to the fastest time, but it's not sorting by the time correctly. I also tried adding second reference to the results table
select a.id as aid, a.`first`, a.`last`, r.`id` as `rid`, r.`time`
FROM athletes a, results r, results r2
WHERE
r.athlete_id=a.id AND
r2.athlete_id=a.id AND
r.time > 0
r1.time < r2.time
ORDER BY r.time
but that caused a out of memory error. The results table has over a million entries and the athletes entry has over 15,000. So the question remains, is there an efficient way of sorting the grouped records or should I have the PHP script remove results as the record set is looped.
Try
SELECT q.athlete_id aid, a.first, a.last, r.id rid, q.`time`
FROM
(SELECT athlete_id, MIN(`time`) `time`
FROM results
WHERE time > 0
GROUP BY athlete_id) q JOIN results r
ON q.athlete_id = r.athlete_id
AND q.`time` = r.`time` JOIN athletes a
ON q.q.athlete_id = a.id
ORDER BY q.`time`
Output:
| AID | FIRST | LAST | RID | TIME |
--------------------------------------
| 1 | Sam | Johnson | 1 | 1302 |
| 2 | Adam | Jones | 3 | 1491 |
SQLFiddle