MySQL JOIN/AGGREGATE function output - mysql

I have two tables in my database :
select * from marks;
select * from subjects;
I need to find the id of the students who got the highest marks in each subject along with the subject name, i.e., Resultset should have 3 columns:
student_id
subject_name
maximum_marks
1
PHYSICS
97.5
2
CHEMSITRY
98.5
Please help me write the query for the above result set
This is what I've tried so far
select m.student_id, s.subject_name, max(m.marks) as maximum_marks from
marks m inner join subjects s
on m.subject_id=s.subject_id
group by m.subject_id;
OUTPUT:

SQL Fiddle Demo
select m.student_id, s.subject_name, m.max_marks
from subjects s join (
select student_id,subject_id, max(marks) as max_marks
from marks
group by student_id,subject_id
order by 3 desc
) as m
on s.subject_id = m.subject_id
group by s.subject_id
Schema & sample & ONLY_FULL_GROUP_BY disabled
CREATE TABLE IF NOT EXISTS `marks` (
`student_id` int(6) NOT NULL,
`subject_id` int(6) NOT NULL,
`marks` float NOT NULL
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `subjects` (
`subject_id` int(6) NOT NULL,
`subject_name` varchar(10) NOT NULL
) DEFAULT CHARSET=utf8;
INSERT INTO `marks` (`student_id`, `subject_id`, `marks`) VALUES
(1,1,97.5),(1,2,92.5),
(2,1,90.5),(2,2,98.5),
(3,1,90.5),(3,2,67.5),
(4,1,80.5),(4,2,97.5);
INSERT INTO `subjects` (`subject_id`, `subject_name`) VALUES
(2,"Chemistry"),(1,"Physics");

I've found a little bit better solution, this is a common use-case of correlated sub-queries, the output can be achieved without a group-by.
select m1.student_id, m1.subject_id, m1.marks, s.subject_name
from marks m1 inner join subjects s
on m1.subject_id=s.subject_id
where m1.marks=
(select max(marks) from marks m2 where m1.subject_id=m2.subject_id);

Related

Improve query of getting row with max value

I have next tables:
CREATE TABLE IF NOT EXISTS `Customers` (
`id` INT AUTO_INCREMENT,
`name` VARCHAR(20) NOT NULL,
PRIMARY KEY(`id`)
);
CREATE TABLE IF NOT EXISTS `Orders` (
`id` INT AUTO_INCREMENT,
`id_cust` INT NOT NULL,
`descr` VARCHAR(40),
`price` INT NOT NULL,
PRIMARY KEY(`id`),
FOREIGN KEY(`id_cust`) REFERENCES `Customers`(`id`)
);
One customer can have many orders. I want to get id_cust and sum of the orders of who paid the most(one person).
My query:
SELECT cust, max_orders_sum
FROM
(
(
SELECT MAX(orders_sum) AS max_orders_sum
FROM (
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
) AS same_query0
) AS step1
INNER JOIN
(
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
) AS same_query1
ON (step1.max_orders_sum = same_query1.orders_sum)
);
Main problem:
as you can see, it has the same parts: same_query0 and same_query1. Is there any way to get rid of them?
Or if you know the better way to reach my goal, please share.
I found one simple solution:
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
ORDER BY orders_sum DESC LIMIT 1;
But this is not a direct way to solve the problem.
I don't think you can do much better than what you've already done, unless you create a view like
create view v_cust_tot as
select id_cust, sum(price) as cust_tot
from Orders
group by id_cust
With that you'd be able to rewrite your query like this
select id_cust, cust_tot
from v_cust_tot
where cust_tot = (select max(cust_tot) from v_cust_tot)
This would be an improvement just in the compactness of the query, because I think performances would be the same as the execution plan would be almost identical
Another one nice solution:
select id_cust, sum(price) from orders group by id_cust having sum(price) =
(select max(prc) from
(select sum(price) as prc from orders group by id_cust) as tb);

Difficult MySQL Query to Determine Rank For a Parent Based on Aggregated Child Data

I need a single query that can provide me with the rank of a parent based on it's rank among all parents. That rank is determined by the sum of the score field from the parent's children.
Here is the MYSQL table setup:
CREATE TABLE `parent` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(30) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `child` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parent_id` int(11) DEFAULT NULL,
`name` varchar(30) NOT NULL,
`score` decimal(6,4) DEFAULT NULL,
PRIMARY KEY (`id`)
);
I can easily find the rank of a particular child among all children using:
SELECT c1.id,c1.name,c1.score,COUNT(c2.id) AS 'rank'
FROM child c1
JOIN child c2
ON c1.score <= c2.score
WHERE c1.id=3;
What I can't figure out is how to find the rank of a particular parent among all parents based on the aggregate sum of the score field of each parents' children. The SQL query I'm looking for should return the rank of the parent based on passing in its ID.
Thanks in advance for your help!
If there is no index on parent_id, try adding one there. It should speed up the joins.
I believe the following will get you what you want:
SELECT COUNT(*) AS rank FROM (
SELECT SUM(c.score) AS parentScore
FROM child
GROUP BY parent_id
HAVING parentScore <= (SELECT SUM(c.score) FROM child WHERE parent_id = _PARENT_ID)
) AS scores
The subquery will only execute once because it is not dependent on the enclosing query. This should minimize trips through the table. This does not account for ties in scores. If you want to handle that, you can select distinct scores.
select p.id, p.name, ps.score, count(ps2.score) as 'rank'
from parent p
join (
select pp.id as 'id', sum(c.score) as 'score'
from parent pp
join child c on pp.id = c.parent_id
group by pp.id
) as 'ps' on p.id = ps.id
join (
select pp.id as 'id', sum(c.score) as 'score'
from parent pp
join child c on pp.id = c.parent_id
group by pp.id
) as 'ps2' on ps.score <= ps2.score
where p.id = 3;
and you may need at the end:
group by p.id, p.name, ps.score

GROUP BY with MAX date field - erratic results

Have a table containing form data. Each row contains a section_id and field_id. There are 50 distinct fields for each section. As users update an existing field, a new row is inserted with an updated date_modified. This keeps a rolling archive of changes.
The problem is that I'm getting erratic results when pulling the most recent set of fields to display on a page.
I've narrowed down the problem to a couple of fields, and have recreated a portion of the table in question on SQLFiddle.
Schema:
CREATE TABLE IF NOT EXISTS `cTable` (
`section_id` int(5) NOT NULL,
`field_id` int(5) DEFAULT NULL,
`content` text,
`user_id` int(11) NOT NULL,
`date_modified` datetime NOT NULL,
KEY `section_id` (`section_id`),
KEY `field_id` (`field_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
This query shows all previously edited rows for field_id 39. There are five rows returned:
SELECT cT.*
FROM cTable cT
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Here's what I'm trying to do to pull the most recent row for field_id 39. No rows returned:
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Record Count: 0;
If I try the same query on a different field_id, say 54, I get the correct result:
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=54;
Record Count: 1;
Why would same query work on one field_id, but not the other?
In your subquery from where you are getting maxima you need to GROUP BY section_id,field_id using just GROUP BY field_id is skipping the section id, on which you are applying filter
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT section_id,field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY section_id,field_id
) AS max
ON(max.field_id =cT.field_id
AND max.date_modified=cT.date_modified
AND max.section_id=cT.section_id
)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
See Fiddle Demo
You are looking for the max(date_modified) per field_id. But you should look for the max(date_modified) per field_id where the section_id is 123. Otherwise you may find a date for which you find no match later.
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable
WHERE section_id = 123
GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Here is the SQL fiddle: http://www.sqlfiddle.com/#!2/0cefd8/19.

MySQL one to many relationship: GROUP_CONCAT or JOIN or both?

I need help with a MySQL query. I have three tables:
`product_category` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
`order_products` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`qty` int(11) NOT NULL,
`unit_price` decimal(11,2) NOT NULL,
`category` int(11) NOT NULL,
`order_id` int (11) NOT NULL,
PRIMARY KEY (`id`)
);
`orders` (
`id` int(11) NOT NULL auto_increment,
`date` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
I had a query that calculated subtotal of all orders by product category (in a certain date range). It looked like this:
SELECT
SUM(op.unit_price * op.qty) as amt,
c.name as category_name
FROM
order_products op,
product_category c,
orders o
WHERE
op.category = c.id
AND
op.order_id = o.id
AND
o.date > 'xxxxxxx'
GROUP BY
c.id
This works great, but now I want to add the individual products and their subtotals to each row so that I get a result like this:
c.name|amt|op.name (1) - op.subtotal (1), op.name (2), op.subtotal (2), etc....
I figured out using GROUP_CONCAT that I could get the names to show up pretty easily by adding:
GROUP_CONCAT(op.name) as product_name
to the SELECT clause, but for the life of me I can't figure out how to get the subtotal for each product to show up next to the product name. I have a feeling it involves a combination of joins or CONCAT nested inside GROUP_CONCAT, but nothing I've tried has worked. Any ideas?
#piotrm had an idea that seemed like it should work (below) but for some reason it returns the following:
SELECT `subtotals`
FROM `product_category`
WHERE `c`.`category_name` = 'Fragrance'AND.`amt` = '23164.50'AND.`subtotals` = CAST( 0x6c6f76656c696c7920454454202d203631302e30302c20466f72657665726c696c79202e313235206f7a2045445020726f6c6c657262616c6c202d20313831372e35302c20666f72657665726c696c79206361636865706f74202d2039302e30302c20666f72657665726c696c7920312f38206f756e63652070617266756d206f696c20726f6c6c657262616c6c202d20313833302e30302c20666f72657665726c696c792070657266756d696e6720626f6479206c6f74696f6e202d203938312e30302c20666f72657665726c696c7920332e34206f756e6365206561752064652070617266756d207370726179202009202d203535382e30302c20666f72657665726c696c79205363656e74656420566f746976652043616e646c65202d203132302e30302c20454450202620426f6f6b20736574202d203334332e30302c20666f72657665726c696c7920332e34206f756e6365206561752064652070617266756d207370726179202d2031363831352e3030 AS
BINARY ) ;
As soon as I take out s.subtotal from the original SELECT clause it pulls the correct product names. The JOIN query pulls the products out correctly with their associated category_id and subtotals. I just can't get the two to CONCAT together without creating this mess here. Any other thoughts?
Solution
#piotrm's query was basically right, except GROUP_CONCAT is looking for a collection of strings. So the final query looked like this:
SELECT c.name AS category_name,
SUM( s.subtotal ) AS amt,
GROUP_CONCAT( CONCAT(s.name, ' - ', cast(s.subtotal as char) ) SEPARATOR ', ') AS subtotals
FROM
product_category c
JOIN
(SELECT op.category, op.name, sum(op.qty*op.unit_price) AS subtotal
FROM order_products op
JOIN orders o ON o.id = op.order_id
WHERE o.date > '0'
GROUP BY op.category, op.name ) s
ON s.category = c.id
GROUP BY c.name
Guessing from your query there is also order_id field in your order_products table you didn't mention in the table definition. Your query should then look like:
SELECT c.name AS category_name,
SUM( s.subtotal ) AS amt,
GROUP_CONCAT( CONCAT(s.name, ' - ', s.subtotal ) SEPARATOR ', ' ) AS subtotals
FROM
product_category c
JOIN
( SELECT op.category, op.name, sum(op.qty*op.unit_price) AS subtotal
FROM order_products op
JOIN orders o ON o.id = op.order_id
WHERE o.date > '2012-03-31'
GROUP BY op.category, op.name ) s
ON s.category = c.id
GROUP BY c.name
Your db schema is quite weird though, orders table looks like it could be removed and that date moved to order_products, because for every order_products row you have reference to orders table. Usually it is the other way - there are many orders for every product referenced by product_id field in the orders table. Also date column in orders is of type varchar - why not date or datetime?
Try:
SELECT
SUM(op.unit_price * op.qty) as amt,
c.name as category_name,
group_concat(concat(op.name, '-', op.qty, ',' ,
op.unit_price*op.qty) separator '|')
...

Aggregate function not working as expected with subquery

Having some fun with MySQL by asking it difficult questions.
Essentially i have a table full of transactions, and from that i want to determine out of all the available products (productid), who (userid) has bought the most of each? The type in the where clause refers to transaction type, 1 being a purchase.
I have a subquery that on its own returns a list of the summed products bought for each person, and it works well by itself. From this i am trying to then pick the max of the summed quantities and group by product, which is a pretty straight forward aggregate. Unfortunately it's giving me funny results! The userid does not correspond correctly to the reported max productid sales.
select
`userid`, `productid`, max(`sumqty`)
from
(select
`userid`, `productid`, sum(`qty`) as `sumqty`
from
`txarchive`
where
`type` = 1
group by `userid`,`productid`) as `t1`
group by `productid`
I have removed all the inner joins to give more verbal results as they don't change the logic of it all.
Here is the structure of tx if you are interested.
id bigint(20) #transaction id
UserID bigint(20) #user id, links to another table.
ProductID bigint(20) #product id, links to another table.
DTG datetime #date and time of transaction
Price decimal(19,4) #price per unit for this transaction
QTY int(11) #QTY of products for this transaction
Type int(11) #transaction type, from purchase to payment etc.
info bigint(20) #information string id, links to another table.
*edit
Working final query: (Its biggish)
select
`username`, `productname`, max(`sumqty`)
from
(select
concat(`users`.`firstname`, ' ', `users`.`lastname`) as `username`,
`products`.`name` as `productname`,
sum(`txarchive`.`qty`) as `sumqty`
from
`txarchive`
inner join `users` ON `txarchive`.`userid` = `users`.`id`
inner join `products` ON `txarchive`.`productid` = `products`.`id`
where
`type` = 1
group by `productname`,`username`
order by `productname`,`sumqty` DESC) as `t1`
group by `productname`
order by `sumqty` desc
Not the best solution (not even guaranteed to work 100% of the times):
select
`userid`, `productid`, max(`sumqty`)
from
( select
`userid`, `productid`, sum(`qty`) as `sumqty`
from
`txarchive`
where
`type` = 1
group by
`productid`
, `userid`
order by
`productid`
, `sumqty` DESC
) as `t1`
group by
`productid`