Aggregate function not working as expected with subquery

Aggregate function not working as expected with subquery - mysql

Having some fun with MySQL by asking it difficult questions.
Essentially i have a table full of transactions, and from that i want to determine out of all the available products (productid), who (userid) has bought the most of each? The type in the where clause refers to transaction type, 1 being a purchase.
I have a subquery that on its own returns a list of the summed products bought for each person, and it works well by itself. From this i am trying to then pick the max of the summed quantities and group by product, which is a pretty straight forward aggregate. Unfortunately it's giving me funny results! The userid does not correspond correctly to the reported max productid sales.
select
`userid`, `productid`, max(`sumqty`)
from
(select
`userid`, `productid`, sum(`qty`) as `sumqty`
from
`txarchive`
where
`type` = 1
group by `userid`,`productid`) as `t1`
group by `productid`
I have removed all the inner joins to give more verbal results as they don't change the logic of it all.
Here is the structure of tx if you are interested.
id bigint(20) #transaction id
UserID bigint(20) #user id, links to another table.
ProductID bigint(20) #product id, links to another table.
DTG datetime #date and time of transaction
Price decimal(19,4) #price per unit for this transaction
QTY int(11) #QTY of products for this transaction
Type int(11) #transaction type, from purchase to payment etc.
info bigint(20) #information string id, links to another table.
*edit
Working final query: (Its biggish)
select
`username`, `productname`, max(`sumqty`)
from
(select
concat(`users`.`firstname`, ' ', `users`.`lastname`) as `username`,
`products`.`name` as `productname`,
sum(`txarchive`.`qty`) as `sumqty`
from
`txarchive`
inner join `users` ON `txarchive`.`userid` = `users`.`id`
inner join `products` ON `txarchive`.`productid` = `products`.`id`
where
`type` = 1
group by `productname`,`username`
order by `productname`,`sumqty` DESC) as `t1`
group by `productname`
order by `sumqty` desc

Not the best solution (not even guaranteed to work 100% of the times):
select
`userid`, `productid`, max(`sumqty`)
from
( select
`userid`, `productid`, sum(`qty`) as `sumqty`
from
`txarchive`
where
`type` = 1
group by
`productid`
, `userid`
order by
`productid`
, `sumqty` DESC
) as `t1`
group by
`productid`

Related

A column returns null value in SQL statement containing multiple SELECTs

I inherited a particular code from a previous developer. I intend to build the application all over again, but I have to add some functionalities before I proceed.
Firstly, It's a 62 column table that has to do with accounting, and I also have to fetch values from different tables with a single call to get the values that I need before insertion.
Lets say I need to make an insertion into table dailysales and i need to get values from table a,b,c and d at the same time.
I already have an sql statement for fetching this values, and it works fine except that a particular column keeps returning as NULL.
Here's my code:
SELECT `gds_pnr_ref`, `transaction_date`,
(SELECT `lastname` FROM `a` WHERE `id` = `staff` LIMIT 1) as `lastname`,
(SELECT `firstname` FROM `a` WHERE `id` = `staff` LIMIT 1) as `firstname`,
(SELECT `department_name` FROM `b` WHERE `id` = `staff_department` LIMIT 1) as `department`,
(SELECT `name` FROM `b` WHERE `memo_serial` = '$some_value' LIMIT 1) as `pax_name`,
(SELECT `customer_name` FROM `c` WHERE `id` = `customer_name` LIMIT 1) as `customer`,
travel_product,
(SELECT `vendor_name` FROM `c` WHERE `id` = `vendor` LIMIT 1) as `vendor`
FROM `d` WHERE `id` = '$some_value' LIMIT 1
The column (SELECT customer_name FROM c WHERE id = customer_name LIMIT 1) as customer always returns as NULL but when i run it independently it gives me the appropriate value.
I'm very much opened to a better solution for going about this.

You should always qualify column names in a query. Presumably, you intend something like this:
SELECT d.`gds_pnr_ref`, d.`transaction_date`,
(SELECT a.`lastname` FROM `a` WHERE a.`id` = d.`staff` LIMIT 1) as `lastname`,
(SELECT a.`firstname` FROM `a` WHERE a.`id` = d.`staff` LIMIT 1) as `firstname`,
(SELECT b.`department_name` FROM `b` WHERE b.`id` = d.`staff_department` LIMIT 1) as `department`,
(SELECT b.`name` FROM `b` WHERE b.`memo_serial` = ? LIMIT 1) as `pax_name`,
(SELECT c.`customer_name` FROM `c` WHERE c.`id` = d.`customer_name` LIMIT 1) as `customer`,
d.travel_product,
(SELECT c.`vendor_name` FROM `c` WHERE c.`id` = d.`vendor` LIMIT 1) as `vendor`
FROM `d`
WHERE d.`id` = ?
LIMIT 1;
I have to guess where the columns come from -- so this might not be 100% correct.
Notice that I also replaced the string variables with the ? placeholder. This is a reminder that you should be using parameters for such values.

Thanks guys, but this is the query i finally went with that returns all my needed values with non as null.
SELECT `a`.`currency`,
`a`.`vendor_name`,
CONCAT(`c`.`lastname`, ' ', `c`.`firstname`) AS `actioned_by`,
`e`.`department_name` AS `department`,
`f`.`customer_name` AS `customer`,
`g`.`currency_name` AS `fl_currency`,
`b`.`name`,
`b`.`nuc`,
`b`.`tax`,
`b`.`comm` AS `comm_percen`,
`b`.`comm_tax` AS `comm_tax_value`,
`b`.`actual_comm`,
`b`.`service_charge`,
`b`.`dip`,
SUM(`b`.`vendor`) AS payable,
`b`.`charge` AS receivable
FROM ((((((`d`
INNER JOIN `b` ON `d`.`id` = `b`.`memo_serial`)
INNER JOIN `a` ON `d`.`vendor` = `a`.`id`)
INNER JOIN `c` ON `d`.`staff` = `c`.`id`)
INNER JOIN `e` ON `d`.`staff_department` = `e`.`id`)
INNER JOIN `f` ON `d`.`customer_name` = `f`.`id`)
INNER JOIN `g` ON `a`.`currency` = `g`.`id`) WHERE `d`.`id` = '$some_value'
Using sub queries had some limitations, like when i needed to pull out multiple entries of the a particular foreign key in a particular table. It keeps returning the first row value only. So i ended up using INNER JOIN to pull from 6 different tables to get my results and it's way neater

Improve query of getting row with max value

I have next tables:
CREATE TABLE IF NOT EXISTS `Customers` (
`id` INT AUTO_INCREMENT,
`name` VARCHAR(20) NOT NULL,
PRIMARY KEY(`id`)
);
CREATE TABLE IF NOT EXISTS `Orders` (
`id` INT AUTO_INCREMENT,
`id_cust` INT NOT NULL,
`descr` VARCHAR(40),
`price` INT NOT NULL,
PRIMARY KEY(`id`),
FOREIGN KEY(`id_cust`) REFERENCES `Customers`(`id`)
);
One customer can have many orders. I want to get id_cust and sum of the orders of who paid the most(one person).
My query:
SELECT cust, max_orders_sum
FROM
(
(
SELECT MAX(orders_sum) AS max_orders_sum
FROM (
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
) AS same_query0
) AS step1
INNER JOIN
(
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
) AS same_query1
ON (step1.max_orders_sum = same_query1.orders_sum)
);
Main problem:
as you can see, it has the same parts: same_query0 and same_query1. Is there any way to get rid of them?
Or if you know the better way to reach my goal, please share.
I found one simple solution:
SELECT o.id_cust AS cust, SUM(o.price) AS orders_sum
FROM Orders AS o
GROUP BY o.id_cust
ORDER BY orders_sum DESC LIMIT 1;
But this is not a direct way to solve the problem.

I don't think you can do much better than what you've already done, unless you create a view like
create view v_cust_tot as
select id_cust, sum(price) as cust_tot
from Orders
group by id_cust
With that you'd be able to rewrite your query like this
select id_cust, cust_tot
from v_cust_tot
where cust_tot = (select max(cust_tot) from v_cust_tot)
This would be an improvement just in the compactness of the query, because I think performances would be the same as the execution plan would be almost identical

Another one nice solution:
select id_cust, sum(price) from orders group by id_cust having sum(price) =
(select max(prc) from
(select sum(price) as prc from orders group by id_cust) as tb);

MySQL: How do I select fields from multiple tables for insert to a third table?

I am implementing a time tracking solution for our small company.I have this query that does an insert into a table through a Perl script. The basic query works fine but I have two inputs, project_id and category_id, that I need to use to select the id from another table for insert.
INSERT INTO `time_entries` (`project_id`, `user_id`, `category_id`, `start`)
SELECT a.`project_id`, a.`user_id`, a.`category_id`, a.`start` FROM
(SELECT
(SELECT `id` FROM `projects` WHERE `title` = $scanin[0]) `project_id`,
$scanin[1] `user_id`,
(SELECT `id` FROM `categories` WHERE `barcode` = $scanin[2]) `category_id`,
NOW() `start`) a
WHERE NOT EXISTS
( SELECT 1 FROM `time_entries` WHERE `project_id` = (SELECT `id` FROM `projects` WHERE `title` = $scanin[0])
AND `user_id` = $scanin[1]
AND `category_id` = $scanin[2]
AND `end` = '0000-00-00 00:00:00')
It works fine if I am selecting from one table for insert but obviously won't work with two tables. Is it even possible to do this? I am pretty good with simple SQL statements but this is complex and joins have always been a problem for me.I just don't do a lot of it.
time_entries projects categories
------------ -------- ----------
id id id
project_id title barcode
user_id
category_id
start
end

That is a pretty obfuscate query, with way to many unnecessary nested queries.
First lets clean it up, assuming the following database structure;
projects categories time_entries
-------- ---------- ------------
id id id
cat_id title project_id
usr_id user_id
title user_id
category_id
start
end
We can simplify your query to a more developer friendly version;
INSERT INTO `time_entries` (`project_id`, `user_id`, `category_id`, `start`)
SELECT project_id, user_id, category_id, now()
FROM projects JOIN category ON projects.cat_id = category.id
WHERE project_id NOT IN (
SELECT project_id
FROM time_entries
WHERE title = $scanin[0]
AND `user_id` = $scanin[1]
AND `category_id` = $scanin[2]
AND `end` = '0000-00-00 00:00:00'
)
Ok, so now it should be much easier to add another table as you requested by using the following pattern;
INSERT INTO time_e... , column_n, column_n_plus_1
....
FROM proj....
JOIN table_n on id_n = project_id
JOIN table_n_plus_one on id_n_plus_one = project_id
....

Adding an index gave me the behavior I was looking for.
ALTER TABLE `time_entries` ADD UNIQUE `unique_record_index`(`project_id`,`user_id`,`category_id`,`end`)
Thanks for the help!

GROUP BY with MAX date field - erratic results

Have a table containing form data. Each row contains a section_id and field_id. There are 50 distinct fields for each section. As users update an existing field, a new row is inserted with an updated date_modified. This keeps a rolling archive of changes.
The problem is that I'm getting erratic results when pulling the most recent set of fields to display on a page.
I've narrowed down the problem to a couple of fields, and have recreated a portion of the table in question on SQLFiddle.
Schema:
CREATE TABLE IF NOT EXISTS `cTable` (
`section_id` int(5) NOT NULL,
`field_id` int(5) DEFAULT NULL,
`content` text,
`user_id` int(11) NOT NULL,
`date_modified` datetime NOT NULL,
KEY `section_id` (`section_id`),
KEY `field_id` (`field_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
This query shows all previously edited rows for field_id 39. There are five rows returned:
SELECT cT.*
FROM cTable cT
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Here's what I'm trying to do to pull the most recent row for field_id 39. No rows returned:
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Record Count: 0;
If I try the same query on a different field_id, say 54, I get the correct result:
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=54;
Record Count: 1;
Why would same query work on one field_id, but not the other?

In your subquery from where you are getting maxima you need to GROUP BY section_id,field_id using just GROUP BY field_id is skipping the section id, on which you are applying filter
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT section_id,field_id, MAX(date_modified) AS date_modified
FROM cTable GROUP BY section_id,field_id
) AS max
ON(max.field_id =cT.field_id
AND max.date_modified=cT.date_modified
AND max.section_id=cT.section_id
)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
See Fiddle Demo

You are looking for the max(date_modified) per field_id. But you should look for the max(date_modified) per field_id where the section_id is 123. Otherwise you may find a date for which you find no match later.
SELECT cT.*
FROM cTable cT
INNER JOIN (
SELECT field_id, MAX(date_modified) AS date_modified
FROM cTable
WHERE section_id = 123
GROUP BY field_id
) AS max USING (field_id, date_modified)
WHERE
cT.section_id = 123 AND
cT.field_id=39;
Here is the SQL fiddle: http://www.sqlfiddle.com/#!2/0cefd8/19.

MySQL one to many relationship: GROUP_CONCAT or JOIN or both?

I need help with a MySQL query. I have three tables:
`product_category` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
`order_products` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`qty` int(11) NOT NULL,
`unit_price` decimal(11,2) NOT NULL,
`category` int(11) NOT NULL,
`order_id` int (11) NOT NULL,
PRIMARY KEY (`id`)
);
`orders` (
`id` int(11) NOT NULL auto_increment,
`date` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
I had a query that calculated subtotal of all orders by product category (in a certain date range). It looked like this:
SELECT
SUM(op.unit_price * op.qty) as amt,
c.name as category_name
FROM
order_products op,
product_category c,
orders o
WHERE
op.category = c.id
AND
op.order_id = o.id
AND
o.date > 'xxxxxxx'
GROUP BY
c.id
This works great, but now I want to add the individual products and their subtotals to each row so that I get a result like this:
c.name|amt|op.name (1) - op.subtotal (1), op.name (2), op.subtotal (2), etc....
I figured out using GROUP_CONCAT that I could get the names to show up pretty easily by adding:
GROUP_CONCAT(op.name) as product_name
to the SELECT clause, but for the life of me I can't figure out how to get the subtotal for each product to show up next to the product name. I have a feeling it involves a combination of joins or CONCAT nested inside GROUP_CONCAT, but nothing I've tried has worked. Any ideas?
#piotrm had an idea that seemed like it should work (below) but for some reason it returns the following:
SELECT `subtotals`
FROM `product_category`
WHERE `c`.`category_name` = 'Fragrance'AND.`amt` = '23164.50'AND.`subtotals` = CAST( 0x6c6f76656c696c7920454454202d203631302e30302c20466f72657665726c696c79202e313235206f7a2045445020726f6c6c657262616c6c202d20313831372e35302c20666f72657665726c696c79206361636865706f74202d2039302e30302c20666f72657665726c696c7920312f38206f756e63652070617266756d206f696c20726f6c6c657262616c6c202d20313833302e30302c20666f72657665726c696c792070657266756d696e6720626f6479206c6f74696f6e202d203938312e30302c20666f72657665726c696c7920332e34206f756e6365206561752064652070617266756d207370726179202009202d203535382e30302c20666f72657665726c696c79205363656e74656420566f746976652043616e646c65202d203132302e30302c20454450202620426f6f6b20736574202d203334332e30302c20666f72657665726c696c7920332e34206f756e6365206561752064652070617266756d207370726179202d2031363831352e3030 AS
BINARY ) ;
As soon as I take out s.subtotal from the original SELECT clause it pulls the correct product names. The JOIN query pulls the products out correctly with their associated category_id and subtotals. I just can't get the two to CONCAT together without creating this mess here. Any other thoughts?
Solution
#piotrm's query was basically right, except GROUP_CONCAT is looking for a collection of strings. So the final query looked like this:
SELECT c.name AS category_name,
SUM( s.subtotal ) AS amt,
GROUP_CONCAT( CONCAT(s.name, ' - ', cast(s.subtotal as char) ) SEPARATOR ', ') AS subtotals
FROM
product_category c
JOIN
(SELECT op.category, op.name, sum(op.qty*op.unit_price) AS subtotal
FROM order_products op
JOIN orders o ON o.id = op.order_id
WHERE o.date > '0'
GROUP BY op.category, op.name ) s
ON s.category = c.id
GROUP BY c.name

Guessing from your query there is also order_id field in your order_products table you didn't mention in the table definition. Your query should then look like:
SELECT c.name AS category_name,
SUM( s.subtotal ) AS amt,
GROUP_CONCAT( CONCAT(s.name, ' - ', s.subtotal ) SEPARATOR ', ' ) AS subtotals
FROM
product_category c
JOIN
( SELECT op.category, op.name, sum(op.qty*op.unit_price) AS subtotal
FROM order_products op
JOIN orders o ON o.id = op.order_id
WHERE o.date > '2012-03-31'
GROUP BY op.category, op.name ) s
ON s.category = c.id
GROUP BY c.name
Your db schema is quite weird though, orders table looks like it could be removed and that date moved to order_products, because for every order_products row you have reference to orders table. Usually it is the other way - there are many orders for every product referenced by product_id field in the orders table. Also date column in orders is of type varchar - why not date or datetime?

Try:
SELECT
SUM(op.unit_price * op.qty) as amt,
c.name as category_name,
group_concat(concat(op.name, '-', op.qty, ',' ,
op.unit_price*op.qty) separator '|')
...

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Aggregate function not working as expected with subquery - mysql

Related

A column returns null value in SQL statement containing multiple SELECTs

Improve query of getting row with max value

MySQL: How do I select fields from multiple tables for insert to a third table?

GROUP BY with MAX date field - erratic results

MySQL one to many relationship: GROUP_CONCAT or JOIN or both?

Categories

Resources