WHY don't aggregate functions work, unless using GROUP BY statement?

WHY don't aggregate functions work, unless using GROUP BY statement? - mysql

To calculate the price of invoices (that have *invoice item*s in a separate table and linked to the invoices), I had written this query:
SELECT `i`.`id`, SUM(ii.unit_price * ii.quantity) invoice_price
FROM (`invoice` i)
JOIN `invoiceitem` ii
ON `ii`.`invoice_id` = `i`.`id`
WHERE `i`.`user_id` = '$user_id'
But it only resulted ONE row.
After research, I got that I had to have GROUP BY i.id at the end of the query. With this, the results were as expected.
From my opinion, even without GROUP BY i.id, nothing is lost and it should work well!
Please in some simple sentences tell me...
Why should I always use the additional!!! GROUP BY i.id, What is lost without it, and maybe as the most functioning question, How should I remember that I have lost the additional GROUP BY?!

You have to include the group by because there are many IDs that went into the sum. If you don't specify it then MySQL just picks the first one, and sums across the entire result set. GroupBy tells MySQL to sum (or generically aggregate) for each Grouped By Entity.

Why should I always use GROUP BY?
SUM() and others are Aggregate Functions. Their very nature requires that they be used in combination with GROUP BY.
What is lost without it?
From the documentation:
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
In the end, there is nothing to remember, as these are GROUP BY aggregate functions. You will quickly tell from the result that you have forgotten GROUP BY when the result includes the entire result set (incorrectly), instead of your grouped subsets.

Related

SQL Query: Joining on a SUM()

I'm trying to run a query that sums the value of items and then JOIN on the value of that SUM.
So in the below code, the Contract_For is what I'm trying to Join on, but I'm not sure if that's possible.
SELECT `items_value`.`ContractId` as `Contract`,
`items_value`.`site` as `SiteID`,
SUM(`items_value`.`value`) as `Contract_For`,
`contractitemlists`.`Text` as `Contracted_Text`
FROM items_value
LEFT JOIN contractitemlists ON (`items_value`.`Contract_For`) = `contractitemlists`.`Ref`;
WHERE `items_value`.`ContractID`='2';
When I've face similar issues in the past, I've just created a view that holds the SUM, then joined to that in another view.
At the moment, the above sample is meant to work for just one dummy value, but it's intended to be stored procedure, where the user selects the ContractID. The error I get at the moment is 'Unknown Column items_value.Contract_For

You cannot use aliases or aggregate using expressions from the SELECT clause anywhere but HAVING and ORDER BY*; you need to make the first "part" a subquery, and then JOIN to that.
It might be easier to understand, though a bit oversimplified and not precisely correct, if you look at it this way as far as order of evaluation goes...
FROM (Note: JOIN is only within a FROM)
WHERE
GROUP BY
SELECT
HAVING
ORDER BY
In actual implementation, "under the hood", most SQL implementations actually use information from each section to optimize other sections (like using some where conditions to reduce records JOINed in a FROM); but this is the conceptual order that must be adhered to.
*In some versions of MSSQL, you cannot use aliases from the SELECT in HAVING or ORDER BY either.
Your query needs to be something like this:
SELECT s.*
, `cil`.`Text` as `Contracted_Text`
FROM (
SELECT `iv`.`ContractId` as `Contract`
, `iv`.`site` as `SiteID`
, SUM(`iv`.`value`) as `Contract_For`
FROM items_value AS iv
WHERE `iv`.`ContractID`='2'
) AS s
LEFT JOIN contractitemlists AS cil ON `s`.`Contract_For` = cil.`Ref`
;
But as others have mentioned, the lack of a GROUP BY is something to be looked into; as in "what if there are multiple site values."

using count and suppress/ignore group by

Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
I got this query which is counting the total entries. The query is generic generated and therefore I can't make any comprehensive changes like subqueries etc.
In some specific cases a group by is needed to retrieve the correct rows and because of this the group by can't be removed
SELECT count(dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0
GROUP BY dv.id

Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
well, the answer to your question is simply you can't have an aggregate that works on all the results, while having a group by statement. That's the whole purpose of the group by to create groups that change the behaviour of aggregates:
The GROUP BY clause causes aggregations to occur in groups (naturally) for the columns you name.
cf this blog post which is only the first result I found on google on this topic.
You'd need to redesign your query, the easiest way being to create a subquery, or a hell of a jointure. But without the schema and a little context on what you want this query to do, I can't give you an alternative that works.
I just can tell you that you're trying to use a hammer to tighten a screw...

Have found an alternative where COUNT DISTINCT is used
SELECT count(distinct dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0

MySQL Group By and Order by not playing nicely together

I have read a few post on this, but not seeming to be able to fix my problem.
I am calling two database queries to populate two array's that run along side by side of each other, but they aren't matching, as the order that they come out is different. I believe i have something to do with the Group By, and this may require a sub query, but again a little lost...
Query 1:
SELECT count(bids_bid.total_bid), bidtime_bid, users_usr.company_usr, users_usr.id_usr
FROM bids_bid
INNER JOIN users_usr
ON bids_bid.user_bid = users_usr.id_usr
WHERE auction_bid = 36
GROUP BY user_bid
ORDER BY bidtime_bid ASC
Query 2:
SELECT auction_bid, user_bid, bidtime_bid, bids_bid.total_bid
FROM bids_bid
WHERE auction_bid = 36
ORDER BY bidtime_bid ASC
Even though the 'Order by' is the same the results aren't matching. The users are coming out in a different sequence.
I hope this makes sense, and thanks in advance.
* Update *
I just wanted to add a bit of clarity on what the output I want is. I need to only show 1 result by one user (user_bid) the second query show all users rows. I only need the first one to show the first row entered for each user. So if I could order before the the group and by min date, that would be ace...

It's to be expected. You're fetching fields that are NOT involved in the grouping, and are not part of an aggregate function. MySQL allows such things, but generally the results of the ungrouped/unaggregated functions can be wonky.
Because MySQL is free to chose WHICH of the potentially multiple 'free' rows to choose for the actual result row, you will get different results. Generally it picks the first-encountered 'free choice' result, but that's not defined/guaranteed.

You use grouping when you want unique results in result set according to some
group id (column name). usually grouping is used with aggregate functions such as
(min, max,count,sum..).
Ordering or inner query is nothing to do with result set, i suggest read some introductory
tutorials about grouping and think/treat Sql as a set based language and most of the set theory is applied on sql you'll be fine.

So I was complicating issues that I didn't need to. The solution I found was before.
SELECT users_usr.company_usr,
users_usr.id_usr,
bids_bid.bidtime_bid, min(bidtime_bid) as minbid FROM bids_bid INNER JOIN users_usr ON bids_bid.user_bid = users_usr.id_usr
WHERE auction_bid = 36
GROUP BY id_usr
ORDER BY minbid ASC
Thanks everyone for making me look (try) harder...

How does adding GROUP BY or DISTINCT give the same result set?

SELECT unit.id,
unit.unit_name,
unit.description,
unit.category_id,
city.name,
mealbase.name AS mealbase_name,
unit.province_id,
unit.rooms,
unit.max_people,
unit.thumblocation,
prices.normal_price,
prices.holiday_price
FROM jos_units AS unit,
jos_prices AS prices,
jos_cities AS city,
jos_meal_basis AS mealbase
WHERE prices.unit_id = unit.id
AND city.id = unit.city_id
AND unit.published = 1
AND unit.mealbasis_id = mealbase.id
When I run this query It gives me redundant result set as below.
But If I add
SELECT DISTINCT unit.id Instead of SELECT unit.id at the beginning Or
GROUP BY unit.unit.id at the end. It gives me correct result set as below.
My issue is What's wrong with my query(join above gives redundant result even I have corrected joined them)? Why does the adding SELECT DISTINCT unit.id or GROUP BY unit.unit.id is same for the query(which fixes the issue) here? (DISTINCT AND GROUP BY are different functionalities)
Given that I know adding `SELECT DISTINCT unit.id will remove the redundant results but how does the adding one of the two snippet gives same result set? Obviously SELECT DISTINCT unit.id should remove redundant rows by how does the GROUP BY do it?

Basically you are grouping the results without using an aggregation function (using a COUNT, or a MAX, for examples), thus you get the aggregate row in the same way you would obtain it by selecting DISTINCT objects. If you don't need to aggregate them, DISTINCT is the right thing to do.

join above gives redundant result even I have corrected joined them
why is it?
Thats because of how your tables:
jos_units.
jos_prices.
jos_cities.
jos_meal_basis.
are related to each others.
It seems like you have one to many or many to many relations between those tables. For instance, for each record in the jos_meal_basis, each meal has a unit, so many meals might be measured by the same unit, then when joining the two tables you will get redundant units because of this. The same with other tables.

Your combination in the first query, ie
(unit.id,
unit.unit_name,
unit.description,
unit.category_id,
city.name,
mealbase.name AS mealbase_name,
unit.province_id,
unit.rooms,
unit.max_people,
unit.thumblocation,
prices.normal_price,
prices.holiday_price) has duplicates and so you are getting more than 1 rows for the same combination.
When you use distinct clause or group by it removes duplicates in your above combination. Hope this helps you.

GROUP BY is primarily used if you want to use aggregate or group functions. For example if you wanted to find the number of rows that match you could do
SELECT
id
, COUNT(id) num_rows
FROM
...
GROUP BY id
because the COUNT is an aggregate function you need to group by the other columns. If you aren't doing any aggregate functions, GROUP BY is essentially just aggregating the rows up (if that's the way you've written it) causing only one row - the same as DISTINCT.

mysql max query then JOIN?

I have followed the tutorial over at tizag for the MAX() mysql function and have written the query below, which does exactly what I need. The only trouble is I need to JOIN it to two more tables so I can work with all the rows I need.
$query = "SELECT idproducts, MAX(date) FROM results GROUP BY idproducts ORDER BY MAX(date) DESC";
I have this query below, which has the JOIN I need and works:
$query = ("SELECT *
FROM operators
JOIN products
ON operators.idoperators = products.idoperator JOIN results
ON products.idProducts = results.idproducts
ORDER BY drawndate DESC
LIMIT 20");
Could someone show me how to merge the top query with the JOIN element from my second query? I am new to php and mysql, this being my first adventure into a computer language I have read and tried real hard to get those two queries to work, but I am at a brick wall. I cannot work out how to add the JOIN element to the first query :(
Could some kind person take pity on a newb and help me?

Try this query.
SELECT
*
FROM
operators
JOIN products
ON operators.idoperators = products.idoperator
JOIN
(
SELECT
idproducts,
MAX(date)
FROM results
GROUP BY idproducts
) AS t
ON products.idproducts = t.idproducts
ORDER BY drawndate DESC
LIMIT 20

JOINs function somewhat independently of aggregation functions, they just change the intermediate result-set upon which the aggregate functions operate. I like to point to the way the MySQL documentation is written, which hints uses the term 'table_reference' in the SELECT syntax, and expands on what that means in JOIN syntax. Basically, any simple query which has a table specified can simply expand that table to a complete JOIN clause and the query will operate the same basic way, just with a modified intermediate result-set.
I say "intermediate result-set" to hint at the mindset which helped me understand JOINS and aggregation. Understanding the order in which MySQL builds your final result is critical to knowing how to reliably get the results you want. Generally, it starts by looking at the first row of the first table you specify after 'FROM', and decides if it might match by looking at 'WHERE' clauses. If it is not immediately discardable, it attempts to JOIN that row to the first JOIN specified, and repeats the "will this be discarded by WHERE?". This repeats for all JOINs, which either add rows to your results set, or remove them, or leaves just the one, as appropriate for your JOINs, WHEREs and data. This process builds what I am referring to when I say "intermediate result-set". Somewhere between starting and finishing your complete query, MySQL has in it's memory a potentially massive table-like structure of data which it built using the process I just described. Only then does it begin to aggregate (GROUP) the results according to your criteria.
So for your query, it depends on what specifically you are going for (not entirely clear in OP). If you simply want the MAX(date) from the second query, you can simply add that expression to the SELECT clause and then add an aggregation spec to the end:
SELECT *, MAX(date)
FROM operators
...
GROUP BY idproducts
ORDER BY ...
Alternatively, you can add the JOIN section of the second query to the first.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

WHY don't aggregate functions work, unless using GROUP BY statement? - mysql

You have to include the group by because there are many IDs that went into the sum. If you don't specify it then MySQL just picks the first one, and sums across the entire result set. GroupBy tells MySQL to sum (or generically aggregate) for each Grouped By Entity.

Related

SQL Query: Joining on a SUM()

using count and suppress/ignore group by

MySQL Group By and Order by not playing nicely together

How does adding GROUP BY or DISTINCT give the same result set?

mysql max query then JOIN?

Categories

Resources