using count and suppress/ignore group by - mysql

Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
I got this query which is counting the total entries. The query is generic generated and therefore I can't make any comprehensive changes like subqueries etc.
In some specific cases a group by is needed to retrieve the correct rows and because of this the group by can't be removed
SELECT count(dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0
GROUP BY dv.id

Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
well, the answer to your question is simply you can't have an aggregate that works on all the results, while having a group by statement. That's the whole purpose of the group by to create groups that change the behaviour of aggregates:
The GROUP BY clause causes aggregations to occur in groups (naturally) for the columns you name.
cf this blog post which is only the first result I found on google on this topic.
You'd need to redesign your query, the easiest way being to create a subquery, or a hell of a jointure. But without the schema and a little context on what you want this query to do, I can't give you an alternative that works.
I just can tell you that you're trying to use a hammer to tighten a screw...

Have found an alternative where COUNT DISTINCT is used
SELECT count(distinct dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0

Related

Filtering using WHERE for a subquery

i consider myself a pretty well-versed developer, but this one has me stumped.
The actual use case is somewhat more complicated than this (i have built a data-view framework that allows you to filter and search data), but at its simplest...
Why can't I do something like this?:
SELECT
fundraisers.id,
(
SELECT
count(*)
FROM
transactions
WHERE
transactions.fundraiser_id = fundraisers.id) AS total
FROM
fundraisers
WHERE
total > 331
ORDER BY
total DESC
I've also tried:
I'm aware i can successfully use HAVING to do this, but i need it to be part of the WHERE clause in order to be able to use it in conjunction with other filters using the right AND/OR conditions.
doing it as a subquery JOIN instead, but it never seems to return the right count of transactions for the row.
Any help is appreciated! Thanks folks.
You can use a derived table, in other words a subquery in the FROM clause instead of the select-list.
SELECT t.fundraiser_id, t.total
FROM
fundraisers AS f
JOIN (
SELECT fundraiser_id, COUNT(*) AS total
FROM transactions
GROUP BY fundraiser_id
) AS t ON t.fundraiser_id = f.id
WHERE
t.total > 331
ORDER BY
t.total DESC;
The reason you can't refer to an alias in the WHERE clause is that the conditions of the WHERE clause is evaluated before expressions in the select-list are evaluated. This is a good thing, because you wouldn't want the select-list to be evaluated for potentially millions of rows that would be filtered out by the conditions anyway. Evaluating the select-list only for rows that are included in the result helps improve performance.
But it means the result of those expressions in the select-list, and their alias, isn't available to be referenced by conditions in the WHERE clause.
The workaround I show in my example above produces the results in a subquery, which happens before the WHERE clause gets to filter the results.
i don't know what do you want to select but try this
select fundraisers.id,count(*) as total FROM
fundraisers f join transactions t on t.fundraiser_id=f.fundraiser_id
WHERE
total > 331
ORDER BY
total DESC

WHY don't aggregate functions work, unless using GROUP BY statement?

To calculate the price of invoices (that have *invoice item*s in a separate table and linked to the invoices), I had written this query:
SELECT `i`.`id`, SUM(ii.unit_price * ii.quantity) invoice_price
FROM (`invoice` i)
JOIN `invoiceitem` ii
ON `ii`.`invoice_id` = `i`.`id`
WHERE `i`.`user_id` = '$user_id'
But it only resulted ONE row.
After research, I got that I had to have GROUP BY i.id at the end of the query. With this, the results were as expected.
From my opinion, even without GROUP BY i.id, nothing is lost and it should work well!
Please in some simple sentences tell me...
Why should I always use the additional!!! GROUP BY i.id, What is lost without it, and maybe as the most functioning question, How should I remember that I have lost the additional GROUP BY?!
You have to include the group by because there are many IDs that went into the sum. If you don't specify it then MySQL just picks the first one, and sums across the entire result set. GroupBy tells MySQL to sum (or generically aggregate) for each Grouped By Entity.
Why should I always use GROUP BY?
SUM() and others are Aggregate Functions. Their very nature requires that they be used in combination with GROUP BY.
What is lost without it?
From the documentation:
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
In the end, there is nothing to remember, as these are GROUP BY aggregate functions. You will quickly tell from the result that you have forgotten GROUP BY when the result includes the entire result set (incorrectly), instead of your grouped subsets.

mysql max query then JOIN?

I have followed the tutorial over at tizag for the MAX() mysql function and have written the query below, which does exactly what I need. The only trouble is I need to JOIN it to two more tables so I can work with all the rows I need.
$query = "SELECT idproducts, MAX(date) FROM results GROUP BY idproducts ORDER BY MAX(date) DESC";
I have this query below, which has the JOIN I need and works:
$query = ("SELECT *
FROM operators
JOIN products
ON operators.idoperators = products.idoperator JOIN results
ON products.idProducts = results.idproducts
ORDER BY drawndate DESC
LIMIT 20");
Could someone show me how to merge the top query with the JOIN element from my second query? I am new to php and mysql, this being my first adventure into a computer language I have read and tried real hard to get those two queries to work, but I am at a brick wall. I cannot work out how to add the JOIN element to the first query :(
Could some kind person take pity on a newb and help me?
Try this query.
SELECT
*
FROM
operators
JOIN products
ON operators.idoperators = products.idoperator
JOIN
(
SELECT
idproducts,
MAX(date)
FROM results
GROUP BY idproducts
) AS t
ON products.idproducts = t.idproducts
ORDER BY drawndate DESC
LIMIT 20
JOINs function somewhat independently of aggregation functions, they just change the intermediate result-set upon which the aggregate functions operate. I like to point to the way the MySQL documentation is written, which hints uses the term 'table_reference' in the SELECT syntax, and expands on what that means in JOIN syntax. Basically, any simple query which has a table specified can simply expand that table to a complete JOIN clause and the query will operate the same basic way, just with a modified intermediate result-set.
I say "intermediate result-set" to hint at the mindset which helped me understand JOINS and aggregation. Understanding the order in which MySQL builds your final result is critical to knowing how to reliably get the results you want. Generally, it starts by looking at the first row of the first table you specify after 'FROM', and decides if it might match by looking at 'WHERE' clauses. If it is not immediately discardable, it attempts to JOIN that row to the first JOIN specified, and repeats the "will this be discarded by WHERE?". This repeats for all JOINs, which either add rows to your results set, or remove them, or leaves just the one, as appropriate for your JOINs, WHEREs and data. This process builds what I am referring to when I say "intermediate result-set". Somewhere between starting and finishing your complete query, MySQL has in it's memory a potentially massive table-like structure of data which it built using the process I just described. Only then does it begin to aggregate (GROUP) the results according to your criteria.
So for your query, it depends on what specifically you are going for (not entirely clear in OP). If you simply want the MAX(date) from the second query, you can simply add that expression to the SELECT clause and then add an aggregation spec to the end:
SELECT *, MAX(date)
FROM operators
...
GROUP BY idproducts
ORDER BY ...
Alternatively, you can add the JOIN section of the second query to the first.

MySQL Joins, Group By, and Ordering the Group By Choice

Is it possible to order the GROUP BY chosen results of a MySQL query w/out using a subquery? I'm finding that, with my large dataset, the subquery adds a significant amount of load time to my query.
Here is a similar situation: how to sort order of LEFT JOIN in SQL query?
This is my code that works, but it takes way too long to load:
SELECT tags.contact_id, n.last
FROM tags
LEFT JOIN ( SELECT * FROM names ORDER BY timestamp DESC ) n
ON (n.contact_id=tags.contact_id)
WHERE tags.tag='$tag'
GROUP BY tags.contact_id
ORDER BY n.last ASC;
I can get a fast result doing a simple join w/ a table name, but the "group by" command gives me the first row of the joined table, not the last row.
I'm not really sure what you're trying to do. Here are some of the problems with your query:
selecting n.last, although it is neither in the group by clause, nor an aggregate value. Although MySQL allows this, it's really not a good idea to take advantage of.
needlessly sorting a table before joining, instead of just joining
the subquery isn't really doing anything
I would suggest carefully writing down the desired query results, i.e. "I want the contact id and latest date for each tag" or something similar. It's possible that will lead to a natural, easy-to-write and semantically correct query that is also more efficient than what you showed in the OP.
To answer the question "is it possible to order a GROUP BY query": yes, it's quite easy, and here's an example:
select a, b, sum(c) as `c sum`
from <table_name>
group by a,b
order by `c sum`
You are doing a LEFT JOIN on contact ID which implies you want all tag contacts REGARDLESS of finding a match in the names table. Is that really the case, or will the tags table ALWAYS have a "Names" contact ID record. Additionally, your column "n.Last". Is this the person's last name, or last time something done (which I would assume is actually the timestamp)...
So, that being said, I would just do a simple direct join
SELECT DISTINCT
t.contact_id,
n.last
FROM
tags t
JOIN names n
ON t.contact_id = n.contact_id
WHERE
t.tag = '$tag'
ORDER BY
n.last ASC

MySQL - What is the difference between GROUP BY and DISTINCT? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Is there any difference between Group By and Distinct
What's the difference between GROUP BY and DISTINCT in a MySQL query?
Duplicate of
Is there any difference between GROUP BY and DISTINCT
It is already discussed here
If still want to listen here
Well group by and distinct has its own use.
Distinct is used to filter unique records out of the records that satisfy the query criteria.
Group by clause is used to group the data upon which the aggregate functions are fired and the output is returned based on the columns in the group by clause. It has its own limitations such as all the columns that are in the select query apart from the aggregate functions have to be the part of the Group by clause.
So even though you can have the same data returned by distinct and group by clause its better to use distinct. See the below example
select col1,col2,col3,col4,col5,col6,col7,col8,col9 from table group by col1,col2,col3,col4,col5,col6,col7,col8,col9
can be written as
select distinct col1,col2,col3,col4,col5,col6,col7,col8,col9 from table
It makes you life easier when you have more columns in the select list. But at the same time if you need to display sum(col10) along with the above columns than you will have to use Group By. In that case distinct will not work.
eg
select col1,col2,col3,col4,col5,col6,col7,col8,col9,sum(col10) from table group by col1,col2,col3,col4,col5,col6,col7,col8,col9
Hope this helps.
DISTINCT works only on the entire row. Don't be mislead into thinking SELECT DISTINCT(A), B does something different. This is equivalent to SELECT DISTINCT A, B.
On the other hand GROUP BY creates a group containing all the rows that share each distinct value in a single column (or in a number of columns, or arbitrary expressions). Using GROUP BY you can use aggregate functions such as COUNT and MAX. This is not possible with DISTINCT.
If you want to ensure that all rows in your result set are unique and you do not need to aggregate then use DISTINCT.
For anything more advanced you should use GROUP BY.
Another difference that applies only to MySQL is that GROUP BY also implies an ORDER BY unless you specify otherwise. Here's what can happen if you use DISTINCT:
SELECT DISTINCT a FROM table1
Results:
2
1
But using GROUP BY the results will come in sorted order:
SELECT a FROM table1 GROUP BY a
Results:
1
2
As a result of the lack of sorting using DISTINCT is faster in the case where you can use either. Note: if you don't need the sorting with GROUP BY you can add ORDER BY NULL to improve performance.
Care to look at the docs:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
and
http://dev.mysql.com/doc/refman/5.0/en/select.html