Simplify select with count - mysql

I need to get some values from a database and count all rows.
I wrote this code:
SELECT author, alias, (select COUNT(*) from registry WHERE status='OK' AND type='H1') AS count
FROM registry
WHERE status='OK' AND type='H1'
It works, but, how can I simplify this code? Both WERE condition thereof.

If the query is returning the resultset you need, with the "total" count of rows (independent of author and alias), with the same exact value for "count" repeated on each row, we could rewrite the query like this:
SELECT t.author
, t.alias
, s.count
FROM registry t
CROSS
JOIN ( SELECT COUNT(*) AS `count`
FROM registry c
WHERE c.status='OK'
AND c.type='H1'
) s
WHERE t.status='OK'
AND t.type='H1'
I don't know if that's any simpler, but to someone reading the statement, I think it makes it more clear what resultset is being returned.
(I also tend to favor avoiding any subquery in the SELECT list, unless there is a specific reason to add one.)
The resultset from this query is a bit odd. But absent any example data, expected output or any specification other than the original query, we're just guessing. The query in my answer replicates the results from the original query, in a way that's more clear.

try this:
SELECT author, alias, count(1) AS caunt
FROM registry
WHERE status='OK' AND type='H1'
group by author, alias

The status and type are the criteria to extract the fields from the table. You cannot go more simpler than this.

Try using a group by,
SELECT author, alias, COUNT(*) AS COUNT
FROM registry
WHERE STATUS='OK' AND TYPE='H1'
GROUP BY author, alias
Depending on how your table is set up, you may actually need to do this to get the data you are looking for.
Some good reading for this: https://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
However, if you are wanting the count to be simply ALL the records with that criteria, what you have may be simplest already.

Related

How to GROUP BY on a GROUP_CONCAT(DISTINCT xxx) as yyyy

I am trying to GROUP BY a MYSQL request on a GROUP_CONCAT. The trio of values that is generated by this GROUP_CONCAT is the only unique identifier that I have to describe the group I want to apply the GROUP BY.
When I do the following :
SELECT [...] GROUP_CONCAT(DISTINCT xxx) as supsku
[...]
GROUP BY supsku
it says :
Can't group on 'supsku'
Thanks a lot
One way to go try with a subselect
SELECT t.* FROM (
SELECT [...] GROUP_CONCAT(DISTINCT xxx) as supsku
[...]
) t
GROUP BY supsku
You can't group by a column whose contents don't exist until after the groups are formed. That's a chicken-and-egg problem.
By analogy, suppose I ask you to scratch off some lottery tickets, but scratch them only if the total value of the winning tickets is more than $100? Obviously, you can't know what the winning values are before you scratch the lottery tickets, so you can't know if you should scratch them or not.
The answer from #MKhalidJunaid shows part of the solution -- using a subquery to produce a partial result with the strings formed into groups. Then embed that as a derived table subquery to be further processed by an outer query with a GROUP BY.
But the problem with that solution is that we don't know how to group the strings in the inner subquery. Without a valid GROUP BY in the subquery, the default is to treat the whole table as one group, and therefore GROUP_CONCAT will return one row with one string.
So you need to think about defining your problem better. There must be some other grouping criterion you have in mind.

Scope of COUNT(DISTINCT ..) when used with GROUP BY

I'm doing something like follows (Example, getting distinct people named "Mark" by State):
Select count(distinct FirstName) FROM table
GROUP BY State
I think the group by query organization is done first, such that the distinct is only relative to each "group by"? Basically, can "Mark" show up as a "distinct" count in each group? This would "scope" my distinct expression to the group by rows only, I believe...
This may actually depend on where DISTINCT is used. For example, SELECT DISTINCT COUNT( would be different than SELECT COUNT(DISTINCT.
In this case, it will work as you want and get a count of distinct names in each group (even if the names are not distinct across groups).
Your understanding is correct. Group by says, essentially, to take a group of rows and aggregate them into one row (based on the criteria). All aggregation functions -- including count(distinct) -- summarize values in this group.
As a note, you are using the word "scope". Just so you know, this has a particular meaning in SQL. The meaning refers to the portions of the query where a column or table alias are understood by the compiler.

Passing query result into subquery

SELECT alert,
(select created_at from alerts
WHERE alert = #ALERT ORDER BY created_at desc LIMIT 1)
AS latest FROM alerts GROUP BY alert;
I am having an issue with the above query where I would like to pass in each alert into the subquery so that I have a column called latest which displays the latest alert for each group of alerts. How should I do this?
This is called a correlated subquery. To make it work, you need table aliases:
SELECT a1.alert,
(select a2.created_at
from alerts a2
WHERE a2.alert = a1.alert
ORDER BY a2.created_at desc
LIMIT 1
) AS latest
FROM alerts a1
GROUP BY a1.alert;
Table aliases are a good habit to get into, because they often make the SQL more readable. It is also a good idea to use table aliases with column references, so you easily know where the column is coming from.
EDIT:
If you really want the latest, you can get it by simply doing:
select alert, max(created_at)
from alerts
group by alert;
If you are trying to get the latest created_at date for each group of alerts, there is a simpler way.
SELECT
alert,
max (created_at) AS latest
FROM
alerts
GROUP BY
alert;
I would do the following
SELECT
alert_group_name,
MAX(created_at) AS latest
FROM
alerts A
GROUP BY
alert_group_name;
For a correlated subquery, you need to reference an expression from the outer query.
The best way to do that is to assign an alias to the table on the outer query, and then reference that in the inner query. Best practice is to assign an alias to EVERY table reference, and qualify EVERY column reference.
All that needs to be done to "fix" your query is to replace the reference to "#ALERT" with a reference to the alert column from the table on the outer query.
In our shop, that statement would be formatted something like this:
SELECT a.alert
, (SELECT l.created_at
FROM alerts l
WHERE l.alert = a.alert
ORDER BY l.created_at DESC
LIMIT 1
) AS latest
FROM alerts a
GROUP
BY a.alert
Not because that's easier to write that way, but more importantly it's easier to read and understand what the statement is doing.
The correlated subquery approach can be efficient for a small number of rows returned (a very restrictive WHERE clause on the outermost query.) But in general, correlated subqueries in the SELECT list can make for a (what we refer to in our shop) an LDQ "light dimming query".
In our shop, if we needed the resultset returned by that query, that statement would likely be rewritten as:
SELECT a.alert
, MAX(a.created_at) AS latest
FROM alerts a
GROUP
BY a.alert
And we'd definitely have an index defined ON alerts(alert,created_at) (or an index with additional columns after those first two.)
size, we
(I don't anticipate any cases where this statement would return a different result.)

using count and suppress/ignore group by

Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
I got this query which is counting the total entries. The query is generic generated and therefore I can't make any comprehensive changes like subqueries etc.
In some specific cases a group by is needed to retrieve the correct rows and because of this the group by can't be removed
SELECT count(dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0
GROUP BY dv.id
Is it possible to have count in the select clause with a group by which is suppressed in the count? I need the count to ignore the group by clause
well, the answer to your question is simply you can't have an aggregate that works on all the results, while having a group by statement. That's the whole purpose of the group by to create groups that change the behaviour of aggregates:
The GROUP BY clause causes aggregations to occur in groups (naturally) for the columns you name.
cf this blog post which is only the first result I found on google on this topic.
You'd need to redesign your query, the easiest way being to create a subquery, or a hell of a jointure. But without the schema and a little context on what you want this query to do, I can't give you an alternative that works.
I just can tell you that you're trying to use a hammer to tighten a screw...
Have found an alternative where COUNT DISTINCT is used
SELECT count(distinct dv.id) num
FROM `data_voucher` dv
LEFT JOIN `data_voucher_enclosure` de ON de.data_voucher_id=dv.id
WHERE IF(de.id IS NULL,0,1)=0

MySQL: Include COUNT of SELECT Query Results as a Column (Without Grouping)

I have a simple report sending framework that basically does the following things:
It performs a SELECT query, it makes some text-formatted tables based on the results, it sends an e-mail, and it performs an UPDATE query.
This system is a generalization of an older one, in which all of the operations were hard coded. However, in pushing all of the logic of what I'd like to do into the SELECT query, I've run across a problem.
Before, I could get most of the information for my text tables by saying:
SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT';
Then, when I needed an extra number for the e-mail, also do:
SELECT COUNT(*) FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';
Now, I no longer have the luxury of multiple SELECT queries. What I'd like to do is something like:
SELECT Tabl.Name, Tabl.Address, COUNT(Results.UID) AS Totals
FROM Databas.Tabl
LEFT JOIN Databas.Tabl Results
ON Tabl.UID = Results.UID
AND Results.TimeLogged='Noon'
WHERE Status='URGENT';
This, at least in my head, says to get a total count of all the rows that were SELECTed and also have some conditional.
In reality, though, this gives me the "1140 - Mixing of GROUP columns with no GROUP columns illegal if no GROUP BY" error. The problem is, I don't want to GROUP BY. I want this COUNT to redundantly repeat the number of results that SELECT found whose TimeLogged='Noon'. Or I want to remove the AND clause and include, as a column in the result of the SELECT statement, the number of results that that SELECT statement found.
GROUP BY is not the answer, because that causes it to get the COUNT of only the rows who have the same value in some column. And COUNT might not even be the way to go about this, although it's what comes to mind. FOUND_ROWS() won't do the trick, since it needs to be part of a secondary query, and I only get one (plus there's no LIMIT involved), and ROW_COUNT() doesn't seem to work since it's a SELECT statement.
I may be approaching it from the wrong angle entirely. But what I want to do is get COUNT-type information about the results of a SELECT query, as well as all the other information that the SELECT query returned, in one single query.
=== Here's what I've got so far ===
SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
FROM Databas.Tabl
WHERE TimeLogged='Noon'
GROUP BY NULL) Results
ON 0 = Results.Bonus
WHERE Status='URGENT';
This does use sub-SELECTs, which I was initially hoping to avoid, but now realize that hope may have been foolish. Plus it seems like the COUNTing SELECT sub-queries will be less costly than the main query since the COUNT conditionals are all on one table, but the real SELECT I'm working with has to join on multiple different tables for derived information.
The key realizations are that I can GROUP BY NULL, which will return a single result so that COUNT(*) will actually catch everything, and that I can force a correlation to this column by just faking a Bonus column with 0 on both tables.
It looks like this is the solution I will be using, but I can't actually accept it as an answer until tomorrow. Thanks for all the help.
SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
FROM Databas.Tabl
WHERE TimeLogged='Noon'
GROUP BY NULL) Results
ON 0 = Results.Bonus
WHERE Status='URGENT';
I figured this out thanks to ideas generated by multiple answers, although it's not actually the direct result of any one. Why this does what I need has been explained in the edit of the original post, but I wanted to be able to resolve the question with the proper answer in case anyone else wants to perform this silly kind of operation. Thanks to all who helped.
You could probably do a union instead. You'd have to add a column to the original query and select 0 in it, then UNION that with your second query, which returns a single column. To do that, the second query must also select empty fields to match the first.
SELECT Cnt = 0, Name, Address FROM Databas.Tabl WHERE Status='URGENT'
UNION ALL
SELECT COUNT(*) as Cnt, Name='', Address='' FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';
It's a bit of a hack, but what you're trying to do isn't ideal...
Does this do what you need?
SELECT Tabl.Name ,
Tabl.Address ,
COUNT(Results.UID) AS GrandTotal,
COUNT(CASE WHEN Results.TimeLogged='Noon' THEN 1 END) AS NoonTotal
FROM Databas.Tabl
LEFT JOIN Databas.Tabl Results
ON Tabl.UID = Results.UID
WHERE Status ='URGENT'
GROUP BY Tabl.Name,
Tabl.Address
WITH ROLLUP;
The API you're using to access the database should be able to report to you how many rows were returned - say, if you're running perl, you could do something like this:
my $sth = $dbh->prepare("SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT'");
my $rv = $sth->execute();
my $rows = $sth->rows;
Grouping by Tabl.id i dont believe would mess up the results. Give it a try and see if thats what you want.