func.count(distinct(...)) does not give the same result as distinct().count()

func.count(distinct(...)) does not give the same result as distinct().count() - mysql

I have a column with null entries, e.g. the possible values in this column are None, 1, 2, 3
When I count the number of unique entries in the column with session.query(func.count(distinct(Entry.col))).scalar() I get back '3'.
But when I perform the count with session.query(Entry.col).distinct().count(), I get back '4'.
Why does the latter method count the None, but the first doesn't?

In the first case, the resulting query will look like this:
SELECT COUNT(DISTINCT(col)) FROM Entry
... and, as you probably already know, COUNT here won't actually count the NULL values.
In the second case, however, the query is different, as shown in the doc:
SELECT count(1) AS count_1 FROM (
SELECT DISTINCT(col) FROM Entry
) AS anon_1
Now that just counts the total number of the rows returned by SELECT DISTINCT query (which is 4 - NULL is included in the output of DISTINCT queries).
The reason is simple: query.count purpose is to return the number of rows the query would have returned if run without count clause. This method doesn't give you control over which columns should be used to count - that's what func.count(...) is for.

MySQL COUNT doesn't count NULL values, so if you are counting values by a field that has NULL values, that rows won't be counted by COUNT.
DISTINCT returns just number of different values so NULL is included.

Related

SQL query which includes COUNT(*) in it's SELECT `clause` confuses me

I'm a newbie in SQL, trying to find my way through.
I have the following diagram:
and I'm being requested to
"Produce a list of number of items from each product which was ordered
in June 2004. Assume there's a function MONTH() and YEAR()"
The given solution is:
SELECT cat_num, COUNT(*)
FROM ord_rec AS O, include AS I
WHERE O.ord_num = I.ord_num AND
MONTH(O.ord_date) = 6 AND
YEAR(O.ord_date) = 2004
GROUP BY cat_num;
What I'm confused about is the COUNT(*). (specifically the asterisk within).
Does it COUNT all rows that are returned from the given query? So the asterisk refers to all of the returned ROWS? or am I far off?
Is it any different than having:
SELECT cat_num, COUNT(cat_num)
Thanks!

The COUNT(*) function returns the number of rows in a dataset using the SELECT statement. The function counts rows with NULL, duplicate, and non-NULL values.
The COUNT(cat_num) function returns the number of rows that do not contain NULL values.
Consider an example:
Block
Range
A
1-10
A
10-1
B
(NULL)
B
(NULL)
B
(NULL)
For this data,using query:
SELECT
COUNT(*),
COUNT(t.`Block`),
COUNT(t.`Range`)
FROM
`test_table` t
You'll obtain results :
count(*)
count(t.Block)
count(t.Range)
5
5
2
I hope that clears your confusion.

The COUNT(*) function returns the number of rows in a table in a query. It counts duplicate rows and rows that contain null values.
Overall, you can use * or ALL or DISTINCT or some expression along
with COUNT to COUNT the number of rows w.r.t. some condition or all of
the rows, depending up on the arguments you are using along with
COUNT() function.
Possible parameters for COUNT()
When the * is used for COUNT(), all records ( rows ) are COUNTed if some content NULL but COUNT(column_name) does not COUNT a record if its field is NULL.
Resources here.

multiple count in join

I am trying to get the count of ids from one table with left join in a mysql query. it works well when i have one count. but when i try to add an additional count the result of the second count is the same as first count. so how to fix this query to have two counts.
note: 1 st count result should be based on join condition
2 nd count result should be over all count not based on join

SELECT COUNT(*)
counts all rows.
SELECT COUNT(column_name)
counts just the values that are not NULL in that particular column.
So in your case your first count should be COUNT(a column from your joined table) and your second count should be COUNT(*).
For special cases you can also use boolean expressions. For example
SELECT SUM(my_column = 'foo')
counts just the values where the value in my_column is foo, because the boolean expression returns 1 if true and 0 otherwise.

Why does HAVING MAX() return a different value than SELECT MAX()?

I have a table log that contains, among others, a DateTime column called TimeOfLog and a foreign key Logger_ID.
What I was trying to do was get the newest entry per Logger_ID.
SELECT l.TimeOfLog AS TimeOfLog, l.Logger_ID AS Logger_ID
FROM `log` `l`
GROUP BY l.Logger_ID
HAVING MAX(l.TimeOfLog)
this however returns more or less a random TimeOfLog belonging to that Logger_ID. If I then run
SELECT MAX(l.TimeOfLog) AS TimeOfLog, l.Logger_ID AS Logger_ID
FROM `log` `l`
GROUP BY l.Logger_ID
I get the expected, newest, result. However, I'm pretty sure the Logger_ID is not the one belonging to that TimeOfLog.
Why is that/What am I misunderstanding here?

To get the maximum row, don't think group by; think filtering. Here is one method:
select l.*
from log l
where l.timeoflog = (select max(t2.timeoflog)
from log l2
where l2.logger_id = l.logger_id
);
If you just want the maximum time, then aggregation is appropriate:
select logger_id, max(timeoflog)
from log l
group by logger_id;
You have the expression:
HAVING MAX(l.TimeOfLog)
This just checks that the maximum is not 0 or NULL.

You are misunderstarding how GROUP BY AND HAVING works.
GROUP BY groups all rows that have same values in columns specified columns together into one group. If you select one column that is not mentioned in GROUP BY without using agregate function, you will randomly get one value from the grouped rows.
If you use agregate function like MAX() then the function is applied on all grouped rows and then result is selected.
HAVING is a filter similar to WHERE but while WHERE is applied before grouping the HAVING filter is applied after grouping.
You can use aggregate functions there. The correct usage of having might be for example
SELECT column,
FROM table
GROUP BY column
HAVING COUNT(*) > 1
This query would only select values of column that are present more than once.
In your example the MAX(c.TimeOfLog) will always be true as long as c.TimeOfLog is not empty for at least one row in group so it won't filter anything.

MSSql ISNULL query

select ISNULL(c.name,'any') from (select Name from Orders where ID = '123')
select ISNULL((select Name from Orders where ID = '123'),'any')
Orders table have two columns
1. ID
2. Name
and data in Orders is
ID Name
121 abc
124 def
First Query is not returning any result whereas second query is giving any as result. What is the difference

The first form uses a subquery as a table source, in its FROM clause; it can return between zero and many rows.
For each of the rows that the subquery returns, the ISNULL expression is evaluated. But if the subquery returned no rows, then the final output contains no rows.
The second form uses a SELECT without a FROM clause - which will always produce a result set containing exactly one row. It then also uses a scalar subquery (by introducing a subquery in a location where a scalar value is expected) - that has to either produce zero or one results. If the subquery produces zero results, then NULL is substituted.
So, the differences between the two are that the first query can return between zero and many rows, and the ISNULL expression is evaluated for each row. Whereas the second query always produces exactly one row, and if the subquery returned multiple results, an error is produced.

MySQL: Count two things in one query?

I have a "boolean" column in one of my tables (value is either 0 or 1).
I need to get two counts: The number of rows that have the boolean set to 0 and the number of rows that have it set to 1. Currently I have two queries: One to count the 1's and the other to count the 0's.
Is MySQL traversing the entire table when counting rows with a WHERE condition? I'm wondering if there's a single query that would allow two counters based on different conditions?
Or is there a way to get the total count along side the WHERE conditioned count? This would be enough as I'd only have to subtract one count from the other (due to the boolean nature of the column). There are no NULL values.
Thanks.

You could group your records by your boolean column and get count for each group.
SELECT bool_column, COUNT(bool_column) FROM your_table
WHERE your_conditions
GROUP BY bool_column
This will obviously work not only for bool columns but also with other data types if you need that.

Try this one:
SELECT
SUM(your_field) as positive_count,
SUM(IF(your_field, 0, 1)) as negative_count
FROM thetable

If they are all either 0 or 1 and you dont mind 2 rows as result you can group by that field and do a count like so:
select field, count(field)
from table
group by field

A simple group clause should do the trick :
SELECT boolField, COUNT(boolField)
FROM myTable
GROUP BY boolField

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

func.count(distinct(...)) does not give the same result as distinct().count() - mysql

MySQL COUNT doesn't count NULL values, so if you are counting values by a field that has NULL values, that rows won't be counted by COUNT. DISTINCT returns just number of different values so NULL is included.

Related

SQL query which includes COUNT(*) in it's SELECT `clause` confuses me

multiple count in join

Why does HAVING MAX() return a different value than SELECT MAX()?

MSSql ISNULL query

MySQL: Count two things in one query?

Categories

Resources