How to express count(distinct) with subquery in MySQL? - mysql

A query results a certain number. The query is:
select
count(distinct case when (A or B or C) and D then table_a.field1 else null end)
from table_a
left join table_b on table_b.x = table_a.y
group by table_a.y
;
where A, B, C and D are given conditions. Now, written in this form:
select
sum((select count(1) from table_b where table_b.x = table_a.y and ((A or B or C) and D) ))
from table_a
left join table_b on table_b.x = table_a.y
group by table_a.y
;
the result does not match the one we got with count(distinct).
What is the correct way of writing count(distinct) with a subquery?

It's not at all clear why you need a subquery. You still have the JOIN, so that subquery is potentially going to be "counting" the same rows multiple times.
If you want to get the number of distinct values for field1 in table_a which meets a set of criteria (on table_a), then you don't really need a subquery on table_b to get that. At least, I don't see anyway that you can get that result using a subquery on table_b.
Here's an example that returns an equivalent result:
select (select sum(1) as mycount
from ( select a.field1
from table_a a
left join table_b on table_b.x = a.y
where a.y = t.y
and ( (A or B or C) and D )
and a.field1 IS NOT NULL
group by a.field1
) s
) as mycount
from table_a t
group by t.y
That's really the only way I know to get something equivalent to a COUNT(DISTINCT expr). You've got to do a SELECT expr FROM ... WHERE expr IS NOT NULL GROUP BY expr, and then count the rows it returns. In this case, you could use either a COUNT(1) or a SUM(1).
(I'm not at all sure that answers the question you were asking, but it's my best shot at it.)
(We note that in your original query, you have a GROUP BY table_a.y, so that query can return multiple rows, each with its own count.

Related

How do I return 0 when my sql returns with no rows?

select a, count (b)
from table1 where b in ( select distict b from table2)
and table1.dated>=DATE('yy/mm/dd')
group by a;
In the above SQL, when I have count(b)>0 then it returns columns but when count=0 then no rows were returned
I did try UNION, NULLIF() and SELECT(SELECT()) as something but nothing worked.
I was expecting to get 0 returned if the count is equal to 0.
https://www.db-fiddle.com/#&togetherjs=2AkxeMUrPF
You could use:
select table1.a, count(DISTINCT table2.b)
from table1
LEFT JOIN table2
ON table1.b = table2.b
AND table1.dated>=DATE('yy/mm/dd') -- this comparision is simply incorrect
group by table1.a
We can ensure that a query returns a row by having the query guarantee it.
Here's an example that retrieves exactly one row from an inline view i.
Then an outer join to another inline view s that gets a distinct list of values.
And then and then another outer join to table1.
SELECT t.a
, COUNT(t.b) AS cnt
FROM ( SELECT 1 AS n ) i
LEFT
JOIN ( SELECT DISTINCT r.b
FROM table2 r
) s
LEFT
JOIN table1 t
ON t.b = s.b
AND t.dated >= ...
GROUP
BY i.n
, t.a
If inline view s returns no rows, the query should return
a cnt
---- ---
NULL 0

how to group by a field that has both select and count

I am wondering how to group by a field that has both a select count() and count() statement. I know that we have to put all select fields in group by but it wont let me do so because of the second count() statement in the field.
create table C as(
select a.id, a.date_id,
(select count(b.hits)*1.00 where b.hits >= '9')/count(b.hits) AS percent **<--error here
from A a join B b
on a.id = b.id
group by 1,2,3) with no data primary index(id);
This is my error:
[SQLState HY000] GROUP BY and WITH...BY clauses may not contain
aggregate functions. Error Code: 3625
When i add a select to the second count in the third line only get 1 or 0 which is not right.
`((select count(b.hits)*1.00 where b.hits >= '9')/(select count(b.hits))) AS` percent
Do i need to do a self join instead or is there any way i can just use nested queries?
You need to fix the group by. But, you can probably simplify the query as:
create table C as
select a.id, a.date_id,
avg(b.hits >= 9) as percent
from A a join
B b
on a.id = b.id
group by a.id, a.date_id
with no data primary index(id);
It looks like you only need to group on 2 columns, not 3, plus you shouldn't need a sub-select:
create table C as(
select a.id, a.date_id,
SUM(CASE WHEN b.hits >= '9' THEN 1 ELSE 0 END)/COUNT(b.hits) AS percent
from A a join B b
on a.id = b.id
group by 1,2) with no data primary index(id);

MySQL trying to reuse results of subquery in an efficient way

I have a query like this:
SELECT q,COUNT(x),y,
(SELECT i FROM (SELECT q,w FROM tableA WHERE conds)
JOIN tableC ON (cond)
WHERE id = t.q)
FROM (SELECT q,w FROM tableA WHERE conds) t
JOIN tableB
GROUP BY q
The subquery (SELECT q,w FROM tableA WHERE conds) returns several hundred rows. After the GROUP BY q there is around 20 rows left.
The subquery (SELECT i FROM (SELECT q,w FROM tableA WHERE conds) join tableC WHERE id = t.q) uses inside of it the exactly same subquery as the one above, but then also selects a fraction of the results based on which q value is currently being grouped.
My problem seems to be this. The performance is too slow because I can't seem to put the WHERE id = t.q inside the (SELECT q,w, FROM Table A WHERE conds) subquery. I can only guess that for every unique value of q, the query is being run, it produces hundreds of rows and then has to perform the WHERE clause on an un-indexed temporary table. I think I need to perform the WHERE before the full join
Any ideas please?
This query could produce the same results, but so much information is missing from the question, who can be sure?
Select
q,
count(x),
y,
i
From
tableA a
inner join
tableC c
on cond and c.id = a.q
cross join -- is this an inner join?
tableB b
Where
conds
Group By
q,
y,
i

SQL add rows count from a second table to the main query

I'm trying to improve a (not so much) simple query:
I need to retrieve every row from Table A.
Then join Table A with Table B so I get all the data I need.
At the same time, I need to add an extra column with the count() from Table C.
Something like:
SELECT a.*,
(SELECT Count(*)
FROM table_c c
WHERE c.a_id = a.id) AS counter,
b.*
FROM table_a a
LEFT JOIN table_b b
ON b.a_id = a.id
This works, ok, but in reality, I'm just making 2 queries and I need to improve this so it only do one (if, its even possible).
Anyone knows how can I achive that?
The simplest approach is likely to just move the correlated sub-query into a sub-query.
NOTE: Many optimisers deal with correlated sub-queries extremely effectively. Your example query could be perfectly reasonable.
SELECT
a.*,
b.*,
c.row_count
FROM
table_a a
LEFT JOIN
table_b b
ON b.a_id = a.id
LEFT JOIN
(
SELECT
a_id,
Count(*) row_count
FROM
table_c
GROUP BY
a_id
)
c
ON c.a_id = a.id
Another Note: SQL is an expression, it is not executed directly, it is translated into a plan using nest loops, hash joins, etc. Do not assume that having two queries is a bad thing. In this case my example may significantly minimise the number of reads compared to a single query and then use of GROUP BY and COUNT(DISTINCT).
Try this:
SELECT
tmp.*,
SUM(IF(c.a_id IS NULL,0,1)) as counter,
FROM (
SELECT
a.id as aid,
b.id as bid,
a.*,
b.*
FROM
table_a a
LEFT JOIN table_b b
ON b.a_id = a.id
) as tmp
LEFT JOIN table_c c
ON c.a_id = tmp.id
GROUP BY
tmp.aid,
tmp.bid

How to count number of records as well get the records from the query?

I have 3 tables A,B and C. In the stored procedure,I have used a query to get the result but i also want the total number of records i got from the above query.
Is this possible. I tried using something like this
Select count(*)
from (
select A.Name,B.Address,C.grade
from A,B,C
where A.id=B.id
AND B.Tlno=C.tlno
)
But this is not working.
(1) stop using old-style x,y,z joins.
SELECT A.Name,B.Address,C.grade
FROM dbo.A
INNER JOIN dbo.B ON A.id = B.id
INNER JOIN dbo.C ON B.Tlno = C.tlno;
(2) you can add a count(*) over() to the entire resultset. This is kind of wasteful because it returns the count on every row:
SELECT A.Name, B.Address, C.grade, row_count = COUNT(*) OVER ()
FROM dbo.A
INNER JOIN dbo.B ON A.id = B.id
INNER JOIN dbo.C ON B.Tlno = C.tlno;
You can use a windowing function:
select A.Name,
B.Address,
C.grade,
count(*) over () as total_count
from A,B,C
where A.id=B.id
AND B.Tlno=C.tlno
this will return the total count in each and every row though (but it will be the same number for all rows).
Alternative would be to use the ##rowcount keyword:
SELECT A.Name, B.Address, C.grade, ##rowcount
FROM dbo.A
INNER JOIN dbo.B ON A.id = B.id
INNER JOIN dbo.C ON B.Tlno = C.tlno;
Same result as the windowing function though, so you get the total count on each row. I'm curious if there is a performance difference between the two... (don't have SHOWPLAN permission at my current client unfortunately)
use a table variable as below
declare #num table (accname varchar(200),subnet varchar(200))
insert into #num(accname,subnet) Select a.accountname,s.subnet from tbl_accounts a,tbl_accountsubnet s where a.accountid=s.accountid
select COUNT(*) from #num;