Counting in Many to Many Relations in SQL - mysql

I have a table A and B, and their many to many relations in table AB.
Select A.id, AB.bId FROM A LEFT JOIN AB on A.id = AB.aId
gives
A1--B1
A1--B2
A2--B3
A3--NULL
A4--B4
I want to find total number of distict A's and total number of distinct A's having a not null B. e.g. for above table, the numbers and 4 and 3. In fact, I am wondering the percentage 3/4=0.75.
Can I do this in one optimal query?

Since count() does not count null, you could:
select count(distinct A.id) as DistinctA
, count(distinct case
when AB.bId is not null then A.id
end) as DistinctAHavingNotNullB
from A
left join
AB
on A.id = AB.aId
Note that a case without else returns null when no when clause matches.

Related

Join based on conditions

I have long hive query, which has 10 joins and lots of conditions, below is 3 conditions
1) If id is not equal to XFG or GHT, use field sid
join ABC_Tables on sid
join CDE_Tables on sid
2) If id is equal to XFG or GHT, Tested is null, use field pid
join ABC_Tables on kid
join CDE_Tables on kid
3) If id is equal to XFG or GHT, Tested is not null, use field pid
join ABC_Tables on kid
join CDE_Tables on kid
What am I doing,
select 1 conditions
union all
select 2 conditions
union all
select 3 conditions
am I doing right. What is the alternative of above problem.
Your conditions are allowed to be part of ON join condition. Equal/not equal to constants are allowed in Hive ( ID!='XFG')and(ID!='GHT')and(a.PID=b.PID) is allowed join condition. a.ID not in ('XFG', 'GHT') and a.sid=b.sid also should work:
select *
from a
left join b on a.ID not in ('XFG', 'GHT') and a.sid=b.sid
left join b on a.ID in ('XFG', 'GHT') and Tested is null and a.pid=b.pid

How to get count of two fields from two different table with grouping a field from another table in mysql

I have three tables projects, discussions, and comments.
I have tried it like this:
SELECT p.PRO_Name, COUNT( d.DIS_Id ) AS nofdisc, COUNT( c.COM_Id ) AS nofcom
FROM projects p
LEFT JOIN discussions d ON p.PRO_Id = d.PRO_Id
LEFT JOIN comments c ON d.DIS_Id = c.DIS_Id
GROUP BY p.PRO_Name LIMIT 0 , 30
But it's taking all the rows from discussions and the count of comments is the same as the count of discussions.
count counts the number of non-null values of the given parameter. The join you have will create a row per comment, where both dis_id and com_id are not null, so their counts would be the same. Since these are IDs, you could just count the distinct number of occurrences to get the response you'd want:
(EDIT: Added an order by clause as per the request in the comments)
SELECT p.PRO_Name,
COUNT(DISTINCT d.DIS_Id) AS nofdisc,
COUNT(DISTINCT c.COM_Id) AS nofcom
FROM projects p
LEFT JOIN discussions d ON p.PRO_Id = d.PRO_Id
LEFT JOIN comments c ON d.DIS_Id = c.DIS_Id
GROUP BY p.PRO_Name
ORDER BY 2,3
LIMIT 0 , 30

MySQL distinct sum without subqueries

I have two main tables bills and billing_items:
bills
_______
id
..
amount
balance
billing_items
______________
id
...
bill_id
I need to get the sum of the bill amount and balance based on certain criteria in the billing_items table (the table references other tables of interest)
when I use the below query I get duplicates:
select sum(b.amount), sum(b.balance)
from bills b left join billing_items bi
on b.id=bi.bill_id;
I can't use subqueries such as the one below because of the ORM I'm using (subqueries not supported):
select sum(a) from
(select b.amount as a, b.balance
from bills b left join billing_items bi
on b.id=bi.bill_id group by b.id) t;
the criteria on billing_items and its referencing tables is ommited but I need to reference billing_items
not sure if your ORM allows the use of user variables or not in your SQL, if yes you could try this, basically it's ORDER BY bi.bill_id and only sum amount when there's a new bill_id
select sum(IF (#prevBillId IS NULL OR #prevBillId != bi.bill_id,b.amount,0)) as sumAmount,
sum(IF (#prevBillId IS NULL OR #prevBillId != bi.bill_id,b.balance,0)) as sumBalance,
#prevBillId:=bi.bill_id
from bills b left join billing_items bi
on b.id=bi.bill_id
ORDER BY bi.bill_id;
see this sqlFiddle
Hopefully this will work...
So your issue is your left join to billing_items is causing multiple rows per billing item and your sum(b.amount) is being multiplied by the number of rows you have. So solution...divide your sum by your count.
select b.id,sum(b.amount) / count(*) as amount, sum(b.balance)
from bills b left join billing_items bi
on b.id=bi.bill_id;
group by b.id
Give that a try...hoping it works out.

Left Join One Column in Table 1 to Two Columns in Table 2

I have two tables, one is a "phone number" table and the other is a "calls" table. The calls table has two columns of interest: the originating number column (c.orig) and the terminating number column (c.term).
I'm trying to write a MySQL query that will return all records in the call table where NEITHER the c.orig number or the c.term number exist in the numbers table (the "n.num" column in the numbers table).
Here is my SQL query:
SELECT
c.id, c.date, c.orig, c.term, c.duration
FROM calls as c
LEFT JOIN numbers as n ON (n.num = c.orig AND n.num = c.term)
WHERE
c.period = '2012-08' AND
n.num IS NULL
GROUP BY c.call_id
ORDER BY c.call_id
LIMIT 0,300
Any ideas?
Here is some further clarification:
------------------------------
table: numbers
nid num
1 111-222-3333
2 222-333-4444
3 333-444-5555
------------------------------
------------------------------
table: calls
id orig term
1 333-444-5555 999-999-9999
2 999-999-9999 111-222-3333
3 222-333-4444 999-999-9999
4 888-888-8888 999-999-9999
5 777-777-7777 999-999-9999
------------------------------
Call IDs 1, 2, and 3 have at least one of the two numbers (orig or term) that can be found in the numbers table.
Call IDs 4 and 5 are situations where neither of the two phone numbers are not in the numbers table. Those are the records that I'm trying to find. Records where neither phone number is found in the numbers table.
You should Join table twice for this,
SELECT
c.id, c.date, c.orig, c.term, c.duration
FROM calls as c
LEFT JOIN numbers as n
ON (n.num = c.orig)
LEFT JOIN numbers m
ON m.num = c.term
WHERE
c.period = 'date here' AND
m.num IS NULL
-- GROUP BY c.call_id
ORDER BY c.call_id
LIMIT 0,300
question, i removed your group clause cause i don't see any aggregated function. What else do you want to do?
UPDATE 1
based on your examples above, try this edited one.
Call IDs 4 and 5 are situations where neither of the two phone numbers
are not in the numbers table. Those are the records that I'm trying to
find. Records where neither phone number is found in the numbers
SELECT a.*
FROM calls a
LEFT JOIN numbers b
ON a.orig = b.num
LEFT JOIN numbers c
ON a.term = c.num
WHERE b.num IS NULL AND
c.num IS NULL
SQLFiddle Demo
Hpe this makes sense.
You should keep in mind that, logically, the condition in your ON clause is applied to every single row, not to the lot of them, so you can't really have any single n.num be equal to both c.orig and c.term, unless, of course, the two are the same number (which in your particular case would make no sense).
So, what you actually need to be checking there is whether n.num is equal to either c.orig or c.term. That is, just replace AND with OR and you are done:
SELECT
c.id, c.date, c.orig, c.term, c.duration
FROM calls as c
LEFT JOIN numbers as n ON (n.num = c.orig OR n.num = c.term)
WHERE
c.period = '2012-08' AND
n.num IS NULL
GROUP BY c.call_id
ORDER BY c.call_id
LIMIT 0,300
The question already uses the word EXIST. Well, this is axactly why EXISTS exists:
SELECT c.id, c.date, c.orig, c.term, c.duration
FROM calls c
WHERE NOT EXISTS ( SELECT *
FROM numbers a
WHERE a.num = c.orig
)
AND NOT EXISTS ( SELECT *
FROM numbers b
WHERE b.num = c.term
);
The two subqueries could even be combined into one:
SELECT c.id, c.date, c.orig, c.term, c.duration
FROM calls c
WHERE NOT EXISTS ( SELECT *
FROM numbers ab
WHERE ab.num = c.orig
OR ab.num = c.term
);
NOTE: I omitted the period-condition and the LIMIT, which are both irrelevant to the actual problem.
BTW: "date" is a reserved word in SQL. Better not use it as a column name.

MySQL left join counts

I have a left join to a table and want to count columns from it, after grouping by a column of the parent table:
SELECT * , COUNT(list.id) AS listcount, COUNT(uploads.id) AS uploadcount
FROM members
LEFT JOIN lists ON members.id= list.mid
LEFT JOIN uploads ON members.id= uploads.mid
GROUP BY members.id
Assume that a user can have either lists or uploads based on the type of user. Then is above query good enough? If not why?
Or do I have to use this query?
SELECT * , l.listcount, u.uploadcount
FROM members
LEFT JOIN (select count(lists.id) as listscount,mid from lists group by mid) as l
on l.mid = m.id
LEFT JOIN (select count(uploads.id) as uploadscount
,mid from uploads group by mid) as u on u.mid = m.id
GROUP BY members.id
Or correlated subqueries?
SELECT *,
(select count(lists.id) as listscount from lists as l where l.mid = m.id
group by mid) as listcount
(select count(uploads.id) from uploads as u where u.mid = m.id
group by mid) as uploadscount
FROM members
GROUP BY members.id
And which is best solution?
The alias m for members is missing in query 2 and 3. Otherwise they should give the same numbers.
Query 2 (fixed) will perform fastest.
Query 1 is different in that it will give a higher number for uploads, if there are cases of multiple lists per member. After joining to lists, there will be multiple rows for a member too, which will increase the count for uploads. So query 1 is probably wrong.
Also, NULL values are not counted. The manual informs:
COUNT(expr)
Returns a count of the number of non-NULL values of expr in the rows
retrieved by a SELECT statement. The result is a BIGINT value.