I am trying to link two tables with similar column. I need to find out how many values differ from table1.column1 and table 2.column1:
My current query:
SELECT i10_descr.i10_code, gems_pcsi9.i10_code
FROM i10_descr INNER JOIN gems_pcsi9 ON i10_descr.i10_code = gems_pcsi9.i10_code
ORDER BY i10_descr.i10_code;
I know this query shows the matching codes of each table: I cannot figure out how to COUNT the missing/different codes in the tables.
Also, I have to compute the ratio of codes.
Any help, tips, or direction is much appreciated.
Thanks
You could use an anti-join pattern to get a list of i10_code that exist in one table, but not the other. For example:
SELECT i.i10_code
FROM i10_descr i
LEFT
JOIN gems_pcsi9 g
ON g.i10_code = i.i10_code
WHERE g.i10_code IS NULL
ORDER BY i.i10_code
If you just want a count, you could use COUNT(i.i10_code) and/or COUNT(DISINCT i.i10_code) in the SELECT list and remove the ORDER BY clause.
To get the i10_code in the gems table that aren't in the i10 table, you'd do the same thing but invert the query so that gems is the "driving" table. e.g.
SELECT COUNT(DISTINCT g.i10_code) AS cnt_diff
FROM gems_pcsi9 g
LEFT
JOIN i10_descr i
ON i.i10_code = g.i10_code
WHERE i.i10_code IS NULL
If you want to combine the number of differences, you can combine the two queries by making them inline views:
SELECT d.cnt_diff + e.cnt_diff AS total_diff
FROM (
SELECT COUNT(DISTINCT g.i10_code) AS cnt_diff
FROM gems_pcsi9 g
LEFT
JOIN i10_descr i
ON i.i10_code = g.i10_code
WHERE i.i10_code IS NULL
) d
CROSS
JOIN (
SELECT COUNT(DISTINCT i.i10_code) AS cnt_diff
FROM i10_descr i
LEFT
JOIN gems_pcsi9 g
ON g.i10_code = i.i10_code
WHERE g.i10_code IS NULL
) e
NOTE: the COUNT aggregate will omit NULL values. The query would need to be tweaked if you also wanted to "count" rows that had NULL values for i10_code. You'd use COUNT(DISTINCT ) if you want just a number of distinct values that are different. A COUNT() would give a number of rows. These two results would be different if you had multiple rows with the same i10_code value.
To get a "ratio" of codes, assuming that at this point, the "differences" don't matter, you get a count of codes from each table. The queries to do that could be used inline views:
SELECT d.cnt / e.cnt AS ratio_cnt_g_over_cnt_i
, d.cnt AS cnt_g
, e.cnt AS cnt_i
FROM (
SELECT COUNT(DISTINCT g.i10_code) AS cnt
FROM gems_pcsi9 g
) d
CROSS
JOIN (
SELECT COUNT(DISTINCT i.i10_code) AS cnt
FROM i10_descr i
) e
An alternative method is to use union all with aggregation:
select in_i10descr, in_gems_pcsi9, count(*) as numcodes
from (select code, max(in_i10descr) as in_i10descr, max(in_gems_pcsi9) as in_gems_pcsi9
from ((select i10_descr.i10_code as code, 1 as in_i10descr, 0 as in_gems_pcsi9
from i10_descr
) union all
(select gems_pcsi9.i10_code, 0, 1
gems_pcsi9.i10_code
)
) t
group by code
) c
group by in_i10descr, in_gems_pcsi9;
This will calculate counts of things in each table separately and in both tables.
Related
I have 3 tables with following columns.
Table: A with column: newColumnTyp1, typ2
Table: B with column: typ2, tableC_id_fk
Table: C with column: id, typ1
I wanted to update values in A.newColumnTyp1 from C.typ1 by following logic:
if A.typ2=B.typ2 and B.tableC_id_fk=C.id
the values must be distinct, if any of the conditions above gives multiple results then should be ignored. For example A.typ2=B.typ2 may give multiple result in that case it should be ignored.
edit:
the values must be distinct, if any of the conditions above gives multiple results then take only one value and ignore rest. For example A.typ2=B.typ2 may give multiple result in that case just take any one value and ignore rest because all the results from A.typ2=B.typ2 will have same B.tableC_id_fk.
I have tried:
SELECT DISTINCT C.typ1, B.typ2
FROM C
LEFT JOIN B ON C.id = B.tableC_id_fk
LEFT JOIN A ON B.typ2= A.typ2
it gives me a result of table with two columns typ1,typ2
My logic was, I will then filter this new table and compare the type2 value with A.typ2 and update A.newColumnTyp1
I thought of something like this but was a failure:
update A set newColumnTyp1= (
SELECT C.typ1 from
SELECT DISTINCT C.typ1, B.typ2
FROM C
LEFT JOIN B ON C.id = B.tableC_id_fk
LEFT JOIN A ON B.typ2= A.type2
where A.typ2=B.typ2);
I am thinking of an updateable CTE and window functions:
with cte as (
select a.newColumnTyp1, c.typ1, count(*) over(partition by a.typ2) cnt
from a
inner join b on b.type2 = a.typ2
inner join c on c.id = b.tableC_id_fk
)
update cte
set newColumnTyp1 = typ1
where cnt > 1
Update: if the columns have the same name, then alias one of them:
with cte as (
select a.typ1, c.typ1 typ1c, count(*) over(partition by a.typ2) cnt
from a
inner join b on b.type2 = a.typ2
inner join c on c.id = b.tableC_id_fk
)
update cte
set typ1 = typ1c
where cnt > 1
I think I would approach this as:
update a
set newColumnTyp1 = bc.min_typ1
from (select b.typ2, min(c.typ1) as min_typ1, max(c.typ1) as max_typ1
from b join
c
on b.tableC_id_fk = c.id
group by b.type2
) bc
where bc.typ2 = a.typ2 and
bc.min_typ1 = bc.max_typ1;
The subquery determines whether typ1 is always the same. If so, it is used for updating.
I should note that you might want the most common value assigned, instead of requiring unanimity. If that is what you want, then you can ask another question.
Count non-null values directly from select statement (not using where) on a left joint table
count(*) as comments Need this to provide count of non-null values only. Also, inner join is not a solution because, that does not include content which have zero comments in count(distinct (t1.postId)) as no_of_content
select t1.tagId as tagId, count(distinct (t1.postId)) as no_of_content, count(*) as comments
from content_created as t1
left join comment_created as t2
on t1.postId=t2.postId
where
( (t1.tagId = "S2036623" )
or (t1.tagId = "S97422" )
)
group BY 1
Though Posting the sample data might help us more to answer this but you can update your count function to -
COUNT(CASE WHEN postId IS NULL THEN 1 END) as comments
Count only counts non-null values. What you need to do is reference the right hand side table's column explicitly. So instead of saying count(*) use count(right_joined_table.join_key).
Here's a full example using BigQuery:
with left_table as (
select num
from unnest(generate_array(1,10)) as num
), right_table as (
select num
from unnest(generate_array(2,10,2)) as num
)
select
count(*) as total_rows,
count(l.num) as left_table_counts,
count(r.num) as non_null_counts
from left_table as l
left outer join right_table as r
on l.num = r.num
This gives you the following results:
I have the SQL command:
SELECT
vinculo.id,
data start,
count(*) title
from
atendimento_regulacao
join vinculo on vinculo.id = atendimento_regulacao.vinculo_id
where data = '2019-07-02'
group by vinculo.usuario_id, atendimento_regulacao.data
The result is empty because not exists none record on where data = '2019-07-02'
How to show the id like below?
id | start | title
1 | |
You can use a CROSS JOIN to generate the rows and LEFT JOIN to bring in the results:
select v.id, d.dte as start, count(ar.vinculo_id) as num_title
from (select '2019-07-02' as dte) d cross join
vinculo v left join
atendimento_regulacao ar
on v.id = ar.vinculo_id and ar.data = d.dte
group by v.id, d.dte;
If you really want to aggregate by v.usuario_id, then include it in both the select and group by.
Notes:
The structure of the query easily extends to multiple dates.
The GROUP BY uses the same columns in the SELECT.
Table aliases make the query easier to write and to read.
Qualify all column references in a query that has more than one table reference.
The COUNT() uses a column from ar so it can return 0.
For the specific case of a single date, you can use conditional aggregation:
select v.id, '2019-07-02' as start,
count(ar.vinculo_id) as num_title
from vinculo v left join
atendimento_regulacao ar
on v.id = ar.vinculo_id and ar.data = '2019-07-02'
group by v.id;
Use RIGHT JOIN, and convert your count to the one below, otherwise it shows zero whenever didn't find to count anything.
SELECT v.id, a.data start,
case when count(*) is null then null end title
FROM atendimento_regulacao a
RIGHT JOIN vinculo v
ON v.id = a.vinculo_id
AND a.data = '2019-07-02'
GROUP BY v.usuario_id, a.data;
Demo
I have 3 tables which are interconnected and i want to select columns from two tables and counts from table 3. If anyone is aware on this, any hint would be appreciated.
Below is the sql i tried, but the count is getting repeated
SELECT distinct p.p_id, p.p_f6, p.p_l4,m.m_id, (
SELECT COUNT(*)
FROM ttokens t where t.pdetail_id = p.pdetail_id
) AS token_count
FROM tparking p,ttokens t LEFT join ttokens_md m ON t.trefn_id = m.trefn_id
WHERE t.pdetail_id = p.pdetail_id
You can try to use JOIN with subquery to get your count instead of selcet subquery.
SELECT p.p_id, p.p_f6, p.p_l4,m.m_id,t.cnt
FROM tparking p
JOIN (
SELECT pdetail_id,COUNT(*) cnt
FROM ttokens
GROUP BY pdetail_id
) t ON t.pdetail_id = p.pdetail_id
LEFT join ttokens_md m ON t.trefn_id = m.trefn_id
Note
I would use JOIN instead of , comma with where condition to connect two tables,, is an old style.
I am working on 2 problems for homework and after many hours I have just about solved them both, the last issue I have is that both of my queries are coming back with doubled numerical values instead of single.
Here is what I have:
SELECT SUM(P.AMT_PAID) AS TOTAL_PAID, C.CITATION_ID, C.DATE_ISSUED, SUM(V.FINE_CHARGED) AS TOTAL_CHARGED
FROM PAYMENT P, CITATION C, VIOLATION_CITATION V
WHERE V.CITATION_ID = C.CITATION_ID
AND C.CITATION_ID = P.CITATION_ID
GROUP BY C.CITATION_ID;
and my other one:
SELECT C.CITATION_ID, C.DATE_ISSUED, SUM(V.FINE_CHARGED) AS TOTAL_CHARGED, SUM(P.AMT_PAID) AS TOTAL_PAID, SUM(V.FINE_CHARGED) - SUM(P.AMT_PAID) AS TOTAL_OWED
FROM (CITATION C)
LEFT JOIN VIOLATION_CITATION V
ON V.CITATION_ID = C.CITATION_ID
LEFT JOIN PAYMENT P
ON P.CITATION_ID = C.CITATION_ID
GROUP BY C.CITATION_ID
ORDER BY TOTAL_OWED DESC;
I am sure there is just something that I am overlooking. If someone else could kindly tell me where I went awry it would be a great help.
Select Sum(P.Amt_Paid) As Total_Paid, C.Citation_Id
, C.Date_Issued, Sum(V.Fine_Charged) As Total_Charged
From Payment P
Join Citation C
On C.Citation_Id = P.Citation_Id
Join Violation_Citation V
On V.Citation_Id = C.Citation_Id
Group By C.Citation_Id
First, you should use the JOIN syntax instead of using the comma-delimited list of tables. It makes it easier to read, more standardized and will help prevent problems by overlooking a filtering clause.
Second, the most likely reason for having a sum that is too large is due to the join to the VIOLATION_CITATION table. If you remove the Group By and columns with aggregate functions, you will likely see that P.AMT_PAID is repeated for each instance of VIOLATION_CITATION. Perhaps, the following will solve the problem:
Select Coalesce(PaidByCitation.TotalAmtPaid,0) As Total_Paid
, C.Citation_Id, C.Date_Issued
, Coalesce(ViolationByCitation.TotalCharged,0) As Total_Charged
, Coalesce(ViolationByCitation.TotalCharged,0)
- Coalesce(PaidByCitation.TotalAmtPaid,0) As Total_Owed
From Citation As C
Left Join (
Select P.Citation_Id, Sum( P.Amt_Paid ) As TotalAmtPaid
From Payment As P
Group By P.Citation_Id
) As PaidByCitation
On PaidByCitation.Citation_Id = C.Citation_Id
Left Join (
Select V.Citation_Id, Sum( V.Find_Charged ) As TotalCharged
From Violation_Citation As V
Group By V.Citation_Id
) As ViolationByCitation
On ViolationByCitation.Citation_Id = C.Citation_Id
The use of Coalesce is to ensure that if the left join returns no rows for a given Citation_ID value, that we replace the Null with zero.