I am working on 2 problems for homework and after many hours I have just about solved them both, the last issue I have is that both of my queries are coming back with doubled numerical values instead of single.
Here is what I have:
SELECT SUM(P.AMT_PAID) AS TOTAL_PAID, C.CITATION_ID, C.DATE_ISSUED, SUM(V.FINE_CHARGED) AS TOTAL_CHARGED
FROM PAYMENT P, CITATION C, VIOLATION_CITATION V
WHERE V.CITATION_ID = C.CITATION_ID
AND C.CITATION_ID = P.CITATION_ID
GROUP BY C.CITATION_ID;
and my other one:
SELECT C.CITATION_ID, C.DATE_ISSUED, SUM(V.FINE_CHARGED) AS TOTAL_CHARGED, SUM(P.AMT_PAID) AS TOTAL_PAID, SUM(V.FINE_CHARGED) - SUM(P.AMT_PAID) AS TOTAL_OWED
FROM (CITATION C)
LEFT JOIN VIOLATION_CITATION V
ON V.CITATION_ID = C.CITATION_ID
LEFT JOIN PAYMENT P
ON P.CITATION_ID = C.CITATION_ID
GROUP BY C.CITATION_ID
ORDER BY TOTAL_OWED DESC;
I am sure there is just something that I am overlooking. If someone else could kindly tell me where I went awry it would be a great help.
Select Sum(P.Amt_Paid) As Total_Paid, C.Citation_Id
, C.Date_Issued, Sum(V.Fine_Charged) As Total_Charged
From Payment P
Join Citation C
On C.Citation_Id = P.Citation_Id
Join Violation_Citation V
On V.Citation_Id = C.Citation_Id
Group By C.Citation_Id
First, you should use the JOIN syntax instead of using the comma-delimited list of tables. It makes it easier to read, more standardized and will help prevent problems by overlooking a filtering clause.
Second, the most likely reason for having a sum that is too large is due to the join to the VIOLATION_CITATION table. If you remove the Group By and columns with aggregate functions, you will likely see that P.AMT_PAID is repeated for each instance of VIOLATION_CITATION. Perhaps, the following will solve the problem:
Select Coalesce(PaidByCitation.TotalAmtPaid,0) As Total_Paid
, C.Citation_Id, C.Date_Issued
, Coalesce(ViolationByCitation.TotalCharged,0) As Total_Charged
, Coalesce(ViolationByCitation.TotalCharged,0)
- Coalesce(PaidByCitation.TotalAmtPaid,0) As Total_Owed
From Citation As C
Left Join (
Select P.Citation_Id, Sum( P.Amt_Paid ) As TotalAmtPaid
From Payment As P
Group By P.Citation_Id
) As PaidByCitation
On PaidByCitation.Citation_Id = C.Citation_Id
Left Join (
Select V.Citation_Id, Sum( V.Find_Charged ) As TotalCharged
From Violation_Citation As V
Group By V.Citation_Id
) As ViolationByCitation
On ViolationByCitation.Citation_Id = C.Citation_Id
The use of Coalesce is to ensure that if the left join returns no rows for a given Citation_ID value, that we replace the Null with zero.
Related
How can I merge these two left joins: http://sqlfiddle.com/#!9/1d2954/69/0
SELECT d.`id`, (adcount + bdcount)
FROM `docs` d
LEFT JOIN
(
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
) ad ON ad.`doc_id` = d.`id`
LEFT JOIN
(
SELECT db.`doc_id`, COUNT(db.`doc_id`) AS bdcount FROM `docs_scod_b` db
INNER JOIN `scod_b` b ON b.`id` = db.`scod_b_id`
WHERE b.`ver_b` IN ('BA', 'BB')
GROUP BY db.`doc_id`
) bd ON bd.`doc_id` = d.`id`
to be a Single left join just to ease its use in my code, while making it no less slower?
Let me first emphasize that your method of doing the calculation is the better method. You have two separate dimensions and aggregating them separately is often the most efficient method for doing the calculation. It is also the most scalable method.
That said, your query should be equivalent to this version:
SELECT d.id,
count(distinct a.id),
count(distinct b.id)
FROM docs d left join
docs_scod_a da
ON da.doc_id = d.id LEFT JOIN
scod_a a
ON a.id = da.scod_a_id AND a.ver_a IN ('AA', 'AB') LEFT JOIN
docs_scod_b db
ON db.doc_id = d.id LEFT JOIN
scod_b b
ON b.id = db.scod_b_id AND b.ver_b IN ('BA', 'BB')
GROUP BY d.id
ORDER BY d.id;
This query is more expensive than it looks, because the COUNT(DISTINCT) incurs additional overhead compared to COUNT().
And here is the SQL Fiddle.
And, because LEFT JOIN can return NULL values, your query is more correctly written as:
SELECT d.`id`, COALESCE(adcount, 0) + COALESCE(bdcount, 0)
If you were having problems with the results, this small change might fix those problems.
Performance may be a big problem, depending on sizes of each table. It appears to be an "inflate-deflate" situation since it first "inflates" the number of rows via JOIN, then "deflates" via GROUP BY. The formulation below avoids inflation-deflation.
But first, if I understand this subquery correctly, this
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount
FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
can be rewritten as
SELECT `doc_id`,
( SELECT COUNT(*)
FROM `scod_a`
WHERE `id` = da.`scod_a_id`
AND `ver_a` IN ('AA', 'AB')
) AS adcount
FROM `docs_scod_a` AS da
If that is correct, then the entire query becomes
SELECT d.id,
( SELECT COUNT(*)
FROM docs_scod_a ds
JOIN scod_a s ON s.id = ds.scod_a_id
WHERE ds.doc_id = d.id
AND s.ver_a IN ('AA', 'AB')
) +
( SELECT COUNT(*)
FROM docs_scod_b ds
JOIN scod_b s ON s.id = ds.scod_b_id
WHERE ds.doc_id = d.id
AND s.ver_b IN ('BA', 'BB')
)
FROM docs AS d
Which needs these indexes:
docs_scod_a: (doc_id, scod_a_id), (scod_a_id, doc_id)
docs_scod_b: (doc_id, scod_b_id), (scod_b_id, doc_id)
scod_a: (ver_a, id)
scod_b: (ver_b, id)
docs: -- presumably has PRIMARY KEY(id)
Note the lack of GROUP BY.
docs_scod_a smells like a many-to-many mapping table. I recommend you follow the tips here.
(No COALESCE is needed since COUNT will simply return zero.)
(I don't know whether my version is better (faster or whatever) than Gordon's, nor whether my indexes will help his formulation.)
I have written an sql statement that besides all the other columns should return the number of comments and the number of likes of a certain post. It works perfectly when I don't try to get the number of times it has been shared too. When I try to get the number of time it was shared instead it returns a wrong number of like that seems to be either the number of shares and likes or something like that. Here is the code:
SELECT
[...],
count(CS.commentId) as shares,
count(CL.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN
account A ON A.id = `AS`.accountId
INNER JOIN
comment C ON C.accountId = A.id
LEFT JOIN
commentLikes CL ON C.commentId = CL.commentId
LEFT JOIN
commentShares CS ON C.commentId = CS.commentId
GROUP BY
C.time
ORDER BY
year, month, hour, month
Could you also tell me if you think this is an efficient SQL statement or if you would do it differently? thank you!
Do this instead:
SELECT
[...],
(select count(*) from commentLikes CL where C.commentId = CL.commentId) as shares,
(select count(*) from commentShares CS where C.commentId = CS.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN account A ON A.id = `AS`.accountId
INNER JOIN comment C ON C.accountId = A.id
GROUP BY C.time
ORDER BY year, month, hour, month
If you use JOINs, you're getting back one result set, and COUNT(any field) simply counts the rows and will always compute the same thing, and in this case the wrong thing. Subqueries are what you need here. Good luck!
EDIT: as posted below, count(distinct something) can also work, but it's making the database do more work than necessary for the answer you want to end up with.
Quick fix:
SELECT
[...],
count(DISTINCT CS.commentId) as shares,
count(DISTINCT CL.commentId) as numberOfLikes
Better approach:
SELECT [...]
, Coalesce(shares.numberOfShares, 0) As numberOfShares
, Coalesce(likes.numberOfLikes , 0) As numberOfLikes
FROM [...]
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfShares
FROM commentShares
GROUP
BY commentId
) As shares
ON shares.commentId = c.commentId
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfLikes
FROM commentLikes
GROUP
BY commentId
) As likes
ON likes.commentId = c.commentId
I am trying to link two tables with similar column. I need to find out how many values differ from table1.column1 and table 2.column1:
My current query:
SELECT i10_descr.i10_code, gems_pcsi9.i10_code
FROM i10_descr INNER JOIN gems_pcsi9 ON i10_descr.i10_code = gems_pcsi9.i10_code
ORDER BY i10_descr.i10_code;
I know this query shows the matching codes of each table: I cannot figure out how to COUNT the missing/different codes in the tables.
Also, I have to compute the ratio of codes.
Any help, tips, or direction is much appreciated.
Thanks
You could use an anti-join pattern to get a list of i10_code that exist in one table, but not the other. For example:
SELECT i.i10_code
FROM i10_descr i
LEFT
JOIN gems_pcsi9 g
ON g.i10_code = i.i10_code
WHERE g.i10_code IS NULL
ORDER BY i.i10_code
If you just want a count, you could use COUNT(i.i10_code) and/or COUNT(DISINCT i.i10_code) in the SELECT list and remove the ORDER BY clause.
To get the i10_code in the gems table that aren't in the i10 table, you'd do the same thing but invert the query so that gems is the "driving" table. e.g.
SELECT COUNT(DISTINCT g.i10_code) AS cnt_diff
FROM gems_pcsi9 g
LEFT
JOIN i10_descr i
ON i.i10_code = g.i10_code
WHERE i.i10_code IS NULL
If you want to combine the number of differences, you can combine the two queries by making them inline views:
SELECT d.cnt_diff + e.cnt_diff AS total_diff
FROM (
SELECT COUNT(DISTINCT g.i10_code) AS cnt_diff
FROM gems_pcsi9 g
LEFT
JOIN i10_descr i
ON i.i10_code = g.i10_code
WHERE i.i10_code IS NULL
) d
CROSS
JOIN (
SELECT COUNT(DISTINCT i.i10_code) AS cnt_diff
FROM i10_descr i
LEFT
JOIN gems_pcsi9 g
ON g.i10_code = i.i10_code
WHERE g.i10_code IS NULL
) e
NOTE: the COUNT aggregate will omit NULL values. The query would need to be tweaked if you also wanted to "count" rows that had NULL values for i10_code. You'd use COUNT(DISTINCT ) if you want just a number of distinct values that are different. A COUNT() would give a number of rows. These two results would be different if you had multiple rows with the same i10_code value.
To get a "ratio" of codes, assuming that at this point, the "differences" don't matter, you get a count of codes from each table. The queries to do that could be used inline views:
SELECT d.cnt / e.cnt AS ratio_cnt_g_over_cnt_i
, d.cnt AS cnt_g
, e.cnt AS cnt_i
FROM (
SELECT COUNT(DISTINCT g.i10_code) AS cnt
FROM gems_pcsi9 g
) d
CROSS
JOIN (
SELECT COUNT(DISTINCT i.i10_code) AS cnt
FROM i10_descr i
) e
An alternative method is to use union all with aggregation:
select in_i10descr, in_gems_pcsi9, count(*) as numcodes
from (select code, max(in_i10descr) as in_i10descr, max(in_gems_pcsi9) as in_gems_pcsi9
from ((select i10_descr.i10_code as code, 1 as in_i10descr, 0 as in_gems_pcsi9
from i10_descr
) union all
(select gems_pcsi9.i10_code, 0, 1
gems_pcsi9.i10_code
)
) t
group by code
) c
group by in_i10descr, in_gems_pcsi9;
This will calculate counts of things in each table separately and in both tables.
I will try to explain things as much as I can.
I have following query to fetch records from different tables.
SELECT
p.p_name,
p.id,
cat.cat_name,
p.property_type,
p.p_type,
p.address,
c.client_name,
p.price,
GROUP_CONCAT(pr.price) AS c_price,
pd.land_area,
pd.land_area_rp,
p.tagline,
p.map_location,
r.id,
p.status,
co.country_name,
p.`show`,
u.name,
p.created_date,
p.updated_dt,
o.type_id,
p.furnished,
p.expiry_date
FROM
property p
LEFT OUTER JOIN region AS r
ON p.district_id = r.id
LEFT OUTER JOIN country AS co
ON p.country_id = co.country_id
LEFT OUTER JOIN property_category AS cat
ON p.cat_id = cat.id
LEFT OUTER JOIN property_area_details AS pd
ON p.id = pd.property_id
LEFT OUTER JOIN sc_clients AS c
ON p.client_id = c.client_id
LEFT OUTER JOIN admin AS u
ON p.adminid = u.id
LEFT OUTER JOIN sc_property_orientation_type AS o
ON p.orientation_type = o.type_id
LEFT OUTER JOIN property_amenities_details AS pad
ON p.id = pad.property_id
LEFT OUTER JOIN sc_commercial_property_price AS pr
ON p.id = pr.property_id
WHERE p.id > 0
AND (
p.created_date > DATE_SUB(NOW(), INTERVAL 1 YEAR)
OR p.updated_dt > DATE_SUB(NOW(), INTERVAL 1 YEAR)
)
AND p.p_type = 'sale'
everything works fine if I exclude GROUP_CONCAT(pr.price) AS c_price, from above query. But when I include this it just gives one result. My intention to use group concat above is to fetch comma separated price from table sc_commercial_property_price that matches the property id in this case p.id. If the records for property exist in sc_commercial_property_price then fetch them in comma separated form along with other records. If not it should return blank. What m I doing wrong here?
I will try to explain again if my problem is not clear. Thanks in advance
The GROUP_CONCAT is an aggregation function. When you include it, you are telling SQL that there is an aggregation. Without a GROUP BY, only one row is returns, as in:
select count(*)
from table
The query that you have is acceptable syntax in MySQL but not in any other database. The query does not automatically group by the columns with no functions. Instead, it returns an arbitrary value. You could imagine a function ANY, so you query is:
select any(p.p_name) as p_num, any(p.tagline) as tagline, . . .
To fix this, put all your current variables in a group by clause:
GROUP BY
p.p_name,
p.id,
cat.cat_name,
p.property_type,
p.p_type,
p.address,
c.client_name,
p.price,
pd.land_area,
pd.land_area_rp,
p.tagline,
p.map_location,
r.id,
p.status,
co.country_name,
p.`show`,
u.name,
p.created_date,
p.updated_dt,
o.type_id,
p.furnished,
p.expiry_date
Most people who write SQL think it is good form to include all the group by variables in the group by clause, even though MySQL does not necessarily require this.
Add GROUP BY clause enumerating whatever you intend to have separate rows for. What happens now is that it picks some value for each result column and group_concats every pr.price.
I'm having an odd problem with the following query, it works all correct,
the count part in it gets me the number of comments on a given 'hintout'
I'm trying to add a similar count that gets the number of 'votes' for each hintout, the below is the query:
SELECT h.*
, h.permalink AS hintout_permalink
, hi.hinter_name
, hi.permalink
, hf.user_id AS followed_hid
, ci.city_id, ci.city_name, co.country_id, co.country_name, ht.thank_id
, COUNT(hc.comment_id) AS commentsCount
FROM hintouts AS h
INNER JOIN hinter_follows AS hf ON h.hinter_id = hf.hinter_id
INNER JOIN hinters AS hi ON h.hinter_id = hi.hinter_id
LEFT JOIN cities AS ci ON h.city_id = ci.city_id
LEFT JOIN countries as co ON h.country_id = co.country_id
LEFT JOIN hintout_thanks AS ht ON (h.hintout_id = ht.hintout_id
AND ht.thanker_user_id = 1)
LEFT JOIN hintout_comments AS hc ON hc.hintout_id = h.hintout_id
WHERE hf.user_id = 1
GROUP BY h.hintout_id
I tried to add the following to the select part:
COUNT(ht2.thanks_id) AS thanksCount
and the following on the join:
LEFT JOIN hintout_thanks AS ht2 ON h.hintout_id = ht2.hintout_id
but the weird thing happening, to which I could not find any answers or solutions,
is that the moment I add this addtiional part, the count for comments get ruined (I get wrong and weird numbers), and I get the same number for the thanks -
I couldn't understand why or how to fix it...and I'm avoiding using nested queries
so any help or pointers would be greatly appreciated!
ps: this might have been posted twice, but I can't find the previous post
When you add
LEFT JOIN hintout_thanks AS ht2 ON h.hintout_id = ht2.hintout_id
The number of rows increases, you get duplicate rows for table hc, which get counted double in COUNT(hc.comment_id).
You can replace
COUNT(hc.comment_id) <<-- counts duplicated
/*with*/
COUNT(DISTINCT(hc.comment_id)) <<-- only counts unique ids
To only count unique appearances on an id.
On values that are not unique, like co.county_name the count(distinct will not work because it will only list the distinct countries (if all your results are in the USA, the count will be 1).
Quassnoi
Has solved the whole count problem by putting the counts in a sub-select so that the extra rows caused by all those joins do not influence those counts.
SELECT h.*, h.permalink AS hintout_permalink, hi.hinter_name,
hi.permalink, hf.user_id AS followed_hid,
ci.city_id, ci.city_name, co.country_id, co.country_name,
ht.thank_id,
COALESCE(
(
SELECT COUNT(*)
FROM hintout_comments hci
WHERE hc.hintout_id = h.hintout_id
), 0) AS commentsCount,
COALESCE(
(
SELECT COUNT(*)
FROM hintout_comments hti
WHERE hti.hintout_id = h.hintout_id
), 0) AS thanksCount
FROM hintouts AS h
JOIN hinter_follows AS hf
ON hf.hinter_id = h.hinter_id
JOIN hinters AS hi
ON hi.hinter_id = h.hinter_id
LEFT JOIN
cities AS ci
ON ci.city_id = h.city_id
LEFT JOIN
countries as co
ON co.country_id = h.country_id
LEFT JOIN
hintout_thanks AS ht
ON ht.hintout_id = h.hintout_id
AND ht.thanker_user_id=1
WHERE hf.user_id = 1