Get COUNT() AND COUNT(DISTINCT()) values across 2 tables

Get COUNT() AND COUNT(DISTINCT()) values across 2 tables - mysql

I am trying to get a COUNT(*) from a joined table "Table2" where the dates are common PLUS a COUNT(DISTINCT(CID)) from "Table1" and grouped by the common year-month:
Table1
---------------
cid | date |
----|---------|
321 | 2016-01 |
----|---------|
423 | 2016-01 |
----|---------|
324 | 2016-01 |
----|---------|
546 | 2015-12 |
----|---------|
Table2
---------------
id | dateEnq |
----|---------|
3 | 2016-01 |
----|---------|
6 | 2016-01 |
----|---------|
24 | 2015-12 |
----|---------|
36 | 2015-12 |
----|---------|
MySQL query:
SELECT COUNT( DISTINCT (t1.cid) ) AS users,
SUBSTR(DATE(t1.date),1,7) AS month,
COUNT(t2.dateEnq) AS enquiries
FROM table1 t1
INNER JOIN table2 t2 ON SUBSTR(t2.dateEnq, 1, 7 ) = SUBSTR(t1.date, 1, 7 )
GROUP BY SUBSTR(DATE(t1.date),1,7)
This is the result I get, but the enquiries values are just way wrong, I think it is not counting the values from Table2, they should be like 3, 10, 25 per row.
How do I get the monthly count from Table2?
users | month | enquiries|
------|----------|-----------
7237 | 2015-10 | 8374 |
12597 | 2015-11 | 30066 |
12980 | 2015-12 | 15514 |
11305 | 2016-01 | 128169 |

Your "counts" are inflated because it's a partial cross product. For a given month, every row from t1 is matched to every row for that same month from t2.
One option is to get the counts before you do the join. As an example, using inline views:
SELECT c1.users
, c1.month
, c2.enquiries
FROM ( SELECT COUNT(DISTINCT t1.cid) AS `users`
, SUBSTR(DATE(t1.date),1,7) AS `month`
FROM table1 t1
GROUP BY SUBSTR(DATE(t1.date),1,7)
) c1
JOIN ( SELECT COUNT(t2.dateEnq) AS `enquiries`
, SUBSTR(DATE(t2.date),1,7) AS `month`
FROM table2 t2
GROUP BY SUBSTR(DATE(t2.date),1,7)
) c2
ON c2.month = c1.month
ORDER BY c1.month
This isn't the only way (or even necessarily the best way) to get this result. There are other query patterns that will achieve equivalent results.
If there's a possibility to have zero enquiries for a given month, then to get that zero count returned, we can tweak that query so it's an outer join, and then replace any NULL value (for a missing row) with a zero in the outer query:
SELECT c1.users
, c1.month
, IFNULL(c2.enquiries,0) AS `enquiries`
FROM ( SELECT COUNT(DISTINCT t1.cid) AS `users`
, SUBSTR(DATE(t1.date),1,7) AS `month`
FROM table1 t1
GROUP BY SUBSTR(DATE(t1.date),1,7)
) c1
LEFT
JOIN ( SELECT COUNT(t2.dateEnq) AS `enquiries`
, SUBSTR(DATE(t2.date),1,7) AS `month`
FROM table2 t2
GROUP BY SUBSTR(DATE(t2.date),1,7)
) c2
ON c2.month = c1.month
ORDER BY c1.month

Related

How to Select First Date, Previous Date, Latest Date where first date is higher than a reference date

I want to SELECT the Latest Date, the Second Latest Date and the First Date FROM a table1 where the First Date is higher than a reference Date found in another table2. And that reference Date should also be the latest from that table2. I have a solution, supposed to be. But the problem is, the solutions will not return an output if there is ONLY 1 record from table1. Example of the tables:
table1
Reg ID | DateOfAI | byTechnician
2GP001 | 2015-01-13 | 31
2GP001 | 2015-02-18 | 31
2GP001 | 2017-11-10 | 45
2GP001 | 2017-11-30 | 32
2GP044 | 2017-11-30 | 28
2GP001 | 2017-12-23 | 32
table2
Reg ID | DateOfCalving | DryOffDate
2GP001 | 2016-01-14 |
2GP070 | 2016-01-14 |
2GP065 | 2017-04-08 |
2GP001 | 2017-04-12 |
my expected output would be:
Reg ID | LatestDateOfCalving | 1stDateOfAI | PreviousAIDate | LastestAIDate
2GP001 | 2017-04-12 | 2017-11-10 | 2017-11-30 | 2017-12-23
I have searched everywhere from the moon and back... still no luck. these are the queries that i have used
the Fisrt:
SELECT b.actualDam,COUNT(x.actualDam) AS ilanba, max(b.breedDate) AS huli, max(x.breedDate) AS nex,MIN(x.breedDate) AS una,IFNULL(c.calvingDate,NULL) AS nganak,r.*,h.herdID,a.animalID,a.regID, IFNULL(a.dateOfBirth,NULL) AS buho
FROM x_animal_breeding_rec b
LEFT JOIN x_animal_calving_rec c ON b.recID=c.brecID
LEFT JOIN x_herd_animal_rel r ON b.actualDam=r.animal
LEFT JOIN x_herd h ON r.herd=h.herdID
LEFT JOIN x_animal_main_info a ON b.actualDam=a.animalID
JOIN x_animal_breeding_rec x ON b.actualDam = x.actualDam AND x.breedDate < b.breedDate
WHERE h.herdID = ? AND x.mateType = ? AND x.recFlag = ? GROUP BY b.actualDam
and the Second one that I've tried is this code:
SELECT b.recID
, b.actualDam
, b.breedDate
, min(b.breedDate) AS una
, max(b.breedDate) AS huli
, COUNT(b.actualDam) AS sundot
, b.mateType
, b.recFlag
, a.animalID
, a.regID
, h.*
FROM
( SELECT c.recID, c.actualDam
, c.breedDate
, c.mateType
, c.recFlag
, CASE WHEN #prev=c.recID THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=c.recID prev
FROM x_animal_breeding_rec c
, ( SELECT #prev:=null,#i:=0 ) vars
ORDER BY c.recID,c.breedDate DESC
) b
LEFT JOIN x_animal_main_info a ON b.actualDam=a.animalID
LEFT JOIN x_herd_animal_rel h ON b.actualDam=h.animal
WHERE i <= 2 GROUP BY b.actualDam HAVING h.herd = ? AND b.mateType = ? AND b.recFlag = ? ORDER BY b.breedDate DESC
Another problem here is the first solution returns a WRONG COUNT. the second solution returns a CORRECT COUNT, however, wrong Dates were returned. I hope you could give me an idea. Thanx in Advance.

The following query answers your question:
SELECT
RegID,
LatestDateOfCalving,
MIN(DateOfAI) AS 1stDateOfAI,
REPLACE(SUBSTRING_INDEX(GROUP_CONCAT(DateOfAI ORDER BY DateOfAI DESC), ',', 2), CONCAT(MAX(DateOfAI), ','), '') AS PreviousAIDate,
MAX(DateOfAI) AS LatestAIDate
FROM (
SELECT
t1.RegID,
LatestDateOfCalving,
DateOfAI,
IF(DateOfAI >= LatestDateOfCalving, 1, 0) AS dates
FROM table1 AS t1
INNER JOIN (
SELECT
RegID,
MAX(DateOfCalving) AS LatestDateOfCalving
FROM table2 GROUP BY RegID
) AS tt2 ON t1.RegID = tt2.RegID) AS x
WHERE dates = 1
GROUP BY RegID
HAVING COUNT(dates) >= 3;
Output:
+--------+---------------------+-------------+----------------+--------------+
| RegID | LatestDateOfCalving | 1stDateOfAI | PreviousAIDate | LatestAIDate |
+--------+---------------------+-------------+----------------+--------------+
| 2GP001 | 2017-04-12 | 2017-11-10 | 2017-11-30 | 2017-12-23 |
+--------+---------------------+-------------+----------------+--------------+
DEMO
In a subquery we select RegID and LatestDateOfCalving from table2 in order to have a reference date. Then join it to table1 and flag the record whether DateOfAI is greater or equal to LatestDateOfCalving (IF(DateOfAI >= LatestDateOfCalving, 1, 0)). We use this subquery in the outer query (SELECT RegID, LatestDateOfCalving, MIN(DateOfAI) AS 1stDateOfAI, MAX(DateOfAI) AS LatestAIDate, ...) and select only those records where the DateOfAI are at or after LatestDateOfCalving (WHERE dates = 1, where 1 is the flag where the condition was true) and have at least 3 records (HAVING COUNT(dates) >= 3). In the outer query I use the REPLACE(SUBSTRING_INDEX(GROUP_CONCAT(...))) structure in order to extract the previousAIDate from a comma (,) separated list of dates.

Sum columns from two tables in sql

I have two tables, one is the cost table and the other is the payment table, the cost table contains the cost of product with the product name.
Cost Table
id | cost | name
1 | 100 | A
2 | 200 | B
3 | 200 | A
Payment Table
pid | amount | costID
1 | 10 | 1
2 | 20 | 1
3 | 30 | 2
4 | 50 | 1
Now I have to sum the total of cost by the same name values, and as well sum the total amount of payments by the costID, like the query below
totalTable
name | sum(cost) | sum(amount) |
A | 300 | 80 |
B | 200 | 30 |
However I have been working my way around this using the query below but I think I am doing it very wrong.
SELECT
b.name,
b.sum(cost),
a.sum(amount)
FROM
`Payment Table` a
LEFT JOIN
`Cost Table` b
ON
b.id=a.costID
GROUP by b.name,a.costID
I would be grateful if somebody would help me with my queries or better still an idea as to how to go about it. Thank you

This should work:
select t2.name, sum(t2.cost), coalesce(sum(t1.amount), 0) as amount
from (
select id, name, sum(cost) as cost
from `Cost`
group by id, name
) t2
left join (
select costID, sum(amount) as amount
from `Payment`
group by CostID
) t1 on t2.id = t1.costID
group by t2.name
SQLFiddle

You need do the calculation in separated query and then join them together.
First one is straight forward.
Second one you need to get the name asociated to that payment based in the cost_id
SQL Fiddle Demo
SELECT C.`name`, C.`sum_cost`, COALESCE(P.`sum_amount`,0 ) as `sum_amount`
FROM (
SELECT `name`, SUM(`cost`) as `sum_cost`
FROM `Cost`
GROUP BY `name`
) C
LEFT JOIN (
SELECT `Cost`.`name`, SUM(`Payment`.`amount`) as `sum_amount`
FROM `Payment`
JOIN `Cost`
ON `Payment`.`costID` = `Cost`.`id`
GROUP BY `Cost`.`name`
) P
ON C.`name` = P.`name`
OUTPUT
| name | sum_cost | sum_amount |
|------|----------|------------|
| A | 300 | 80 |
| B | 200 | 30 |

A couple of issues. For one thing, the column references should be qualified, not the aggregate functions.
This is invalid:
table_alias.SUM(column_name)
Should be:
SUM(table_alias.column_name)
This query should return the first two columns you are looking for:
SELECT c.name AS `name`
, SUM(c.cost) AS `sum(cost)`
FROM `Cost Table` c
GROUP BY c.name
ORDER BY c.name
When you introduce a join to another table, like Product Table, where costid is not UNIQUE, you have the potential to produce a (partial) Cartesian product.
To see what that looks like, to see what's happening, remove the GROUP BY and the aggregate SUM() functions, and take a look at the detail rows returned by a query with the join operation.
SELECT c.id AS `c.id`
, c.cost AS `c.cost`
, c.name AS `c.name`
, p.pid AS `p.pid`
, p.amount AS `p.amount`
, p.costid AS `p.costid`
FROM `Cost Table` c
LEFT
JOIN `Payment Table` p
ON p.costid = c.id
ORDER BY c.id, p.pid
That's going to return:
c.id | c.cost | c.name | p.pid | p.amount | p.costid
1 | 100 | A | 1 | 10 | 1
1 | 100 | A | 2 | 20 | 1
1 | 100 | A | 4 | 50 | 1
2 | 200 | B | 3 | 30 | 2
3 | 200 | A | NULL | NULL | NULL
Notice that we are getting three copies of the id=1 row from Cost Table.
So, if we modified that query, adding a GROUP BY c.name, and wrapping c.cost in a SUM() aggregate, we're going to get an inflated value for total cost.
To avoid that, we can aggregate the amount from the Payment Table, so we get only one row for each costid. Then when we do the join operation, we won't be producing duplicate copies of rows from Cost.
Here's a query to aggregate the total amount from the Payment Table, so we get a single row for each costid.
SELECT p.costid
, SUM(p.amount) AS tot_amount
FROM `Payment Table` p
GROUP BY p.costid
ORDER BY p.costid
That would return:
costid | tot_amount
1 | 80
2 | 30
We can use the results from that query as if it were a table, by making that query an "inline view". In this example, we assign an alias of v to the query results. (In the MySQL venacular, an "inline view" is called a "derived table".)
SELECT c.name AS `name`
, SUM(c.cost) AS `sum_cost`
, IFNULL(SUM(v.tot_amount),0) AS `sum_amount`
FROM `Cost Table` c
LEFT
JOIN ( -- inline view to return total amount by costid
SELECT p.costid
, SUM(p.amount) AS tot_amount
FROM `Payment Table` p
GROUP BY p.costid
ORDER BY p.costid
) v
ON v.costid = c.id
GROUP BY c.name
ORDER BY c.name

SQL query, SELECT not in more than 2 records on other table join

so i have this two table:
1. private_information
| account_no | name | adrress |
-----------------------------------
| 123 | andrew | alberque |
| 234 | melissa| california|
| 456 | matthew| newark |
and then the 2nd table is transaction:
| account_no | transaction_num |
----------------------------------
| 123 | 989890808 |
| 123 | 234247827 |
| 123 | 123621472 |
| 123 | 457465745 |
| 234 | 435446545 |
so i want to make this select condition:
SELECT *
From private_information a
JOIN transaction b ON a.account_no=b.account_no
WHERE ( <= 2 records in transation table)
the account number should not be more than 2 records of transaction number, so account _no = 123 should not show.

Add a join with a subquery that counts the number of transactions for each account.
SELECT p.*, t1.*
FROM private_information AS p
JOIN transaction AS t1 ON p.account_no = t1.account_no
JOIN (SELECT account_no
FROM transaction
GROUP BY account_no
HAVING COUNT(*) <= 2) AS t2 ON p.account_no = t2.account_no

Here we go
SELECT *
FROM private_information a JOIN
( SELECT count(account_no) as counted,
account_no
FROM transaction
HAVING counted <= 2
GROUP BY account_no ) b
ON a.account_no = b.account_no

select AA.account_no, AA.name, AA.adrress, CC.transaction_num from
(
select account_no, name, adrress from private_information
) AA inner join
( select count(*) as cc, account_no from transation group by account_no having cc <= 2
) BB ON BB.account_no = AA.account_no inner join
(
select account_no, transaction_num from transaction
) CC ON CC.account_no = AA.account_no

MySQL count rows within the same intervals to eachother

I have a table where one column is the date:
+----------+---------------------+
| id | date |
+----------+---------------------+
| 5 | 2012-12-10 10:12:37 |
+----------+---------------------+
| 4 | 2012-12-10 09:09:55 |
+----------+---------------------+
| 3 | 2012-12-09 21:12:35 |
+----------+---------------------+
| 2 | 2012-12-09 20:15:07 |
+----------+---------------------+
| 1 | 2012-12-09 20:01:42 |
+----------+---------------------+
What I need, is to count the rows which are for example whitin 3 hours to each other. In this example I want to join the upper row with the 2nd row, and the 3rd row with the 4th and 5th rows. So my output should be like this:
+----------+---------------------+---------+
| id | date | count |
+----------+---------------------+---------+
| 5 | 2012-12-10 10:12:37 | 2 |
+----------+---------------------+---------+
| 3 | 2012-12-09 21:12:35 | 3 |
+----------+---------------------+---------+
How could I do this?

I think you need a self-join for this:
select t.id, t.date, COUNT(t2.id)
from t left outer join
t t2
on t.date between t2.date - interval 3 hour and t2.date + interval 3 hour
group by t.id, t.date
(This is untested code so it might have a syntax error.)
If you are trying to divide everything into 3-hour intervals, you can do something like:
select max(t.date), t.id, count(*)
from (select t.*,
(date(date)*100 + floor(hour(date)/3)*3) as interval
from t
) t
group by interval

I am not sure how to do this with My SQL but i am able to build a set of queries in SQL Server 2005 which will provide the intended results. Here is the working sample, its very complex and may be overly complex but that's how i was able to get the desired result:
WITH BaseData AS
(
SELECT 5 AS ID, '2012-12-10 10:12:37' AS Date
UNION ALL
SELECT 4 AS ID, '2012-12-10 09:09:55' AS Date
UNION ALL
SELECT 3 AS ID, '2012-12-09 21:12:35' AS Date
UNION ALL
SELECT 2 AS ID, '2012-12-09 20:15:07' AS Date
UNION ALL
SELECT 1 AS ID, '2012-12-09 20:01:42' AS Date
),
BaseDataWithRowNum AS
(
SELECT ID,DATE, ROW_NUMBER() OVER (ORDER BY Date DESC) AS RowNum
FROM BaseData
),
InterRelatedDates AS
(
SELECT B1.RowNum AS RowNum1,B2.RowNum AS RowNum2
FROM BaseDataWithRowNum B1
INNER JOIN BaseDataWithRowNum B2
ON B1.Date BETWEEN B2.Date AND DATEADD(hh,3,B2.Date)
AND B1.RowNum < B2.RowNum
AND B1.ID != B2.ID
),
InterRelatedDatesWithinMultipleGroups AS
(
SELECT G1.RowNum1,G2.RowNum2
FROM InterRelatedDates G1
LEFT JOIN InterRelatedDates G2
ON G1.RowNum2 = G2.RowNum2
AND G1.RowNum1 != G2.RowNum1
)
SELECT BN.ID,
BN.Date,
CountExcludingOriginalGrouppingRecord +1 AS C
FROM
(
SELECT RowNum1 AS RowNum,COUNT(1) AS CountExcludingOriginalGrouppingRecord
FROM
(
-- If a row was used in only one group then it is ok. use as it is
SELECT D1.RowNum1
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NULL
UNION ALL
-- In case a row was selected in two groups, choose the one with higher date
SELECT Min(D1.RowNum1)
FROM InterRelatedDatesWithinMultipleGroups AS D1
WHERE D1.RowNum2 IS NOT NULL
GROUP BY D1.RowNum2
) T
GROUP BY RowNum1
) T2
INNER JOIN BaseDataWithRowNum BN
ON BN.RowNum = T2.RowNum

SUM a pair of COUNTs from two tables based on a time variable

Been searching for an answer to this for the better part of an hour without much luck. I have two regional tables laid out with the same column names and I can put out a result list for either table based on the following query (swap Table2 for Table1):
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
Ideally I'd like to get a result that gives me a total sum of the counts by year, so instead of:
| REGION 1 | | REGION 2 |
| YEAR | Total | | YEAR | Total |
| 2010 | 5 | | 2010 | 1 |
| 2009 | 2 | | 2009 | 3 |
| | | | 2008 | 4 |
I'd have:
| MERGED |
| YEAR | Total |
| 2010 | 6 |
| 2009 | 5 |
| 2008 | 4 |
I've tried a variety of JOINs and other ideas but I think I'm caught up on the SUM and COUNT issue. Any help would be appreciated, thanks!

SELECT `YEAR`, FORMAT(SUM(`count`), 0) AS `Total`
FROM (
SELECT `Table1`.`YEAR`, COUNT(*) AS `count`
WHERE `Table1`.`variable` = 'Y'
GROUP BY `Table1`.`YEAR`
UNION ALL
SELECT `Table2`.`YEAR`, COUNT(*) AS `count`
WHERE `Table2`.`variable` = 'Y'
GROUP BY `Table2`.`YEAR`
) AS `union`
GROUP BY `YEAR`

You should use an UNION:
SELECT
t.YEAR,
COUNT(*) as TOTAL
FROM (
SELECT *
FROM Table1
UNION ALL
SELECT *
FROM Table2
) t
WHERE t.variable='Y'
GROUP BY t.YEAR;

Select year, sum(counts) from (
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, FORMAT(COUNT(Table2.id),0) AS Total
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) GROUP BY year

To improve upon Shehzad's answer:
SELECT YEAR, FORMAT(SUM(counts),0) AS total FROM (
SELECT Table1.YEAR, COUNT(Table1.id) AS counts
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, COUNT(Table2.id) AS counts
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) AS newTable GROUP BY YEAR

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Get COUNT() AND COUNT(DISTINCT()) values across 2 tables - mysql

Related

How to Select First Date, Previous Date, Latest Date where first date is higher than a reference date

Sum columns from two tables in sql

SQL query, SELECT not in more than 2 records on other table join

MySQL count rows within the same intervals to eachother

SUM a pair of COUNTs from two tables based on a time variable

Categories

Resources