mysql multiple left join and group by main table - mysql

I have the following scenario. An area has multiple territories, a territory has multiple addresses and an address is visited multiple times in a month. Now I want to generate a monthly report about an area. (How many times an area has been visited).I have written the query but the result set is producing less areas because some addresses are not visited. I have the following structure
tables
areas: id|name (180 rows) //name is unique
territories: id|name|area_id (1k rows)
addresses: id|name|territory_id (80k rows)
visiting_addresses: id|address_id|date|status (1M+ rows) //status => 1 = visited, 2 = pending
My query is following.
select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total
from areas ar
left join territories t on t.area_id=ar.id
left join addresses a on a.territory_id=t.id
left join visiting_addresses va on va.address_id=a.id
where month(va.date) = '01'
and year(va.date)='2020'
group by ar.id
the area table contains 180 areas but the result set shows only 144 areas. Where is my mistake and what is the explanation to this? those areas are missing because they have no visiting.

Your WHERE clause is converting the LEFT JOIN with visiting_addresses to an INNER JOIN. And since it's the right most table in a LEFT-JOIN-chain, all joins will be converted to INNER JOINS. To prevent that, you should move the corresponding conditions from the WHERE clause to the ON clause:
select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total
from areas ar
left join territories t on t.area_id=ar.id
left join addresses a on a.territory_id=t.id
left join visiting_addresses va
on va.address_id=a.id
and month(va.date) = '01'
and year(va.date)='2020'
group by ar.id
But since you have alot of rows, I would rather run two queries. First get only areas with adresses form the last month using inner joins. You should though change your conditions on va.date to utilize an index:
select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total
from areas ar
join territories t on t.area_id=ar.id
join addresses a on a.territory_id=t.id
join visiting_addresses va on va.address_id=a.id
where va.date >= '2020-01-01'
and va.date < '2020-02-01'
group by ar.id
Make sure you have an index on visiting_addresses(date) or even better on visiting_addresses(date, address_id, status).
Then get all areas with a simple
select ar.id as area_id, ar.name as area
from areas ar
and add missing areas to the first result while setting visited, pending and total to zero (in application code).
The INNER JOIN should be much faster, because now the engine can start reading only the necessary rows from the visiting_addresses using an index for the WHERE conditions.
You could also use a more complex but single query. The Idea is to use a LEFT JOIN with a pre-aggregated subquery:
select ar.id as area_id, ar.name as area,
coalesce(visited, 0) as visited,
coalesce(pending, 0) as pending,
coalesce(total, 0) as total
from areas ar
left join (
select t.area_id
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total
from territories t
join addresses a on a.territory_id=t.id
join visiting_addresses va on va.address_id=a.id
where va.date >= '2020-01-01'
and va.date < '2020-02-01'
group by t.area_id
) x on x.area_id = ar.id

Try moving the logic in the WHERE clause to the ON clause of the appropriate join:
SELECT
ar.id AS area_id,
ar.name AS area,
COUNT(CASE WHEN va.status = 1 THEN 1 END) AS visited,
COUNT(CASE WHEN va.status = 2 THEN 1 END) AS pending,
COUNT(va.id) AS total
FROM areas ar
LEFT JOIN territories t ON t.area_id = ar.id
LEFT JOIN addresses a ON a.territory_id = t.id
LEFT JOIN visiting_addresses va ON va.address_id = a.id AND
va.date >= '2020-01-01' AND va.date < '2020-02-01'
GROUP BY
ar.id;
Note that selecting the name field while only aggregating by id is valid in MySQL, assuming that id be a unique field in the areas table.
You may also try adding the following index to the visiting_addresses table:
CREATE INDEX date_idx ON visiting_addresses (address_id, date, status);
This might help speed up the join to this table.

Related

MySQL upgrade from left join to something similar as FULL join

Sorry for asking here this but I need help and google is not being nice.
I have the following table Products
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.kits
LEFT join core.suppliers on core.kits.Location = core.suppliers.id
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
but does not give me all the suppliers with zeros for the ones who have no appearance in kits.
Then in I do
SELECT
COUNT(CASE when core.kits.Location = core.suppliers.id THEN 1 END) as total,
COUNT(CASE when core.kits.cp = 1 THEN 1 END) as used,
core.suppliers.id, core.suppliers.name, core.suppliers.email,
core.suppliers.cperson, core.suppliers.adress, core.suppliers.phone
FROM core.suppliers
LEFT join core.suppliers on core.suppliers.id = core.kits.Location
WHERE core.suppliers.id is not null
AND banned=0
GROUP BY core.suppliers.id
ORDER BY name ASC
LIMIT 1000 OFFSET 0
I get all suppliers and correct numbers but the query takes 8 seconds instead of 1s. Any ideas how can I get all the suppliers with the count of stocks in 1s?
cheers.
If you want all the suppliers, even those that do not appear in kits you should do a LEFT join of suppliers to kits:
SELECT COUNT(k.Location) AS total,
COUNT(CASE WHEN k.cp = 1 THEN 1 END) AS used,
s.id, s.name, s.email, s.cperson, s.adress, s.phone
FROM core.suppliers s LEFT JOIN core.kits k
ON k.Location = s.id
WHERE banned=0
GROUP BY s.id
ORDER BY s.name ASC
LIMIT 1000 OFFSET 0;
I assume that core.suppliers.id is the primary key of suppliers, so that the conition:
core.suppliers.id is not null
is not needed.
Also, if the column banned is contained in the table kits, then the condition should be moved in the ON clause:
ON k.Location = s.id AND k.banned=0
and the WHERE clause should be removed.

shows mysql records twice because of inner joining

In below query (Mentors) are 13 which shows me 26, while (SchoolSupervisor) are 5 which shows me 10 which is wrong. it is because of the Evidence which having 2 evidance, because of 2 evidence the Mentors & SchoolSupervisor values shows me double.
please help me out.
Query:
select t.c_id,t.province,t.district,t.cohort,t.duration,t.venue,t.v_date,t.review_level, t.activity,
SUM(CASE WHEN pr.p_association = "Mentor" THEN 1 ELSE 0 END) as Mentor,
SUM(CASE WHEN pr.p_association = "School Supervisor" THEN 1 ELSE 0 END) as SchoolSupervisor,
(CASE WHEN count(file_id) > 0 THEN "Yes" ELSE "No" END) as evidence
FROM review_m t , review_attndnce ra
LEFT JOIN participant_registration AS pr ON pr.p_id = ra.p_id
LEFT JOIN review_files AS rf ON rf.training_id = ra.c_id
WHERE 1=1 AND t.c_id = ra.c_id
group by t.c_id, ra.c_id order by t.c_id desc
enter image description here
You may perform the aggregations in a separate subquery, and then join to it:
SELECT
t.c_id,
t.province,
t.district,
t.cohort,
t.duration,
t.venue,
t.v_date,
t.review_level,
t.activity,
pr.Mentor,
pr.SchoolSupervisor,
rf.evidence
FROM review_m t
INNER JOIN review_attndnce ra
ON t.c_id = ra.c_id
LEFT JOIN
(
SELECT
p_id,
COUNT(CASE WHEN p_association = 'Mentor' THEN 1 END) AS Mentor,
COUNT(CASE WHEN p_association = 'School Supervisor' THEN 1 END) AS SchoolSupervisor,
FROM participant_registration
GROUP BY p_id
) pr
ON pr.p_id = ra.p_id
LEFT JOIN
(
SELECT
training_id,
CASE WHEN COUNT(file_id) > 0 THEN 'Yes' ELSE 'No' END AS evidence
FROM review_files
GROUP BY training_id
) rf
ON rf.training_id = ra.c_id
ORDER BY
t.c_id DESC;
Note that this also fixes another problem your query had, which was that you were selecting many columns which did not appear in the GROUP BY clause. Under this refactor, there is nothing wrong with your current select, because the aggregation take place in a separate subquery.
try adding this to the WHERE part of your query
AND pr.p_id IS NOT NULL AND rf.training_id IS NOT NULL
You can add a group by pr.p_id to remove the duplicate records there. Since, the group by on pr is not present as of now, there might be multiple records of same p_id for same ra
group by t.c_id, ra.c_id, pr.p_id order by t.c_id desc

Unable to get left outer join result in mysql query

SELECT
BB.NAME BranchName,
VI.NAME Village,
COUNT(BAC.CBSACCOUNTNUMBER) 'No.Of Accounts',
SUM(BAC.CURRENTBALANCE) SumOfAmount,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
ELSE 0
END) AS CreditTotal,
SUM(CASE
WHEN transactiontype = 'D' THEN amount
ELSE 0
END) AS DebitTotal,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
WHEN transactiontype = 'D' THEN - 1 * amount
ELSE 0
END) AS CurrentBalance
FROM
CUSTOMER CU,
APPLICANT AP,
ADDRESS AD,
VILLAGE VI,
BANKBRANCH BB,
BANKACCOUNT BAC
LEFT OUTER JOIN
accounttransaction ACT ON BAC.CBSACCOUNTNUMBER = ACT.BANKACCOUNT_CBSACCOUNTNUMBER
AND ACT.TRANDATE <= '2013-03-21'
AND BAC.ACCOUNTOPENINGDATE < '2013-03-21'
AND ACT.BANKACCOUNT_CBSACCOUNTNUMBER IS NOT NULL
WHERE
CU.CODE = AP.CUSTOMER_CODE
AND BAC.ENTITY = 'CUSTOMER'
AND BAC.ENTITYCODE = CU.CODE
AND AD.ENTITY = 'APPLICANT'
AND AD.ENTITYCODE = AP.CODE
AND AD.VILLAGE_CODE = VI.CODE
AND AD.STATE_CODE = VI.STATE_CODE
AND AD.DISTRICT_CODE = VI.DISTRICT_CODE
AND AD.BLOCK_CODE = VI.BLOCK_CODE
AND AD.PANCHAYAT_CODE = VI.PANCHAYAT_CODE
AND CU.BANKBRANCH_CODE = BB.CODE
AND BAC.CBSACCOUNTNUMBER IS NOT NULL
AND ACT.TRANSACTIONTYPE IS NOT NULL
GROUP BY BB.NAME , VI.NAME;
Here is my information
I have two tables bankaccount and accountransactions table
If account is created it will go to bankaccount table and if any transaction is done so respective account number record in accounttrasactiosns table however I want to display the count of total account numbers respective to the branch which the account number existed in bankaccount and it is may or may not available in accounttransactions table.
I'm guessing that the problem you have is that you are not getting results for accounts that do not have data in your accounttransaction table, even though you are using a LEFT JOIN. If that is true, the reason is because your join condition includes AND ACT.BANKACCOUNT_CBSACCOUNTNUMBER IS NOT NULL, which defeats the LEFT JOIN. You also have two conditions in your WHERE clause that I bet should not be there.
You should learn to use explicit join syntax in your coding. Your code will be much clearer if you do that; it separates the join conditions from the WHERE clause, which does a very different thing. I took a stab at re-writing your query as an illustration:
SELECT
BB.NAME BranchName,
VI.NAME Village,
COUNT(BAC.CBSACCOUNTNUMBER) 'No.Of Accounts',
SUM(BAC.CURRENTBALANCE) SumOfAmount,
SUM(ACT.CurrentBalance) CurrentBalance,
SUM(ACT.DebitTotal) DebitTotal,
SUM(ACT.CreditTotal) CreditTotal
FROM CUSTOMER CU
JOIN APPLICANT AP
ON AP.CUSTOMER_CODE = CU.CODE
JOIN ADDRESS AD
ON AD.ENTITYCODE = AP.CODE
JOIN VILLAGE VI
ON VI.CODE = AD.VILLAGE_CODE
AND VI.STATE_CODE = AD.STATE_CODE
AND VI.DISTRICT_CODE = AD.DISTRICT_CODE
AND VI.BLOCK_CODE = AD.BLOCK_CODE
AND VI.PANCHAYAT_CODE = AD.PANCHAYAT_CODE
JOIN BANKBRANCH BB
ON BB.CODE = CU.BANKBRANCH_CODE
JOIN BANKACCOUNT BAC
ON BAC.ENTITYCODE = CU.CODE
LEFT OUTER JOIN (
SELECT BANKACCOUNT_CBSACCOUNTNUMBER,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
ELSE 0
END) AS CreditTotal,
SUM(CASE
WHEN transactiontype = 'D' THEN amount
ELSE 0
END) AS DebitTotal,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
WHEN transactiontype = 'D' THEN - 1 * amount
ELSE 0
END) AS CurrentBalance
FROM accounttransaction
WHERE TRANDATE <= '2013-03-21'
GROUP BY BANKACCOUNT_CBSACCOUNTNUMBER
) ACT
ON ACT.BANKACCOUNT_CBSACCOUNTNUMBER = BAC.CBSACCOUNTNUMBER
AND BAC.ACCOUNTOPENINGDATE < '2013-03-21'
WHERE BAC.ENTITY = 'CUSTOMER'
AND AD.ENTITY = 'APPLICANT'
GROUP BY BB.NAME , VI.NAME;
I removed this line from the LEFT JOIN condition
AND ACT.BANKACCOUNT_CBSACCOUNTNUMBER IS NOT NULL
And I removed these two lines from the WHERE clause
AND BAC.CBSACCOUNTNUMBER IS NOT NULL
AND ACT.TRANSACTIONTYPE IS NOT NULL
If that does not solve your problem, please revise your question to explain further.
UPDATE: Based on comments, the query is revised to calculate the debit, credit, and current balance by account using a derived table.
Also note the placement of the BAC.ACCOUNTOPENINGDATE < '2013-03-21' condition on left join. As written, this will return all accounts regardless of the opening date. If you want to only show accounts that were opened before that date, this condition should be moved to the WHERE clause.

Unable to get exact result using left outer join in mysql banking system?

SELECT
BB.NAME BranchName,
VI.NAME Village,
COUNT(BAC.CBSACCOUNTNUMBER) 'No.Of Accounts',
SUM(BAC.CURRENTBALANCE) SumOfAmount,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
ELSE 0
END) AS CreditTotal,
SUM(CASE
WHEN transactiontype = 'D' THEN amount
ELSE 0
END) AS DebitTotal,
SUM(CASE
WHEN transactiontype = 'C' THEN amount
WHEN transactiontype = 'D' THEN - 1 * amount
ELSE 0
END) AS CurrentBalance
FROM CUSTOMER CU
JOIN APPLICANT AP
ON AP.CUSTOMER_CODE = CU.CODE
JOIN ADDRESS AD
ON AD.ENTITYCODE = AP.CODE
JOIN VILLAGE VI
ON VI.CODE = AD.VILLAGE_CODE
AND VI.STATE_CODE = AD.STATE_CODE
AND VI.DISTRICT_CODE = AD.DISTRICT_CODE
AND VI.BLOCK_CODE = AD.BLOCK_CODE
AND VI.PANCHAYAT_CODE = AD.PANCHAYAT_CODE
JOIN BANKBRANCH BB
ON BB.CODE = CU.BANKBRANCH_CODE
JOIN BANKACCOUNT BAC
ON BAC.ENTITYCODE = CU.CODE
LEFT OUTER JOIN accounttransaction ACT
ON ACT.BANKACCOUNT_CBSACCOUNTNUMBER= BAC.CBSACCOUNTNUMBER
AND ACT.TRANDATE <= '2013-07-01'
AND BAC.ACCOUNTOPENINGDATE < '2013-07-01'
WHERE BAC.ENTITY = 'CUSTOMER'
AND AD.ENTITY = 'APPLICANT'
GROUP BY BB.NAME,VI.NAME;
Here in one branch from the BANKBRANCK table having 263 accounts when I executed the above query using Left outer join the count is increasing to 293 which is wrong because only accounts opened under that branch is 263 the result is 293 which is wrong.
If I remove the Left outer join then my result is 263 for one branch when I include the Left out join then count is increasing to 293, please help me where is the problem
This is the continuous for the below question
http://stackoverflow.com/questions/17277899/unable-to-get-left-outer-join-result-in-mysql-query/17279769#17279769
This Part
LEFT OUTER JOIN accounttransaction ACT
ON ACT.BANKACCOUNT_CBSACCOUNTNUMBER= BAC.CBSACCOUNTNUMBER
AND ACT.TRANDATE <= '2013-07-01'
AND BAC.ACCOUNTOPENINGDATE < '2013-07-01'
Allows for more than 1 row on the accounttransaction table to be returned, which will allow the row count to increase from 263 to 293. Left Join does not implicitly limit the joined data to only one match.

MYSQL How to count elements grouped by type

I have a problem with a query:
I have a list of stores, each of these stores has members and there are various categories of membership (Bronze, silver, gold ...)
The tables are: 'shops', 'members', 'membership_cards'.
shops: id, name
members: id, shops_id, membership_id, first_name, last_name
membership_cards: id, description
I need to extract the count of members, grouped by membership of each stores. Can I do this without using a server side language?
The final result should be something like:
Store's name, n°bronze members, n°silver_members, n°gold_members ....
Based on what you provided, you want a query like:
select shopid,
sum(case when c.cardtype = 'Bronze' then 1 else 0 end) as Bronze,
sum(case when c.cardtype = 'Silver' then 1 else 0 end) as Silver,
sum(case when c.cardtype = 'Gold' then 1 else 0 end) as Gold
from shops s left outer join
members m
on s.shopid = m.shopid left outer join
cards c
on c.memberid = m.memberid
group by shopid
If you want to know the number of members, rather than of cards in each group (if members can have more than one card), then replace the sum() expression with:
count(case when c.cardtype = 'Bronze' then m.memberid end)
Without knowing your database schema, it's a bit hard to answer that question, but something like the following should do the job:
SELECT shop.name,
SUM(CASE WHEN membership_cards.category = 'Bronze' THEN 1 ELSE 0 END) AS Bronze,
SUM(CASE WHEN membership_cards.category = 'Silver'THEN 1 ELSE 0 END) AS Silver,
SUM(CASE WHEN membership_cards.category = 'Gold' THEN 1 ELSE 0 END) AS Gold
FROM shops
INNER JOIN members
ON shop.id = members.shopid
INNER JOIN membership_cards
ON members.id = membership_cards.memberid
GROUP BY shop.name
Just change the column names to the names you are using.
SELECT B.name,A.Bronze,A.Silver,A.Gold
FROM
(
SELECT S.id,
SUM(IF(IFNULL(C.cardtype,'')='Bronze',1,0)) Bronze,
SUM(IF(IFNULL(C.cardtype,'')='Silver',1,0)) Silver,
SUM(IF(IFNULL(C.cardtype,'')='Gold' ,1,0)) Gold
FROM shops S
LEFT JOIN members M ON S.id = M.shops_id
LEFT JOIN membership_cards C ON M.membership_id = C.id
GROUP BY S.id
) A
INNER JOIN shops B USING (id);
I used the IFNULL function in case any member has no cards