how to show all result query even though the results are empty - mysql

I count my data from database, but I have a problem with the result. the result only displays data that is not empty, while the empty data is not displayed. how do I display data rows that are empty and not empty?
the result of my query like this
pendidikan| Male | Famale | Total
----------+------+--------+------
SD | 3 | 4 | 7
SMP | 2 | 1 | 3
SMA | 1 | 3 | 4
S1 | 10 | 1 | 11
BUT i want the result like this :
pendidikan| Male | Famale | Total
----------+------+--------+------
SD | 3 | 4 | 7
SMP | 2 | 1 | 3
SMA | 1 | 3 | 4
S1 | 10 | 1 | 11
S2 | 0 | 0 | 0
S3 | 0 | 0 | 0
i want to show empty data from my database. this is my query
SELECT a.NamaStatusPendidikan, COUNT(c.IDPencaker) as total,
count(case when c.JenisKelamin='0' then 1 end) as laki,
count(case when c.JenisKelamin='1' then 1 end) as cewe
FROM msstatuspendidikan as a JOIN mspencaker as c ON
a.IDStatusPendidikan = c.IDStatusPendidikan JOIN
mspengalaman as d ON c.IDPencaker = d.IDPencaker
WHERE d.StatusPekerjaan = '0' AND c.RegisterDate
BETWEEN '2019-01-01' AND '2019-03-01' GROUP BY a.IDStatusPendidikan

Try running this query:
SELECT sp.NamaStatusPendidikan,
COUNT(*) as total,
SUM( p.JenisKelamin = 0 ) as laki,
SUM( p.JenisKelamin = 1 ) as cewe
FROM msstatuspendidikan sp LEFT JOIN
mspencaker p
ON sp.IDStatusPendidikan = p.IDStatusPendidikan AND
p.RegisterDate BETWEEN '2019-01-01' AND '2019-03-01' LEFT JOIN
mspengalaman g
ON g.IDPencaker = c.IDPencaker AND
g.StatusPekerjaan = 0
GROUP BY sp.IDStatusPendidikan;
Notes:
The JOINs have been replaced with LEFT JOINs.
Filtering conditions on all but the first table have been moved to the ON clauses.
This replaces the meaningless table aliases with table abbreviations, so the table is easier to read.
Things that looks like numbers probably are numbers, so I removed the single quotes.
This simplifies the counts, using the fact that MySQL treats booleans as numbers in a numeric context.

Related

SQL Query to conditionally retrieve records based on a key value list

I have an API that communicates with a database with the following tables:
QUESTS
+----------+------------+
| quest_id | quest_name |
+----------+------------+
| 1 | Q001 |
| 2 | Q002 |
| 3 | Q003 |
| 4 | Q004 |
| 5 | Q005 |
+----------+------------+
SKILLS
+----------+------------+
| skill_id | skill_name |
+----------+------------+
| 1 | S001 |
| 2 | S002 |
| 3 | S003 |
| 4 | S004 |
| 5 | S005 |
+----------+------------+
SKILL_PREREQUISITES
+----------+-----------------------+-------------+
| quest_id | prerequisite_skill_id | skill_value |
+----------+-----------------------+-------------+
| 1 | 2 | 50 |
| 1 | 1 | 45 |
| 4 | 2 | 25 |
| 4 | 3 | 60 |
| 5 | 4 | 50 |
+----------+-----------------------+-------------+
Quests correspond to levels in a game, and the skills are acquired by players while playing the game. The SKILL_PREREQUISITE table maintains the skill pre-requisites a player needs to satisfy before being able to participate in a quest.
Problem
An endpoint of the API receives a list of skills a player has, along with the skill level (skill_value) of the corresponding skill like so:
[
{
"key": 1, //skill ID
"value": 45 //skill value
},
{
"key": 2,
"value": 60
}
...
]
Now my use case is to use these values and query the database to obtain a list of Quests the player is eligible to participate in based on the skill_id as well as the skill_value
Example
Assume the API receives the following skill-skill value mapping:
skill-value map: [{1,50}, {2,60}, {4,50}]
Based on this, the player with the above skill set can participate in quest 1 and 5 but not in 4 since 4 requires skill 3 with 60 points.
Attempt at a solution
I have managed to write a query (thanks to a previous question I posted!) to identify the quests that correspond to the skill IDs, but I have no idea how to filter this further within the query, based on the skill value.
I am wondering if this is even possible at this point and whether I might have to do further processing on the server to get the required result.
Fiddle for my attempt at a solution: https://www.db-fiddle.com/f/s2umHS1wz3Q8ibwUhKDas6/1
The challenge here is that you need to pass these values retrieved from an API response to your SQL statement as input and generate output by dynamically creating no of comparisons based on the input.
Now, if i would've familiar with your back-end platform than i would've given more apt solution but as i don't aware with Node.js, my solution will only include required SQL statements and the remaining part you need to DIY.
First thing you need to do is to parse this API response and store these values into a Data Structure.
Now, create a Temporary table in from your Node.js code and store these input values in this table.
CREATE TEMPORARY TABLE Input (id INT, value INT);
Add data from that Data structure to this table.
Now, run the following query & you'll get what you want:
SELECT skp.quest_id
FROM SKILL_PREREQUISITES skp
GROUP BY quest_id
HAVING COUNT(skp.quest_id) =
( SELECT COUNT(quest_id)
FROM Input i
JOIN SKILL_PREREQUISITES sp
ON sp.prerequisite_skill_id = i.id
AND sp.skill_value <= i.value
WHERE skp.quest_id = sp.quest_id
)
Demo Fiddle
You can phrase the query by comparing both the skill and the value:
SELECT q.quest_id
FROM QUESTS q LEFT JOIN
SKILL_PREREQUISITES p
ON p.quest_id = q.quest_id
GROUP BY q.quest_id
HAVING SUM( p.prerequisite_skill_id = 1 and 50 >= p.skill_value) > 0 AND
SUM( p.prerequisite_skill_id = 2 and 60 >= p.skill_value) > 0 AND
SUM( p.prerequisite_skill_id = 4 and 50 >= p.skill_value) > 0 ;
Because you need quests that have prerequisites, a LEFT JOIN is not necessary. In fact, no JOIN is necessary at all:
SELECT p.quest_id
FROM SKILL_PREREQUISITES p
WHERE p.prerequisite_skill_id IN (1, 2, 4)
GROUP BY p.quest_id
HAVING SUM( p.prerequisite_skill_id = 1 and 50 >= p.skill_value) > 0 AND
SUM( p.prerequisite_skill_id = 2 and 60 >= p.skill_value) > 0 AND
SUM( p.prerequisite_skill_id = 4 and 50 >= p.skill_value) > 0 ;
The filtering before the GROUP BY is optional, but it might improve performance.
EDIT:
I think I answered the wrong question above. I think you want all the skills that are available given the user's pre-requisites. That would be:
SELECT q.quest_id
FROM QUESTS q LEFT JOIN
SKILL_PREREQUISITES p
ON p.quest_id = q.quest_id
GROUP BY q.quest_id
HAVING COALESCE(SUM( (p.prerequisite_skill_id = 1 and 50 >= p.skill_value) +
(p.prerequisite_skill_id = 2 and 60 >= p.skill_value) +
(p.prerequisite_skill_id = 4 and 50 >= p.skill_value)
)) = COUNT(p.prerequisite_skill_id);

Ordering issue when using SQL variable

I run this query:
SELECT stockcarddetail.id, stockcarddetail.date, stockcarddetail.quantity, stockcarddetail.pricePerItem
FROM Stockcard
LEFT JOIN staff
ON staff.branchId = stockcard.branchId
LEFT JOIN stockcarddetail
ON stockcarddetail.stockcardId = stockcard.id
WHERE staff.username = 'jemmy.h'
AND stockcarddetail.quantity > 0
AND stockcard.productId = '98924a5f-6afb-11e7-8dd4-2c56dcbcb038'
ORDER BY date ASC
and get the result below:
id | date | quantity| pricePerItem
50 | 2017-10-15 | 10.00 | 10000.00
1 | 2017-10-18 | 20.00 | 10000.00
Then, I need to calculate the cumulative of quantity based on the order above, so I run this query:
SELECT a.*, #tot:=#tot + a.quantity FROM
(SELECT #tot:= 0)b
JOIN
(SELECT stockcarddetail.id, stockcarddetail.date, stockcarddetail.quantity, stockcarddetail.pricePerItem
FROM Stockcard
LEFT JOIN staff
ON staff.branchId = stockcard.branchId
LEFT JOIN stockcarddetail
ON stockcarddetail.stockcardId = stockcard.id
WHERE staff.username = 'jemmy.h'
AND stockcarddetail.quantity > 0
AND stockcard.productId = '98924a5f-6afb-11e7-8dd4-2c56dcbcb038'
ORDER BY date ASC) a
Then I got this result:
id | date | quantity| pricePerItem | #tot
1 | 2017-10-18 | 20.00 | 10000.00 | 20
50 | 2017-10-15 | 10.00 | 10000.00 | 30
However, the result that I want is like this:
id | date | quantity| pricePerItem | #tot
50 | 2017-10-15 | 10.00 | 10000.00 | 10
1 | 2017-10-18 | 20.00 | 10000.00 | 30
How can I get the expected result?
EDIT
Simplified version of the problem can be found here: http://sqlfiddle.com/#!9/f6ad91/3
From what I understand from you, you want the cumulative total for each entry.
I suggest ditching the variable and relying on a subquery instead:
SELECT
scd.id,
scd.date,
scd.quantity,
scd.pricePerItem,
(SELECT SUM(scd1.quantity) FROM StockcardDetail AS scd1 WHERE scd1.stockcardId = scd.stockcardId AND scd1.date <= scd.date) AS total
FROM Stockcard
LEFT JOIN staff ON staff.branchId = stockcard.branchId
LEFT JOIN stockcarddetail AS scd ON scd.stockcardId = stockcard.id
WHERE staff.username = 'jemmy.h'
AND scd.quantity > 0
AND stockcard.productId = '98924a5f-6afb-11e7-8dd4-2c56dcbcb038'
ORDER BY scd.date ASC
The idea behind this is to make it select the sum of all entries prior (including the current one) for each entry.
As per my understanding, you should get the expected output from your query. But, you aren't getting your expected output, then other possible solution is (WITHOUT JOIN)
SET #tot:= 0;
SELECT
stockcarddetail.id,
stockcarddetail.date,
stockcarddetail.quantity,
stockcarddetail.pricePerItem,
#tot:=#tot + stockcarddetail.quantity as Total
FROM Stockcard
LEFT JOIN staff ON staff.branchId = stockcard.branchId
LEFT JOIN stockcarddetail ON stockcarddetail.stockcardId = stockcard.id
WHERE staff.username = 'jemmy.h' AND stockcarddetail.quantity > 0 AND stockcard.productId = '98924a5f-6afb-11e7-8dd4-2c56dcbcb038'
ORDER BY date ASC

Mysql search on multiple join results

I've a table "products" and a table where are store some attributes of a product:
zd_products
----------
|ID|title|
----------
| 1| Test|
| 2| Prod|
| 3| Colr|
zd_product_attached_attributes
------------------
|attrid|pid|value|
------------------
|1 | 1 | A |
|2 | 1 | 10 |
|3 | 1 | AB |
|1 | 2 | B |
|2 | 2 | 22 |
|3 | 2 | BB |
|1 | 3 | A |
|2 | 3 | 10 |
|3 | 3 | CC |
I want to search in zd_products only the products that have some attributes values, for exam place
Get the product when the attribute 1 is A and the attribute 3 is AB
Get the product when the attribute 2 is 10 and the attribute 3 is CC
etc
How can i do this using a join ?
Oh, the Joys of the EAV model!
One way is to use a separate JOIN operation for each attribute value. For example:
SELECT p.id
, p.title
FROM zd_products p
JOIN zd_product_attached_attributes a1
ON a1.pid = p.id
AND a1.attrid = 1
AND a1.value = 'A'
JOIN zd_product_attached_attributes a3
ON a3.pid = p.id
AND a3.attrid = 3
AND a3.value = 'AB'
With appropriate indexes, that's likely going to be the most efficient approach. This isn't the only query that will return the specified result, but this one does make use of JOIN operations.
Another, less intuitive approach
If id is unique in the zd_products table, and we have guarantee that the (attrid,pid,value) tuple is unique in the zd_product_attached_attributes table, then this:
SELECT p.id
, p.title
FROM zd_products p
JOIN zd_product_attached_attributes a
ON a.pid = p.id
AND ( (a.attrid = 1 AND a.value = 'A')
OR (a.attrid = 3 AND a.value = 'AB')
)
GROUP
BY p.id
, p.title
HAVING COUNT(1) > 1
will return an equivalent result. The latter query is of a form that is particularly suitable for matching two criteria out of three, where we don't need a match on ALL of the attributes, but just some of them. For example, finding a product that matches any two of:
color = 'yellow'
size = 'bigger'
special = 'on fire'
And of course there are other approaches that don't make use of a JOIN.
FOLLOWUP
Q: And if I want to the same but using OR operator? I mean get ONLY if the attribute 1 is A or the attribute 2 is AB otherwise don't select the record.
A: A query of the form like the second one in my answer (above) is more conducive to the OR condition.
If you want XOR (exclusive OR), where one of the attributes has a matching value but the other one doesn't, just change the HAVING COUNT(1) > 1 to HAVING COUNT(1) = 1. Only rows from products that find one "matching" row in the attributes table will be returned. To match exactly 2 (out of several), HAVING COUNT(1) = 2, etc.
A query like the first one in my answer can be modified to use OUTER joins, to find matches, and then do a conditional test in the WHERE clause, to determine if a match was found.
SELECT p.id
, p.title
FROM zd_products p
LEFT
JOIN zd_product_attached_attributes a1
ON a1.pid = p.id
AND a1.attrid = 1
AND a1.value = 'A'
LEFT
JOIN zd_product_attached_attributes a3
ON a3.pid = p.id
AND a3.attrid = 3
AND a3.value = 'AB'
WHERE a1.pid IS NOT NULL
OR a3.pid IS NOT NULL
I've just added the LEFT keyword, to specify an outer join; rows from products will be returned with matching rows from a1 and a3, along with rows from products that don't have any matching rows found in a1 or a3.
The WHERE clause tests a column from a1 and a3 to see whether a matching row was returned. If a matching row was found in a1, we are guaranteed that the pid column from a1 will be non-NULL. That column will be returned as NULL only if a matching row was not found.
If we replaced the OR with an AND, we'd be negating the "outerness" of both joins, making it essentially equivalent to the first query above.
To get an XOR type operation (exclusive OR) where we find one matching attribute but not the other, we could change the WHERE clause to read:
WHERE (a1.pid IS NOT NULL AND a3.pid IS NULL)
OR (a3.pid IS NOT NULL AND a1.pid IS NULL)
Use a pivot
You can do this type of query using a pivot. As far as I know, MySQL doesn't have a native, built in pivot, but you can achieve this by transposing the rows and columns of your zd_product_attached_attributes table using:
SELECT pid,
MAX(CASE WHEN attrid = 1 THEN value END) `attrid_1`,
MAX(CASE WHEN attrid = 2 THEN value END) `attrid_2`,
MAX(CASE WHEN attrid = 3 THEN value END) `attrid_3`
FROM zd_product_attached_attributes
GROUP BY pid
This will pivot your table as shown:
+----+---------+-------+ +----+----------+----------+----------+
| attrid | pid | value | | pid| attrid_1 | attrid_2 | attrid_3 |
+----+---+-------------+ +----+----------+----------+----------+
| 1 | 1 | A | | 1 | A | 10 | AB |
| 2 | 1 | 10 | => | 2 | B | 22 | BB |
| 3 | 1 | AB | | 3 | A | 10 | CC |
| 1 | 2 | B | +----+----------+----------+----------+
| 2 | 2 | 22 |
| 3 | 2 | BB |
| 1 | 3 | A |
| 2 | 3 | 10 |
| 3 | 3 | CC |
+--------+---------+---+
So you can select the products id and title using:
SELECT id, title FROM zd_products
LEFT JOIN
(
SELECT pid,
MAX(CASE WHEN attrid = 1 THEN value END) `attrid_1`,
MAX(CASE WHEN attrid = 2 THEN value END) `attrid_2`,
MAX(CASE WHEN attrid = 3 THEN value END) `attrid_3`
FROM zd_product_attached_attributes
GROUP BY pid
) AS attrib_search
ON id = pid
WHERE ( attrib_1 = 'A' AND attrib_3 = 'AB' )
OR ( attrib_2 = 10 AND attrib_3 = 'CC' )
Note: You can use this type of query when you have guaranteed uniqueness on (pid, attrid)
(thanks #spencer7593)
I haven't tested this, but I think it should work:
select title
from zd_products p
join zd_product_attached_attributes a ON a.pid = p.id
where ( attrid = 1 and value = 'A' )
or ( attrid = 3 and value = 'AB' );
If you want to tack on more "searches" you could append more lines similar to the last one (ie. or "or" statements)

nested query & transaction

Update #1: query gives me syntax error on Left Join line (running the query within the left join independently works perfectly though)
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance'
FROM MyTable b1
JOIN CustomerInfoTable c on c.id = b1.company_id
#Filter for Clients of particular brand, package and active status
where c.brand_id = 2 and c.status = 2 and c.package_id = 3
LEFT JOIN
(
SELECT b2.company_id, sum(b2.debit) as 'Current_Usage'
FROM MyTable b2
WHERE year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
)
b3 on b3.company_id = b1.company_id
group by b1.company_id;
Original Post:
I keep track of debits and credits in the same table. The table has the following schema:
| company_id | timestamp | credit | debit |
| 10 | MAY-25 | 100 | 000 |
| 11 | MAY-25 | 000 | 054 |
| 10 | MAY-28 | 000 | 040 |
| 12 | JUN-01 | 100 | 000 |
| 10 | JUN-25 | 150 | 000 |
| 10 | JUN-25 | 000 | 025 |
As my result, I want to to see:
| Grouped by: company_id | Balance* | Current_Usage (in June) |
| 10 | 185 | 25 |
| 12 | 100 | 0 |
| 11 | -54 | 0 |
Balance: Calculated by (sum(credit) - sum(debits))* - timestamp does not matter
Current_Usage: Calculated by sum(debits) - but only for debits in JUN.
The problem: If I filter by JUN timestamp right away, it does not calculate the balance of all time but only the balance of any transactions in June.
How can I calculate the current usage by month but the balance on all transactions in the table. I have everything working, except that it filters only the JUN results into the current usage calculation in my code:
SELECT b.company_id, ((sum(b.credit)-sum(b.debit))/1024/1024/1024/1024) as 'BW_remaining', sum(b.debit/1024/1024/1024/1024/28*30) as 'Usage_per_month'
FROM mytable b
#How to filter this only for the current_usage calculation?
WHERE month(a.timestamp) = 'JUN' and a.credit = 0
#Group by company in order to sum all entries for balance
group by b.company_id
order by b.balance desc;
what you will need here is a join with sub query which will filter based on month.
SELECT T1.company_id,
((sum(T1.credit)-sum(T1.debit))/1024/1024/1024/1024) as 'BW_remaining',
MAX(T3.DEBIT_PER_MONTH)
FROM MYTABLE T1
LEFT JOIN
(
SELECT T2.company_id, SUM(T2.debit) T3.DEBIT_PER_MONTH
FROM MYTABLE T2
WHERE month(T2.timestamp) = 'JUN'
GROUP BY T2.company_id
)
T3 ON T1.company_id-T3.company_id
GROUP BY T1.company_id
I havn't tested the query. The point here i am trying to make is how you can join your existing query to get usage per month.
alright, thanks to #Kshitij I got it working. In case somebody else is running into the same issue, this is how I solved it:
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance',
(
SELECT sum(b2.debit)
FROM MYTABLE b2
WHERE b2.company_id = b1.company_id and year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
) AS 'Usage_June'
FROM MYTABLE b1
#Group by company in order to add sum of all zones the company is using
group by b1.company_id
order by Usage_June desc;

MySql: Multiple Left Join giving wrong output

I'm having a little trouble about using multiple Left Joins in a query. Some of the tables have one-to-one relationship with the left-table and some have one-to-many relation. The query looks like this:
Select
files.filename,
coalesce(count(distinct case
when dm_data.weather like '%clear%' then 1
end),
0) as clear,
coalesce(count(distinct case
when dm_data.weather like '%lightRain%' then 1
end),
0) as lightRain,
coalesce(count(case
when kc_data.type like '%bicycle%' then 1
end),
0) as bicycle,
coalesce(count(case
when kc_data.type like '%bus%' then 1
end),
0) as bus,
coalesce(count(case
when kpo_data.movement like '%walking%' then 1
end),
0) as walking,
coalesce(count(case
when kpo_data.type like '%pedestrian%' then 1
end),
0) as pedestrian
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
left join
kpo_data ON kpo_data.id = files.id
where
files.filename in (X, Y, Z, ........)
group by files.filename;
Here, dm_data table has a one-to-one relation with 'files' table (thats why I'm using 'Distinct'), whereas kc_data and kpo_data data has one-to-many relationship with the 'files' table. (kc_data and kpo_data can have 10 to 20 rows against one files.id). This query works fine.
The problem arises when I add another left join with another one-to-many table pd_markings (which can have 100s of rows against one files.id).
Select
files.filename,
coalesce(count(distinct case
when dm_data.weather like '%clear%' then 1
end),
0) as clear,
coalesce(count(distinct case
when dm_data.weather like '%lightRain%' then 1
end),
0) as lightRain,
coalesce(count(case
when kc_data.type like '%bicycle%' then 1
end),
0) as bicycle,
coalesce(count(case
when kc_data.type like '%bus%' then 1
end),
0) as bus,
coalesce(count(case
when kpo_data.movement like '%walking%' then 1
end),
0) as walking,
coalesce(count(case
when kpo_data.type like '%pedestrian%' then 1
end),
0) as pedestrian,
**coalesce(count(case
when pd_markings.movement like '%walking%' then 1
end),
0) as walking**
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
left join
kpo_data ON kpo_data.id = files.id
left join
**kpo_data ON pd_markings.id = files.id**
where
files.filename in (X, Y, Z, ........)
group by files.filename;
Now all the values become multiple of each other. Any ideas???
Note that the first two columns return 1 or 0 value. Thats the desired result actually, as one-to-one relationship tables will only have either 1 or 0 rows against any files.id, so if I don't use 'Distinct' then the resulting value is wrong (i guess because of the other tables which are returning more then one row against same file.id) No, unfortunately, my tables don't have their own unique ID columns except the 'files' table.
You need to flatten the results of your query, in order to obtain a right count.
You said you have one-to-many relationship from your files table to other table(s)
If SQL only has a keyword LOOKUP instead of cramming everything in JOIN keywords, it shall be easy to infer if the relation between table A and table B is one-to-one, using JOIN will automatically connotes one-to-many. I digress. Anyway, I should have already inferred that your files is one-to-many against dm_data; and also, the files against kc_data is one-to-many too. LEFT JOIN is another hint that the relationship between first table and second table is one-to-many; this is not definitive though, some coders just write everything with LEFT JOIN. There's nothing wrong with your LEFT JOIN in your query, but if there are multiple one-to-many tables in your query, that will surely fail, your query will produce repeating rows against other rows.
from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id
So with this knowledge that you indicate files is one-to-many against dm_data, and it is one-to-many also against kc_data. We can conclude that there's something wrong with chaining those joins and grouping them on one monolithic query.
An example if you have three tables, namely app(files), ios_app(dm_data), android_app(kc_data), and this is the data for example for ios:
test=# select * from ios_app order by app_code, date_released;
ios_app_id | app_code | date_released | price
------------+----------+---------------+--------
1 | AB | 2010-01-01 | 1.0000
3 | AB | 2010-01-03 | 3.0000
4 | AB | 2010-01-04 | 4.0000
2 | TR | 2010-01-02 | 2.0000
5 | TR | 2010-01-05 | 5.0000
(5 rows)
And this is the data for your android:
test=# select * from android_app order by app_code, date_released;
.android_app_id | app_code | date_released | price
----------------+----------+---------------+---------
1 | AB | 2010-01-06 | 6.0000
2 | AB | 2010-01-07 | 7.0000
7 | MK | 2010-01-07 | 7.0000
3 | TR | 2010-01-08 | 8.0000
4 | TR | 2010-01-09 | 9.0000
5 | TR | 2010-01-10 | 10.0000
6 | TR | 2010-01-11 | 11.0000
(7 rows)
If you merely use this query:
select x.app_code,
count(i.date_released) as ios_release_count,
count(a.date_released) as android_release_count
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
group by x.app_code
order by x.app_code
The output will be wrong instead:
app_code | ios_release_count | android_release_count
----------+-------------------+-----------------------
AB | 6 | 6
MK | 0 | 1
PM | 0 | 0
TR | 8 | 8
(4 rows)
You can think of chained joins as cartesian product, so if you have 3 rows on first table, and has 2 rows on second table, the output will be 6
Here's the visualization, see that there is 2 repeating android AB for every ios AB. There are 3 ios AB, so what would be the count when you do COUNT(ios_app.date_released)? That will become 6; the same with COUNT(android_app.date_released), this will also be 6. Likewise there's 4 repeating android TR for every ios TR, there are are 2 TR in ios, so that would give us a count of 8.
.app_code | ios_release_date | android_release_date
----------+------------------+----------------------
AB | 2010-01-01 | 2010-01-06
AB | 2010-01-01 | 2010-01-07
AB | 2010-01-03 | 2010-01-06
AB | 2010-01-03 | 2010-01-07
AB | 2010-01-04 | 2010-01-06
AB | 2010-01-04 | 2010-01-07
MK | | 2010-01-07
PM | |
TR | 2010-01-02 | 2010-01-08
TR | 2010-01-02 | 2010-01-09
TR | 2010-01-02 | 2010-01-10
TR | 2010-01-02 | 2010-01-11
TR | 2010-01-05 | 2010-01-08
TR | 2010-01-05 | 2010-01-09
TR | 2010-01-05 | 2010-01-10
TR | 2010-01-05 | 2010-01-11
(16 rows)
So what you should do is flatten each result before you join them to other tables and queries.
If your database is capable of CTE, please use so. It's very neat and very self-documenting:
with ios_app_release_count_list as
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
)
,android_release_count_list as
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
)
select
x.app_code,
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join ios_app_release_count_list i on i.app_code = x.app_code
left join android_release_count_list a on a.app_code = x.app_code
order by x.app_code;
Whereas if your database has no CTE capability yet, like MySQL, you should do this instead:
select x.app_code,
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
) i on i.app_code = x.app_code
left join
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
) a on a.app_code = x.app_code
order by x.app_code
That query and the CTE-style query will show the correct output:
app_code | ios_release_count | android_release_count
----------+-------------------+-----------------------
AB | 3 | 2
MK | 0 | 1
PM | 0 | 0
TR | 2 | 4
(4 rows)
Live test
Incorrect query: http://www.sqlfiddle.com/#!2/9774a/2
Correct query: http://www.sqlfiddle.com/#!2/9774a/1
I question your distinct usage here - the way it is written it will return 1 or 0. Which means a count distinct will only ever return 0, 1 or 2.
I assume you have unique ID columns in each of your tables. You can change the case to return the ID value, then count distinct that. If your join returns multiple of the same row from your pd_markings table, a distinct count on the ID will return, well, only the distinct count of rows.