Consolidate SQL query with 2 subselects with same where clause - mysql

My app uses a scores table with a locationId, scoreDateTime, score, and comment columns. Users can score a location and optionally submit comments. A small data set might look like the following:
mysql> select locationId, scoreDateTime, score, comments from scores;
+-----------------------------+-------------------------+-------+--------------------------------+
| locationId | scoreDateTime | score | comments |
+-----------------------------+-------------------------+-------+--------------------------------+
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 17:30:32.899 | 3 | asdfasf |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 18:28:46.221 | 3 | |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 18:29:56.395 | 3 | safasf |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 18:32:10.358 | 3 | |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 18:49:32.262 | 3 | |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 18:50:33.693 | 3 | |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 19:13:58.456 | 3 | |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 2016-04-17 19:28:10.435 | 3 | asdfasf |
| ChIJqZyf8O8F44kRhatfHL4GYe0 | 2016-04-17 23:20:28.857 | 3 | aasdfasfsfsd |
| ChIJqZyf8O8F44kRhatfHL4GYe0 | 2016-04-17 23:22:55.254 | 3 | asdfasfasfsafasfsfasf asdfasfd |
| ChIJqZyf8O8F44kRhatfHL4GYe0 | 2016-04-17 23:40:37.106 | 3 | |
| ChIJpbSR1a4I44kRemEzTpniis8 | 2016-04-19 11:17:41.836 | 5 | adfgadf |
| ChIJF1LAoqgI44kR5EWvRqJPUN4 | 2016-04-19 11:17:52.536 | 4 | |
+-----------------------------+-------------------------+-------+--------------------------------+
I'd like to build a single query that will get the following for each location:
a score count from the last X hours
a comment count from the last Y days
the latest scoreDateTime (or NULL) for any comments in the last Y days
My motivation is to show locations, their recent score counts, their historical comment counts, and their latest comment datetime (or null). This will give me the recent running score counts and the hotness of the comment trail.
The following query works. However, the duplicate locationId list is actually going to be much higher in production. QUESTION: I'd like to know if there is a performant way to consolidate the 2 locationId lists, a.k.a 'locationId in (...)'.
select
x.locationId, count1, count2, count3, count4, count5, IFNULL(commentCount,0) as commentCount, lastCommentDateTime
from
( select
locationId,
sum(if (score = 1, 1, 0)) count1,
sum(if (score = 2, 1, 0)) count2,
sum(if (score = 3, 1, 0)) count3,
sum(if (score = 4, 1, 0)) count4,
sum(if (score = 5, 1, 0)) count5
from
scores
where
scoreDateTime > '2016-04-16 21:38:51.843' and
locationId in (
'ChIJqZyf8O8F44kRbNWHQkDkpGQ',
'ChIJqZyf8O8F44kRhatfHL4GYe0',
'ChIJCes00a4I44kRKG8zB4KvYTM',
'ChIJP-eRLq8I44kRKU6VOpTXqTM',
'ChIJpbSR1a4I44kRemEzTpniis8',
'ChIJF1LAoqgI44kRip2l7rjO2g4',
'ChIJF1LAoqgI44kR5EWvRqJPUN4',
'ChIJF1LAoqgI44kRRD_ZvPUmrGA',
'ChIJjweq4h0G44kRWoCPQKPdrPM',
'ChIJf2tVDB4G44kRTYjhl3sjm8M',
'ChIJ_Vg4giEG44kRq2nvtjEn8yA',
'ChIJP00qFSMG44kRyKcy2f_S12o'
)
group by locationId
) as x
left join
( select
locationId,
count(comments) as commentCount,
max(scoreDateTime) as lastCommentDateTime
from
scores
where
comments != "" and
scoreDateTime > '2016-01-16 00:00:00.000' and
locationId in (
'ChIJqZyf8O8F44kRbNWHQkDkpGQ',
'ChIJqZyf8O8F44kRhatfHL4GYe0',
'ChIJCes00a4I44kRKG8zB4KvYTM',
'ChIJP-eRLq8I44kRKU6VOpTXqTM',
'ChIJpbSR1a4I44kRemEzTpniis8',
'ChIJF1LAoqgI44kRip2l7rjO2g4',
'ChIJF1LAoqgI44kR5EWvRqJPUN4',
'ChIJF1LAoqgI44kRRD_ZvPUmrGA',
'ChIJjweq4h0G44kRWoCPQKPdrPM',
'ChIJf2tVDB4G44kRTYjhl3sjm8M',
'ChIJ_Vg4giEG44kRq2nvtjEn8yA',
'ChIJP00qFSMG44kRyKcy2f_S12o'
)
group by locationId
) as y
on x.locationId = y.locationId;
The results look like the following:
mysql> source ../../query3.sql
+-----------------------------+--------+--------+--------+--------+--------+--------------+-------------------------+
| locationId | count1 | count2 | count3 | count4 | count5 | commentCount | lastCommentDateTime |
+-----------------------------+--------+--------+--------+--------+--------+--------------+-------------------------+
| ChIJF1LAoqgI44kR5EWvRqJPUN4 | 0 | 0 | 0 | 1 | 0 | 0 | NULL |
| ChIJpbSR1a4I44kRemEzTpniis8 | 0 | 0 | 0 | 0 | 1 | 1 | 2016-04-19 11:17:41.836 |
| ChIJqZyf8O8F44kRbNWHQkDkpGQ | 0 | 0 | 8 | 0 | 0 | 3 | 2016-04-17 19:28:10.435 |
| ChIJqZyf8O8F44kRhatfHL4GYe0 | 0 | 0 | 3 | 0 | 0 | 2 | 2016-04-17 23:22:55.254 |
+-----------------------------+--------+--------+--------+--------+--------+--------------+-------------------------+

It looks like the difference between your 2 queries are the scoreDateTime and comments criteria. One way to combine your queries is by moving these conditions to your select using conditional aggregation.
Also, mysql evaluates booleans to 1 or 0, so you can simplify your sum calls by removing your if statements.
select
locationId,
sum(score = 1 and scoreDateTime > '2016-04-16 21:38:51.843') count1,
sum(score = 2 and scoreDateTime > '2016-04-16 21:38:51.843') count2,
sum(score = 3 and scoreDateTime > '2016-04-16 21:38:51.843') count3,
sum(score = 4 and scoreDateTime > '2016-04-16 21:38:51.843') count4,
sum(score = 5 and scoreDateTime > '2016-04-16 21:38:51.843') count5,
sum(comments != "") commentCount,
max(case when comments != "" then scoreDateTime end) as lastCommentDateTime
from
scores
where
scoreDateTime > '2016-01-16 00:00:00.000' and
locationId in (
'ChIJqZyf8O8F44kRbNWHQkDkpGQ',
'ChIJqZyf8O8F44kRhatfHL4GYe0',
'ChIJCes00a4I44kRKG8zB4KvYTM',
'ChIJP-eRLq8I44kRKU6VOpTXqTM',
'ChIJpbSR1a4I44kRemEzTpniis8',
'ChIJF1LAoqgI44kRip2l7rjO2g4',
'ChIJF1LAoqgI44kR5EWvRqJPUN4',
'ChIJF1LAoqgI44kRRD_ZvPUmrGA',
'ChIJjweq4h0G44kRWoCPQKPdrPM',
'ChIJf2tVDB4G44kRTYjhl3sjm8M',
'ChIJ_Vg4giEG44kRq2nvtjEn8yA',
'ChIJP00qFSMG44kRyKcy2f_S12o'
)
group by locationId
This query can take advantage of a composite index on (locationId, scoreDateTime)

Related

How to count in a range of result in mysql

l have a record table now, and l must to statistics the result of every month.
here is a test table
+----+------+----------+----------+------+
| id | name | grade1 | grade2 | time |
+----+------+----------+----------+------+
| 1 | a | 1 | 1 | 1 |
| 2 | a | 0 | 1 | 1 |
| 3 | a | 1 | 2 | 2 |
| 4 | b | 1 | 2 | 2 |
| 5 | a | 1 | 1 | 2 |
+----+------+----------+----------+------+
5 rows in set (0.01 sec)
time column means month(the actual is timestamp).
l need to statistics total number those grade1 >=1 && grade2 >=1 in every month
So, l want to get the result like this
+----+------+----------+----------+----------+----------+------+
| id | name | grade1_m1| grade2_m1| grade1_m2| grade2_m2| time |
+----+------+----------+----------+----------+----------+------+
| 13 | a | 1 | 2 | null | null | 1 |
| 14 | a | null | null | 2 | 2 | 2 |
| 15 | b | null | null | 1 | 1 | 2 |
+----+------+----------+----------+----------+----------+------+
3 rows in set (0.00 sec)
fake code of sql seem like this:
select
count(grade1 where time=1 and grade1 >= 1) as grade1_m1,
count(grade2 where time=1 and grade2 >= 1) as grade1_m1,
count(grade1 where time=2 and grade1 >= 1) as grade1_m2,
count(grade2 where time=2 and grade2 >= 1) as grade1_m2,
-- ... 12 months' statistics
from test
group by name
In the fact, l done it, but with temporary table like follow:
select
count(if(m1.grade1>=1, 1, null)) as grade1_m1,
count(if(m1.grade2>=1, 1, null)) as grade2_m1,
count(if(m2.grade1>=1, 1, null)) as grade1_m2,
count(if(m2.grade2>=1, 1, null)) as grade2_m2,
-- ...
from test
left join
(select * from test where time = 1) as m1
on m1.id = test.id
left join
(select * from test where time = 1) as m2
on m2.id = test.id
-- ...
group by name
But this sql is toooooooo long. this test table is just a simple version. Under real situation, l printed my sql and that took up two screens in chrome. So l am seeking a more simple way to complete it
You're original version is almost there. You need case and sum() is more appropriate:
select name,
sum(case when time=1 and grade1 >= 1 then grade1 end) as grade1_m1,
sum(case when time=1 and grade2 >= 1 then grade2 end) as grade2_m1,
sum(case when time=2 and grade1 >= 1 then grade1 end) as grade1_m2,
sum(case time=2 and grade2 >= 1 then grade2 end) as grade2_m2,
-- ... 12 months' statistics
from test
group by name

Join two table and count, avoid zero if record is not available in second table

I have following tables products and tests.
select id,pname from products;
+----+---------+
| id | pname |
+----+---------+
| 1 | prd1 |
| 2 | prd2 |
| 3 | prd3 |
| 4 | prd4 |
+----+---------+
select pname,testrunid,testresult,time from tests;
+--------+-----------+------------+-------------+
| pname | testrunid | testresult | time |
+--------+-----------+------------+-------------+
| prd1 | 800 | PASS | 2017-10-02 |
| prd1 | 801 | FAIL | 2017-10-16 |
| prd1 | 802 | PASS | 2017-10-02 |
| prd1 | 803 | NULL | 2017-10-16 |
| prd1 | 804 | PASS | 2017-10-16 |
| prd1 | 805 | PASS | 2017-10-16 |
| prd1 | 806 | PASS | 2017-10-16 |
+--------+-----------+------------+-------------+
I like to count test results for products and if there is no result available,for a product just show a zero for it. something like following table:
+--------+------------+-----------+----------------+---------------+
| pname | total_pass | total_fail| pass_lastweek | fail_lastweek |
+--------+------------+-----------+----------------+---------------+
| prd1 | 5 | 1 | 3 | 1 |
| prd2 | 0 | 0 | 0 | 0 |
| prd3 | 0 | 0 | 0 | 0 |
| prd4 | 0 | 0 | 0 | 0 |
+--------+------------+-----------+----------------++--------------+
I have tried different queries like following, which is just working for one product and is incomplete:
SELECT pname, count(*) as pass_lastweek FROM tests where testresult = 'PASS' AND time
>= '2017-10-11' and pname in (select pname from products) group by pname;
+-------------+---------------+
| pname | pass_lastweek |
+-------------+---------------+
| prd1 | 3 |
+-------------+---------------+
it looks so basic but still I am unable to write it, any idea?
Use conditional aggregation. The COUNT function count NULL values as zeros automatically, therefore, there is no need to take care of that.
select p.pname,
count(case when testresult = 'PASS' then 1 end) as total_pass,
count(case when testresult = 'FAIL' then 1 end) as total_fail,
count(case when testresult = 'PASS' and time >= curdate() - INTERVAL 6 DAY then 1 end) as pass_lastweek ,
count(case when testresult = 'FAIL' and time >= curdate() - INTERVAL 6 DAY then 1 end) as fail_lastweek ,
from products p
left join tests t on t.pname = p.pname
group p.id, p.pname
Generally, you need to LEFT JOIN the first table with the second one before you group. The join will give you a row for each product (even if there are no test results to join it to; INNER JOIN would exclude products with no associated tests) + an additional row for each test result (beyond the first). Then you can group them.
SELECT products.*, tests.* FROM products
LEFT JOIN tests ON products.pname = tests.pname
GROUP BY products.id
Also, I would strongly recommend using a product_id column in the tests table, rather than using pname (if a products.pname changes, your whole DB breaks unless you also update the pname field in kind for every test result). The general query would then look like this:
SELECT products.*, tests.* FROM products
LEFT JOIN tests ON products.id = tests.product_id
GROUP BY products.id
I used 2 queries , the first with conditional count and the second one is to change all null values into 0 :
select pname,
case when total_pass is null then 0 else total_pass end as total_pass,
case when total_fail is null then 0 else total_fail end as total_fail,
case when pass_lastweek is null then 0 else pass_lastweek end as pass_lastweek,
case when fail_lastweek is null then 0 else fail_lastweek end asfail_lastweek from (
select products.pname,
count(case when testresult = 'PASS' then 1 end) as total_pass,
count(case when testresult = 'FAIL' then 1 end) as total_fail,
count(case when testresult = 'PASS' and time >= current_date -7 DAY then 1 end) as pass_lastweek ,
count(case when testresult = 'FAIL' and time >= current_date -7 DAY then 1 end) as fail_lastweek ,
from products
left join tests on tests.pname = products.pname
group 1 ) t1

SQL Help: How come the total from this query is different that a summation query?

This query does a group by on lead_source_id:
SELECT ch.lead_source_id,
Count(DISTINCT ch.repurchased_date)
FROM customers_history ch
WHERE ch.repurchased_date >= '2014-04-01'
AND ch.repurchased_date < '2014-05-01'
AND ch.lead_source_id IS NOT NULL
GROUP BY ch.lead_source_id;
And this query totals the records in the table:
SELECT Count(DISTINCT( repurchased_date ))
FROM customers_history
INNER JOIN (SELECT DISTINCT( customer_id ) AS xcid
FROM customers_history
WHERE repurchased_date >= '2014-04-01'
AND repurchased_date < '2014-05-01'
AND lead_source_id IS NOT NULL) AS Temp
ON Temp.xcid = customer_id
WHERE repurchased_date >= '2014-04-01'
AND repurchased_date < '2014-05-01'
AND lead_source_id IS NOT NULL;
On our production data, the totals from Query1 come to 7963, but the second query prints 7905. Why the difference and how can we fix our queries?
Here's our table layout:
+--------+-------------+----------------+---------------------+--------+
| id | customer_id | lead_source_id | repurchased_date | Rating |
+--------+-------------+----------------+---------------------+--------+
| 422923 | 420450 | 4 | 2014-04-14 09:16:48 | Warm |
| 422924 | 420450 | 4 | 2014-04-14 09:16:48 | Cold |
| 422956 | 420450 | 4 | 2014-04-14 09:16:49 | Hot |
| 422933 | 420451 | 37 | 2014-04-14 09:18:41 | Hot |
| 422938 | 420452 | 1 | 2014-04-10 20:50:30 | Hot |
| 422984 | 420452 | 1 | 2014-04-12 20:50:30 | Warm |
| 422940 | 420453 | 47 | 2014-04-14 09:20:27 | Hot |
+--------+-------------+----------------+---------------------+--------+
EDIT
To answer some of the possibilities about nulls:
select count(id) from customers_history where customer_id is null: 0
select count(id) from customers_history where lead_source_id is null: 5103
select count(id) from customers_history where repurchased_date is null: 0
The most obvious conclusion is that some lead_source_ids share values of repurchased_date.
Another possibility is that you have NULL values for customer_id and the second filters these out.
The third possibility is that NULL values of lead_source_id are adding additional values in the first query.

complex MySQL query wrong results

I am trying to build complex mysql query but its returning wrong results...
SELECT
b.name AS batch_name,
b.id AS batch_id,
COUNT(DISTINCT s.id)
AS total_students,
COALESCE( SUM(s.open_bal), 0 )
AS open_balance,
SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
) AS gross_fee,
SUM( COALESCE(i.discount, 0) )
AS discount,
COALESCE( SUM(s.open_bal), 0 )
+ SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
)
- SUM( COALESCE(i.discount, 0) )
AS net_payable,
SUM( COALESCE(r.reg_fee, 0)
+ COALESCE(r.tut_fee, 0)
+ COALESCE(r.other_fee, 0)
) AS net_recieved,
( COALESCE( SUM(s.open_bal), 0 )
+ SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
)
- SUM( COALESCE(i.discount, 0) )
)
- ( SUM( COALESCE(r.reg_fee, 0)
+ COALESCE(r.tut_fee, 0)
+ COALESCE(r.other_fee, 0)
)
)
AS balance_due
FROM batches b
LEFT JOIN students s ON s.batch = b.id
LEFT JOIN invoices i ON i.student_id = s.id
LEFT JOIN recipts r ON r.student_id = s.id
WHERE s.inactive = 0
GROUP BY b.name, b.id;
Returns following results...
| batch_name | total_students | open_bal | gross_fee | discount | net_payable | net_recieved | due_balance |
+------------+-----------------+----------+-----------+----------+-------------+--------------+-------------+
| MS | 6 | 10000 | 0 | 0 | 10000 | 101000 | -91000 |
+------------+-----------------+----------+-----------+----------+-------------+--------------+-------------+
batches table
| id | name |
+-----+------+
| 9 | Ms |
+-----+------+
Students table
| id | open_bal | batch | inactive |
+-----+----------+-------+----------+
| 44 | -16000 | 9 | 0 |
+-----+----------+-------+----------+
| 182 | 9000 | 9 | 0 |
+-----+----------+-------+----------+
| 184 | -36000 | 9 | 0 |
+-----+----------+-------+----------+
| 185 | 19000 | 9 | 0 |
+-----+----------+-------+----------+
| 186 | 9000 | 9 | 0 |
+-----+----------+-------+----------+
| 187 | 4000 | 9 | 0 |
+-----+----------+-------+----------+
Invoices Table
| id | student_id | reg_fee | tut_fee | other_fee | net_payable | discount |
+------+------------+---------+---------+-----------+-------------+----------+
| | | | | | | |
+------+------------+---------+---------+-----------+-------------+----------+
No invoices are available for above students id.
Recipts table
| id | student_id | reg_fee | tut_fee | other_fee | status |
+------+------------+---------+---------+-----------+------------+
| 8 | 44 | 0 | 0 | 1500 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 277 | 44 | 0 | 50000 | 0 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 26 | 182 | 0 | 0 | 1500 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 424 | 182 | 0 | 15000 | 0 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 468 | 182 | 0 | 15000 | 0 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 36 | 185 | 0 | 0 | 1500 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 697 | 185 | 0 | 15000 | 0 | confirmed |
+------+------------+---------+---------+-----------+------------+
| 66 | 187 | 0 | 0 | 1500 | confirmed |
+------+------------+---------+---------+-----------+------------+
Expected results using above sql query and tables...
| batch_name | total_students | open_bal | gross_fee | discount | net_payable | net_recieved | due_balance |
+------------+-----------------+----------+-----------+----------+-------------+--------------+-------------+
| MS | 6 | -11000 | 0 | 0 | 10000 | 101000 | -112000 |
+------------+-----------------+----------+-----------+----------+-------------+--------------+-------------+
You still haven't provided full information - no batches table, even the not existing recipts table.. Anyway, I assume we don't care whats in the batches table, lets say it's just the name and id. Your receipts table have multiple rows for the same student. This should result in multiple rows returned for the other tables as well, due to all the JOINs. Therefore you SUM() multiple times values which must be summed just once, i.e. open_balance. This could be a clue as to where the problem is, I'd say you have to move the info that you need from 'the receipts table into subqueries, but I'm not sure you've shown us the entirety of your DB. Try removing the receipts table from the query and check the results again. If that's it, you should see what to do from there on or at least give more info to us.
EDIT:
The query should be:
SELECT
b.name AS batch_name,
b.id AS batch_id,
COUNT(DISTINCT s.id)
AS total_students,
COALESCE( SUM(s.open_bal), 0 )
AS open_balance,
SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
) AS gross_fee,
SUM( COALESCE(i.discount, 0) )
AS discount,
COALESCE( SUM(s.open_bal), 0 )
+ SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
)
- SUM( COALESCE(i.discount, 0) )
AS net_payable,
SUM((SELECT SUM(COALESCE(receipts.reg_fee, 0)
+ COALESCE(receipts.tut_fee, 0)
+ COALESCE(receipts.other_fee, 0)) FROM receipts WHERE receipts.student_id = s.id))
AS net_recieved,
( COALESCE( SUM(s.open_bal), 0 )
+ SUM( COALESCE(i.reg_fee, 0)
+ COALESCE(i.tut_fee, 0)
+ COALESCE(i.other_fee, 0)
)
- SUM( COALESCE(i.discount, 0) )
)
- SUM((SELECT SUM(COALESCE(receipts.reg_fee, 0)
+ COALESCE(receipts.tut_fee, 0)
+ COALESCE(receipts.other_fee, 0)) FROM receipts WHERE receipts.student_id = s.id))
AS balance_due
FROM batches b
LEFT JOIN students s ON s.batch = b.id
LEFT JOIN invoices i ON i.student_id = s.id
WHERE s.inactive = 0
GROUP BY b.name, b.id;
This will sum students data in the receipts table even if it's on more than one row, returning just one row. Removing the join to the receipts table removes duplicate lines from the other tables, so the calculations should now be correct.
One more thing - you've got s.inactive = 0 in the WHERE clause, make sure it's not relevant to this calculations.
P.S. How come you don't know what a sub query is and you end up writing stuff like that?
I have got the solution, i was joining lots of queries together and that's by some results are doubling. thanks.

mysql wrong results on big query

Please help me to fix mysql query and get correct results...
Please see dataset for tables as following...
students
| id | name | batch | discount | open_bal | inactive |
+----+-------+-------+----------+----------+----------+
| 1 | Ash | 19 | 0 | -5000 | 0 |
+----+-------+-------+----------+----------+----------+
| 2 | Tuh | 15 | 0 | 0 | 0 |
+----+-------+-------+----------+----------+----------+
invoices
| id | invoice_num | student_id | reg_fee | tut_fee | other_fee | discount |
+------+-------------+------------+---------+---------+-----------+----------+
| 1 | 2011/1 | 1 | 5000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
| 137 | 2011/137 | 1 | 15000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
| 169 | 2011/169 | 2 | 15000 | 0 | 0 | 0 |
+------+-------------+------------+---------+---------+-----------+----------+
recipts
| id | recipt_num | student_id | reg_fee | tut_fee | other_fee | status |
+------+-------------+------------+---------+---------+-----------+------------+
| 264 | 2011/264 | 1 | 0 | 15000 | 0 | confirmed |
+------+-------------+------------+---------+---------+-----------+------------+
| 18 | 2011/18 | 2 | 0 | 5250 | 0 | confirmed |
+------+-------------+------------+---------+---------+-----------+------------+
| 251 | 2011/251 | 2 | 4650 | 0 | 0 | pending |
+------+-------------+------------+---------+---------+-----------+------------+
batches
| id | name |
+-----+----------+
| 19 | S.T-11 |
+-----+----------+
| 15 | Mc/11-13 |
+-----+----------+
I want to achieve report according to batches....
Batch id - batch id from batches table
Batch Name - batch name from batches table
Total Students - count(s.id) from students table group by batch
Opening Bal - sum(s.openbal) from students table
Gross Fee - sum(reg_fee+tut_fee+other_fee) from invoices table
Discount - sum(i.discount) from invoices table
Net Payable - (openbal + grossfee) - discount
Net Received - sum(reg_fee+tut_fee+other_fee) from recipts table where r.status = 'confirmed'
Due Balance - Net Payable - Net Received
expected report
| batch_id | batch_name | total_students | opening_bal | gross_fee | discount | net_payable | net_recieved | due_balance |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
| 15 | 2011/264 | 1 | 0 | 15000 | 0 | 15000 | 5250 | 9750 |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
| 19 | S.T-11 | 1 | -5000 | 20000 | 0 | 15000 | 15000 | 0 |
+----------+------------+----------------+-------------+-----------+----------+-------------+--------------+-------------+
I have tried using following query but its giving wrong results.
SELECT b.name AS batch_name,
b.id AS batch_id,
COUNT( s.id ) AS total_students,
COALESCE( s.open_bal, 0 ) AS open_balance,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) AS gross_fee,
COALESCE( s.discount, 0 ) ,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) -
COALESCE( s.discount, 0 ) AS net_payable,
COALESCE( sum( r.reg_fee + r.tut_fee + r.other_fee ) , 0 ) AS net_recieved,
COALESCE( s.discount, 0 ) ,
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) -
COALESCE( s.discount, 0 ) -
COALESCE( sum( r.reg_fee + r.tut_fee + r.other_fee ) , 0 )
AS due_balance
FROM batches b
LEFT JOIN students s ON s.batch = b.id
LEFT JOIN invoices i ON i.student_id = s.id
LEFT JOIN recipts r ON r.student_id = s.id
WHERE s.inactive =0 and r.status = 'confirmed'
GROUP BY b.name;
please help me to rewrite this query...
Talking about SQL this line is quite certainly wrong:
GROUP BY b.name;
The GROUP BY should contain every element of the select which is not an aggregate expression.
Try the query using:
GROUP BY b.name,b.id,COALESCE(s.open_bal,0), COALESCE(s.discount,0);
When you do not make the right GROUP BY expression MySQL makes his own improved and simplified group by, which avoids a query rejection but produce higly unexpectable results, especially if your query is complex.
If you do not need a distinct result row for each s.open_bal and s.discount, then maybe you do not need theses (duplicates) data in the select.
Then I did not took the time to analyze the complete query. But your needs seems quite complex. I would say Divide and conquer, KISS (Keep It Stupid Simple), make several queries you fully understand instead of one huge query. Especially if requirements from some of the results differs (some working on details, some working on aggregates, and some working on different aggregates, etc), as you would maybe need some window functions ("partition by" keyword) that you do not have on MySQL.
maybe you should try to fix your sum like this example:
COALESCE( sum( i.reg_fee + i.tut_fee + i.other_fee ) , 0 ) //bad
sum( COALESCE(i.reg_fee,0) + COALESCE(i.tut_fee,0) + COALESCE(i.other_fee,0) ) //good