I'm wanting to optimize a query using a union as a sub query. Im not really sure how to construct the query though. I'm using MYSQL 5
Here is the original query:
SELECT Parts.id
FROM Parts_Category, Parts
LEFT JOIN Image ON Parts.image_id = Image.id
WHERE
(
(
Parts_Category.category_id = '508' OR
Parts_Category.main_category_id ='508'
) AND
Parts.id = Parts_Category.Parts_id
) AND
Parts.status = 'A'
GROUP BY
Parts.id
What I want to do is replace this ( (Parts_Category.category_id = '508' OR Parts_Category.main_category_id ='508' ) part with the union below. This way I can drop the GROUP BY clause and use straight col indexes which should improve performance. Parts and parts category tables contains half a million records each so any gain would be great.
(
SELECT * FROM
(
(SELECT Parts_id FROM Parts_Category WHERE category_id = '508')
UNION
(SELECT Parts_id FROM Parts_Category WHERE main_category_id = '508')
)
as Parts_id
)
Can anybody give me a clue on how to re-write it? I've tried for hours but can't get it as I'm only fairly new to MySQL.
SELECT Parts.id
FROM (
SELECT parts_id
FROM Parts_Category
WHERE Parts_Category.category_id = '508'
UNION
SELECT parts_id
FROM Parts_Category
WHERE Parts_Category.main_category_id = '508'
) pc
JOIN Parts
ON parts.id = pc.parts_id
AND Parts.status = 'A'
LEFT JOIN
Image
ON image.id = parts.image_id
Note that MySQL can use Index Merge and you can rewrite your query as this:
SELECT Parts.id
FROM (
SELECT DISTINCT parts_id
FROM Parts_Category
WHERE Parts_Category.category_id = '508'
OR Parts_Category.main_category_id = '508'
) pc
JOIN Parts
ON parts.id = pc.parts_id
AND Parts.status = 'A'
LEFT JOIN
Image
ON image.id = parts.image_id
, which will be more efficient if you have the following indexes:
Parts_Category (category_id, parts_id)
Parts_Category (main_category_id, parts_id)
Related
I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).
Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?
Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.
Here's the full SQL query (lots of nesting):
SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
FROM Post p
WHERE (p.Id IN (SELECT pc.PostId
FROM PostCreator pc
WHERE (pc.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pc.UserId = '100')
))
OR (p.Id IN (SELECT pum.PostId
FROM PostUserMentions pum
WHERE (pum.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pum.UserId = '100')
))
OR (p.Id IN (SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100'))
))
OR (p.Id IN (SELECT psm.PostId
FROM PostSMentions psm
WHERE (psm.StockId IN (SELECT sf.StockId
FROM StockFollowing sf
WHERE sf.UserId = '100' ))
))
UNION ALL
SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder
FROM Post p
INNER JOIN UserPostE upe
on p.Id = upe.PostId
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))
ORDER BY PostTimeOrder DESC;
Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL
SELECT
DISTINCT -- <<=- May not be needed
p.Id
, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
, p.Content AS Content
, p.Bu AS Bu
, p.Se AS Se
, UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
, p.PostCreationTime)) AS PostTimeOrder
FROM Post p
LEFT JOIN UserPostE upe
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND
(uf.FollowingId = '100' OR
upe.UserId = '100'))
on p.Id = upe.PostId
WHERE upe.PostID is not null
or exists (SELECT 1
FROM PostCreator pc
WHERE pc.PostId = p.ID
and pc.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pc.UserID
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM PostUserMentions pum
WHERE pum.PostId = p.ID
and pum.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pum.UserId
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM SStreamPost ssp
WHERE ssp.PostId = p.ID
and exists (SELECT 1
FROM SStreamFollowing ssf
WHERE ssf.SStreamId = ssp.SStreamId
and ssf.UserId = '100')
)
OR exists (SELECT 1
FROM PostSMentions psm
WHERE psm.PostId = p.ID
and exists (SELECT
FROM StockFollowing sf
WHERE sf.StockId = psm.StockId
and sf.UserId = '100' )
)
ORDER BY PostTimeOrder DESC
The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:
FROM Post p
LEFT JOIN UserPostE upe
on p.Id = upe.PostId
and ( upe.UserId = '100'
or exists (select 1
from UserFollowing uf
where uf.FollwedID = upe.UserID
and uf.FollowingId = '100'))
Turn IN ( SELECT ... ) into a JOIN .. ON ... (see below)
Turn OR into UNION (see below)
Some the tables are many:many mappings? Such as SStreamFollowing? Follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Example of IN:
SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (
SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100' ))
-->
SELECT ssp.PostId
FROM SStreamPost ssp
JOIN SStreamFollowing ssf ON ssp.SStreamId = ssf.SStreamId
WHERE ssf.UserId = '100'
The big WHERE with all the INs becomes something like
JOIN ( ( SELECT pc.PostId AS id ... )
UNION ( SELECT pum.PostId ... )
UNION ( SELECT ssp.PostId ... )
UNION ( SELECT psm.PostId ... ) )
Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.
I have this below:
SELECT a.* FROM ( SELECT
asset_liability_income_expenditure_tbl.a_l_code,
SUM(mainaccount_a_2017.amount), mainaccount_a_2017.dr_cr_action
FROM `mainaccount_a_2017` LEFT JOIN chart_of_account
ON (
chart_of_account.joint_account_numbers =
mainaccount_a_2017.joint_account_number
)
LEFT JOIN asset_liability_income_expenditure_tbl
ON (
asset_liability_income_expenditure_tbl.a_l_code =
chart_of_account.account_type
)
WHERE asset_liability_income_expenditure_tbl.a_l_code = 'FA'
AND mainaccount_a_2017.dr_cr_action = 'DR' UNION
SELECT asset_liability_income_expenditure_tbl.a_l_code,
SUM(mainaccount_b_2017.amount),
mainaccount_b_2017.dr_cr_action
FROM `mainaccount_b_2017`
LEFT JOIN chart_of_account ON (
chart_of_account.joint_account_numbers =
mainaccount_b_2017.joint_account_number
)
LEFT JOIN asset_liability_income_expenditure_tbl ON (
asset_liability_income_expenditure_tbl.a_l_code =
chart_of_account.account_type
)
WHERE asset_liability_income_expenditure_tbl.a_l_code = 'FA'
AND mainaccount_b_2017.dr_cr_action = 'DR'
) AS a
it works fine, but displays either one empty row at the top and the sum below or vis-a-vis. I tried LIMIT 1, but the problem is when the SUM(amount) outputs in row 2, i cannot fetch and if I don't apply any limit, it only fetches result whose SUM(amount) outputs in row 1. I don't know what am missing. Please kindly assist. Thanks.
I figured it out. I had to flip something. Results below:
SELECT SUM(a) FROM
(
SELECT SUM(mainaccount_a_2017.amount)
AS a FROM `mainaccount_a_2017`
LEFT JOIN chart_of_account ON (chart_of_account.joint_account_numbers = mainaccount_a_2017.joint_account_number)
LEFT JOIN asset_liability_income_expenditure_tbl ON (asset_liability_income_expenditure_tbl.a_l_code = chart_of_account.account_type)
WHERE asset_liability_income_expenditure_tbl.a_l_code = 'FA'
AND mainaccount_a_2017.dr_cr_action = 'DR'
UNION ALL
SELECT SUM(mainaccount_b_2017.amount)
AS a FROM `mainaccount_b_2017`
LEFT JOIN chart_of_account ON (chart_of_account.joint_account_numbers = mainaccount_b_2017.joint_account_number)
LEFT JOIN asset_liability_income_expenditure_tbl ON (asset_liability_income_expenditure_tbl.a_l_code = chart_of_account.account_type)
WHERE asset_liability_income_expenditure_tbl.a_l_code = 'FA' AND mainaccount_b_2017.dr_cr_action = 'DR'
) a
I have the following query which is actually within a stored procedure, but I removed it as there is too much going on inside the stored procedure. Basically this is the end result which takes ages (more than a minute) to run and I know the reason why - as you will also see from looking at the result of the explain - but I just cannot get it sorted.
Just to quickly explain what this query is doing. It is fetching all products from companies that are "connected" to the company where li.nToObjectID = 37. The result also returns some other information about the other companies like its name, company id, etc.
SELECT DISTINCT
SQL_CALC_FOUND_ROWS
p.id,
p.sTitle,
p.sTeaser,
p.TimeStamp,
p.ExpiryDate,
p.InStoreDate,
p.sCreator,
p.sProductCode,
p.nRetailPrice,
p.nCostPrice,
p.bPublic,
c.id as nCompanyID,
c.sName as sCompany,
m.id as nMID,
m.sFileName as sHighResFileName,
m.nSize,
(
Select sName
FROM tblBrand
WHERE id = p.nBrandID
) as sBrand,
(
Select t.sFileName
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as sFileName,
(
Select t.nWidth
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as nWidth,
(
Select t.nHeight
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as nHeight,
IF (
(
SELECT COUNT(id) FROM tblLink
WHERE
sType = "company"
AND sStatus = "active"
AND nToObjectID = 37
AND nFromObjectID = u.nCompanyID
),
1,
0
) AS bLinked
FROM tblProduct p
INNER JOIN tblMedia m
ON (
m.nTypeID = p.id AND
m.sType = "product"
)
INNER JOIN tblUser u
ON u.id = p.nUserID
INNER JOIN tblCompany c
ON u.nCompanyID = c.id
LEFT JOIN tblLink li
ON (
li.sType = "company"
AND li.sStatus = "active"
AND li.nToObjectID = 37
AND li.nFromObjectID = u.nCompanyID
)
WHERE c.bActive = 1
AND p.bArchive = 0
AND p.bActive = 1
AND NOW() <= p.ExpiryDate
AND (
li.id IS NOT NULL
OR (
li.id IS NULL
AND p.bPublic = 1
)
)
ORDER BY p.TimeStamp DESC
LIMIT 0, 52
Click here to see the output for EXPLAIN. Sorry, just couldn't get the formatting correct.
http://i60.tinypic.com/2hdqjgj.png
And lastly the number of rows for all the tables in this query:
tblProducts
Count: 5392
tblBrand
Count: 194
tblCompany
Count: 368
tblUser
Count: 416
tblMedia
Count: 5724
tblLink
Count: 24800
tblThumbnail
Count: 22207
So I have 2 questions:
1. Is there another way of writing this query which might potentially speed it up?
2. What index combination do I need for tblProducts so that not all the rows are searched through?
UPDATE 1
This is the new query after removing the subqueries and making use of left joins instead:
SELECT DISTINCT DISTINCT
SQL_CALC_FOUND_ROWS
p.id,
p.sTitle,
p.sTeaser,
p.TimeStamp,
p.ExpiryDate,
p.InStoreDate,
p.sCreator,
p.sProductCode,
p.nRetailPrice,
p.nCostPrice,
p.bPublic,
c.id as nCompanyID,
c.sName as sCompany,
m.id as nMID,
m.sFileName as sHighResFileName,
m.nSize,
brand.sName as sBrand,
thumb.sFilename,
thumb.nWidth,
thumb.nHeight,
IF (
(
SELECT COUNT(id) FROM tblLink
WHERE
sType = "company"
AND sStatus = "active"
AND nToObjectID = 37
AND nFromObjectID = u.nCompanyID
),
1,
0
) AS bLinked
FROM tblProduct p
INNER JOIN tblMedia m
ON (
m.nTypeID = p.id AND
m.sType = "product"
)
INNER JOIN tblUser u
ON u.id = p.nUserID
INNER JOIN tblCompany c
ON u.nCompanyID = c.id
LEFT JOIN tblLink li
ON (
li.sType = "company"
AND li.sStatus = "active"
AND li.nToObjectID = 37
AND li.nFromObjectID = u.nCompanyID
)
LEFT JOIN tblBrand AS brand
ON brand.id = p.nBrandID
LEFT JOIN tblThumbnail AS thumb
ON (
thumb.nMediaID = m.id
AND thumb.sType = 'thumbnail'
)
WHERE c.bActive = 1
AND p.bArchive = 0
AND p.bActive = 1
AND NOW() <= p.ExpiryDate
AND (
li.id IS NOT NULL
OR (
li.id IS NULL
AND p.bPublic = 1
)
)
ORDER BY p.TimeStamp DESC
LIMIT 0, 52;
UPDATE 2
ALTER TABLE tblThumbnail ADD INDEX (nMediaID,sType) USING BTREE;
ALTER TABLE tblMedia ADD INDEX (nTypeID,sType) USING BTREE;
ALTER TABLE tblProduct ADD INDEX (bArchive,bActive,ExpiryDate,bPublic,TimeStamp) USING BTREE;
After doing the above changes the explain showed that it is now only searching through 1464 rows on tblProduct instead of 5392.
That's a big query with a lot going on. It's going to take a few steps of work to optimize it. I will take the liberty of just presenting a couple of steps.
First step. Can you get rid of SQL_CALC_FOUND_ROWS and still have your program work correctly? If so, do that. When you specify SQL_CALC_FOUND_ROWS it sometimes means the server has to delay sending you the first row of your resultset until the last row is available.
Second step. Refactor the dependent subqueries to be JOINs instead.
Here's how you might approach that. Part of your query looks like this...
SELECT DISTINCT SQL_CALC_FOUND_ROWS
p.id,
...
c.id as nCompanyID,
...
m.id as nMID,
...
( /* dependent subquery to be removed */
Select sName
FROM tblBrand
WHERE id = p.nBrandID
) as sBrand,
( /* dependent subquery to be removed */
Select t.sFileName
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as sFileName,
( /* dependent subquery to be removed */
Select t.nWidth
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as nWidth,
( /* dependent subquery to be removed */
Select t.nHeight
FROM tblThumbnail t
where t.nMediaID = m.id AND
t.sType = "thumbnail"
) as nHeight,
...
Try this instead. Notice how the brand and thumbnail dependent subqueries disappear. You had three dependent subqueries for the thumbnail; they can disappear into a single JOIN.
SELECT DISTINCT SQL_CALC_FOUND_ROWS
p.id,
...
brand.sName,
thumb.sFilename,
thumb.nWidth,
thumb.nHeight,
...
FROM tblProduct p
INNER JOIN tblMedia AS m ON (m.nTypeID = p.id AND m.sType = 'product')
... (other table joins) ...
LEFT JOIN tblBrand AS brand ON p.id = p.nBrandID
LEFT JOIN tblMedia AS thumb ON (t.nMediaID = m.id AND thumb.sType = 'thumbnail')
I used LEFT JOIN rather than INNER JOIN so MySQL will present NULL values if the joined rows are missing.
Edit
You're using a join pattern that looks like this:
JOIN sometable AS s ON (s.someID = m.id AND s.sType = 'string')
You seem to do this for a few tables. You probably can speed up the JOIN operations by creating compound indexes in those tables. For example, try adding the following index to tblThumbnail: (sType, nMediaID). You can do that with this DDL statement.
ALTER TABLE tblThumbnail ADD INDEX (sType, nMediaID) USING BTREE
You can do similar things to other tables with the same join pattern.
The view below does not return correct results. I am trying to get the sum of Picked, Printed & Scanned grouped by Plan_Id & PartNum. I need to return the correct totals regardless if there are corresponding records in the child tables. I know how to do it if I use three different views and join them, but how do i do it all in a single view? Any help appreciated.
SELECT
`prod_plan`.`Prp_ProdPlanId` AS `PlanId`,
`prod_plan`.`Prp_PartNum` AS `PartNum`,
sum(`prod_plan`.`Prp_Picked`) AS `Picked`,
sum(`printed`.`PtQty`) AS `Printed`,
sum(`scanned`.`PtQty`) AS `Scanned`
FROM
(
(
`prod_plan`
LEFT JOIN `product_trans` `printed` ON (
(
(
`printed`.`PtPlanId` = `prod_plan`.`Prp_ProdPlanId`
)
AND (
`printed`.`PtPartNum` = `prod_plan`.`Prp_PartNum`
)
)
)
)
LEFT JOIN `product_trans` `scanned` ON (
(
(
`scanned`.`PtPlanId` = `prod_plan`.`Prp_ProdPlanId`
)
AND (
`scanned`.`PtPartNum` = `prod_plan`.`Prp_PartNum`
)
)
)
)
WHERE
(
(
`printed`.`PtPart` = 'Barcode Print'
)
AND (
`scanned`.`PtPart` = 'Barcode Scan'
)
)
GROUP BY
`prod_plan`.`Prp_ProdPlanId`,
`prod_plan`.`Prp_PartNum`
You need to check PtPart in the ON clauses. Otherwise, you won't get the rows with no matches in the child tables, because those columns will be NULL.
SELECT
`prod_plan`.`Prp_ProdPlanId` AS `PlanId`,
`prod_plan`.`Prp_PartNum` AS `PartNum`,
sum(`prod_plan`.`Prp_Picked`) AS `Picked`,
sum(`printed`.`PtQty`) AS `Printed`,
sum(`scanned`.`PtQty`) AS `Scanned`
FROM `prod_plan`
LEFT JOIN `product_trans` `printed`
ON `printed`.`PtPlanId` = `prod_plan`.`Prp_ProdPlanId`
AND `printed`.`PtPartNum` = `prod_plan`.`Prp_PartNum`
AND `printed`.`PtPart` = 'Barcode Print'
LEFT JOIN `product_trans` `scanned`
ON `scanned`.`PtPlanId` = `prod_plan`.`Prp_ProdPlanId`
AND `scanned`.`PtPartNum` = `prod_plan`.`Prp_PartNum`
AND `scanned`.`PtPart` = 'Barcode Scan'
GROUP BY
`prod_plan`.`Prp_ProdPlanId`,
`prod_plan`.`Prp_PartNum`
Here is what I have done. Below are 3 views that work correctly.
v_trans_printed:
SELECT
product_trans.PtPlanId AS PlanId,
product_trans.PtLot AS Lot,
product_trans.PtPartNum AS PartNum,
Sum(product_trans.PtQty) AS Printed
FROM
product_trans
WHERE
product_trans.PtPart = 'Barcode Print'
GROUP BY
product_trans.PtPlanId,
product_trans.PtLot,
product_trans.PtPartNum
v_trans_scanned:
SELECT
product_trans.PtPlanId AS PlanId,
product_trans.PtLot AS Lot,
product_trans.PtPartNum AS PartNum,
Sum(product_trans.PtQty) AS Scanned
FROM
product_trans
WHERE
product_trans.PtPart = 'Barcode Scan'
GROUP BY
product_trans.PtPlanId,
product_trans.PtLot,
product_trans.PtPartNum
And here I put them all together. This returns the correct results:
vSELECT
prod_plan.Prp_ProdPlanId AS PlanId,
prod_plan.Prp_Lot AS Lot,
prod_plan.Prp_PartNum AS PartNum,
Sum(prod_plan.Prp_Picked) AS Picked,
Printed.Printed AS Printed,
Scanned.Scanned AS Scanned
FROM
prod_plan
LEFT JOIN v_trans_printed AS Printed ON Printed.PlanId = prod_plan.Prp_ProdPlanId AND Printed.Lot = prod_plan.Prp_Lot AND Printed.PartNum = prod_plan.Prp_PartNum
LEFT JOIN v_trans_scanned AS Scanned ON Scanned.PlanId = prod_plan.Prp_ProdPlanId AND Scanned.Lot = prod_plan.Prp_Lot AND Scanned.PartNum = prod_plan.Prp_PartNum
GROUP BY
prod_plan.Prp_ProdPlanId,
prod_plan.Prp_Lot,
prod_plan.Prp_PartNum
But it would be better if I could use one view.
Note: I left out Lot originally by accident. But it didn't affect the results w. the sample data set.
I have multiple table for a project (sessions , charges and payments)
To get the sessions i'm doing the following :
SELECT
sess.file_id, SUM(sess.rate * sess.length) AS total
FROM
sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
This will return the amount that a specific student should pay
I also have another table "charges"
SELECT
file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM
file_charges
GROUP BY file_charges.file_id
And finally the payment query :
SELECT
file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM
file_payments
GROUP BY file_payments.file_id
Can i combine those 3 in a way to have :
Total = Payments - (Session + Charges)
Note that it could be negative so i could have file_id that exists in session , charges but not in payments and i could have a payment without sessions or charges ...
Edit : http://sqlfiddle.com/#!2/a90d9
One issue that needs to be addressed is whether one of these queries can be the "driver", in cases where we don't have rows for a given file_id returned by one or more of the queries. (e.g. there might be rows from sess, but none from file_payments. If we want to be sure to include every possible file_id that appears in any of the queries, we can get a list of all possible file_id with a query like this:
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
(NOTE: The UNION operator will remove any duplicates)
To get the specified resultset, we can use that query, along with "left joins" of the other three original queries. The outline of the query will be:
SELECT a.file_id, p.total_payment - ( s.total + c.total_charges)
FROM a
LEFT JOIN s ON s.file_id = a.file_id
LEFT JOIN c ON c.file_id = a.file_id
LEFT JOIN p ON p.file_id = a.file_id
ORDER BY a.file_id
In that statement a is a standin for the query that gets the set of all file_id values (as shown above). The s, c and p are standins for your three original queries, on sess, file_charges and file_payments, respectively.
If any of the file_id values is "missing" from any of the queries, we are going to need to substitute a zero for the missing value. We can use the IFNULL function to handle that for us.
This query should return the specified resultset:
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS t
FROM ( -- all possible values of file_id
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT
JOIN ( -- the amount that a specific student should pay
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT
JOIN ( -- charges
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT
JOIN ( -- payments
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
(The EXPLAIN for this query is not going to be pretty, with four derived tables. On really large sets, performance may be horrendous. But the resultset returned should meet the specification.)
Beware of queries that JOIN all three tables together... that will likely give incorrect results when there are (for example) two (or more) rows for the same file_id in the file_payment table.
There are other approaches to getting an equivalent result set, but the query above answers the question: "how can i get the results of these queries joined together into a total".
Using correlated subqueries
Here's another approach, using correlated subqueries in the SELECT list...
SELECT a.file_id
, IFNULL( ( SELECT SUM(file_payments.paymentAmount) FROM file_payments
WHERE file_payments.file_id = a.file_id )
,0)
- ( IFNULL( ( SELECT SUM(sess.rate * sess.length) FROM sess
WHERE sess.file_id = a.file_id )
,0)
+ IFNULL( ( SELECT SUM(file_charges.price) FROM file_charges
WHERE file_charges.file_id = a.file_id )
,0)
) AS tot
FROM ( -- all file_id values
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
ORDER BY a.file_id
try this
SELECT sess.file_id, SUM(file_payments.paymentAmount) - (SUM(sess.rate * sess.length)+SUM(file_charges.price)) as total_payment FROM sess , file_charges , file_payments
WHERE sess.sessionDone = 1
GROUP BY total_payment
EDIT.
SELECT a.file_id
, IFNULL(p.total_payment,0) - ( IFNULL(s.total,0) + IFNULL(c.total_charges,0)) AS tot
FROM (
SELECT ss.file_id FROM sess ss
UNION
SELECT fc.file_id FROM file_charges fc
UNION
SELECT fp.file_id FROM file_payments fp
) a
LEFT JOIN (
SELECT sess.file_id, SUM(sess.rate * sess.length) AS total
FROM sess
WHERE sess.sessionDone = 1
GROUP BY sess.file_id
) s
ON s.file_id = a.file_id
LEFT JOIN (
SELECT file_charges.file_id, SUM(file_charges.price) AS total_charges
FROM file_charges
GROUP BY file_charges.file_id
) c
ON c.file_id = a.file_id
LEFT JOIN (
SELECT file_payments.file_id, SUM(file_payments.paymentAmount) AS total_payment
FROM file_payments
GROUP BY file_payments.file_id
) p
ON p.file_id = a.file_id
ORDER BY a.file_id
DEMO HERE