I currently have this working using a Sub-query, but as the DB grows this will become HUGELY inefficient. I'm wondering if there is a more efficient way to do what I need to do without sub-queries?
I need to have my final output look like so:
Question, Answer, Responses, Charts included in Response Count
Did this work?, N/A, 26, 30
Did this work?, Yes, 4, 30
This is my current query:
SELECT
bq_text,
ba_a,
bq_id,
COUNT(ba_a) AS ba_aC,
(SELECT COUNT(*) FROM board_done_sheet WHERE sd_b_id = bs.bs_id AND sd_sub = 1) AS sd_chartnumC
FROM board_done_sheet AS sh
LEFT JOIN board_done bd
ON (bd.bd_id = sh.sd_bd_id)
LEFT JOIN boardsubs bs
ON (bd.bd_b_id = bs.bs_id)
LEFT JOIN b_q_answers ba
ON (sh.sd_s_id = ba.ba_s_id)
LEFT JOIN bsquestions bq
ON (bq.bq_id = ba.ba_q_id)
LEFT JOIN multiples m
ON (ba.ba_m_id = m.m_id)
LEFT JOIN users u
ON (u.us_id = bd.bd_d_id)
LEFT JOIN profiles p
ON (p.p_u_id = bd.bd_d_id)
LEFT JOIN users rev
ON (rev.us_id = bd.bd_rev)
WHERE sd_sub = '1' AND bq_text <> 'Date' AND bq_id = 380
GROUP BY bs_id, bq_text, ba_a
That works perfectly, the problem is it has to use sub-queries which as time goes by will get less efficient.
I'm just wondering if there is a better more efficient way to do that summed field without it.
Presumably the subquery you're concerned about is the one in your toplevel SELECT.
That is easy to refactor so it won't get repeated.
Just JOIN it to the rest of the table. You'll want this sort of thing:
SELECT
bq_text, ...
COUNT(ba_a) AS ba_aC,
countup.countup AS sd_chartnumC
FROM board_done_sheet AS sh
LEFT JOIN board_done bd
ON (bd.bd_id = sh.sd_bd_id)
...
LEFT JOIN users rev
ON (rev.us_id = bd.bd_rev)
JOIN (
SELECT COUNT(*) AS countup , sd_b_id
FROM board_done_sheet
WHERE sd_sub = 1
GROUP BY sd_b_id
) AS countup ON countup.sd_b_id = bs.bs_id
WHERE sd_sub = '1'
AND bq_text <> 'Date'
AND bq_id = 380
GROUP BY bs_id, bq_text, ba_a
The countup subquery generates a summary table of counts and ids, and then joins it to the other tables.
A JOIN cascade of this complexity may become inefficient for other reasons as your table grows if you don't structure your indexes correctly.
Related
I have the query, which is giving correct result, but, I am sure there are other way to do so, having same conditions repeated.
Can anybody help me to reduce the complexity of the query.
Query using these mysql parameters:-
SELECT avai.account_visit_account_info_pk AS Account_ID,
mb.NAME AS Client_Name,
mb.fullname AS Client_Full_Name,
avai.account_name AS Account_Name,
mc.NAME AS Asset_City,
Format(( bfd.finance_value ), 'en_IN') AS Reserve_Price,
Format(( bfd.finance_value ) * 10 / 100, 'en_IN') AS EMD_Value,
Ifnull(Concat(CASE
WHEN mpc.parent = 4 THEN 'Residential'
WHEN mpc.parent = 5 THEN 'Commercial'
WHEN mpc.parent = 6 THEN 'Industrial'
WHEN mpc.parent = 7 THEN 'Agricultural'
END, '/', mpc.category_name), mpc.category_name) Asset_Category,
Concat(ud.first_name, ' ', ud.last_name) AS ADM_Name,
Concat(udd.first_name, ' ', udd.last_name) AS MKT_Name,
mcc.NAME AS ADM_City,
ms.NAME AS ADM_State,
mz.NAME AS ADM_Zone,
bec.e_auction_from AS Auction_Date,
bfdd.finance_value AS Sold_Price
FROM account_branch_visit abv
JOIN mst_product_category mpc
ON mpc.mst_product_category_pk = abv.mst_product_category_pk
JOIN mst_bank mb
ON abv.mst_bank_pk = mb.mst_bank_pk
JOIN banking_financial_details bfd
ON abv.account_branch_visit_pk = bfd.account_branch_visit_pk
AND bfd.mst_financial_pk IN ( 33 )
LEFT JOIN banking_financial_details bfdd
ON abv.account_branch_visit_pk = bfdd.account_branch_visit_pk
AND bfd.mst_financial_pk IN ( 38 )
JOIN mst_city mc
ON mc.mst_city_pk = avai.mst_city_pk
JOIN mst_city mcc
ON mcc.mst_city_pk = avai.mst_city_pk
JOIN mst_state ms
ON ms.mst_state_pk = mcc.mst_state_pk
JOIN mst_zone mz
ON mz.mst_zone_pk = ms.mst_zone_pk
JOIN case_allocation ca
ON ca.account_branch_visit_pk = avai.account_branch_visit_pk
AND ca.mst_activity_pk = 21
JOIN case_allocation caa
ON caa.account_branch_visit_pk = avai.account_branch_visit_pk
AND caa.mst_activity_pk = 18
JOIN user_detail ud
ON ud.user_detail_pk = ca.assignedto
JOIN user_detail udd
ON udd.user_detail_pk = caa.assignedto
JOIN banking_event_calender bec
ON bec.account_branch_visit_pk = avai.account_branch_visit_pk
AND ( abv.closed_reasons_pk IS NULL
OR abv.closed_reasons_pk = 16 )
AND abv.isdeleted = '0'
WHERE avai.account_branch_visit_pk = '1301';
I do not know what the exact intent of the query is, so I will provide some technical nuances, without actually understanding your data model or goal. The select clause provides you some columns and you probably need it. So, what I'm looking for are duplicate table joins. Some of them are necessary, some of them are unnecessary.
banking_financial_details
You join and left join this table with different ideas. You use both of them, so I assume this is necessary.
mst_city
This is obviously unnecessarily duplicating:
JOIN mst_city mc
ON mc.mst_city_pk = avai.mst_city_pk
JOIN mst_city mcc
ON mcc.mst_city_pk = avai.mst_city_pk
Remove the second JOIN and ON clauses from the above and replace all usages of mcc to mc in the query.
case_allocation
You join this table twice, but with different ids and you then join the corresponding user_detail to both and both user_detail instances are being used, so this is probably necessary.
user_detail
Since this duplicated join seems to be used in the select, it's probably necessary.
Summary
We have found an unnecessary join that can be removed. Further shortening of the query may be possible, but we would need to know more about your task and database to determine further improvements.
i added indexes as well but still it is taking 13 sec
I added compound index for all the columns that i've used here
SELECT carrierbil2_.IDENTITY AS col_0_0_,
carrier4_.CARRIER_NAME AS col_1_0_,
carrier4_.IDENTITY AS col_2_0_,
carrier4_.CARRIER_ID AS col_3_0_,
shipmentor0_.EXTERNAL_REFERENCE_ID AS col_4_0_,
invoicedet5_.INVOICE_NUMBER AS col_5_0_,`enter code here`
shipmentca1_.CARRIER_REFERENCE_NUMBER AS col_6_0_,
SUM(shipmentco9_.RATED_COST) AS col_7_0_,
SUM(shipmentco9_.COST) AS col_8_0_,
invoice6_.TOTAL_PAID_AMOUNT AS col_9_0_,
invoice6_.INVOICE_GENERATED_DATE AS col_10_0_,
shipmentor0_.ACTUAL_SHIP_DATE AS col_11_0_,
bolstatus15_.BOL_STATUS_ID AS col_12_0_,
shipmentlo10_.LOCATION_NAME AS col_13_0_,
country11_.COUNTRY_NAME AS col_14_0_,
postal14_.POSTAL_CODE AS col_15_0_,
state12_.STATE_NAME AS col_16_0_,
city13_.CITY_NAME AS col_17_0_,
shipmentlo16_.LOCATION_NAME AS col_18_0_,
country17_.COUNTRY_NAME AS col_19_0_,
postal20_.POSTAL_CODE AS col_20_0_,
state18_.STATE_NAME AS col_21_0_,
city19_.CITY_NAME AS col_22_0_,
shipmentor0_.IDENTITY AS col_23_0_,
shipmentca1_.IDENTITY AS col_24_0_,
shipmentno7_.NOTE AS col_25_0_
FROM
SHIPMENT_ORDER shipmentor0_
INNER JOIN
SHIPMENT_CARRIER shipmentca1_ ON shipmentor0_.SHIPMENT_ORDER_ID = shipmentca1_.SHIPMENT_ORDER_ID
AND (shipmentca1_.IS_DELETED = 0)
LEFT OUTER JOIN
CARRIER_BILL_DETAILS carrierbil2_ ON shipmentca1_.SHIPMENT_CARRIER_ID = carrierbil2_.SHIPMENT_CARRIER_ID
LEFT OUTER JOIN
CARRIER_BILLS carrierbil3_ ON carrierbil2_.CARRIER_BILL_ID = carrierbil3_.CARRIER_BILL_ID
INNER JOIN
CARRIER carrier4_ ON shipmentca1_.CARRIER_ID = carrier4_.CARRIER_ID
LEFT OUTER JOIN
INVOICE_DETAILS invoicedet5_ ON shipmentor0_.SHIPMENT_ORDER_ID = invoicedet5_.SHIPMENT_ORDER_ID
LEFT OUTER JOIN
INVOICE invoice6_ ON invoicedet5_.INVOICE_ID = invoice6_.INVOICE_ID
LEFT OUTER JOIN
SHIPMENT_NOTES shipmentno7_ ON shipmentor0_.SHIPMENT_ORDER_ID = shipmentno7_.SHIPMENT_ORDER_ID
AND (shipmentno7_.NOTE_TYPE = 4)
LEFT OUTER JOIN
SHIPMENT_COST shipmentco8_ ON shipmentor0_.SHIPMENT_ORDER_ID = shipmentco8_.SHIPMENT_ID
LEFT OUTER JOIN
SHIPMENT_COST_DETAILS shipmentco9_ ON shipmentco8_.SHIPMENT_COST_ID = shipmentco9_.SHIPMENT_COST_ID
AND (shipmentco9_.IS_DELETED = 0)
LEFT OUTER JOIN
SHIPMENT_LOCATION shipmentlo10_ ON shipmentor0_.ORIGIN_ID = shipmentlo10_.SHIPMENT_LOCATION_ID
AND (shipmentlo10_.LOCATION_TYPE_ID = 3)
LEFT OUTER JOIN
COUNTRY country11_ ON shipmentlo10_.COUNTRY_ID = country11_.COUNTRY_ID
LEFT OUTER JOIN
STATE state12_ ON shipmentlo10_.STATE_ID = state12_.STATE_ID
LEFT OUTER JOIN
CITY city13_ ON shipmentlo10_.CITY_ID = city13_.CITY_ID
LEFT OUTER JOIN
POSTAL postal14_ ON shipmentlo10_.POSTAL_ID = postal14_.POSTAL_ID
LEFT OUTER JOIN
BOL_STATUS bolstatus15_ ON shipmentor0_.ORDER_STATUS = bolstatus15_.BOL_STATUS_ID
LEFT OUTER JOIN
SHIPMENT_LOCATION shipmentlo16_ ON shipmentor0_.DESTINATION_LOCATION_ID = shipmentlo16_.SHIPMENT_LOCATION_ID
AND (shipmentlo16_.LOCATION_TYPE_ID = 4)
LEFT OUTER JOIN
COUNTRY country17_ ON shipmentlo16_.COUNTRY_ID = country17_.COUNTRY_ID
LEFT OUTER JOIN
STATE state18_ ON shipmentlo16_.STATE_ID = state18_.STATE_ID
LEFT OUTER JOIN
CITY city19_ ON shipmentlo16_.CITY_ID = city19_.CITY_ID
LEFT OUTER JOIN
POSTAL postal20_ ON shipmentlo16_.POSTAL_ID = postal20_.POSTAL_ID
CROSS JOIN
CLIENT client21_
WHERE
shipmentor0_.CLIENT_ID = client21_.CLIENT_ID
AND bolstatus15_.SEQUENCE_ID >= 700
AND (carrierbil3_.IS_APPROVED = 0
OR carrierbil3_.IS_APPROVED IS NULL)
AND (carrierbil3_.IS_DELETED = 0
OR carrierbil3_.IS_DELETED IS NULL)
AND (carrierbil2_.IS_DELETED = 0
OR carrierbil2_.IS_DELETED IS NULL)
AND (shipmentor0_.IS_DELETED = 0
OR shipmentor0_.IS_DELETED IS NULL)
GROUP BY invoice6_.INVOICE_GENERATED_DATE , shipmentca1_.IDENTITY , invoicedet5_.INVOICE_NUMBER , invoice6_.TOTAL_PAID_AMOUNT , shipmentca1_.CARRIER_REFERENCE_NUMBER , carrier4_.CARRIER_ID , CAST(carrier4_.IDENTITY AS SIGNED) , carrier4_.CARRIER_NAME , CAST(carrierbil2_.IDENTITY AS SIGNED) , shipmentor0_.SHIPMENT_ORDER_ID , shipmentno7_.NOTE , shipmentor0_.EXTERNAL_REFERENCE_ID , shipmentlo10_.LOCATION_NAME , country11_.COUNTRY_NAME , postal14_.POSTAL_CODE , state12_.STATE_NAME , city13_.CITY_NAME , shipmentlo16_.LOCATION_NAME , country17_.COUNTRY_NAME , postal20_.POSTAL_CODE , state18_.STATE_NAME , city19_.CITY_NAME , shipmentor0_.IDENTITY
ORDER BY shipmentor0_.SHIPMENT_ORDER_ID DESC;
The indexes are mostly useless because of OR, as in
AND (carrierbil3_.IS_APPROVED = 0
OR carrierbil3_.IS_APPROVED IS NULL)
The simple way to fix that is to pick either 0 or NULL to represent the flag. Then make sure all the data is consistent, and change the WHERE to just check for the one case.
Do you really mean
CROSS JOIN
CLIENT client21_
That is likely to be a performance-killer and generate a huge resultset.
Never mind. You have the ON in WHERE. Please use ON for relations and WHERE for filtering.
WHERE
shipmentor0_.CLIENT_ID = client21_.CLIENT_ID
I see a mixture of LEFT JOIN and JOIN. Check that the LEFT JOINs really need to be LEFT; that is, the 'right' table might have missing data.
To discuss further, please provide EXPLAIN SELECT ....
Eschew over-normalization:
You have 5 tables to describe a location (name, country, postal, state, city). Instead, I recommend a single table with those 5 columns. This, alone, would get rid of 8 JOINs.
CAST(carrier4_.IDENTITY AS SIGNED) -- Can't you fix the datatype to be SIGNED, or allow the value to be UNSIGNED?
But perhaps the main performance-killer is the "explode-implode" syndrone. First, it does a lot of JOINs, building a huge intermediate table, then it collapses that by doing GROUP BY. The remedy is
SELECT ...
FROM ( SELECT SUM(...), SUM(...) FROM ... GROUP BY ... ) AS a
JOIN ((whatever else is needed));
That is, first devise a minimal "derived table" that does the GROUP BY (and/or ORDER BY and/or LIMIT). Then see what else is needed to complete the query (namely all the normalization lookups).
After you have acted on most of my comments, we can discuss whether you have the optimal indexes. (It is premature to do so now.) If so, please start a new Question; it would be too much clutter to add to this one.
First of all, that's a lot of joins. However, the main reason your query is taking a significant time is because you're adding an order by clause. You need to figure out a way to avoid it, or may be come up with a different strategy
I have the following query
SELECT custconcompany, custconfirstname, custconlastname, custconemail, custconphone, shipaddress1, shipaddress2, shipcity, stateabbrv, shipzip, countryname, websitecheck.formfieldfieldvalue websitevalue, excludecheck.formfieldfieldvalue excludevalue
FROM obcisc_customers
JOIN ( (obcisc_shipping_addresses JOIN obcisc_countries
ON obcisc_shipping_addresses.shipcountryid = obcisc_countries.countryid)
LEFT JOIN obcisc_country_states
ON obcisc_shipping_addresses.shipstateid = obcisc_country_states.stateid
LEFT JOIN obcisc_formfieldsessions websitecheck
ON obcisc_shipping_addresses.shipformsessionid = websitecheck.formfieldsessioniformsessionid
LEFT JOIN obcisc_formfieldsessions excludecheck
ON obcisc_shipping_addresses.shipformsessionid = excludecheck.formfieldsessioniformsessionid)
ON obcisc_customers.customerid = obcisc_shipping_addresses.shipcustomerid
WHERE custgroupid = 11
AND websitecheck.formfieldfieldid = 24
AND excludecheck.formfieldfieldid = 30
AND excludecheck.formfieldfieldvalue != 'a:1:{i:0;s:3:"Yes";}'
ORDER BY shipstate, shipcity
This works great except I also need it to return rows where "excludecheck.formfieldfieldid=30" does not exist... right now it's not returning them
When writing a LEFT JOIN, any criteria on the table you're joining with should be put into the ON clause. If you put it into the WHERE clause, you'll filter out the results in the first table that don't have a matching row in the second table, because the NULL value that comes from the outer join will not match the criteria.
SELECT custconcompany, custconfirstname, custconlastname, custconemail, custconphone, shipaddress1, shipaddress2, shipcity, stateabbrv, shipzip, countryname, websitecheck.formfieldfieldvalue websitevalue, excludecheck.formfieldfieldvalue excludevalue
FROM obcisc_customers
JOIN obcisc_shipping_addresses
ON obcisc_customers.customerid = obcisc_shipping_addresses.shipcustomerid
JOIN obcisc_countries
ON obcisc_shipping_addresses.shipcountryid = obcisc_countries.countryid)
LEFT JOIN obcisc_country_states
ON obcisc_shipping_addresses.shipstateid = obcisc_country_states.stateid
LEFT JOIN obcisc_formfieldsessions websitecheck
ON obcisc_shipping_addresses.shipformsessionid = websitecheck.formfieldsessioniformsessionid
AND websitecheck.formfieldfieldid = 24
LEFT JOIN obcisc_formfieldsessions excludecheck
ON obcisc_shipping_addresses.shipformsessionid = excludecheck.formfieldsessioniformsessionid
AND excludecheck.formfieldfieldid = 30
AND excludecheck.formfieldfieldvalue != 'a:1:{i:0;s:3:"Yes";}')
WHERE custgroupid = 11
ORDER BY shipstate, shipcity
Another way to do it is by putting (excludecheck.formfieldfieldid = 30 OR excludecheck.formfieldfieldid IS NULL) in the WHERE clause. But this is more verbose and also I believe it's harder for MySQL to optimize, especially if you have several tables you're joining like this.
I have the following SQL query
SELECT
DISTINCT
count("SiteTree_Live"."ID")
FROM
"SiteTree_Live"
LEFT JOIN "Page_Live" ON "Page_Live"."ID" = "SiteTree_Live"."ID"
LEFT JOIN "TourPage_Live" ON "TourPage_Live"."ID" = "SiteTree_Live"."ID"
LEFT JOIN "DepartureDate" ON "DepartureDate"."TourID" = "SiteTree_Live"."ID"
WHERE
("SiteTree_Live"."Locale" = 'en_AU')
AND ("SiteTree_Live"."ClassName" IN ('TourPage'))
AND ("DepartureDate"."DepartureDate" LIKE '2012-11%')
but it producing a wrong count as the query result. The total intented result this query is suppose to return should not be more than 245 but currently, its returning more than about "4569" results.
Thats is because of the JOIN on the "DepartureDate" table as the query returns the expected result when i remove the join from the "DepartureDate" table.
What modification do i need to make to my query to count the Macthes between "SiteTree_Live"."ID" and "DepartureDate"."TourID" whiles counting only the "SiteTree_Live"."ID" count excluding the Departure dates?
Any suggestions welcomed :)
THE ANSWER
SELECT
COUNT(DISTINCT SiteTree_Live.ID)
FROM
"SiteTree_Live" LEFT JOIN "Page_Live" ON "Page_Live"."ID" = "SiteTree_Live"."ID"
LEFT JOIN "TourPage_Live" ON "TourPage_Live"."ID" = "SiteTree_Live"."ID"
LEFT JOIN "DepartureDate" ON "DepartureDate"."TourID" = "SiteTree_Live"."ID"
WHERE
("SiteTree_Live"."Locale" = 'en_AU')
AND ("SiteTree_Live"."ClassName" IN ('TourPage'))
AND ("DepartureDate"."DepartureDate" LIKE '2013-03%')
Seems to give me the right result. Thanks for the tip #Michael Berkowski
Minor correction: if DepartureDate is a date-type, then the LIKE '2013-03% will force it to be coerced into a character type (this is a mysql feature) As a result, any indexes on DepartureDate will not be used, IIRC. Better use a plain range-query:
SELECT
COUNT(DISTINCT stl.ID)
FROM
SiteTree_Live stl
LEFT JOIN
DepartureDate dd ON dd.TourID = stl.ID
WHERE
stl.Locale = 'en_AU'
AND stl.ClassName = 'TourPage'
AND dd.DepartureDate >= '2013-03-01'
AND dd.DepartureDate < '2013-04-01'
;
Do this (You have a bunch of unneeded joins)
SELECT
COUNT(DISTINCT SiteTree_Live.ID)
FROM
`SiteTree_Live`
LEFT JOIN
`DepartureDate` ON `DepartureDate`.`TourID` = `SiteTree_Live`.`ID`
WHERE
`SiteTree_Live`.`Locale` = 'en_AU'
AND `SiteTree_Live`.`ClassName` = 'TourPage'
AND `DepartureDate`.`DepartureDate` LIKE '2013-03%'
You could also do a GROUP BY:
SELECT
COUNT(SiteTree_Live.ID)
FROM
`SiteTree_Live`
LEFT JOIN
`DepartureDate` ON `DepartureDate`.`TourID` = `SiteTree_Live`.`ID`
WHERE
`SiteTree_Live`.`Locale` = 'en_AU'
AND `SiteTree_Live`.`ClassName` = 'TourPage'
AND `DepartureDate`.`DepartureDate` LIKE '2013-03%'
GROUP BY
SiteTree_Live.ID
I need to make second request inside one so far i did it like this and then just grouped by userid field, works. But without grouping it shows way too many results i was wondering if this results grouped are actually being requested first and then filtered so it loads mysql server?
SELECT mn.userid, user_table.first_name, user_table.last_name, employer_info.emp_name, emp2.emp_name AS emp2name
FROM main as mn
LEFT JOIN position_info ON position_info.pos_id = mn.position
LEFT JOIN employer_info ON employer_info.emp_id = position_info.emp_id
LEFT JOIN position_info AS position2 ON pos2.pos_id = mn.position2
LEFT JOIN employer_info AS emp2 ON emp2.emp_id = pos2.emp_id
WHERE mn.type = 31 or mn.type = 3
GROUP BY mn.userid
Would this way of building query be more resource friendly?
SELECT mn.userid, user_table.first_name, user_table.last_name, employer_info.emp_name, emp2.emp_name AS emp2name
FROM main as mn
LEFT JOIN position_info ON position_info.pos_id = mn.position
LEFT JOIN employer_info ON employer_info.emp_id = position_info.emp_id
LEFT JOIN employer_info AS emp2 ON emp2.emp_id = {
SELECT emp_id FROM position_info WHERE pos_id = mn.positions2
)
WHERE mn.type = 31 or mn.type = 3
GROUP BY mn.userid
request looks almost same in length, but returns far less results when not grouped, so its better to do it first or second way?
P.S. dont pay attention to the code its not the question