how to eliminate union for better performance in mysql - mysql

Hi i want fetch the records depends on different condition i used union worked fine but taking more than 15 secs so how can we eliminate union or make the query faster
QUERY:
(SELECT p.professional_id,
p.company_name,
pbt.name AS professional_business_type_name,
pbtm.kukun_url,
p.kukun_score,
cc.year_founded,
p.contractor_category,
p.permit_data_count,
p.cost_range_code,
fpcr.cost_min_value,
fpcr.cost_max_value
FROM professional p
INNER JOIN company_contact cc
ON cc.company_contact_id = p.company_contact_id
INNER JOIN professional_business_type_map AS pbtm
ON pbtm.professional_id = p.professional_id
INNER JOIN of_professional_business_type_organization AS pbt
ON pbt.professional_business_type_organization_id =
pbtm.professional_business_type_organization_id
INNER JOIN f_professional_cost_range fpcr
ON fpcr.cost_range_code = p.cost_range_code
WHERE p.professional_id != 262100
AND cc.company_city_id = 5229
AND pbt.professional_business_type_organization_id = 2
AND p.cost_range_code = 4
ORDER BY p.kukun_score DESC
LIMIT 5)
UNION
(SELECT p.professional_id,
p.company_name,
pbt.name AS professional_business_type_name,
pbtm.kukun_url,
p.kukun_score,
cc.year_founded,
p.contractor_category,
p.permit_data_count,
p.cost_range_code,
fpcr.cost_min_value,
fpcr.cost_max_value
FROM professional p
INNER JOIN company_contact cc
ON cc.company_contact_id = p.company_contact_id
INNER JOIN professional_business_type_map AS pbtm
ON pbtm.professional_id = p.professional_id
INNER JOIN of_professional_business_type_organization AS pbt
ON pbt.professional_business_type_organization_id =
pbtm.professional_business_type_organization_id
INNER JOIN f_professional_cost_range fpcr
ON fpcr.cost_range_code = p.cost_range_code
WHERE p.professional_id != 262100
AND cc.company_city_id = 5229
AND pbt.professional_business_type_organization_id = 2
ORDER BY p.kukun_score DESC
LIMIT 5)
UNION
(SELECT p.professional_id,
p.company_name,
pbt.name AS professional_business_type_name,
pbtm.kukun_url,
p.kukun_score,
cc.year_founded,
p.contractor_category,
p.permit_data_count,
p.cost_range_code,
fpcr.cost_min_value,
fpcr.cost_max_value
FROM professional p
INNER JOIN company_contact cc
ON cc.company_contact_id = p.company_contact_id
INNER JOIN professional_business_type_map AS pbtm
ON pbtm.professional_id = p.professional_id
INNER JOIN of_professional_business_type_organization AS pbt
ON pbt.professional_business_type_organization_id =
pbtm.professional_business_type_organization_id
INNER JOIN f_professional_cost_range fpcr
ON fpcr.cost_range_code = p.cost_range_code
WHERE p.professional_id != 262100
AND cc.company_city_id = 5229
ORDER BY p.kukun_score DESC
LIMIT 5)
UNION
(SELECT p.professional_id,
p.company_name,
pbt.name AS professional_business_type_name,
pbtm.kukun_url,
p.kukun_score,
cc.year_founded,
p.contractor_category,
p.permit_data_count,
p.cost_range_code,
fpcr.cost_min_value,
fpcr.cost_max_value
FROM professional p
INNER JOIN company_contact cc
ON cc.company_contact_id = p.company_contact_id
INNER JOIN professional_business_type_map AS pbtm
ON pbtm.professional_id = p.professional_id
INNER JOIN of_professional_business_type_organization AS pbt
ON pbt.professional_business_type_organization_id =
pbtm.professional_business_type_organization_id
INNER JOIN f_professional_cost_range fpcr
ON fpcr.cost_range_code = p.cost_range_code
WHERE p.professional_id != 262100
AND cc.company_state_id = 5
ORDER BY p.kukun_score DESC
LIMIT 5)
LIMIT 5;

(Likely Bug) You need ORDER BY p.kukun_score DESC before the UNION's LIMIT 5. Today MySQL may sequentially perform all parts of the UNION, combine all of them, then do the LIMIT. In some future version, it is likely to, for example, perform the SELECTs in parallel, thereby jumbling the results.
Hence if you want the rows from the first SELECT to be delivered first, you must add a column and ORDER BY.
( SELECT 1 AS sequence, ... )
UNION ALL
( SELECT 2 AS sequence, ... )
...
ORDER BY sequence, kukun_score DESC
LIMIT 5
Also, UNION is the same as UNION DISTINCT, which add a de-dup pass to the operation. That is, the semantics requires evaluating all the selects.
These INDEXes may help:
cc: (company_state_id, company_contact_id, year_founded)
cc: (company_city_id, company_contact_id, year_founded)
fpcr: (cost_range_code, cost_max_value, cost_min_value)
Those indexes will be "covering" and optimal for the SQL you have.
Some benefit will come from moving fpcr out of the union. That is, first UNION all the other tables, then JOIN to fpcr. to get the two columns from it. This will speed things up because it it needs to reach into that table only 5 times, instead of thousands times (however many rows are in the 4 temporary tables).

Related

Mysql identify which table a row came from in double LEFT OUTTER JOIN with UNION ALL

My below query works, but there are two things I want to get from the query that I don't know how to do.
How to tell which LEFT JOIN the final returned row is coming from?
Is it possible to also return the total count from each LEFT JOIN?
SELECT * FROM (
(SELECT ch.user_ID, ch.clID FROM clubHistory AS ch
LEFT OUTER JOIN clubRaffleWinners AS cr ON
ch.user_ID = cr.user_ID
AND cr.cID=1157
AND cr.rafID=18
AND cr.chDate1 = '2022-06-04'
WHERE ch.cID=1157
AND ch.crID=1001
AND ch.ceID=1167
AND ch.chDate = '2022-06-04'
AND cr.user_ID IS NULL
GROUP BY ch.user_ID )
UNION ALL
(SELECT cu.user_ID, cu.clID FROM clubUsers AS cu
LEFT OUTER JOIN clubRaffleWinners AS cr1 ON
cu.user_ID = cr1.user_ID
AND cr1.cID=1157
AND cr1.rafID=18
AND cr1.chDate1 = '2022-06-04'
WHERE cu.cID=1157
AND cu.crID=1001
AND cu.ceID=1167
AND cu.calDate = '2022-06-04'
AND cr1.user_ID IS NULL
GROUP BY cu.user_ID )
) as winner ORDER BY RAND() LIMIT 1 ;
In my two left join select statements I tried:
(SELECT ch.user_ID as chUserID, ch.clID FROM clubHistory AS ch
and
(SELECT cu.user_ID as cuUserID, cu.clID FROM clubUsers AS cu
But every single result, after dozens and dozens of tries comes back a user_ID or chUserID. When I remove the ORDER BY RAND() LIMIT 1 - the only two columns that come back are user_ID, clID or chUserID, clID even though the combined results is the full list of both tables. Is this even possible?
And #2 above, is it possible to extract the total counts from each LEFT JOIN with and with out the final order by rand() limit 1 ???
For 1 add an extra column containing a value that identifies which subquery of the UNION it is.
SELECT * FROM (
(SELECT 'history' AS which, ch.user_ID, ch.clID FROM clubHistory AS ch
LEFT OUTER JOIN clubRaffleWinners AS cr ON
ch.user_ID = cr.user_ID
AND cr.cID=1157
AND cr.rafID=18
AND cr.chDate1 = '2022-06-04'
WHERE ch.cID=1157
AND ch.crID=1001
AND ch.ceID=1167
AND ch.chDate = '2022-06-04'
AND cr.user_ID IS NULL
GROUP BY ch.user_ID )
UNION ALL
(SELECT 'users' AS which, cu.user_ID, cu.clID FROM clubUsers AS cu
LEFT OUTER JOIN clubRaffleWinners AS cr1 ON
cu.user_ID = cr1.user_ID
AND cr1.cID=1157
AND cr1.rafID=18
AND cr1.chDate1 = '2022-06-04'
WHERE cu.cID=1157
AND cu.crID=1001
AND cu.ceID=1167
AND cu.calDate = '2022-06-04'
AND cr1.user_ID IS NULL
GROUP BY cu.user_ID )
) as winner ORDER BY RAND() LIMIT 1 ;
Please only ask one question at a time.

How can I optimize this mysql query

I would like to ask, how can I optimize this query:
select
h.jmeno hrac,
n1.url hrac_url,
t.nazev tym,
n2.url tym_url,
ss.pocet_zapasu zapasy,
ss.pocet_minut minuty,
s.celkem_golu goly,
s.zk,
s.ck
from
hraci h
left join
(
select
hrac_id,
tym_id,
count(minut_celkem) pocet_zapasu,
sum(minut_celkem) pocet_minut
from
statistiky_stridani ss
join
zapasy z
on z.id = ss.zapas_id
join
souteze s
on s.id = z.soutez_id
join
souteze_nazev sn
on sn.id = s.soutez_id
where
s.rocnik_id = 2
group by
hrac_id
) ss on ss.hrac_id = h.id
left join
(
select
hrac_id,
tym_id,
sum(typ_id = 1 or typ_id = 3) as celkem_golu,
sum(typ_id = 4) as zk,
sum(typ_id = 5) as ck
from
statistiky st
join
zapasy z
on z.id = st.zapas_id
join
souteze s
on s.id = z.soutez_id
join
souteze_nazev sn
on sn.id = s.soutez_id
where
s.rocnik_id = 2
group by
hrac_id
) s on s.hrac_id = h.id
join
navigace n1
on n1.id = h.nav_id
join
tymy t
on t.id = ss.tym_id
join
navigace n2
on n2.id = t.nav_id
order by
s.celkem_golu desc
limit
10
Because query takes about 1,5 - 2 seconds. For example, table statistiky_stridani has about 500 000 rows and statistiky about 250 000 rows.
This returns EXPLAIN:
Thank you for your help
Don't use LEFT JOIN instead of JOIN unless you really need the empty rows.
Try to reformulate because JOIN ( SELECT ... ) JOIN ( SELECT ... ) optimizes poorly.
Please do not use the same alias (s) for two different tables; it confuses the reader.
Add the composite index INDEX(rocnik_id, soutez_id) to souteze.
LEFT JOIN ... JOIN ... -- Please add parentheses to show whether the JOIN should be before doing the LEFT JOIN or after:
either
FROM ...
LEFT JOIN ( ... JOIN ... )
or
FROM ( ... LEFT JOIN ... )
JOIN ...
It may make a big difference in how the Optimizer performs the query, which may change the speed.
There may be more suggestions; work through those and ask again (if it is still "too slow").

SQL - Multiple many-to-many relations filtering SELECT

These are my tables:
Cadastros (id, nome)
Convenios (id, nome)
Especialidades (id, nome)
Facilidades (id, nome)
And the join tables:
cadastros_convenios
cadastros_especialidades
cadastros_facilidades
The table I'm querying for: Cadastros
I'm using MySQL.
The system will allow the user to select multiple "Convenios", "Especialidades" and "Facilidades". Think of each of these tables as a different type of "tag". The user will be able to select multiple "tags" of each type.
What I want is to select only the results in Cadastros table that are related with ALL the "tags" from the 3 different tables provided. Please note it's not an "OR" relation. It should only return the row from Cadastros if it has a matching link table row for EVERY "tag" provided.
Here is what I have so far:
SELECT Cadastro.*, Convenio.* FROM Cadastros AS Cadastro
INNER JOIN cadastros_convenios AS CadastrosConvenio ON(Cadastro.id = CadastrosConvenio.cadastro_id)
INNER JOIN Convenios AS Convenio ON (CadastrosConvenio.convenio_id = Convenio.id AND Convenio.id IN(2,3))
INNER JOIN cadastros_especialidades AS CadastrosEspecialidade ON (Cadastro.id = CadastrosEspecialidade.cadastro_id)
INNER JOIN Especialidades AS Especialidade ON(CadastrosEspecialidade.especialidade_id = Especialidade.id AND Especialidade.id IN(1))
INNER JOIN cadastros_facilidades AS CadastrosFacilidade ON (Cadastro.id = CadastrosFacilidade.cadastro_id)
INNER JOIN Facilidades AS Facilidade ON(CadastrosFacilidade.facilidade_id = Facilidade.id AND Facilidade.id IN(1,2))
GROUP BY Cadastro.id
HAVING COUNT(*) = 5;
I'm using the HAVING clause to try to filter the results based on the number of times it shows (meaning the number of times it has been successfully "INNER JOINED"). So in every case, the count should be equal to the number of different filters I added. So if I add 3 different "tags", the count should be 3. If I add 5 different tags, the count should be 5 and so on. It works fine for a single relation (a single pair of inner joins). When I add the other 2 relations it starts to lose control.
EDIT
Here is something that I believe is working (thanks #Tomalak for pointing out the solution with sub-queries):
SELECT Cadastro.*, Convenio.*, Especialidade.*, Facilidade.* FROM Cadastros AS Cadastro
INNER JOIN cadastros_convenios AS CadastrosConvenio ON(Cadastro.id = CadastrosConvenio.cadastro_id)
INNER JOIN Convenios AS Convenio ON (CadastrosConvenio.convenio_id = Convenio.id)
INNER JOIN cadastros_especialidades AS CadastrosEspecialidade ON (Cadastro.id = CadastrosEspecialidade.cadastro_id)
INNER JOIN Especialidades AS Especialidade ON(CadastrosEspecialidade.especialidade_id = Especialidade.id)
INNER JOIN cadastros_facilidades AS CadastrosFacilidade ON (Cadastro.id = CadastrosFacilidade.cadastro_id)
INNER JOIN Facilidades AS Facilidade ON(CadastrosFacilidade.facilidade_id = Facilidade.id)
WHERE
(SELECT COUNT(*) FROM cadastros_convenios WHERE cadastro_id = Cadastro.id AND convenio_id IN(1, 2, 3)) = 3
AND
(SELECT COUNT(*) FROM cadastros_especialidades WHERE cadastro_id = Cadastro.id AND especialidade_id IN(3)) = 1
AND
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = Cadastro.id AND facilidade_id IN(2, 3)) = 2
GROUP BY Cadastro.id
But I'm concerned about performance. It looks like these 3 sub-queries in the WHERE clause are gonna be over-executed...
Another solution
It joins subsequent tables only if the previous joins were a success (if no rows match one of the joins, the next joins are gonna be joining an empty result-set) (thanks #DRapp for this one)
SELECT STRAIGHT_JOIN
Cadastro.*
FROM
( SELECT Qualify1.cadastro_id
from
( SELECT cc1.cadastro_id
FROM cadastros_convenios cc1
WHERE cc1.convenio_id IN (1, 2, 3)
GROUP by cc1.cadastro_id
having COUNT(*) = 3 ) Qualify1
JOIN
( SELECT ce1.cadastro_id
FROM cadastros_especialidades ce1
WHERE ce1.especialidade_id IN( 3 )
GROUP by ce1.cadastro_id
having COUNT(*) = 1 ) Qualify2
ON (Qualify1.cadastro_id = Qualify2.cadastro_id)
JOIN
( SELECT cf1.cadastro_id
FROM cadastros_facilidades cf1
WHERE cf1.facilidade_id IN (2, 3)
GROUP BY cf1.cadastro_id
having COUNT(*) = 2 ) Qualify3
ON (Qualify2.cadastro_id = Qualify3.cadastro_id) ) FullSet
JOIN Cadastros AS Cadastro
ON FullSet.cadastro_id = Cadastro.id
INNER JOIN cadastros_convenios AS CC
ON (Cadastro.id = CC.cadastro_id)
INNER JOIN Convenios AS Convenio
ON (CC.convenio_id = Convenio.id)
INNER JOIN cadastros_especialidades AS CE
ON (Cadastro.id = CE.cadastro_id)
INNER JOIN Especialidades AS Especialidade
ON (CE.especialidade_id = Especialidade.id)
INNER JOIN cadastros_facilidades AS CF
ON (Cadastro.id = CF.cadastro_id)
INNER JOIN Facilidades AS Facilidade
ON (CF.facilidade_id = Facilidade.id)
GROUP BY Cadastro.id
Emphasis mine
"It should only return the row from Cadastros if it has a matching row for EVERY "tag" provided."
"where there is a matching row"-problems are easily solved with EXISTS.
EDIT After some clarification, I see that using EXISTS is not enough. Comparing the actual row counts is necessary:
SELECT
*
FROM
Cadastros c
WHERE
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = c.id AND id IN (2,3)) = 2
AND
(SELECT COUNT(*) FROM cadastros_especialidades WHERE cadastro_id = c.id AND id IN (1)) = 1
AND
(SELECT COUNT(*) FROM cadastros_facilidades WHERE cadastro_id = c.id AND id IN (1,2)) = 2
The indexes on the link tables should be (cadastro_id, id) for this query.
Depending on the size of the tables (records), WHERE-based subqueries, running a test on every row CAN SIGNIFICANTLY hit performance. I have restructured it which MIGHT better help, but only you would be able to confirm. The premise here is to have the first table based on getting distinct IDs that meet the criteria, join THAT set to the next qualifier criteria... joined to the FINAL set. Once that has been determined, use THAT to join to your main table and its subsequent links to get the details you are expecting. You also had an overall group by by the ID which will eliminate all other nested entries as found in the support details table.
All that said, lets take a look at this scenario. Start with the table that would be EXPECTED TO HAVE THE LOWEST RESULT SET to join to the next and next. if cadastros_convenios has IDs that match all the criteria include IDs 1-100, great, we know at MOST, we'll have 100 ids.
Now, these 100 entries are immediately JOINED to the 2nd qualifying criteria... of which, say it only matches ever other... for simplicity, we are now matched on 50 of the 100.
Finally, JOIN to the 3rd qualifier based on the 50 that qualified and you get 30 entries. So, within these 3 queries you are now filtered down to 30 entries with all the qualifying criteria handled up front. NOW, join to the Cadastros and then subsequent tables for the details based ONLY on the 30 that qualified.
Since your original query would eventually TRY EVERY "ID" for the criteria, why not pre-qualify it up front with ONE query and get just those that hit, then move on.
SELECT STRAIGHT_JOIN
Cadastro.*,
Convenio.*,
Especialidade.*,
Facilidade.*
FROM
( SELECT Qualify1.cadastro_id
from
( SELECT cc1.cadastro_id
FROM cadastros_convenios cc1
WHERE cc1.convenio_id IN (1, 2, 3)
GROUP by cc1.cadastro_id
having COUNT(*) = 3 ) Qualify1
JOIN
( SELECT ce1.cadastro_id
FROM cadastros_especialidades ce1
WHERE ce1.especialidade_id IN( 3 )
GROUP by ce1.cadastro_id
having COUNT(*) = 1 ) Qualify2
ON Qualify1.cadastro_id = Qualify2.cadastro_id
JOIN
( SELECT cf1.cadastro_id
FROM cadastros_facilidades cf1
WHERE cf1.facilidade_id IN (2, 3)
GROUP BY cf1.cadastro_id
having COUNT(*) = 2 ) Qualify3
ON Qualify2.cadastro_id = Qualify3.cadastro_id ) FullSet
JOIN Cadastros AS Cadastro
ON FullSet.Cadastro_id = Cadastro.Cadastro_id
INNER JOIN cadastros_convenios AS CC
ON Cadastro.id = CC.cadastro_id
INNER JOIN Convenios AS C
ON CC.convenio_id = C.id
INNER JOIN cadastros_especialidades AS CE
ON Cadastro.id = CE.cadastro_id
INNER JOIN Especialidades AS E
ON CE.especialidade_id = E.id
INNER JOIN cadastros_facilidades AS CF
ON Cadastro.id = CF.cadastro_id
INNER JOIN Facilidades AS F
ON CF.facilidade_id = F.id

how to simplify my sql query

I have this query, but it takes about 15 seconds to finish.. how can i simplyfy it to get same result in less time? my problem is that i need all of this data at ones.
SELECT * FROM (
SELECT c.client_id, c.client_name, c.client_bpm,
c.client_su_name, c.client_maxbpm, s.bpm,
s.timestamp, m.mesure_id, ms.currentT
FROM tbl_clients c, tbl_meting m, tbl_sensor_meting s,
tbl_magsens_meting ms
WHERE c.client_id = m.client_id
AND (m.mesure_id = s.id_mesure
OR m.mesure_id = ms.id_mesure)
AND m.live =1
ORDER BY s.timestamp DESC
) AS mesure
GROUP BY mesure.client_id
I think the problem may be the OR condition from your WHERE clause? You seem to be trying to join to one table or another, which you can't do. So I've replaced it with a LEFT JOIN, so in the event no related records exist nothing will be returned.
I also took out your GROUP BY, as I don't think it was required.
SELECT c.client_id, c.client_name, c.client_bpm,
c.client_su_name, c.client_maxbpm, s.bpm,
s.timestamp, m.mesure_id, ms.currentT
FROM tbl_clients c
JOIN tbl_meting m ON m.client_id = c.client_id
LEFT JOIN tbl_sensor_meting s ON s.id_mesure = m.mesure_id
LEFT JOIN tbl_magsens_meting ms ON ms.id_mesure = m.mesure_id
WHERE m.live = 1
ORDER BY s.timestamp DESC

MySQL LIMIT in a Correllated Subquery

I have a correlated subquery that will return a list of quantities, but I need the highest quantity, and only the highest. So I tried to introduce an order by and a LIMIT of 1 to achieve this, but MySQL throws an error stating it doesn't yet support limits in subqueries. Any thoughts on how to work around this?
SELECT Product.Name, ProductOption.Name, a.Qty, a.Price, SheetSize.UpgradeCost,
FinishType.Name, FinishOption.Name, FinishTierPrice.Qty, FinishTierPrice.Price
FROM `Product`
JOIN `ProductOption`
ON Product.idProduct = ProductOption.Product_idProduct
JOIN `ProductOptionTier` AS a
ON a.ProductOption_idProductOption = ProductOption.idProductOption
JOIN `PaperSize`
ON PaperSize.idPaperSize = ProductOption.PaperSize_idPaperSize
JOIN `SheetSize`
ON SheetSize.PaperSize_idPaperSize = PaperSize.idPaperSize
JOIN `FinishOption`
ON FinishOption.Product_idProduct = Product.idProduct
JOIN `FinishType`
ON FinishType.idFinishType = FinishOption.Finishtype_idFinishType
JOIN `FinishTierPrice`
ON FinishTierPrice.FinishOption_idFinishOption = FinishOption.idFinishOption
WHERE Product.idProduct = 1
AND FinishTierPrice.idFinishTierPrice IN (SELECT FinishTierPrice.idFinishTierPrice
FROM `FinishTierPrice`
WHERE FinishTierPrice.Qty <= a.Qty
ORDER BY a.Qty DESC
LIMIT 1)
This is a variation of the greatest-n-per-group problem that comes up frequently.
You want the single row form FinishTierPrice (call it p1), matching the FinishOption and with the greatest Qty, but still less than or equal to the Qty of the ProductOptionTier.
One way to do this is to try to match a second row (p2) from FinishTierPrice that would have the same FinishOption and a greater Qty. If no such row exists (use an outer join and test that it's NULL), then the row found by p1 is the greatest.
SELECT Product.Name, ProductOption.Name, a.Qty, a.Price, SheetSize.UpgradeCost,
FinishType.Name, FinishOption.Name, FinishTierPrice.Qty, FinishTierPrice.Price
FROM `Product`
JOIN `ProductOption`
ON Product.idProduct = ProductOption.Product_idProduct
JOIN `ProductOptionTier` AS a
ON a.ProductOption_idProductOption = ProductOption.idProductOption
JOIN `PaperSize`
ON PaperSize.idPaperSize = ProductOption.PaperSize_idPaperSize
JOIN `SheetSize`
ON SheetSize.PaperSize_idPaperSize = PaperSize.idPaperSize
JOIN `FinishOption`
ON FinishOption.Product_idProduct = Product.idProduct
JOIN `FinishType`
ON FinishType.idFinishType = FinishOption.Finishtype_idFinishType
JOIN `FinishTierPrice` AS p1
ON p1.FinishOption_idFinishOption = FinishOption.idFinishOption
AND p1.Qty <= a.Qty
LEFT OUTER JOIN `FinishTierPrice` AS p2
ON p2.FinishOption_idFinishOption = FinishOption.idFinishOption
AND p2.Qty <= a.Qty AND (p2.Qty > p1.Qty OR p2.Qty = p1.Qty
AND p2.idFinishTierPrice > p1.idFinishTierPrice)
WHERE Product.idProduct = 1
AND p2.idFinishTierPrice IS NULL