So I was working on the problem of optimizing the following query I have already optimized this to the fullest from my side can this be further optimized?
select distinct name ad_type
from dim_ad_type x where exists ( select 1
from sum_adserver_dimensions sum
left join dim_ad_tag_map on dim_ad_tag_map.id=sum.ad_tag_map_id and dim_ad_tag_map.client_id=sum.client_id
left join dim_site on dim_site.id = dim_ad_tag_map.site_id
left join dim_geo on dim_geo.id = sum.geo_id
left join dim_region on dim_region.id=dim_geo.region_id
left join dim_device_category on dim_device_category.id=sum.device_category_id
left join dim_ad_unit on dim_ad_unit.id=dim_ad_tag_map.ad_unit_id
left join dim_monetization_channel on dim_monetization_channel.id=dim_ad_tag_map.monetization_channel_id
left join dim_os on dim_os.id = sum.os_id
left join dim_ad_type on dim_ad_type.id = dim_ad_tag_map.ad_type_id
left join dim_integration_type on dim_integration_type.id = dim_ad_tag_map.integration_type_id
where sum.client_id = 50
and dim_ad_type.id=x.id
)
order by 1
Your query although joined ok, is an overall bloat. You are using the dim_ad_type table on the outside, just to make sure it exists on the inside as well. You have all those left-joins that have NO bearing on the final outcome, why are they even there. I would simplify by reversing the logic. By tracing your INNER query for the same dim_ad_type table, I find the following is the direct line. sum -> dim_ad_tag_map -> dim_ad_type. Just run that.
select distinct
dat.name Ad_Type
from
sum_adserver_dimensions sum
join dim_ad_tag_map tm
on sum.ad_tag_map_id = tm.id
and sum.client_id = tm.client_id
join dim_ad_type dat
on tm.ad_type_id = dat.id
where
sum.client_id = 50
order by
1
Your query was running ALL dim_ad_types, then finding all the sums just to find those that matched. Run it direct starting with the one client, then direct with JOINs.
Related
I have 7 tables to work with inside a query:
tb_post, tb_spots, users, td_sports, tb_spot_types, tb_users_sports, tb_post_media
This is the query I am using:
SELECT po.id_post AS id_post,
po.description_post as description_post,
sp.id_spot as id_spot,
po.date_post as date_post,
u.id AS userid,
u.user_type As tipousuario,
u.username AS username,
spo.id_sport AS sportid,
spo.sport_icon as sporticon,
st.logo_spot_type as spottypelogo,
sp.city_spot AS city_spot,
sp.country_spot AS country_spot,
sp.latitud_spot as latitudspot,
sp.longitud_spot as longitudspot,
sp.short_name AS spotshortname,
sp.verified_spot AS spotverificado,
u.profile_image AS profile_image,
sp.verified_spot_by as spotverificadopor,
uv.id AS spotverificador,
uv.user_type AS spotverificadornivel,
pm.media_type AS mediatype,
pm.media_file AS mediafile,
GROUP_CONCAT(tus.user_sport_sport) sportsdelusuario,
GROUP_CONCAT(logosp.sport_icon) sportsdelusuariologos,
GROUP_CONCAT(pm.media_file) mediapost,
GROUP_CONCAT(pm.media_type) mediaposttype
FROM tb_posts po
LEFT JOIN tb_spots sp ON po.spot_post = sp.id_spot
LEFT JOIN users u ON po.uploaded_by_post = u.id
LEFT JOIN tb_sports spo ON sp.sport_spot = spo.id_sport
LEFT JOIN tb_spot_types st ON sp.type_spot = st.id_spot_type
LEFT JOIN users uv ON sp.verified_spot_by = uv.id
LEFT JOIN tb_users_sports tus ON tus.user_sport_user = u.id
LEFT JOIN tb_sports logosp ON logosp.id_sport = tus.user_sport_sport
LEFT JOIN tb_post_media pm ON pm.media_post = po.id_post
WHERE po.status = 1
GROUP BY po.id_post,uv.id
I am having problems with some of the GROUP_CONCAT groups:
GROUP_CONCAT(tus.user_sport_sport) sportsdelusuario is giving me the right items but repeated, all items twice
GROUP_CONCAT(logosp.sport_icon) sportsdelusuariologos is giving me the right items but repeated, all items twice
GROUP_CONCAT(pm.media_file) mediapost is giving me the right items but repeated four times
GROUP_CONCAT(pm.media_type) mediaposttype s giving me the right items but repeated four times
I can put here all tables structures if you need them.
Multiple one-to-many relations JOINed in a query have a multiplicative affect on aggregation results; the standard solution is subqueries:
You can change
GROUP_CONCAT(pm.media_type) mediaposttype
...
LEFT JOIN tb_post_media pm ON pm.media_post = po.id_post
to
pm.mediaposttype
...
LEFT JOIN (
SELECT media_post, GROUP_CONCAT(media_type) AS mediaposttype
FROM tb_post_media
GROUP BY media_post
) AS pm ON pm.media_post = po.id_post
If tb_post_media is very big, and the po.status = 1 condition in the outer query would significantly reduce the results of the subquery, it can be worth replicating the original join within the subquery to filter down it's results.
Similarly, the correlated version I mentioned in the comments can also be more performant if the outer query has relatively few results. (Calculating the GROUP_CONCAT() for each individually can cost less than calculating it for all once if you would only actually using very few of the results of the latter).
or just add DISTINCT to all the group_concat, e.g., GROUP_CONCAT(DISTINCT pm.media_type)
I'm trying to create a SQL query that uses one table to count the number of blade servers our company has in each chassis and groups those, while joining it with chassis information from another table.
However, one of the chassis has no blades in it, so the name does not appear in the blade inventory table. Using an INNER JOIN creates a table that doesn't contain that blade in any capacity. A LEFT JOIN achieves the same effect, but a RIGHT JOIN gives me an extra row with a null value for the chassis name.
I'm guessing this is because the non-existence of that blade name in the first table is being given precedence over the second, but not sure how to correct that. My query, as of now, looks like this:
SELECT e.EnclosureName, e.PDUName, q.Blades, r.Serial#
FROM bladeinventory.table e JOIN
(
SELECT EnclosureName,COUNT(*) Blades
FROM bladeinventory.table
GROUP BY EnclosureName
) q ON e.EnclosureName = q.EnclosureName
LEFT JOIN chassisinventory.table r
ON e.EnclosureName = r.EnclosureName
GROUP BY e.EnclosureName, e.PDUName, q.Blades, r.Serial#
Is it possible to edit this in such a way that the name of the chassis with 0 blades is actually generated by the query?
Just pull the name from the chassisinventory table. I'll use coalesce(), just in case you switch the order of the joins (again):
SELECT COALESCE(r.EncloseName, e.EnclosureName) as EnclosureName, e.PDUName, q.Blades, r.Serial#
FROM bladeinventory.table e JOIN
(SELECT EnclosureName,COUNT(*) Blades
FROM bladeinventory.table
GROUP BY EnclosureName
) q
ON e.EnclosureName = q.EnclosureName LEFT JOIN
chassisinventory.table r
ON e.EnclosureName = r.EnclosureName
GROUP BY COALESCE(r.EncloseName, e.EnclosureName), e.PDUName, q.Blades, r.Serial#;
You can also use below code where case is being used which is much simpler and effective
SELECT e.EnclosureName, r.PDUName,
case when q.Blades IS NULL then 0
else q.Blades end Blades,
e.Serial#
FROM chassisinventory.table e
LEFT OUTER JOIN bladeinventory.table r on e.EnclosureName = r.EnclosureName
LEFT OUTER JOIN (SELECT EnclosureName,COUNT(*) Blades
FROM bladeinventory.table
GROUP BY EnclosureName
) q on e.EnclosureName = q.EnclosureName
My aim is to do exactly what a LEFT OUTER JOIN intends to do using the 4th venn diagram: SQL Diagrams:
My query isn't returning any values at all, where in fact, it should be returning all within the Consultant_Memberships minus the one that is stored within Consultant_Memberships_Lists.
Please see the SQL Fiddle for an easier understanding:
SELECT *
FROM consultant_memberships
LEFT OUTER JOIN consultant_memberships_list
ON consultant_memberships.`id` =
consultant_memberships_list.membership_id
WHERE consultant_memberships_list.consultant_id = $id
AND consultant_memberships_list.membership_id IS NULL
The query is using '5' as an ID for demonstration purposes to try and pick out the correct rows.
You current query is basically doing an INNER JOIN because of the consultant_id = 5 on the WHERE clause. I believe you actually want to use:
SELECT *
FROM consultant_memberships m
LEFT OUTER JOIN consultant_memberships_list l
ON m.`id` = l.membership_id
AND l.consultant_id = 5
WHERE l.membership_id IS NULL;
See SQL Fiddle with Demo
Use
SELECT *
FROM consultant_memberships
LEFT Outer JOIN consultant_memberships_list
ON consultant_memberships_list.membership_id = consultant_memberships.`id`
and consultant_memberships_list.consultant_id = 5
where consultant_memberships_list.membership_id IS NULL;
The Where clause used before in your query "consultant_memberships_list.consultant_id = 5 " was neglecting the left outer join.
I want to replace the subquery with a join, if possible.
SELECT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON fftenant_surveyanswer.surveyquestion_id = 1
AND fftenant_surveyanswer.`surveyresult_id` IN (SELECT y.`surveyresult_id` FROM `fftenant_farmer_surveyresults` y WHERE y.farmer_id = `fftenant_farmer`.`person_ptr_id`)
I tried:
SELECT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`#, T5.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_farmer_surveyresults`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_farmer_surveyresults`.`farmer_id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON (`fftenant_farmer_surveyresults`.`surveyresult_id` = `fftenant_surveyanswer`.`surveyresult_id`)
AND fftenant_surveyanswer.surveyquestion_id = 1
But that gave me one record per farmer per survey result for that farmer. I only want one record per farmer as returned by the first query.
A join may be faster on most RDBMs, but the real reason I asked this question is I just can't seem to formulate a join to replace the subquery and I want to know if it's even possible.
You could use DISTINCT or GROUP BY, as mvds and Brilliand suggest, but I think it's closer to the query's design intent if you change the last join to an inner-join, but elevating its precedence:
SELECT farmer.person_ptr_id, surveyanswer.text_value
FROM fftenant_farmer AS farmer
INNER
JOIN fftenant_person AS person
ON person.id = farmer.person_ptr_id
LEFT
OUTER
JOIN
( fftenant_farmer_surveyresults AS farmer_surveyresults
INNER
JOIN fftenant_surveyanswer AS surveyanswer
ON surveyanswer.surveyresult_id = farmer_surveyresults.surveyresult_id
AND surveyanswer.surveyquestion_id = 1
)
ON farmer_surveyresults.farmer_id = farmer.person_ptr_id
Broadly speaking, this will end up giving the same results as the DISTINCT or GROUP BY approach, but in a more principled, less ad hoc way, IMHO.
Use SELECT DISTINCT or GROUP BY to remove the duplicate entries.
Changing your attempt as little as possible:
SELECT DISTINCT `fftenant_farmer`.`person_ptr_id`, `fftenant_surveyanswer`.`text_value`#, T5.`text_value`
FROM `fftenant_farmer`
INNER JOIN `fftenant_person`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_person`.`id`)
LEFT OUTER JOIN `fftenant_farmer_surveyresults`
ON (`fftenant_farmer`.`person_ptr_id` = `fftenant_farmer_surveyresults`.`farmer_id`)
LEFT OUTER JOIN `fftenant_surveyanswer`
ON (`fftenant_farmer_surveyresults`.`surveyresult_id` = `fftenant_surveyanswer`.`surveyresult_id`)
AND fftenant_surveyanswer.surveyquestion_id = 1
the real reason I asked this question is I just can't seem to formulate a join to replace the subquery and I want to know if it's even possible
Then consider a much simpler example to begin with e.g.
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T2);
This is known as a semi join and if desired may be re-written using (among other possibilities) a JOIN with a SELECT clause to a) project only from the 'outer' table, and b) return only DISTINCT rows:
SELECT DISTINCT T1.*
FROM T1
JOIN T2 USING (id);
hi I am doing A query to get some product info, but there is something strange going on, the first query returns resultset fast (.1272s) but the second (note that I just added 1 column) takes forever to complete (28-35s), anyone know what is happening?
query 1
SELECT
p.partnumberp,
p.model,
p.descriptionsmall,
p.brandname,
sum(remainderint) stockint
from
inventario_dbo.inventoryindetails ind
left join purchaseorders.product p on (p.partnumberp = ind.partnumberp)
left join inventario_dbo.inventoryin ins on (ins.inventoryinid= ind.inventoryinid)
group by partnumberp, projectid
query 2
SELECT
p.partnumberp,
p.model,
p.descriptionsmall,
p.brandname,
p.descriptiondetail,
sum(remainderint) stockint
from
inventario_dbo.inventoryindetails inda
left join purchaseorders.product p on (p.partnumberp = inda.partnumberp)
left join inventario_dbo.inventoryin ins on (ins.inventoryinid= inda.inventoryinid)
group by partnumberp, projectid
You shouldn't group by some columns and then select other columns unless you use aggregate functions. Only p.partnumberp and sum(remainderint) make sense here. You're doing a huge join and select and then the results for most rows just end up getting discarded.
You can make the query much faster by doing an inner select first and then joining that to the remaining tables to get your final result for the last few columns.
The inner select should look something like this:
select p.partnumberp, projectid, sum(remainderint) stockint
from inventario_dbo.inventoryindetails ind
left join purchaseorders.product p on (p.partnumberp = ind.partnumberp)
left join inventario_dbo.inventoryin ins on (ins.inventoryinid = ind.inventoryinid)
group by partnumberp, projectid
After the join:
select T1.partnumberp, T1.projectid, p2.model, p2.descriptionsmall, p2.brandname, T1.stockint
from
(select p.partnumberp, projectid, sum(remainderint) stockint
from inventario_dbo.inventoryindetails ind
left join purchaseorders.product p on (p.partnumberp = ind.partnumberp)
left join inventario_dbo.inventoryin ins on (ins.inventoryinid = ind.inventoryinid)
group by partnumberp, projectid) T1
left join purchaseorders.product p2 on (p2.partnumberp = T1.partnumberp)
Is descriptiondetail a really large column? Sounds like it could be a lot of text compared to the other fields based on its name, so maybe it just takes a lot more time to read from disk, but if you could post the schema detail for the purchaseorders.product table or maybe the average length of that column that would help.
Otherswise I would try running the query a few times and see you consistently get the same time results. Could just be load on the database server the time you got the slower result.