How can this SQL query be optimized? (Running on MySQL) - mysql

I need help on rewriting an SQL query that takes 26 seconds to run on an MySQL server.
The query is:
select
c.countries_name,
c.country_id,
(SELECT
count(1)
FROM
`fav_country`
WHERE
`country_id`=c.country_id
and device_id='".$device_id."'
) as isFav,
c.image,
c.countries_iso_code,
s.country
from
station s
left join countries c on c.country_id=s.country
where
isactive=:isactive
group by
s.country
I have tried rewriting it with two left/right joins but to no avail.
Basically, we have three tables, countries, fav_country and station, the common field is the country id (countries.country_id, fav_country.countr_id and station.country)
Thanks in advance!

This is the query you want optimized:
select c.countries_name, c.country_id,
(select count(1)
from fav_country fc
where fc.country_id = c.country_id and
fc.device_id = ? -- use a parameter!
) as isFav,
c.image, c.countries_iso_code
from station s left join
countries c
on c.country_id = s.country
where s.isactive = :isactive
group by s.country;
First, avoiding the outer aggregate is very helpful. I would replace it with exists:
select c.countries_name, c.country_id,
(select count(1)
from fav_country fc
where fc.country_id = c.country_id and
fc.device_id = ?
) as isFav,
c.image, c.countries_iso_code
from countries c
where exists (select 1
from station s
where s.country = c.country_id and
s.isactive = :isactive
);
Then for this query, you want indexes on:
station(country, isactive)
fav_country(country_id, device_id).

Related

MySQL error #1111 - Invalid use of group function

Yes, this is an assignment. So the task was to output two columns of 'first name' and 'last name' with conditions:
-A u (B ∩ -C ∩ -(A ∩ -( B u D)))
A: All consumers that didn't shop on Monday and Friday
(time_by_day.the_day)
B: All consumers who bought 'Non-Consumable'
(product_class.product_family)
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
D: Female consumers from Canada (consumer.gender, consumer.country)
This is what I got so far
SELECT
c.fname,
c.lname
FROM
customer AS c
INNER JOIN sales_fact_1997 AS s ON c.customer_id = s.customer_id
INNER JOIN time_by_day AS t ON s.time_id = t.time_id
INNER JOIN product AS p ON s.product_id = p.product_id
INNER JOIN product_class AS pc ON p.product_class_id = pc.product_class_id
Where
NOT t.the_day in ('Monday', 'Friday') OR
(
pc.product_family = 'Non-Consumable' AND
NOT SUM(s.unit_sales) > 10 AND
NOT (
t.the_day in ('Monday', 'Friday') AND
NOT (
pc.product_family = 'Non-Consumable' OR
(c.country = 'Canada' AND c.gender = 'F')
)
)
)
GROUP BY concat(c.customer_id, s.time_id)
That ended up with an error
#1111 - Invalid use of group function
But I don't know which part of the code is wrong. I'm pretty sure that it's probably the WHERE part. But I don't know what I did wrong.
Condition C is where I'm really struggling. I manage just fine making a query of C
SELECT
t.time_id,
c.customer_id,
c.fullname,
round(SUM(s.unit_sales),0) as tot
FROM
customer as c
INNER JOIN sales_fact_1997 as s ON c.customer_id = s.customer_id
INNER JOIN time_by_day as t on s.time_id=t.time_id
GROUP BY concat(c.customer_id, s.time_id)
ORDER BY c.customer_id, t.time_id
But trying to incorporate it into the main code is hard for me.
Reading online I assume that I should probably use HAVING instead of WHERE.
I would really appreciate it if someone can point me in the right direction.
This is the database that I used.
C: All consumers who bought more than 10 items
(sales_fact_1997.unit_sales) at one time (sales_fact_1997.time_id)
You should use COUNT not SUM.
SELECT time_id,
count(*)
FROM sales_fact_1997
GROUP BY time_id
HAVING COUNT(*)>=10 ;
count(*) is not needed, I let just to show the results
Can you try if it helps:
SELECT c.lname,
c.fname
FROM customer c
INNER JOIN
(
SELECT time_id,customer_id,product_id
FROM sales_fact_1997
GROUP BY time_id,customer_id,product_id
HAVING COUNT(*)>=10
) as s on c.customer_id=s.customer_id
INNER JOIN
(
SELECT time_id,the_day
FROM time_by_day
WHERE the_day
NOT IN ('Monday','Friday')
) as t on s.time_id=t.time_id
INNER JOIN
(
SELECT product_family,product_id
FROM product_class
INNER JOIN product
on product_class.product_class_id=product.product_class_id
WHERE product_family='Non-Consumable'
) pc on s.product_id=pc.product_id
where c.country='Canada' and c.gender ='F' ;

MySQL query with GROUP BY and JOIN

Good afternoon,
I'm trying to get some information from my MySQL database and I'm having problems because I'm not able to have the information needed. I have tried a lot of different approaches and none of them have worked. I hope you can find something because I'm very close to find the solution but something is missing:
MySQL query:
SELECT b.id, b.tipo_perfil, round(avg(b.edad)), COUNT(c.zona), c.zona
FROM analizador_datos_usuario AS a
INNER JOIN analizador_datos_perfil AS b ON (a.id_usuario = b.id_perfil)
INNER JOIN analizador_datos_perfil_historial AS c ON (b.id = c.id_perfil)
WHERE a.id_usuario=21
GROUP BY b.tipo_perfil, c.zona
ORDER BY b.tipo_perfil ASC, count(c.zona) DESC
This query gives me the following information:
Table (in red it's what I need):
Kind regards,
try that :
SELECT b.tipo_perfil, round(avg(b.edad)), COUNT(distinct c.zona), group_concat(distinct b.id separator ' ') as id_list, group_concat(distinct c.zona separator ' ') as zona_list
FROM analizador_datos_usuario AS a
INNER JOIN analizador_datos_perfil AS b ON (a.id_usuario = b.id_perfil)
INNER JOIN analizador_datos_perfil_historial AS c ON (b.id = c.id_perfil)
WHERE a.id_usuario=21
GROUP BY b.tipo_perfil
ORDER BY b.tipo_perfil ASC, count(distinct c.zona) DESC
I think you are getting result what is displayed and you want result which is in red colour.
Try this modified query:-
SELECT b.id, b.tipo_perfil, round(avg(b.edad)), COUNT(c.zona) counted_zone, c.zona
FROM analizador_datos_usuario AS a
INNER JOIN analizador_datos_perfil AS b ON (a.id_usuario = b.id_perfil)
INNER JOIN analizador_datos_perfil_historial AS c ON (b.id = c.id_perfil)
WHERE a.id_usuario=21
GROUP BY b.tipo_perfil, c.zona
Having MAX(counted_zone)
ORDER BY b.tipo_perfil ASC, counted_zone DESC

Trying to add one last SUM() column to my query in SQL Server 2008

I have the first query which is producing correct results. What I need is I need to add the sum of values as a last column grouped by surveyid. I can't insert Sum(c.value) into the first query because it is an aggregate function. I have the correct query as my second query below. I know there's pivot functionality but not sure if it can be used here. I do realize that there will be repetition but that's okay.
'first query
SELECT
A.PATIENTID, B.STUDENTNUMBER, c.surveyid,
convert(varchar, A.CreatedDate, 107),
C.QuestionID, C.Value, D.Question
FROM
dbo.Survey A, dbo.Patient B, [dbo].[SurveyQuestionAnswer] C, [dbo].[LookupQuestions] D
WHERE
A.PATIENTID = B.ID
and c.SurveyID = A.ID
and c.QuestionID = d.ID
and c.questionid <> 10
ORDER BY
A.PATIENTID
'second query
select
c.surveyid,SUM(c.value) as scores
from
dbo.SurveyQuestionAnswer c
group by
c.SurveyID
order by
SurveyID '---not important
You can use SUM if you add the OVER clause. In this case:
SELECT
A.PATIENTID, B.STUDENTNUMBER, c.surveyid,
convert(varchar, A.CreatedDate, 107),
C.QuestionID, C.Value, D.Question,
SUM(c.Value) OVER(PARTITION BY c.surveyid) scores
FROM
dbo.Survey A
INNER JOIN dbo.Patient B
ON A.PATIENTID = B.ID
INNER JOIN [dbo].[SurveyQuestionAnswer] C
ON c.SurveyID = A.ID
INNER JOIN [dbo].[LookupQuestions] D
ON c.QuestionID = d.ID
WHERE
c.questionid <> 10
ORDER BY
A.PATIENTID
You could use something like this:
SELECT
s.PATIENTID, p.STUDENTNUMBER, sqa.surveyid,
CONVERT(varchar, s.CreatedDate, 107),
sqa.QuestionID, sqa.Value, lq.Question,
Scores = (SELECT SUM(Value) FROM dbo.SurveyQuestionAnswer s2 WHERE s2.SurveyID = s.ID)
FROM
dbo.Survey s
INNER JOIN
dbo.Patient p ON s.PatientID = p.ID
INNER JOIN
[dbo].[SurveyQuestionAnswer] sqa ON sqa.SurveyID = s.ID
INNER JOIN
[dbo].[LookupQuestions] lq ON sqa.QuestionID = lq.ID
WHERE
sqa.questionid <> 10
ORDER BY
s.PATIENTID
By having a subquery with the SUM(...) you should be able to get that sum as a single value and you don't need to use any grouping function

MySql on what cols should I put indexes?

I have this query:
SELECT Concat(f.name, ' ', f.parent_names) AS FullName,
stts.name AS 'Status',
u.name AS Unit,
city.name AS City,
(SELECT Group_concat(c.mobile1)
FROM contacts c
WHERE c.id = f.husband_id
OR c.id = f.wife_id) AS MobilePhones,
f.phone AS HomePhone,
f.contact_initiation_date AS InitDate,
f.status_change_date AS StatusChangeDate,
cmt.created_at AS CommentDate,
cmt.comment AS LastComment,
f.reconnection_date AS ReconnectionDate,
(SELECT Group_concat(t.name, ' ')
FROM taggings tgs
JOIN tags t
ON tgs.tag_id = t.id
WHERE tgs.taggable_type = 'family'
AND tgs.taggable_id = f.id) AS HandlingStatus
FROM families f
JOIN categories stts
ON f.family_status_cat_id = stts.id
JOIN units u
ON f.unit_id = u.id
JOIN categories city
ON f.main_city_cat_id = city.id
LEFT JOIN comments cmt
ON f.last_comment_id = cmt.id
WHERE 1 = 0
OR ( u.is_busy = 1 )
OR ( f.family_status_cat_id = 1423 )
OR ( f.family_status_cat_id = 1422
AND f.status_change_date BETWEEN '2011-03-21' AND '2012-03-13' )
My problem is very specific. It is regarding the line:
SELECT GROUP_CONCAT( c.mobile1 )
FROM contacts c
WHERE c.id = f.husband_id
OR c.id = f.wife_id
) AS MobilePhones
When I use EXPLAIN, it seems that this query is bad. I get for this table (c = contacts): 38307 rows.
On what columns should I put the index according to the query?
I tried mobile1 - but no improvement (BTW - family_id is indexed in the contacts table).
I attach the image of the explain result:
Or maybe someone can help me optimize the query...
Any column you'll be searching on, to speed up the process. Keep in mind that keys are already indexed.
Well, it seems that using the GROUP_CONCAT is the problem.
I just seperated the wife and husband mobile to be 2 different columns.
First, I thought that using the GROUP_CONCAT will be faster, but it proved to be VERY WRONG.
Just out of my curiosity, what is the performance of the query
SELECT GROUP_CONCAT( c.mobile1 )
FROM contacts c
WHERE c.id IN(f.husband_id, f.wife_id)
) AS MobilePhones

How to optimize this nested SQL query

Here is the database schema:
[redacted]
I'll describe what I'm doing with the query below:
Innermost query: Select all the saleIds satisfying the WHERE conditions
Middle query: Select all the productIds that were a part of the saleId
Outermost query: SUM the products.cost and select the vendors.name.
And here is the SQL query I came up with:
SELECT vendors.name AS Company
, SUM(products.cost) AS Revenue
FROM
products
INNER JOIN sold_products
ON (products.productId = sold_products.productId)
INNER JOIN vendors
ON (products.vendorId = vendors.vendorId)
WHERE sold_products.productId IN (
SELECT sold_products.productId
FROM
sold_products
WHERE sold_products.saleId IN (
SELECT sales.saleId
FROM
markets
INNER JOIN vendors
ON (markets.vendorId = vendors.vendorId)
INNER JOIN sales_campaign
ON (sales_campaign.marketId = markets.marketId)
INNER JOIN packet_headers
ON (sales_campaign.packetHeaderId = packet_headers.packetHeaderId)
INNER JOIN packet_details
ON (packet_details.packetHeaderId = packet_headers.packetHeaderId)
INNER JOIN sales
ON (sales.packetDetailsId = packet_details.packetDetailsId)
WHERE vendors.customerId=60
)
)
GROUP BY Company
ORDER BY Revenue DESC;
Any help in optimizing this?
Since you are just using inner joins you normally simplify the query to smth like this:
SELECT ve.name AS Company
, SUM(pr.cost) AS Revenue
FROM products pr
, sold_products sp
, vendors ve
, markets ma
, sales_campaign sc
, packet_headers ph
, packet_details pd
, sales sa
Where pr.productId = sp.productId
And pr.vendorId = ve.vendorId
And ve.vendorId = ma.vendorId
And sc.marketId = ma.marketId
And sc.packetHeaderId = ph.packetHeaderId
And pd.packetHeaderId = ph.packetHeaderId)
And sa.packetDetailsId = pd.packetDetailsId
And ve.customerId = 60
GROUP BY ve.Company
ORDER BY pr.Revenue DESC;
Please try if this works and if it is faster and let me know.