MySQL Joining three tables takes much time to execute - mysql

my below select query that joins three tables (products, descriptions, images) takes 9 seconds to return just 10 rows!
SELECT `products`.id, `products`.serialNumber, `products`.title, `descriptions`.description1, `descriptions`.description2, `descriptions`.description3, `descriptions`.description4,`products`.price,`products`.colors, `products`.category, `products`.available, `products`.status, GROUP_CONCAT(DISTINCT `images`.file_name ORDER BY `images`.id) AS images
FROM `products`
INNER JOIN `descriptions`
ON `products`.id = `descriptions`.product_id
LEFT JOIN `images`
ON `products`.id = `images`.product_id
WHERE `products`.status = 1
GROUP BY `products`.id
ORDER BY `products`.id DESC
LIMIT 10;
so after some research, I have arrived to the below answer which contains joining two tables and takes just one second to return the 10 rows, how I can add to it the third table (descriptions)? thank you
SELECT *,
( SELECT group_concat(`images`.file_name)
FROM `images`
) AS images
FROM `products`
JOIN `images` ON `products`.id = `images`.product_id
WHERE `products`.status = 1
GROUP BY `products`.id
ORDER BY `products`.id DESC
LIMIT 10

for better performance be sure you have a composite index on table products columns
status, id
on table descriptions and index on column
product_id
on table image an index on column
product_id
Anyway you could join the 3 tables using a subquery for aggreagation and join this subquery
SELECT *, my_images.aggr_images
FROM `products`
INNER JOIN `descriptions` ON `products`.id = `descriptions`.product_id
LEFT JOIN ( SELECT product_id, group_concat(`images`.file_name) aggr_images
FROM `images`
GROUP BY product_id
) AS my_images on my_images.product_id = `products`.id
WHERE `products`.status = 1
ORDER BY `products`.id DESC
LIMIT 10

Related

How to limit list by using HAVING in MySQL?

I am trying to limit a result set to 5 from each merchant_id by using HAVING in MySQL 5.7. Unfortunatelly this does not seem to work and I can not figure out why.
My SQL query joins three tables together and identifies categories where the manufactuer has a listing in. I want to limit this list to 5 per merchant_id:
SELECT
mcs.CAT_ID
FROM tbl1 mc
INNER JOIN tbl2 mcs ON mc.ID = mcs.CAT_ID
INNER JOIN tbl3 p ON mcs.ARTICLE_ID = p.SKU
WHERE
p.MANUFACTURER_ID =18670
group by
mc.merchant_ID, mcs.CAT_ID
HAVING
COUNT(mc.merchant_id) < 5
I was reading on SO that having gets executed without looking at the where statement, but what would be the right way to limit this list?
You didn't provide tables schema and dummy data, so I can't be sure about the exact query, but I'd use the following approach:
SELECT
mc.merchant_id, t.CAT_ID
FROM tbl1 mc
INNER JOIN (
SELECT mcs.CAT_ID
FROM tbl2 AS mcs
WHERE mc.ID = mcs.CAT_ID
AND EXISTS (
SELECT 'x'
FROM tbl3 AS p
WHERE p.SKU = mcs.ARTICLE_ID
AND p.MANUFACTURER_ID = 18670
)
LIMIT 5
) as t
;
With the subquery in the join I select all the CAT_IDs relate to that mc.ID which have the listing for the product selected (18670), limited to 5 rows. In this way the limit to 5 is applied to each merchant_id

Optimizing MySQL Join and Group By with intermediate table

Simplifying but I have three tables:
users (user_id, team_id)
results (user_id, result)
user_signups (user_id, team_id, event_id)
results.user_id is a foreign key.
Tables have large number of rows in. If I do
select sum(result)
from results
inner join users on users.id = results.user_id
group by team_id
It is fast. "Explain" has results with 150k rows, users with 1 row.
If I do
select sum(result)
from results
inner join user_signups on user_signups.user_id = results.user_id
where event_id = 1
group by team_id
It is very slow (from 1 second to 14). "Explain" has results with 28 rows, user_signups with 5345 rows.
Things I have tried:
A unique index on event_id and user_id on user_signups.
An index on event_id, user_id, team_id on user_signups.
Rewriting as
select sum(result)
from results
inner join (select * from user_signups where event_id = 1) user_signups on user_signups.user_id = results.user_id
group by team_id
Rewriting as
select sum(result)
from results
inner join users on users.id = results.user_id
inner join user_signups on user_signups.user_id = users.id
where event_id = 1
group by user_signups.team_id
Any other suggestions?
By grouping on the team_id, I assume that you want one row for each record in results.
Is this what you're looking for?
SELECT *, sum(result) FROM results
LEFT JOIN users ON (users.user_id=results.user_id)
LEFT JOIN user_signups ON (user_signups.users_id=users.user_id)
GROUP BY table.field
From here, you can group on whatever you like. This structure assumes that most of your data will be present in the results table and will join users to the results table and user_signups to the users table.
Make the multicolumn index on (event_id, user_id, team_id) in user_signups table and try to run the following query.
If this doesn't work then post your explain here.
select sum(result) from results inner join(select
event_id,user_id,team_id from user_signups where event_id = 1)
user_signups on user_signups.user_id = results.user_id group by
team_id

Mysql very slow subquery optimizing

I am building a sql query with a large set of data but query is too slow
I've got 3 tables; movies, movie_categories, skipped_movies
The movies table is normalized and I am trying to query a movie based on a category while excluding ids from skipped_movies table.
However I am trying to use WHERE IN and WHERE NOT IN to in my query.
movies table has approx. 2 million rows (id, name, score)
movie_categories approx. 5 million (id, movie_id, category_id)
skipped_movies has approx. 1k rows (id, movie_id, user_id)
When the skipped_movies table is very small 10 - 20 rows the query is quite fast. (about 40 - 50 ms) but when the table gets somewhere around 1k of data I get somewhere around 7 to 8 seconds on the query.
This is the query I'm using.
SELECT SQL_NO_CACHE * FROM `movies` WHERE `id` IN (SELECT `movie_id` FROM `movie_categories` WHERE `category_id` = 1) AND `id` NOT IN (SELECT `movie_id` FROM `skipped_movies` WHERE `user_id` = 1) AND `score` <= 9 ORDER BY `score` DESC LIMIT 1;
I've tried many ways that came to mind but this was the fastest one. I even tried the EXISTS method to no extent.
I'm using the SQL_NO_CACHE just for testing.
And I guess that the ORDER BY statement is running very slow.
Assuming that (movie_id,category_id) is unique in movies_categories table, I'd get the specified result using join operations, rather than subqueries.
To exclude "skipped" movies, an anti-join pattern would suffice... that's a left outer join to find matching rows in skipped_movies, and then a predicate in the WHERE clause to exclude any matches found, leaving only rows that didn't have a match.
SELECT SQL_NO_CACHE m.*
FROM movies m
JOIN movie_categories c
ON c.movie_id = m.id
AND c.category_id = 1
LEFT
JOIN skipped_movies s
ON s.movie_id = m.id
AND s.user_id = 1
WHERE s.movie_id IS NULL
AND m.score <= 9
ORDER
BY m.score DESC
LIMIT 1
And appropriate indexes will likely improve performance...
... ON movie_categories (category_id, movie_id)
... ON skipped_movies (user_id, movie_id)
Most IN/NOT IN queries can be expressed using JOIN/LEFT JOIN, which usually gives the best performance.
Convert your query to use joins:
SELECT m.*
FROM movies m
JOIN movie_categories mc ON m.id = mc.movie_id AND mc.category_id = 1
LEFT JOIN skipped_movies sm ON m.id = sm.movie_id AND sm.user_id = 1
WHERE sm.movie_id IS NULL
AND score <= 9
ORDER BY score DESC
LIMIT 1
Your query seem to be all right. Just a small tweak need. You can replace * with with the column/attribute names in your table. It will make this query work faster then ever. Since * operation is really slow

MySQL query to calculate percentage of total column

How to convert this result:
Group | Sum
Services | 11120.99
Vendas | 3738.00
Into:
Group | Sum
Services | 74.84
Vendas | 25.16
That is, the second displays the results as percentages of total.
This is what I tried:
SELECT categories.cat AS 'Group', SUM(atual) AS 'Sum'
FROM `table1` INNER JOIN
categories
ON table1.category_id=categories.id
GROUP BY categoria
you can left join a total sum that is not grouped or split up, and divide that by your sum query. this way you are just doing the total select once for faster runtime
SELECT cat, sum_atual, sum_atual/total_atual as percent_atual
FROM
( SELECT categories.cat AS cat, SUM(atual) AS sum_atual
FROM `table1`
JOIN categories ON table1.category_id=categories.id
GROUP BY categoria
) t
LEFT JOIN
( SELECT SUM(atual) as total_atual
FROM `table1`
) t1
SELECT categories.cat AS categoria,
SUM(atual) * 100 / (select sum(atual) from table1) AS percentages
FROM `table1`
INNER JOIN categories ON table1.category_id=categories.id
GROUP BY categoria
You can do this several ways. One is to just use a subquery in the select clause. As written below, this assumes that the category_id column in table1 always matches categories:
SELECT c.categoria AS "Group", SUM(t1.atual) AS "Sum",
SUM(t1.atual) / (SELECT SUM(t1.atual) FROM table1) as "Percent"
FROM `table1` t1 INNER JOIN
categories c
ON t1.category_id = c.id
GROUP BY c.categoria;
I changed the group by clause as well. It is a good idea for the group by and select to use the same columns. And I added table aliases to all the column references, another good practice.

How can I make a WHERE clause only apply to the right table in a left join?

I have two tables.
TableA: field_definitions
field_id, field_type, field_length, field_name, field_desc, display_order, field_section, active
TableB: user_data
response_id, user_id, field_id, user_response
I need a query that will return all rows from table A and, if they exist, matching rows from table B based on a particular user_id.
Here is what I have so far...
SELECT field_definitions. * , user_data.user_response
FROM field_definitions
LEFT JOIN user_data
USING ( field_id )
WHERE (
user_data.user_id =8
OR user_data.user_id IS NULL
)
AND field_definitions.field_section =1
AND field_definitions.active =1
ORDER BY display_order ASC
This only works if table B has zero rows or matching rows for the user_id in the WHERE clause. If table B has rows with matching field_id but not user_id, I get zero returned rows.
Essentially, once rows in table B exist for user X, the query no longer returns rows from table A when searching for user Z responses and none are found.
I need the result to always contain rows from table A even if there are no matching rows in B with the correct user_id.
You can move those constraints from the WHERE clause to the ON clause (which first requires that you change the USING clause into an ON clause: ON clauses are much more flexible than USING clauses). So:
SELECT field_definitions.*,
user_data.user_response
FROM field_definitions
LEFT
JOIN user_data
ON user_data.field_id = field_definitions.field_id
AND user_data.user_id = 8
WHERE field_definitions.field_section = 1
AND field_definitions.active = 1
ORDER
BY field_definitions.display_order ASC
;
Conceptually, the join is performed first and then the where clause is applied to the virtual resultset. If you want to filter one table first, you have to code that as a sub-select inside the join. Something along these lines:
SELECT
field_definitions. * ,
user8.user_response
FROM
field_definitions
LEFT JOIN (select * from user_data where user_id=8 or user_id is null) as user8
USING ( field_id )
WHERE
field_definitions.field_section =1
AND field_definitions.active =1
ORDER BY display_order ASC
You can move the WHERE clause inside as follows
SELECT field_definitions. * , user_data.user_response
FROM (
select * from
field_definitions
WHERE field_definitions.field_section =1
AND field_definitions.active =1 ) as field_definitions
LEFT JOIN (
select * from
user_data
where user_data.user_id =8
OR user_data.user_id IS NULL ) as user_data
USING ( field_id )
ORDER BY display_order ASC
A literal translation of the sepc:
SELECT field_definitions. * , '{{MISSING}}' AS user_response
FROM field_definitions
UNION
SELECT field_definitions. * , user_data.user_response
FROM field_definitions
NATURAL JOIN user_data
WHERE user_data.user_id = 8;
However, I suspect that you don't really want "all rows from table A".