Update query optimization with large data

Update query optimization with large data - mysql

I have products , and categories table, and a pivot table named product_catalog, I need to update the product_catalog table so that I can remove the categories which have less than five products. Those products which are in these redundant categories should move to their parent categories. I have written a query for this but problem is that this product_catalog table has 55213277 records in it and it takes lot of time to run .
Basically it is a nested query and we have to run this query for as many times unless there is no category left having less than five products.
Here is my sql query I tested.
Can you propose me an optimized solution.
UPDATE product_catalogT AS C
INNER JOIN
(SELECT
COUNT(*) AS tp, catalog_id cid, g.parent_id pid
FROM
product_catalog AS p
LEFT JOIN catalog AS g ON p.catalog_id = g.id
Where g.parent_id <> 0
GROUP BY catalog_id
HAVING tp < 5)
AS A ON C.catalog_id = A.cid
SET
C.catalog_id = A.pid

Here's a little less writing, but for performance we'd need to see your tables, indexes, and the EXPLAIN, as mentioned.
UPDATE product_catalogT C
JOIN
( SELECT p.catalog_id
FROM product_catalog p
JOIN catalog g
ON p.catalog_id = g.id
Where g.parent_id <> 0
GROUP
BY catalog_id
HAVING COUNT(*) < 5
) A
ON C.catalog_id = A.cid
SET C.catalog_id = A.pid
Also, I might mention that this seems like a rather strange request

Related

MySQL-Simple Query take 11 seconds to return results

I have mysql query that takes the products from a table (6k+ records) and join another table that holds the images for those products (5.5k+ records) and in where I choose to select the products form specific vendors. Below is the query.
SELECT *
from products
LEFT JOIN products_images ON (products.id = products_images.product_id)
WHERE products.vendor_id in (
SELECT id
FROM vendors
WHERE status=1 )
I need help with optimizing this query. Thanks !!

EXISTS often produces better execution plans:
SELECT *
FROM products p LEFT JOIN
products_images pi
ON p.id = pi.product_id
WHERE EXISTS (SELECT 1
FROM vendors v
WHERE p.vendor_id = v.id AND v.status = 1
);
You also want indexes on product_images(product_id) and vendors(vendor_id, status).
If I assume (reasonably) that id is the primary key in vendors, then you can also use a JOIN:
SELECT p.*, pi.*
FROM products p JOIN
vendors v
ON p.vendor_id = v.id LEFT JOIN
products_images pi
ON p.id = pi.product_id
WHERE v.status = 1;
For this version, the best indexes are probably: vendors(status, id), products(vendor_id, product_id), product_images(product_id).

SQL inner join and on multiple rows

so, I am creating an music database.
I am using three tables (files, categories, categories_assignments).
I want to be able to select a file that is in multiple categories (e.g. a song that is both pop and rock)
I already have made the or variance (included below for reference)
SELECT DISTINCT `files`.`filename` FROM `files`
INNER JOIN `categories_assignments`
ON `files`.`id` = `categories_assignments`.`fileid`
INNER JOIN `categories`
ON `categories_assignments`.`catid` = `categories`.`id`
WHERE `categories`.`name` = 'rock' OR `categories`.`name`='pop';

This is a "set-within-sets" problem -- you are looking for songs that have a set of categories. I like to solve this using group by and having:
SELECT f.filename
FROM files f JOIN
categories_assignments ca
ON f.id = ca.fileid JOIN
categories c
ON ca.catid = c.id
WHERE c.name IN ('rock', 'pop')
GROUP BY f.filename
HAVING COUNT(*) = 2;
Notes:
Table aliases make the query easier to write and to read.
I don't see a need to put backticks around every identifier. That just makes the query harder to read.
You should use IN instead of multiple OR comparisons.
If you are learning SQL, then SELECT DISTINCT is almost never useful. Learn to use GROUP BY first.

Group by the file and take only those groups having both categories
SELECT f.filename
FROM files f
INNER JOIN categories_assignments ca ON f.id = ca.fileid
INNER JOIN categories c ON ca.catid` = c.id
WHERE c.name in ('rock', 'pop')
GROUP BY f.filename
HAVING count(c.name) = 2

Best way to write this query? Several JOINS

I have this query (below) while it does work I am wondering if it is the best as it will be going against thousands of records. I will try to explain the best I can.
SELECT items.*,
p.file AS item_pic,
i_f.id AS favorite_id,
COALESCE(f.favorite_count, 0) AS favorite_count,
COALESCE(b.num_buys, 0) AS num_buys,
COALESCE(c.comment_count, 0) AS comment_count
FROM items i
INNER JOIN (SELECT file,
item_id
FROM item_pics
ORDER BY item_pics.id ASC) AS p
ON p.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS favorite_count,
item_id
FROM item_favorites
GROUP BY item_id) AS f
ON f.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS num_buys,
item_id
FROM purchases
GROUP BY item_id) AS b
ON b.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS comment_count,
item_id
FROM comments
GROUP BY item_id) AS c
ON c.item_id = i.id
LEFT JOIN item_favorites AS i_f
ON i.id = i_f.item_id
AND i_f.userid = '14'
GROUP BY i.id
LIMIT 0, 20
So we are selecting the items in the database. The first join is for a picture (Items have multiple pictures but I only want one).
The next join is for favorite count. Each time a user favorites something it adds it to the table favorites with some info, so I am just trying to get the total number of favorites for that item.
Next up is the number of purchases for this item. Pretty much the same as favorites.
After that it is for comments. Again this is just like the purchases and favorites count.
The last join is to see if the logged in user (id 14) has favorited this item if not I use COALESCE to return 0.
Like I said this all works correctly but it does take a few seconds to load on a table of about 6700 items and about 180K rows in the purchases table for only loading 20 at a time (I do a scrolling/load similar to Facebook/Twitter). Indexes have been properly setup on all tables. Once this is complete/correct I would like to know how to limit results for purchases in the last seven days and order by number of purchases (num_buys).
EDIT: Results from EXPLAIN

I suppose you want the first picture (lowest id), and pictures are required, where as everything else is optional.
I guess you're doing subqueries because you think joining on uncorrelated subqueries (hitting the joined tables just once) will be faster than correlated subqueries or a plain JOIN. However, you end up having to lookup the records twice, and the second lookup (for the actual join) doesn't get to use an index because derived (temporary tables) don't have indexes.
Try normal JOINs:
SELECT items.*,
p.file AS item_pic,
COALESCE(i_f.id, 0) AS favorite_id,
COUNT(f.item_id) AS favorite_count,
COUNT(b.item_id) AS num_buys,
COUNT(c.item_id) AS comment_count
FROM items i
STRAIGHT_JOIN item_pics p
ON p.item_id = i.id
LEFT JOIN item_pics p2
ON p2.item_id = i.id
AND p2.id < p1.id
LEFT JOIN item_favorites f
ON f.item_id = i.id
LEFT JOIN purchases b
ON b.item_id = i.id
LEFT JOIN comments c
ON c.item_id = i.id
LEFT JOIN item_favorites AS i_f
ON i_f.item_id = i.id
AND i_f.userid = '14'
WHERE p2.id IS NULL
GROUP BY i.id
LIMIT 20
The double join on pictures is an anti-join WHERE p2.id IS NULL, to retrieve the picture with the lowest id.

Stop duplication of data in left join

I have a query that selects data from several tables using LEFT JOINS. The problem is data is being duplicated.
Here's the query
SELECT
A.ID,
T.T_ID,
T.name,
T.pic,
T.timestamp AS T_ts,
(SELECT COUNT(*) FROM track_plays WHERE T_ID = T.T_ID) AS plays,
(SELECT COUNT(*) FROM track_downloads WHERE T.T_ID) AS downloads,
S.S_ID,
S.status,
S.timestamp AS S_ts,
G.G_ID,
G.gig_name,
G.date_time,
G.lineup,
G.price,
G.currency,
G.pic AS G_pic,
G.ticket,
G.venue,
G.timestamp AS G_ts
FROM artists A
LEFT JOIN TRACKS T
ON T.ID = A.ID
LEFT JOIN STATUS S
ON S.ID = A.ID
LEFT JOIN GIGS G
ON G.ID = A.ID
WHERE A.ID = '$ID'
ORDER BY S_ts, G_ts AND T_ts DESC LIMIT 20
The problem is data is duplicated if one of the tables in the join has more data than another. So if tracks has 1 row, status has 2 and gigs has no rows you would get the data from tracks doubled.
I have tried using GROUP BY A.ID but that eliminates data. So in the example given before there would nly be one row of status show.
I've also tried GROUP_CONCAT but am unsure on that function so can't tell you much.
USING SELECT DISTINCT has the same effect as just the GROUP BY A.ID.

Assuming that artists -> gigs and artists -> tracks are 1-N mappings then you have two choices. (both of which were covered in the comments on your OP
1) Specify which of the N rows you want to get back to achieve a 1-1 map:
FROM artists A
LEFT JOIN TRACKS T ON T.ID = A.ID AND T.<SOMETHING> = SOMETHING
LEFT JOIN STATUS S ON S.ID = A.ID
LEFT JOIN GIGS G ON G.ID = A.ID AND G.<SOMETHING> = SOMETHNING
2) Do the joins as you wrote and get multiple entries for tracks and gigs and then pivot them in your calling application. Generally you'd put an ORDER BY clause in the query and check for the same artist key and pivot the list.

MySQL query, dealing with active and inactive products

Facing a problem and not getting the hint for a few hours. Maybe onyone can help me out.
Have the following query which shows the Topsellers. So the status of the product (active or not) is saved in b.Article_Status (0=inactive, 1=active).
How do I get the products of the result list which have no active product in the productfamily at the moment. But the product shall still be shown if an old one was ordered (and so is in table order_items) is now inactive and the active one was not ordered yet.
Actual query looks as follow. Already fund a solution which works when the actual active product has been ordered once, but still the problem with the mentioned case.
SELECT count( a.order_itemid ) AS numOrders, c.Product_ID, c.Product_Name, d.producer_name
FROM order_items a
LEFT OUTER JOIN product_article b ON b.Article_ID = a.order_itemid
LEFT OUTER JOIN product c ON b.Article_Productid = c.Product_ID
LEFT OUTER JOIN producer d ON c.Product_Producer = d.producer_id
GROUP BY c.Product_ID
ORDER BY `numOrders` DESC

Solution was a WHERE EXISTS subquery
SELECT count( a.order_itemid ) AS numOrders, c.Product_ID, c.Product_Name, d.producer_name
FROM order_items a
LEFT OUTER JOIN product_article b ON b.Article_ID = a.order_itemid
LEFT OUTER JOIN product c ON b.Article_Productid = c.Product_ID
LEFT OUTER JOIN producer d ON c.Product_Producer = d.producer_id
WHERE EXISTS (SELECT * FROM product_article x WHERE c.Product_ID = x.Article_Productid AND x.Article_Status = 1)
GROUP BY c.Product_ID
ORDER BY `numOrders` DESC
LIMIT 5

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Update query optimization with large data - mysql

Related

MySQL-Simple Query take 11 seconds to return results

SQL inner join and on multiple rows

Best way to write this query? Several JOINS

Stop duplication of data in left join

MySQL query, dealing with active and inactive products

Categories

Resources