I have an existing populated MySQL table with three columns: product_id, category_id and catalog_id
Each catalog has multiple categories in it and each category has multiple products in it. Categories belonging to the same catalog should have exactly the same products in them, but unfortunately they don't in certain cases.
I need to identify the missing products in each category. Missing means that a product exists in at least one other category that belongs to the same catalog but doesn't exist in that particular category.
So the result I need to get out of this is a list of product_id/category_id pairs that are missing and need to be added.
How do I accomplish this in MySQL?
I tried creating a table populated by the distinct product_id and catalog_id pairs to get all the products for each catalog and then join that with the main table, but I am not sure what type of join to perform.
Any MySQL experts willing to help?
Update:
Based on a request, here is the create table SQL (this is a simplified version of the actual scenario):
create table product (
product_id bigint not null,
category_id bigint not null,
catalog_id bigint not null
);
Update 2:
Clarification: Every category that belongs to the same catalog must have the same exact products in it as all the other categories that belong to the same catalog. If a product is in one category and not in another category that belongs to the same catalog, then it is missing and needs to be identified as a product_id/category_id pair.
Update 3:
Per another request, here is sample data:
insert into product (product_id, category_id, catalog_id) values (1, 1, 1);
insert into product (product_id, category_id, catalog_id) values (2, 1, 1);
insert into product (product_id, category_id, catalog_id) values (3, 1, 1);
insert into product (product_id, category_id, catalog_id) values (1, 2, 1);
insert into product (product_id, category_id, catalog_id) values (3, 2, 1);
In this case the pair of product_id 2 and category_id 2 would be identified as part of the result. This is because categories 1 and 2 belong to the same catalog (1) and category 2 has a missing product, namely product_id 2.
You can do it using the following query:
SELECT s1.product_id, s1.category_id
FROM (
SELECT t1.product_id, t2.category_id, t1.catalog_id
FROM (
SELECT DISTINCT product_id, catalog_id
FROM product) AS t1
CROSS JOIN (
SELECT DISTINCT category_id, catalog_id
FROM product) AS t2
WHERE t1.catalog_id = t2.catalog_id ) AS s1
LEFT JOIN product AS s2
ON s1.catalog_id = s2.catalog_id AND
s1.category_id = s2.category_id AND
s1.product_id = s2.product_id
WHERE s2.product_id IS NULL
Demo here
Explanation:
This query:
SELECT DISTINCT product_id, catalog_id
FROM product
gives you a list of all distinct products per catalog:
product_id catalog_id
-----------------------
1 1
2 1
3 1
If you perform a CROSS JOIN of the above to all distinct categories per catalog:
SELECT t1.product_id, t2.category_id, t2.catalog_id
FROM (
SELECT DISTINCT product_id, catalog_id
FROM product) AS t1
CROSS JOIN (
SELECT DISTINCT category_id, catalog_id
FROM product) AS t2
WHERE t1.catalog_id = t2.catalog_id
you get:
product_id category_id catalog_id
----------------------------------
1 1 1
1 2 1
2 1 1
2 2 1
3 1 1
3 2 1
The above is a comprehensive set containing the full list of product_id per category_id per catalog_id.
All you have to do now is to find the missing product_id, category_id pairs from your table. You can do that with a use of a LEFT JOIN as in the initial query.
You can also do by this in optimize way-
Hitesh> select * from product; +------------+-------------+------------+
| product_id | category_id | catalog_id |
+------------+-------------+------------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 1 |
| 1 | 2 | 1 |
| 3 | 2 | 1 |
| 4 | 2 | 1 |
| 5 | 2 | 1 |
| 1 | 2 | 2 |
| 2 | 1 | 2 |
+------------+-------------+------------+
9 rows in set (0.00 sec)
Hitesh>
SELECT product_id, category_id, catalog_id
FROM
(SELECT DISTINCT p1.product_id, p2.category_id, p1.catalog_id
FROM product p1 JOIN product p2 ON p1.catalog_id=p2.catalog_id) tmp
WHERE NOT EXISTS (SELECT 1 FROM product
WHERE category_id = tmp.category_id AND
product_id=tmp.product_id AND
catalog_id=tmp.catalog_id);
+------------+-------------+------------+
| product_id | category_id | catalog_id |
+------------+-------------+------------+
| 4 | 1 | 1 |
| 5 | 1 | 1 |
| 2 | 2 | 1 |
| 2 | 2 | 2 |
| 1 | 1 | 2 |
+------------+-------------+------------+
5 rows in set (0.00 sec)
Related
I have a table named 'products' and another table named 'rates' that has one to many relation with 'products' table. For each product i have two rows in 'rates' table that i want update one boolean column named 'index' to 1 for each 'product' in 'rates' table.
i used this query :
UPDATE ( SELECT
products.id AS productId,
products.name ,
X.`index` AS `index`,
x.id AS rateId,
x.price, x.discount
FROM products JOIN ( SELECT rates.*
FROM rates
) AS x
WHERE products.id = x.product_id
GROUP BY products.id
) AS y
SET y.index = 1
but id got this error massage:
SQL Error (1288) the target table y of the update is not updatable
i'm new in mysql and i don't know where is my mistake.Thank you for helping
Products Table
| id | name
| 1 | chair
| 2 | bench
Rates Table
| id | product_id | index | value
| 1 | 1 | 0 | xx ==> index = 1
| 2 | 1 | 0 | yy
| 3 | 2 | 0 | zz ==> index = 1
| 4 | 2 | 0 | tt
i want update index column for each product in rates to 1
It looks like you want to update the "first" row in rates for each product_id. If so, you can self-join the table with an aggregate query that computes the minimum id per product_id:
update rates r
inner join (select product_id, min(id) id from rates group by product_id) r1
on r1.id = r.id
set r.index = 1
I have some problem to get the ID which use on another table row. I have three tables. One is category, articles, video. In articles and video there is a column which have category ID. This is the example:
Table Categories :
id | category_name
------------------
1 | News
2 | Sports
3 | Art
4 | Horror
Table Articles :
id | category_id | title
----------------------------------
1 | 1 | title content 1
2 | 1 | title content 2
3 | 3 | title content 3
4 | 3 | title content 4
5 | 2 | title content 5
Table Video :
id | category_id | video_title
------------------------------
1 | 1 | video title 1
2 | 2 | video title 2
3 | 3 | video title 3
I want to get each category ID already use what time in two other databases. Like this :
Category ID 1 is use 3 times
Category ID 2 is use 2 times
Category ID 3 is use 3 times
Category ID 4 is use 0 times
What query do I need to use so I can get all data like that ? Please anyone knows could help me. Thanks in advance.
First you need to UNION ALL articles table and video table be a subquery, then use Outer join and COUNT function.
SELECT Concat('Category ID ', c.id, ' is use ', Count(t.category_id ), ' times')
FROM categories c
LEFT JOIN (SELECT category_id
FROM articles
UNION ALL
SELECT category_id
FROM video) t
ON c.id = t.category_id
GROUP BY c.id
SQLFIDDLE:http://sqlfiddle.com/#!9/92cbd0e/12
[Results]:
| Concat('Category ID ', t.id, ' is use ', t.cnt, ' times') |
|------------------------------------------------------------|
| Category ID 1 is use 3 times |
| Category ID 2 is use 2 times |
| Category ID 3 is use 3 times |
| Category ID 4 is use 0 times |
NOTE
COUNT function does not count numbers if the column value encounters null
For example Here is a sample script.
CREATE TABLE T(
col int
);
INSERT INTO T VALUES (NULL);
INSERT INTO T VALUES (1);
SELECT COUNT(col) FROM t; -- RESULT = 1
SELECT COUNT(*) FROM t; --RESULT = 2
sample sqlfiddle:http://sqlfiddle.com/#!9/e2bba7/2
You can use, for example, this query:
Select category_id, count(1) - 1
From (
Select category_id From video
Union All Select category_id From articles
Union All Select id From Categories)
Group By category_id
Okay, so I have the following structure in my product_to_store table:
+----------------------+--------------------+
| product_id - int(11) | store_id - int(11) |
+----------------------+--------------------+
+----------------------+--------------------+
| 1000 | 0 |
| 1000 | 6 |
| 1005 | 0 |
| 1010 | 0 |
| 1010 | 6 |
...
Basically, I need to have a store_id of value 6 for every product_id. For example, the product (product_id) with ID 1005 only has a store_id record of 0. I want it to have another record/row where product_id is equal to 6. Products with ID 1000 and 1010 are what they should be like (they have a record of store_id that is equal to 6).
I tried to run the following query in order insert a new row where only product_id is set:
INSERT INTO `product_to_store`
(product_id) SELECT `product_id`
FROM `product_to_store`
WHERE product_id != 6
And then consider running another query to update all rows where store_id is null with the value of 6. However, I get:
#1062 - Duplicate entry '1000-0' for key 'PRIMARY'.
Any way in which I can accomplish this without having to use a loop in PHP or something rather unpractical?
INSERT INTO `product_to_store` (product_id,store_id)
SELECT DISTINCT p1.product_id, 6 as store_id
FROM
product_to_store p1
LEFT JOIN product_to_store x
ON p1.product_id = x.product_id
AND x.store_id = 6
WHERE
x.product_id IS NULL;
http://sqlfiddle.com/#!9/01ac7a
You basically want to insert products where there doesn't already exist a product with store_id of 6.
INSERT INTO `product_to_store` (product_id,store_id)
SELECT DISTINCT product_id, 6 as store_id
FROM product_to_store p1
WHERE NOT EXISTS (SELECT 1 FROM product_to_store p2 WHERE p2.store_id = 6
AND p2.product_id = p1.product_id)
I am trying to combine multiple selects in one query to use as little data as possible.
I have this sql table (example)
id category status
1 test1 A
2 test2 B
3 test1 A
4 test3 B
5 test1 C
First of all i want to select how many rows there is with the same category.
SELECT category, COUNT(category) FROM test GROUP BY category
Then i would like to count the status in each category. I would do this with this query.
SELECT status, COUNT(status) FROM test WHERE category = 'test1' GROUP BY STATUS
So i want one column with total and then each categorys number of status.
Can i somehow combine these? Is that even possible or do i just have to realize that I have to get the data multiple times to have the right result?
You can try to GROUP BY category and by status and use WITH ROLLUP to get aggregate values:
SELECT category, status, count(*)
FROM test
GROUP BY category, status WITH ROLLUP
The result will look like this:
category | status | count(*)
----------+--------+----------
test1 | A | 2
test1 | C | 1
test1 | NULL | 3
test2 | B | 1
test2 | NULL | 1
test3 | B | 1
test3 | NULL | 1
NULL | NULL | 5
If you ignore the rows containing NULLs, the rest is the regular GROUP BY category, status. There are 2 entries having category = 'test1' AND status = 'A', one entry having category = 'test1' AND status = 'C' and so on.
The third row of the result (category = 'test1', status = NULL, count(*) = 3) summarizes the rows having category = 'test1'. It computes count(*) for all the rows having category = 'test1' no matter what value they have in column status. In a similar way there are computed the summary rows for category = 'test2' and category = 'test3'.
The last row is the summary for the entire table. count(*) = 5 includes all the rows, no matter what value they have in columns category and status.
You can run your second query for all categories at once like this:
mysql> select category, status, count(*) from foo group by category, status;
+----------+--------+----------+
| category | status | count(*) |
+----------+--------+----------+
| test1 | A | 2 |
| test1 | C | 1 |
| test2 | B | 1 |
| test3 | B | 1 |
+----------+--------+----------+
4 rows in set (0.39 sec)
And then you could compute the category-wide count by summing up all its rows. If you really want that too as part of the same query, you could do this:
mysql> select foo.category, status, count(*), cat_count
-> from foo
-> inner join (select category, count(*) cat_count from foo group by category) x
-> on x.category = foo.category
-> group by foo.category, status;
+----------+--------+----------+-----------+
| category | status | count(*) | cat_count |
+----------+--------+----------+-----------+
| test1 | A | 2 | 3 |
| test1 | C | 1 | 3 |
| test2 | B | 1 | 1 |
| test3 | B | 1 | 1 |
+----------+--------+----------+-----------+
4 rows in set (0.00 sec)
Unfortunately MySQL does not support window functions.
One way would be to get your status counts for each category in one query:
SELECT
category,
status,
COUNT(*) AS status_count
FROM
test
GROUP BY
category, status
And then INNER JOIN information about count for categories to it like that:
SELECT
a.*, b.category_count
FROM (
SELECT
category,
status,
COUNT(*) AS status_count
FROM
test
GROUP BY
category, status
) a
INNER JOIN ( SELECT category, COUNT(*) AS category_count FROM test GROUP BY category ) b ON
a.category = b.category
I will give a small exemple for my problem.
I have two tables in my database.
"Car" with the rows: "id","name".
"Seats" with the rows: "cid","weight".
The weight for the seats referees to the order of the seats.A car can have from 0 to n seats.
My problem is that i have seats with the same weight on a car.
I need to do an update for every car where two or more seats have same weight.
CAR
--------------------------
id | name
--------------------------
1 | ford
--------------------------
SEATS
-------------------------------
cid | name | weight
-------------------------------
1 Seat1 | 7
1 Seat2 | 1
1 Seat3 | 7
1 Seat4 | 3
1 Seat5 | 2
1 Seat6 | 3
1 Seat N | N
-------------------------------
And i need to have:
CID is the CAR id
SEATS
-------------------------------
cid | name | weight
-------------------------------
1 Seat1 | 0
1 Seat2 | 1
1 Seat3 | 2
1 Seat4 | 3
1 Seat5 | 4
1 Seat6 | 5
1 Seat N | N
-------------------------------
The query will run on a very big database and will affect many rows, so it is need to be fast.
I have done the join on the tables, but i don`t know how to make the complete update method.
The INNER JOIN is needed for this.
You can use user-defined variables in order to achieve a serial no for your weight column
update t
join (
select `cid`, `name`,#rank:= #rank + 1 rank
from t join
(select #rank:=-1) t1
order by `cid`, `name`
) t2
using(`cid`, `name`)
set weight = t2.rank
Demo
or if you need to a serial weight no for each car group then you can do so
update t
join (
select `cid`, `name`,
#rank:=case when #group = cid
then #rank + 1
else 0 end rank,
#group:=cid
from t join
(select #rank:=-1,#group:= 0) t1
order by `cid`, `name`
) t2
using(`cid`, `name`)
set weight = t2.rank
Demo serial per group