Update field based on duplicate values - mysql

I'm trying to update the value of a column based on duplicates in my database, but I'm struggling to find the correct query.
Here's my db structure:
id | product_id | order_id | parent_id
--------------------------------------
1 1 1 SMITH1
2 1 2 SMITH1
3 2 3 BLOGGS1
4 2 4 BLOGGS1
I want to update the order_id to be the same, where we have duplicates of both product_id AND parent_id.
Currently I've found the duplicates like so:
SELECT
*,
COUNT(parent_id),
COUNT(product_id)
FROM
mytable
GROUP BY parent_id, product_id
HAVING COUNT(parent_id) > 1
AND COUNT(product_id) > 1
But I'm now struggling with the update/join to set the order_id values the same (can be minimum order_id value preferably).
Any help appreciated!

With a join of the table to the minimum value of order_id for each combination of product_id and parent_id:
update mytable t inner join (
select product_id, parent_id, min(order_id) order_id
from mytable
group by product_id, parent_id
having min(order_id) <> max(order_id)
) tt on tt.product_id = t.product_id and tt.parent_id = t.parent_id
set t.order_id = tt.order_id;
This code will prevent any unnecessary updates in case there is only one order_id for that product_id and parent_id.
See the demo.
Results:
| id | product_id | order_id | parent_id |
| --- | ---------- | -------- | --------- |
| 1 | 1 | 1 | SMITH1 |
| 2 | 1 | 1 | SMITH1 |
| 3 | 2 | 3 | BLOGGS1 |
| 4 | 2 | 3 | BLOGGS1 |

Related

single table parent child relationship query

I have written a query to get the items from the table which doesn't have any child items. It's working fine but is very slow.
Any better/easier/optimized way to write the same thing?
select distinct id, (select count(i.item_id) from order_item as i where i.parent_item_id = o.item_id) as c
from order_item as o
where product_type = 'bundle'
having c = 0
order by id desc
limit 10;
Few of the fields are these to get the idea of a structure
Table: order_item
Columns:
item_id PK
order_id
parent_item_id
product_id
product_type
item_id | order_id | parent_item_id | product_id | product_type
-----------------------------------------------------------------
1 | 1 | null | 1 | bundle
2 | 1 | 1 | 2 | simple
3 | 1 | 1 | 3 | simple
4 | 1 | null | 4 | bundle
5 | 2 | null | 1 | bundle
6 | 2 | 5 | 2 | simple
7 | 2 | 5 | 3 | simple
Query should only return the 4rth item
Try below. Also consider creating indexes on PARENT_ITEM_ID and ITEM_ID
SELECT OI.*
FROM ORDER_ITEM OI
LEFT JOIN ORDER_ITEM OI2
ON OI2.PARENT_ITEM_ID = OI.ITEM_ID
WHERE OI.PRODUCT_TYPE = 'bundle' AND OI2.PARENT_ITEM_ID IS NULL
I would suggest not exists:
select oi.*
from order_item oi
where oi.product_type = 'bundle' and
not exists (select 1
from order_item oi2
where oi2.parent_item_id = oi.item_id and oi2.product_type = 'bundle'
)
order by id desc
limit 10;
For performance, you want an index on order_item(parent_item_id, product_type).
Note: I'm not sure you want the product_type filter in the subquery, but it is the logic your query is using.

using HAVING to filter results based on a reference row

I have the following table:
+---------+--------------+----------+
| item_id | location_id | price |
+---------+--------------+----------+
| 1 | 1 | 100 |
| 1 | 1 | 250 |
| 1 | 2 | 50 |
| 2 | 1 | 250 |
| 2 | 1 | 1000 |
| 3 | 1 | 1000 |
| 3 | 2 | 100 |
+---------+--------------+----------+
I can reduce this down to the minimum values using this query
SELECT
item_id, location_id, MIN(price) AS Price
from
table
GROUP BY item_id , location_id
This gets me
+---------+--------------+----------+
| item_id | location_id | price |
+---------+--------------+----------+
| 1 | 1 | 100 |
| 1 | 2 | 50 |
| 2 | 1 | 250 |
| 3 | 1 | 1000 |
| 3 | 2 | 100 |
+---------+--------------+----------+
I want to reduce this further. I am using the rows with a location_id of 1 as a reference row. For each row that has an item_id matching the reference row's item_id but a different location id. I want to compare that row's price with the reference row's price. If the price is lower than the reference row's price, I want to filter that row out.
My final result should include the reference row for each item id and any rows that met the criteria of the price being lower than the reference row price.
I have a hunch that I can use the HAVING clause to do this but I am having trouble compiling the statement. How should I construct the HAVING statement?
Thanks in advance
Nah, having can't help you like this, having is for things like you need filter min() result for something
e.g:
select id,min(price) from table where date = '2016-3-18' group by id having min(price) = 50
it will show you the records that min(price)=50
let's back to your case, there are lots of way to do that,
1. left join
select a.item_id,a.location_id,a.price
from table a
left join table b
on a.location_id = b.location_id and a.price > b.price
where b.price is null
2. exists
select a.item_id,a.location_id,a.price
from table a
where exists(
select 1 from
(select location_id,min(price)as price from table group by location_id)b
where a.location_id = b.location_id and a.price = b.price
)
normally i ll recommand you use exists

MySQL Group by and Having clause issue

I'm trying to get the correct records but for reason I've some issue in Having clause I guess, can anyone please help me out?
CASE 1:
Trying to select rows where order_id = 1 (New Order) but should not have more than 1 record with the same order id
CASE 2:
Select rows where order_id = 2 (Printed Order) but should also select new orders too and apply CASE 1, in other words Query should select where order_id=2 OR where order_id=1 if( order_id = 1 then should not have more than 1 record with the same order id)
I've a table where:
order_id = id of the order
status_id = different status id e.g 1 = New, 2 = Printed, 3 = Processing etc...
status_change_by = id of the admin who change the order status from new to printed to processing...
order_id | status_id | status_change_by
1 | 1 | (NULL)
1 | 2 | 12
1 | 3 | 12
2 | 1 | (NULL)
3 | 1 | (NULL)
4 | 1 | (NULL)
1 | 4 | 13
5 | 1 | (NULL)
3 | 2 | (NULL)
Here's my simple mySQL query:
SELECT * from order_tracking
where status_id = 1
group by order_id
having count(order_id) <= 2;
I even created SQL fiddle for the reference, please check if I'm doing wrong or I need complex query with CASE or IF statements?
http://sqlfiddle.com/#!2/16936/3
If this link doesn't work, please create one by this code:
CREATE TABLE order_tracking
(
track_id int auto_increment primary key,
order_id int (50),
status_id int(20),
status_changed_by varchar(30)
);
Here's the insertion:
INSERT INTO order_tracking
(order_id, status_id, status_changed_by)
VALUES
(1,1,''),
(1,2,12),
(1,3,12),
(2,1,''),
(3,1,''),
(4,1,''),
(1,4,13),
(5,1,''),
(3,2,'');
Your earliest response should be appreciated!
Thanks for the time.
Desire result:
Case:1 which is quite simple where result should be something like that: Only New orders with no more than 1 record
Order_id | status_id | status_changed_by
2 | 1 | (NULL)
4 | 1 | (NULL)
3 | 1 | (NULL)
Case 2 result:
Order_id | status_id | status_changed_by
1 | 4(max id)| (NULL)
2 | 1 | (NULL)
4 | 1 | (NULL)
3 | 2(max id)| (NULL)
It seems, by reading between the lines of your question, that you want to display the highest numerical value of status for each order id. That is, it seems that your orders progress from status 1 to 2 to 3 and so forth.
Here's how you do that. First, you determine which is the highest status for each order, as follows:
SELECT MAX(status_id) AS status_id,
order_id
FROM order_tracking
GROUP BY order_id
This query gives you one row for each order_id showing the maximum value of the status id.
Then, you use it as a subquery and join to your original table like so. http://sqlfiddle.com/#!2/16936/11/0
SELECT o.order_id, o.status_id, o.status_changed_by
FROM order_tracking AS o
JOIN (
SELECT MAX(status_id) AS status_id,
order_id
FROM order_tracking
GROUP BY order_id
) AS m ON o.order_id = m.order_id AND o.status_id = m.status_id
ORDER BY o.order_id
This will give you a nice result with the highest status for each order.
| ORDER_ID | STATUS_ID | STATUS_CHANGED_BY |
|----------|-----------|-------------------|
| 1 | 4 | 13 |
| 2 | 1 | |
| 3 | 2 | |
| 4 | 1 | |
| 5 | 1 | |
Please note: If you were to put an autoincrementing ID column into your order_tracking table, things might work better for you. You'd be able to get the most recently INSERTed status for each order_id rather than the numerically highest status. This would be a very helpful change to your table layout, in my opinion. You'd do that like this:
SELECT o.order_id, o.status_id, o.status_changed_by
FROM order_tracking AS o
JOIN (
SELECT MAX(id) AS id,
order_id
FROM order_tracking
GROUP BY order_id
) AS m ON o.id = m.id
ORDER BY o.order_id

Mysql query with multiple conditions on FK

I have slight problem with mysql query. I have two tables:
bioshops
+------------+-------------+
| bioshop_id | name |
+------------+-------------+
| 1 | Bioshop1 |
| 2 | Bioshop2 |
+------------+-------------+
bioshop_have_product
+----+-----------------+--------------+
| id | bioshop_id | product_id |
+----+-----------------+--------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 2 | 3 |
+----+-----------------+--------------+
The tables are much more complex but this is the important structure. prodict_id in bioshop_have_product is also FK. I need to select bioshops witch contains all products that I ask. Example:
if I need bioshops with product 1 it should return Bioshop1 and Bioshop2 with all products
if I need bioshops with product 1 and 2 it should return Bioshop1 with all products
My query is:
SELECT bs.name AS name,
bs.id AS bioshop_id,
bshd.id AS id,
bshd.product_id AS product_id
FROM bioshops bs
JOIN bioshop_have_product bshp
ON bs.bioshop_id = bshp.bioshop_id
WHERE (bshp.bioshop_id = bs.bioshop_id AND bshp.product_id = '1')
AND (bshp.bioshop_id = bs.bioshop_id AND bshp.product_id = '2')
but this returns nothing and I want it to return Bioshop1 because only Bioshop1 countains both objects.
You can try something like this:
SELECT bs.name AS name,
bs.id AS bioshop_id,
bshp.id AS id,
bshp.product_id AS product_id
FROM bioshop bs
JOIN bioshop_have_product bshp
ON bs.id = bshp.bioshop_id AND
(SELECT COUNT(*) FROM bioshop_have_product WHERE product_id IN (1, 2) AND bs.id = bioshop_id) = X
where X should be equal to the count of different products you whant to check, for instance 2 in your second case.
SELECT bioshop_id
FROM bioshop_have_product
WHERE product_id IN (1,2)
GROUP
BY bioshop_id
HAVING COUNT(*) = 2;

Getting COUNT while ignoring GROUP BY

I have the following table: ProductSales
+-------+-----------+--------+-----------+
|prod_id|customer_id|order_id|supplier_id|
+-------+-----------+--------+-----------+
| 1 | 1 | 1 | 1 |
+-------+-----------+--------+-----------+
| 2 | 4 | 2 | 2 |
+-------+-----------+--------+-----------+
| 3 | 1 | 1 | 1 |
+-------+-----------+--------+-----------+
| 4 | NULL | NULL | Null |
+-------+-----------+--------+-----------+
| 5 | 1 | 1 | 2 |
+-------+-----------+--------+-----------+
| 6 | 4 | 7 | 1 |
+-------+-----------+--------+-----------+
| 7 | 1 | 1 | 3 |
+-------+-----------+--------+-----------+
I have a SELECT query:
SELECT customer_id AS customer, count(*) AS prod_count
, count(DISTINCT order_id) as orders
FROM ProductSales
WHERE supplier_id=1
GROUP BY customer_id
HAVING customer_id<>'NULL'
This will be produce the result:
+--------+----------+------+
|customer|prod_count|orders|
+--------+----------+------+
| 1 | 2 | 1 |
+--------+----------+------+
| 4 | 1 | 1 |
+--------+----------+------+
What I have been trying to achieve and getting nowhere is to add a fourth column in my results to show the number of order_ids that belong only to the current supplier for each customer:
+--------+----------+------+-------------+
|customer|prod_count|orders|Unique Orders|
+--------+----------+------+-------------+
| 1 | 2 | 1 | 0 | } Order '1' is connected with two supplier_ids
+--------+----------+------+-------------+
| 4 | 1 | 1 | 1 | } Order '2' is connected to only one supplier_id
+--------+----------+------+-------------+
(This gets more complex when there are more orders per customer associated with far more suppliers).
I thought I was close with:
SELECT t1.user_id, count(DISTINCT t1.prod_id) AS prod_count
, count(DISTINCT t1.order_id) as orders
, IF(count(DISTINCT t3.supplier_id)>1,0,1) AS Unique_Orders
FROM ProductSales AS t1
LEFT JOIN `order` AS t2 ON t1.order_id=t2.order_id
LEFT JOIN ProductSales AS t3 ON t2.order_id=t3.order_id
WHERE t1.supplier_id=1
GROUP BY t1.customer_id
HAVING t1.customer_id<>'NULL'
The orders table stated above is related to ProductSales only by order_id.
Which shows my Customers, Products(total), Orders(total) but the Unique Orders shows if there are unique orders (0) or not (1), I understand the logic of the IF statement and it does what I expect. It's working out how to find the number of unique orders which is baffling me.
The table is established and can't be changed.
Any suggestions?
Unique orders can be defined as
SELECT OrderID
FROM yourtable
GROUP BY OrderID
Having COUNT(Distinct SupplierID) = 1
So try
SELECT
customer_id AS customer,
count(*) AS prod_count.
count(DISTINCT productsales.order_id) as orders,
COUNT(distinct uqo)
FROM ProductSales
left join
(
SELECT Order_ID uqo
FROM Productsales
GROUP BY Order_ID
Having COUNT(Distinct supplier_id) = 1
) uniqueorders
on ProductSales.order_id = uniqueorders.uqo
WHERE supplier_id=1
GROUP BY customer_id