MYSQL denormalized data finding and deleting duplicates - mysql

I have example data
ID DAY ORDER TIME PRODUCT
1 1 1 1 1
2 1 1 1 2
3 1 1 1 3
4 1 2 2 1
5 1 2 2 2
6 1 2 2 3
7 1 2 *3* 1
8 1 2 *3* 2
9 1 2 *3* 3
I want to prevent to having mltiple orders in different time at same day. if I set unique index on DAY,ORDER,TIME I will not be able to insert multiple time anyway, but I want to disable multiple different TIME. Is this possible with mysql?
Have can I find all records where there multiple different TIME value in same DAY and ORDER and delete them?
in this case I would like to delete records 7,8 ad 9 with SQL query because it is duplicate ORDER inserted.
I don't want to normalize table I will stick with this database structure.
Thank you very much

You can use delete with a join clause to find the duplicates and delete them:
delete
from t join
(select day, "order", min(time) as tokeeptime
from t
group by day, "order"
) tokeep
on t.day = tokeep.day and t."order" = tokeep."order" and t.time <> tokeeptime;

DELETE a
FROM tableName a
INNER JOIN
(
SELECT a.DAY, a.ORDER, MAX(a.TIME) Time
FROM tableName a
GROUP BY a.DAY, a.ORDER
HAVING COUNT(DISTINCT TIME) > 1
) b ON a.DAY = b.DAY AND
a.Order = b.Order AND
a.Time = b.Time
SQLFiddle Demo

Related

MySQL UPDATE with GROUP and ORDER

I'm trying to make an update on a table so that it can increment the values on 1 column depending on another's order.
Here's how it'd go
ID GROUP_ID ORDER(Desired) ORDER(NOW)
1 1 1 2
2 1 2 3
3 1 3 1
4 2 1 2
5 2 2 1
6 3 1 1
7 3 2 1
8 3 3 2
So what I need is for each ID, to update the ORDER column so it can be consecutive, starting from 1, within each GROUP_ID.
I have found some solutions to similar problems regarding the updates and orders, but none that uses multiple orders for groups within the same table.
Hope I illustrated the problem right. Thanks in advance
You can do it by "ranking" the rows over again. Mysql doesn't support window functions but you can achieve the same results with join and count like this:
UPDATE YourTable t
INNER JOIN(SELECT s.id,s.group_id,count(*) as cnt
FROM YourTable s
INNER JOIN YourTable ss
ON(s.group_id = ss.group_id and s.id >= ss.id)
GROUP BY s.id,s.group_id) tt
ON (t.id = tt.id and t.group_id = tt.group_id)
SET t.order = tt.cnt

MySQL - add column to table and insert "tag" if order is from new customer

I have simple table:
Order_ID Client_ID Date Order_Status
1 1 01/01/2015 3
2 2 05/01/2015 3
3 1 06/01/2015 3
4 2 10/01/2015 3
5 1 12/01/2015 4
6 1 05/02/2015 3
I want to identify orders from new customers which are orders in same month in which that customer made first order with Order_Status = 3
So the output table should look like this:
Order_ID Client_ID Date Order_Status Order_from_new_customer
1 1 01/01/2015 3 yes
2 2 05/01/2015 3 yes
3 1 06/01/2015 3 yes
4 2 10/01/2015 3 yes
5 1 12/01/2015 4 NULL
6 1 05/02/2015 3 no
I wasn't able to successfully figure out the query. Thanks a lot for any help.
Join with a subquery that gets the date of the first order by each customer.
SELECT o.*, IF(MONTH(o.date) = MONTH(f.date) AND YEAR(o.date) = YEAR(f.date),
'yes', 'no') AS order_from_new_customer
FROM orders AS o
JOIN (SELECT Client_ID, MIN(date) AS date
FROM orders
WHERE Order_Status = 3
GROUP BY Client_ID) AS f
ON o.Client_ID = f.Client_ID
Use a CASE statement along with a SELF JOIN like below
select t1.*,
case when t1.Order_Status = 3 and MONTH(t1.`date`) = 1 then 'yes'
when t1.Order_Status = 3 and MONTH(t1.`date`) <> 1 then 'no'
else null end as Order_from_new_customer
from order_table t1 join order_table t2
on t1.Order_ID < t2.Order_ID
and t1.Client_ID = t2.Client_ID;
If your order table gets big, the solutions from Rahul and Barmar will tend to get slow.
I would hope your shop will get many orders and you will run into performance trouble ;-). So I would suggest marking the very first order of a new customer with a tinyint column, and when you have the comfort of a tinyint, you could code it like:
0 : unknown
1 : very first order
2 : order in first month
3 : order in "grown-up" mode.
The very first order you could probably mark easily, everyone loves a bright new customer enough to store this event somehow during first ordering. The other orders you can identify in a background job / cronjob by there "0" for unknown, or you mark your old customers and store the "3" on their orders.
The result-set can be achieved without any table-join or subquery:
select
if(Order_Status<>3,null,if(#first_date:=if(#prev_client_id!=Client_ID,month(date),#first_date)=month(date),"yes","no")) as Order_from_new_customer
,Order_ID,Client_ID,date,Order_Status,#prev_client_id:=client_id
from
t1,
(select #prev_client_id:="",#first_date:="")t
order by Client_ID ,date
One extra column added for computation and order by clause is used.
Verify result at http://sqlfiddle.com/#!9/83c29f/24

MySQL order a table by groups

I have a table with the fields id, group, left, level and createdAt.
Every row belongs to a group. The row with level = 0 is the group "leader".
I want to sort the table by the leaders' date, and within each group sort the rows by left. For example, in this table:
Id - Group - Left - Level - CreatedAt
1 1 1 0 00:10
2 1 2 1 00:20
3 2 1 0 00:00
4 1 3 1 00:30
5 2 2 1 00:40
The order should be:
Id - Group - Left - Level - CreatedAt
3 2 1 0 00:00
5 2 2 1 00:40
1 1 1 0 00:10
2 1 2 1 00:20
4 1 3 1 00:30
Because row 3 is the newest group leader, it should be first and followed by all it's group ordered by left. After that is row 1 which is the second most new leader, followed by it's group ordered by left.
Etc..
I hope I explained it clear enough.
Thanks!
Essentially, you need to join your table with the leader's time:
SELECT my_table.*
FROM my_table NATURAL JOIN (
SELECT my_table.Group, MIN(my_table.CreatedAt) AS LeaderTime
FROM my_table
WHERE my_table.Level = 0
GROUP BY my_table.Group
) t
ORDER BY t.LeaderTime, my_table.Left
See it on sqlfiddle.
If you can guarantee that there is an unambiguous leader for every group—e.g. because you have defined a UNIQUE constraint on (Group, Level), which you cannot have because your example contains two records in Group = 1 with Level = 1—then you can avoid the grouping operation:
SELECT my_table.*
FROM my_table JOIN my_table AS leader
ON leader.Group = my_table.Group AND leader.Level = 0
ORDER BY leader.CreatedAt, my_table.Left
In order to sort on the group leader's createdAt, you can join each row to the row describing the group leader, then do the sorting:
SELECT t1.*
FROM my_table t1
INNER JOIN my_table t2
ON t1.Group = t2.Group AND t2.Level = 0
ORDER BY t2.CreatedAt, t1.Left;

Access Totals Query Not Necessarily Returning First Record

I have a table of data like this:
id user_id A B C
=====================
1 15 1 2 3
2 15 1 2 5
3 20 1 3 9
4 20 1 3 7
I need to remove duplicate user ids and keep the record that sorts lowest when sorting by A then B then C. So using the above table, I set up a temp query (qry_temp) that simply does the sort--first on user_id, then on A, then on B, then on C. It returns the following:
id user_id A B C
====================
1 15 1 2 3
2 15 1 2 5
4 20 1 3 7
3 20 1 3 9
Then I wrote a Totals Query based on qry_temp that just had user_id (Group By) and then id (First), and I assumed this would return the following:
user_id id
===========
15 1
20 4
But it doesn't seem to do that--instead it appears to be just returning the lowest id in a group of duplicate user ids (so I get 1 and 3 instead of 1 and 4). Shouldn't the Totals query use the order of the query it's based upon? Is there a property setting in the query that might impact this or another way to get what I need? If it helps, here is the SQL:
SELECT qry_temp.user_id, First(qry_temp.ID) AS FirstOfID
FROM qry_temp
GROUP BY qry_temp.user_id;
You need a different type of query, for example:
SELECT tmp.id,
tmp.user_id,
tmp.a,
tmp.b,
tmp.c
FROM tmp
WHERE (( ( tmp.id ) IN (SELECT TOP 1 id
FROM tmp t
WHERE t.user_id = tmp.user_id
ORDER BY t.a,
t.b,
t.c,
t.id) ));
Where tmp is the name of your table. First, Last, Min and Max are not dependent on a sort order. In relational databases, sort orders are quite ephemeral.

Get the last 2 rows of a table while grouping one of the column. MySQL

Consider Facebook. Facebook displays the latest 2 comments of any status. I want to do something similar.
I have a table with e.g. status_id, comment_id, comment and timestamp.
Now I want to fetch the latest 2 comments for each status_id.
Currently I am first doing a GROUP_CONCAT of all columns, group by status_id and then taking the SUBSTRING_INDEX with -2.
This fetches the latest 2 comments, however the GROUP_CONCAT of all the records for a status_id is an overhead.
SELECT SUBSTRING_INDEX(GROUP_CONCAT('~', comment_id,
'~', comment,
'~', timestamp)
SEPARATOR '|~|'),
'|~|', -2)
FROM commenttable
GROUP BY status_id;
Can you help me with better approach?
My table looks like this -
status_id comment_id comment timestamp
1 1 xyz1 3 hour
1 2 xyz2 2 hour
1 3 xyz3 1 hour
2 4 xyz4 2 hour
2 6 xyz6 1 hour
3 5 xyz5 1 hour
So I want the output as -
1 2 xyz2 2 hour
1 3 xyz3 1 hour
2 4 xyz4 2 hour
2 6 xyz6 1 hour
3 5 xyz5 1 hour
Here is a great answer I came across here:
select status_id, comment_id, comment, timestamp
from commenttable
where (
select count(*) from commenttable as f
where f.status_id = commenttable.status_id
and f.timestamp < commenttable.timestamp
) <= 2;
This is not very efficient (O(n^2)) but it's a lot more efficient than concatenating strings and using substrings to isolate your desired result. Some would say that reverting to string operations instead of native database indexing robs you of the benefits of using a database in the first place.
After some struggle I found this solution -
The following gives me the row_id -
SELECT a.status_id,
a.comments_id,
COUNT(*) AS row_num
FROM comments a
JOIN comments b
ON a.status_id = b.status_id AND a.comments_id >= b.comments_id
GROUP BY a.status_id , a.comments_id
ORDER BY row_num DESC
The gives me the total rows -
SELECT com.status_id, COUNT(*) total
FROM comments com
GROUP BY com.status_id
In the where clause of the main select -
row_num = total OR row_num = total - 1
This gives the latest 2 rows. You can modify the where clause to fetch more than 2 latest rows.