Delete rows having exactly same values

Delete rows having exactly same values - mysql

Please consider following table
Table Name: mytable
model_id
event_name
time_of_event
9
CREATE
2016-01-01 00:00:00
9
UPDATE
2016-01-01 01:00:00
9
DELETE
2016-01-01 02:00:00
3
CREATE
2016-01-01 03:00:00       DUPLICATE
3
CREATE
2016-01-01 03:00:00       DUPLICATE delete this
3
DELETE
2016-01-01 04:00:00
How to delete 5th entry from above table i.e. delete row with exactly same value from table.
In above example no column is unique.
Please keep in mind that database could be huge and I don't want to recreate or republish data with distinct values into the table.
// Use below code to create above example
CREATE TABLE mytable(
model_id integer,
event_name varchar(7),
time_of_event timestamp
);
INSERT INTO mytable
(model_id, event_name, time_of_event)
VALUES
(9, 'CREATE', '2016-01-01 00:00:00'),
(9, 'UPDATE', '2016-01-01 01:00:00'),
(9, 'DELETE', '2016-01-01 02:00:00'),
(3, 'CREATE', '2016-01-01 03:00:00'),
(3, 'CREATE', '2016-01-01 03:00:00'),
(3, 'DELETE', '2016-01-01 04:00:00');
SELECT * FROM mytable;

Try with a helper table that contains the duplicates, but only one each:
With this scenario: ...
CREATE TABLE mytable(
model_id integer,event_name VARCHAR(8),time_of_event TIMESTAMP)
;
INSERT INTO mytable
-- your input data ...
SELECT 9,'CREATE',TIMESTAMP '2016-01-01 00:00:00'
UNION ALL SELECT 9,'UPDATE',TIMESTAMP '2016-01-01 01:00:00'
UNION ALL SELECT 9,'DELETE',TIMESTAMP '2016-01-01 02:00:00'
UNION ALL SELECT 3,'CREATE',TIMESTAMP '2016-01-01 03:00:00'
UNION ALL SELECT 3,'CREATE',TIMESTAMP '2016-01-01 03:00:00'
UNION ALL SELECT 3,'DELETE',TIMESTAMP '2016-01-01 04:00:00'
;
Create your helper table like so:
CREATE TABLE helper AS
SELECT
model_id
, event_name
, time_of_event
FROM mytable
GROUP BY
model_id
, event_name
, time_of_event
HAVING COUNT(*) > 1;
Then, use the helper table to delete ... you will delete all rows, not only one of the duplicates ...
DELETE FROM mytable
WHERE(model_id,event_name,time_of_event) IN (
SELECT model_id,event_name,time_of_event FROM helper
);
And finally, insert all the rows from the helper table back in again:
INSERT INTO mytable
SELECT * FROM helper;
COMMIT; -- if your connection is not auto-commit ...
But I'd like to add that, for most database systems, the other approach - to create a new table containing SELECT DISTINCT * FROM old_table is the faster alternative as soon as we are talking about around 20 to 25 % of the total row count.

Having two or more rows with identical values is a sign of very bad design. I suppose model_id is the table's primary key. I wonder how did you end up in this situation.
I don't want to recreate or republish data with distinct values into the table.
One possible solution is to add (not recreate/republish) a column with unique values to your table, then delete the duplicate rows you want.
ALTER TABLE mytable ADD COLUMN MyTableID INT FIRST;
You need to fill this column with unique values:
SET #i := 0;
UPDATE MyTable SET MyTableID = #i:=(#i+1) WHERE 1=1;
Next, you can write the following query:
SELECT
MT.MyTableID, MT.model_id, MT.event_name, MT.time_of_event
FROM
MyTable MT,
(SELECT model_id, event_name, time_of_event, COUNT(*)
FROM MyTable
GROUP BY model_id, event_name, time_of_event
HAVING COUNT(*) > 1
) TmpTable
WHERE
MT.model_id = TmpTable.model_id
AND MT.event_name = TmpTable.event_name
AND MT.time_of_event = TmpTable.time_of_event
;
Result:
MyTableID
model_id
event_name
time_of_event
4
3
CREATE
2016-01-01 03:00:00
5
3
CREATE
2016-01-01 03:00:00
You can now proceed with the deletion of duplicate rows:
DELETE FROM MyTable WHERE MyTableID IN (5 /*, the ones you wish */);
If you have too many duplicate values and you can't afford to delete them manually, you can do it like this:
DELETE FROM MyTable WHERE MyTableID IN (
SELECT
MT.MyTableID
FROM
(SELECT * FROM MyTable) AS MT,
(SELECT model_id, event_name, time_of_event, COUNT(*)
FROM MyTable
GROUP BY model_id, event_name, time_of_event
HAVING COUNT(*) > 1
) AS TmpTable
WHERE
MT.model_id = TmpTable.model_id
AND MT.event_name = TmpTable.event_name
AND MT.time_of_event = TmpTable.time_of_event
) LIMIT /* The number of duplicate rows - 1 */;
The -1 is to preserve one row of the duplicates. If you want to delete them all, remove the LIMIT clause.

Related

mysql union Merge different columns

I want to remove the null value And move up the value from yesterday
But I don't know how to do it.
Full sql:
(SELECT
COUNT(1) toDay, NULL AS yesterDay
FROM
bas_user
WHERE UNIX_TIMESTAMP(user_datetime) BETWEEN UNIX_TIMESTAMP(
DATE_FORMAT(CURDATE(), '%Y-%m-%d %H:%i:%s')
)
AND UNIX_TIMESTAMP(NOW())
GROUP BY HOUR(user_datetime))
UNION
(SELECT
NULL AS toDay,COUNT(1) yesterDay
FROM
bas_user
WHERE UNIX_TIMESTAMP(user_datetime) BETWEEN UNIX_TIMESTAMP(
DATE_SUB(
DATE_FORMAT(CURDATE(), '%Y-%m-%d %H:%i:%s'),
INTERVAL 1 DAY
)
)
AND UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 DAY))
GROUP BY HOUR(user_datetime)
)

In order to merge the two result sets, you need a join key. For example, assume user_id is the join key of both.
-- Step 1
create table user_today (
user_id int,
today_count int);
create table user_yesterday (
user_id int,
yesterday_count int);
insert into user_today values (101, 10), (102, 20), (103, 30);
insert into user_yesterday values (102, 25), (103, 35), (104, 45);
-- Step 2
select COALESCE(t.user_id, y.user_id) as user_id,
t.today_count,
y.yesterday_count
from user_today t
left
join user_yesterday y
using (user_id)
union
select COALESCE(y.user_id, t.user_id) as user_id,
t.today_count,
y.yesterday_count
from user_yesterday y
left
join user_today t
using (user_id);
Result:
user_id|today_count|yesterday_count|
-------+-----------+---------------+
101| 10| |
102| 20| 25|
103| 30| 35|
104| | 45|

Keep the latest 7 records and delete all other query issue

I have a mysql table
CREATE TABLE IF NOT EXISTS `mytable` (
`i_contact_id` int(16) NOT NULL AUTO_INCREMENT,
`s_contact_name` char(48) NOT NULL,
`ts_contact_scraped` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Date Time when contact is last scraped.',
PRIMARY KEY (`i_contact_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
INSERT INTO `mytable` (`i_contact_id`, `s_contact_name`, `ts_contact_scraped`) VALUES
(1, 'aaaa', '2018-07-27 02:30:30'),
(2, 'bbbb', '2017-03-28 04:13:08'),
(3, 'cccc', '2017-03-12 03:52:57'),
(4, 'dddd', '2017-04-18 07:13:34'),
(5, 'eeee', '2018-05-29 15:22:23'),
(6, 'ffff', '2018-02-23 13:27:24'),
(7, 'gggg', '2016-10-17 22:50:24'),
(8, 'hhhh', '2018-07-20 14:02:14'),
(9, 'iiii', '2020-03-24 10:56:02');
I want to keep 7 latest rows and delete all oldest rows based on ts_contact_scraped field but it don't work properly.
Here is my delete query
DELETE FROM `mytable`
WHERE i_contact_id <= (
SELECT i_contact_id
FROM (
SELECT i_contact_id
FROM `mytable`
ORDER BY ts_contact_scraped DESC
LIMIT 1 OFFSET 7
) foo
)
My original table has more than 1100000 rows, I want to run above query periodically using PHP to purge oldest rows, there is some other logic involved so I want to delete the oldest rows based on ts_contact_scraped field.
When I run this query on my original table it deletes more than expected rows.
Here is fiddle
http://sqlfiddle.com/#!9/9414e2/1/0

You can use JOIN:
DELETE t
FROM `mytable` t JOIN
(SELECT i_contact_id
FROM `mytable`
ORDER BY ts_contact_scraped DESC
LIMIT 1 OFFSET 7
) tt
ON tt.i_contact_id = t.i_contact_id

In your delete statement you are relying on a higher ts_contact_scraped also meaning a higher i_contact_id. At least in your example this is not given.
So stick to ts_contact_scraped instead:
DELETE FROM `mytable`
WHERE ts_contact_scraped <= (
SELECT ts_contact_scraped
FROM (
SELECT ts_contact_scraped
FROM `mytable`
ORDER BY ts_contact_scraped DESC
LIMIT 1 OFFSET 7
) foo
);
Here is your altered fiddle: http://sqlfiddle.com/#!9/610cb4/1
(If there can be duplicate ts_contact_scraped, though, things will get more complicated.)

DELETE t1.*
FROM `mytable` t1
LEFT JOIN ( SELECT i_contact_id
FROM `mytable`
ORDER BY ts_contact_scraped DESC
LIMIT 7 ) t2 ON t1.i_contact_id = t2.i_contact_id
WHERE t2.i_contact_id IS NULL;
fiddle
or
DELETE t1.*
FROM `mytable` t1, ( SELECT ts_contact_scraped
FROM `mytable`
ORDER BY ts_contact_scraped DESC
LIMIT 1 OFFSET 7 ) t2
WHERE t1.ts_contact_scraped <= t2.ts_contact_scraped;
fiddle

MySQL INSERT multiple rows if certain values don't exist

I have the following query which works correctly:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT 10, 'user_other_unit_moved', now(), 8383
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
LIMIT 1;
What the query does is check to see if a row exists in my Events table that matches the event type and unit ID I wish to INSERT. If it finds an existing record, it does not proceed with the INSERT. However, if it does not find a record then it proceeds with the INSERT.
This is the structure of my Events table:
CREATE TABLE `Events` (
`event_ID` int(11) NOT NULL,
`user_ID` int(11) NOT NULL,
`event_type` varchar(35) NOT NULL,
`event_creation_datetime` datetime NOT NULL,
`unit_ID` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `Events`
ADD PRIMARY KEY (`event_ID`),
ADD KEY `unit_ID` (`unit_ID`);
ALTER TABLE `Events`
MODIFY `event_ID` int(11) NOT NULL AUTO_INCREMENT;
COMMIT;
The problem I have is trying to get the above query to work correctly when trying to INSERT multiple rows. I know how to INSERT multiple rows using comma delimited VALUES, but no matter what I try I get syntax errors. Here is the query I have been playing with:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
VALUES (
(SELECT 10, 'user_other_unit_moved', now(), 8383
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
LIMIT 1)),
(SELECT 10, 'user_other_unit_moved', now(), 8380
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380)
LIMIT 1))
);
However, no matter what I try (inserting, removing parentheses etc.) I get either the generic "You have an error in your SQL syntax;" or "Operand should contain only 1 column".
I have also tried this alternative based on other StackOverflow posts:
INSERT IGNORE INTO Events (event_ID, user_ID, event_type, event_creation_datetime, unit_ID)
VALUES
(SELECT (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383), 10, 'user_other_unit_moved', now(), 8383),
(SELECT (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383), 10, 'user_other_unit_moved', now(), 8383);
But this fails with "Can't specify target table for update in FROM clause" even if I try to return results using temporary tables.
Is it just an error with my syntax, or am I trying to do something not possible with the way my query is laid out? And if it's just an error, how would I write the query so that it works as I've intended? Note that I do not want to use multi-queries - I want the query to work as one statement.
Thanks,
Arj

Don't use VALUES, just INSERT ... SELECT and not FROM events.
Then UNION ALL.
This code works for MySql 5.6:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT *
FROM (
SELECT 10 user_ID, 'user_other_unit_moved' event_type,
now() event_creation_datetime, 8383 unit_ID
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
) t
WHERE NOT EXISTS (
SELECT 1 FROM Events e
WHERE e.event_type = t.event_type AND e.unit_ID = t.unit_ID
);
See the demo.
This code works for MySql 5.7+:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT * FROM (
SELECT 10, 'user_other_unit_moved', now(), 8383
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380)
) t
See the demo
And this for MySql 8.0+:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT 10, 'user_other_unit_moved', now(), 8383
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380);
See the demo.

Although you can write this with just union all:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT x.user_id, x.event_type, now(), x.unit_id
FROM (SELECT 10 as user_id, 8383 as unit_id, 'user_other_unit_moved' as event_type
) x
WHERE NOT EXISTS (SELECT 1 FROM Events e2 WHERE e2.event_type = x.event_type AND e2.unit_ID = x.unit_ID)
UNION ALL
SELECT x.user_id, x.event_type, now(), x.unit_id
FROM (SELECT 10 as user_id, 8380 as unit_id, 'user_other_unit_moved' as event_type
) x
WHERE NOT EXISTS (SELECT 1 FROM Events e2 WHERE e2.event_type = x.event_type AND e2.unit_ID = x.unit_ID);
I suspect there is a better way. If a unit_id can have only one row for each event type, then you should specify that using a unique constraint or index:
create unique constraint unq_events_unit_id_event_type on events(unit_id, event_type);
It is better to have the database ensure integrity. In particularly, your version is subject to race conditions. And to duplicates being inserted within the same statement.
Then you can use on duplicate key to prevent duplicate rows:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
VALUES (10, 'user_other_unit_moved', now(), 8383),
(10, 'user_other_unit_moved', now(), 8380)
ON DUPLICATE KEY UPDATE unit_ID = VALUES(unit_ID);
The update actually does nothing (because unit_ID already has that value). But it does prevent an error and a duplicate row from being inserted.

MySQL multiple COUNTs

I have a table like this:
Fiddle: http://sqlfiddle.com/#!2/44d9e/14
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(20) NOT NULL,
`money_earned` int(20) NOT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO mytable (user_id,money_earned) VALUES ("111","10");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","6");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","40");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","45");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","1");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","5");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","19");
I need to know table has how many rows, how many different users, and how many times each user has earned.
I need this result:
TOTAL_ROWS: 7
TOTAL_INDIVIDUAL_USERS: 3
USER_ID USER_TIMES
111 3
222 2
333 2

Is your problem that you want the total as well? If so, then you can get this using rollup:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times
FROM mytable
GROUP BY user_id with rollup;
You can get the user counts in a separate column with this trick:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times, count(distinct user_id) as UserCount
FROM mytable
GROUP BY user_id with rollup;
You realize that a SQL query just returns a table of values. You are asking for very specific formatting, which is typically done better at the application level. That said, you can get close to what you want with something like this:
select user, times
from ((SELECT 3 as ord, cast(user_id as char(20)) as user, COUNT(*) as times
FROM mytable
GROUP BY user_id
)
union all
(select 1, 'Total User Count', count(*)
from mytable
)
union all
(select 2, 'Total Users', count(distinct user_id)
from mytable
)
) t
order by ord;

I think this could be a typo anyway your are trying to sum your COUNT() times, simply replace with money_earned
SELECT user_id,
COUNT(*) AS 'times',
SUM(money_earned) AS 'sum_money'
FROM mytable GROUP BY user_id;
SQL Fiddle

MySQL query, MAX() + GROUP BY

Daft SQL question. I have a table like so ('pid' is auto-increment primary col)
CREATE TABLE theTable (
`pid` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`cost` INT UNSIGNED NOT NULL,
`rid` INT NOT NULL,
) Engine=InnoDB;
Actual table data:
INSERT INTO theTable (`pid`, `timestamp`, `cost`, `rid`)
VALUES
(1, '2011-04-14 01:05:07', 1122, 1),
(2, '2011-04-14 00:05:07', 2233, 1),
(3, '2011-04-14 01:05:41', 4455, 2),
(4, '2011-04-14 01:01:11', 5566, 2),
(5, '2011-04-14 01:06:06', 345, 1),
(6, '2011-04-13 22:06:06', 543, 2),
(7, '2011-04-14 01:14:14', 5435, 3),
(8, '2011-04-14 01:10:13', 6767, 3)
;
I want to get the PID of the latest row for each rid (1 result per unique RID). For the sample data, I'd like:
pid | MAX(timestamp) | rid
-----------------------------------
5 | 2011-04-14 01:06:06 | 1
3 | 2011-04-14 01:05:41 | 2
7 | 2011-04-14 01:14:14 | 3
I've tried running the following query:
SELECT MAX(timestamp),rid,pid FROM theTable GROUP BY rid
and I get:
max(timestamp) ; rid; pid
----------------------------
2011-04-14 01:06:06; 1 ; 1
2011-04-14 01:05:41; 2 ; 3
2011-04-14 01:14:14; 3 ; 7
The PID returned is always the first occurence of PID for an RID (row / pid 1 is frst time rid 1 is used, row / pid 3 the first time RID 2 is used, row / pid 7 is first time rid 3 is used). Though returning the max timestamp for each rid, the pids are not the pids for the timestamps from the original table. What query would give me the results I'm looking for?

(Tested in PostgreSQL 9.something)
Identify the rid and timestamp.
select rid, max(timestamp) as ts
from test
group by rid;
1 2011-04-14 18:46:00
2 2011-04-14 14:59:00
Join to it.
select test.pid, test.cost, test.timestamp, test.rid
from test
inner join
(select rid, max(timestamp) as ts
from test
group by rid) maxt
on (test.rid = maxt.rid and test.timestamp = maxt.ts)

select *
from (
select `pid`, `timestamp`, `cost`, `rid`
from theTable
order by `timestamp` desc
) as mynewtable
group by mynewtable.`rid`
order by mynewtable.`timestamp`
Hope I helped !

SELECT t.pid, t.cost, to.timestamp, t.rid
FROM test as t
JOIN (
SELECT rid, max(tempstamp) AS maxtimestamp
FROM test GROUP BY rid
) AS tmax
ON t.pid = tmax.pid and t.timestamp = tmax.maxtimestamp

I created an index on rid and timestamp.
SELECT test.pid, test.cost, test.timestamp, test.rid
FROM theTable AS test
LEFT JOIN theTable maxt
ON maxt.rid = test.rid
AND maxt.timestamp > test.timestamp
WHERE maxt.rid IS NULL
Showing rows 0 - 2 (3 total, Query took 0.0104 sec)
This method will select all the desired values from theTable (test), left joining itself (maxt) on all timestamps higher than the one on test with the same rid. When the timestamp is already the highest one on test there are no matches on maxt - which is what we are looking for - values on maxt become NULL. Now we use the WHERE clause maxt.rid IS NULL or any other column on maxt.

You could also have subqueries like that:
SELECT ( SELECT MIN(t2.pid)
FROM test t2
WHERE t2.rid = t.rid
AND t2.timestamp = maxtimestamp
) AS pid
, MAX(t.timestamp) AS maxtimestamp
, t.rid
FROM test t
GROUP BY t.rid
But this way, you'll need one more subquery if you want cost included in the shown columns, etc.
So, the group by and join is better solution.

If you want to avoid a JOIN, you can use:
SELECT pid, rid FROM theTable t1 WHERE t1.pid IN ( SELECT MAX(t2.pid) FROM theTable t2 GROUP BY t2.rid);

Try:
select pid,cost, timestamp, rid from theTable order by timestamp DESC limit 2;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Delete rows having exactly same values - mysql

Related

mysql union Merge different columns

Keep the latest 7 records and delete all other query issue

MySQL INSERT multiple rows if certain values don't exist

MySQL multiple COUNTs

MySQL query, MAX() + GROUP BY

Categories

Resources