I want to delete data which are past 2years. filed name is Date and type is varchar(255)
delete from <table_name> where <Filed> like '%2022';
running very longtime but no deletion of data
I have check and tried the query, you can try with
DELETE From <datatable> WHERE <date> LIKE '%2022';
DELETE From post WHERE date LIKE '%2022'; #Example
May you provide the database or screenshot? I have tried the query and no issue https://www.db-fiddle.com/f/syhtgVyEcSPcHRXBXHLtor/0
If primary key(probably id) and the date column are correlated, meaning bigger id will result the later dates(in this case, it is a of type varchar, and thanks to P.Salmon for pointing this out),then
I think you can delete using primary key(normally it is column id), for example:
select id from table where date > '2020' order by id asc limit 1;
// assume this id = 123456789, and delete rows that created before this id was created
delete from table where id < 123456789;
if there is not correlation, I have some ideas like below:
create a new column called created_at of type year/date/datetime/timestamp(probably date or year will do), it will store the actual year or date or datetime, use it to replace the date column of type varchar, probably create an index on created_at, and delete with the new column
If there is a index on date(varchar), since the % sign in like clause will cause the server not using index, so it is a full table scan for sure, and can you like enumerate all date like '01-01-2020', '01-02-2020', and delete rows one date by one date, with a script, I think in this way at least you get to use the index
if there are too many rows, like 10 years or even more, is it possible just migrate data within 2 years to a new table, and just remove the old table?
write a script, fetch 10000 row each time from beginning of primary key, and delete those that are over 2 years, and fetch next 10000
last_id = 0
select * from table where id > last_id order by id asc limit 10000;
last_id = [last id of the query]
delete from table where id in (xxx);
Related
is it possible to delete a row and return a value of the deleted row?
example
DELETE FROM table where time <= -1 week
SELECT all id that were deleted
If you want to control the operations in the database, you could consider to use JOURNAL tables. There's a question here in SO about this.
They are a mirrored table, usually populated by a trigger with the operation performed (update, delete). It stores the "old values" (the current values you can always get from the main table).
If implemented so, you could then SELECT from the journal table and have exactly what you needed.
Trying to give you an example:
Table USER
CREATE TABLE USER (
INT id,
VARCHAR name
)
Table USER_JN
CREATE TABLE USER_JN (
INT id,
VARCHAR name,
VARCHAR operation
)
Then, for every operation you can populate the USER_JN and have a history of all changes (you should not have constraints in it).
If you delete, your operation column would have the delete value and you could use your select to check that.
It's not exactly "selecting the deleted row", but a way to make it possible.
Hope it's somehow useful.
SELECT id FROM table WHERE time <= -1 week
and then simply
DELETE FROM table WHERE time <= -1 week
I would not search non indexed column twice. You should use a variable like:
SELECT id INTO #tID FROM table WHERE time <= -1 week;
DELETE FROM table WHERE id = #tID
You may then use the variable #tID as you wish.
What is the best thing for my scenario
I have a tables with nearly 20,000,000 records, which basically stores what users have done in the site
id -> primary int 11 auto increment
user_id -> index int 11 not null
create_date -> ( no index yet ) date-time not null
it has other columns but seems irrelevant to name them here
I know I must put an index on create_date but do I put a single column index or a double column, which one first on the double index ( given the large number of records)?
by the way the query that I'm now using is like :
select max(id) -- in here I'm selecting actions that users have done, after this date, since date is today
from table t
where
t.create_date >= '2014-12-29 00:00:00'
group by t.user_id
Could you edit your question with an EXPLAIN PLAN of your SELECT? EXPLAIN Link. Meanwhile, you can try with this:
Make partitions using your date field create_date. Partitions
Build your index with the most restrictive criteria first. I think that in your case, it will be better create_date + user_id
CREATE INDEX index_name
ON table_name ( create_date , user_id );
I started with this question: is my large mysql table destined for failure?
The answer that I found from that question was satisfactory. I have a table with 22 million rows that I would like to grow to about 100 million. At this time, the table minute_data structure is like this:
A problem that I am having is as follows. I need to execute this query:
select datediff(date,now()) from minute_data where symbol = "CSCO" order by date desc limit 1;
Which is very fast ( < 1 sec ) when the table contains the value "CSCO". The problem is, sometimes I will query for a symbol that is not in the table already. When I execute a query like this for, say, symbol = "ABCD":
select datediff(date,now()) from minute_data where symbol = "ABCD" order by date desc limit 1;
Then the query takes a LONG TIME... like forever ( 180 seconds ).
A way I can get around this is by making sure that the table contains the symbol I am looking for before I execute the query. The fastest way I found to do this is with the follow query, which I just need to use to check to see if the table minute_data contains the symbol I am looking for or not. Basically I just need it to return a boolean value so I know if the symbol is in the table or not:
select count(1) from minute_data where symbol = "CSCO";
This query takes over 30 seconds to return 1 value, way too long for my liking, since the query above, which actually returns a datediff calculation only takes less than 1 second.
symbol column is part of the pri key, I thought it should be able to figure out if a value exists there very quickly.
What am I doing wrong? Is there a fast way to do what I want to do? Should I change the structure of the data to optimize performance?
Thank You!
UPDATE
I think I found a good solution to this problem. From the answer below by LastCoder, I did the following:
1) Created a new table called minute_data_2 with the exact same definition as minute_data.
2)
ALTER TABLE minute_data_2 ADD PRIMARY KEY (symbol, date);
3)
INSERT IGNORE INTO minute_data_2 SELECT * FROM minute_data;
4)
DROP TABLE minute_data;
5) Rename minute_data_2 to minute_data
Now I am seeing blindingly fast speed for the same query which I described above as taking more than 180 second, now completes in .001 seconds. Amazing.
Did you try using EXISTS (...)
select datediff(date,now()) from minute_data
where EXISTS(SELECT * FROM minute_data WHERE symbol = "CSCO")
AND symbol = "CSCO" order by date desc limit 1;
Even though symbol is a primary key, it seems you have the timestamp as a PK as well which makes me think you are using a COMPOSITE pk which means the ordering is by timestamp then symbol. You may want to put separate index on symbol, if all you have is a composite one where timestamp is first.
I think is better to make a table named symbols and add a reference to that table in your minute_data table:
symbols:
symbol_id (INT, Primary Key, Auto Increment)
symbol_text (VARCHAR)
minute_data:
key_col (BIGINT, Primary Key, Auto Increment)
symbol_id (INT, Index)
other_field
Use InnoDB as table type for adding references.
Try to avoid duplicate entries into your tables..
If I added a record yesterday and one today, how do I order the results by 'select * from table_name' by getting the entry added today first and then the older ones?
You'll need to timestamp and order by it, or order by a field with auto increment or similar.
If your primary key field is an auto-incremented integer, then you can do the following:
SELECT * FROM table_name ORDER BY pk_column DESC
If you're not using an auto-incremented integer for your primary key, then you'll need to do as Andre suggested and timestamp your rows.
Unless records are deleted, they are stored right in the order they were inserted.
If records are deleted, new records are inserted not in a subsequent order. You then need to explicitly order by an "auto-incremented" ID field or a timestamp or something similar (if your table structure does provide any of these).
I have a table with many rows but they are out of order. Im using the field "id" as the primary key. I also have a "date" field which is a datetime field.
How could i reindex the table so that the entries are id'd in chronological order according to the date field
How about something like a simple query using a variable:
set #ROW = 0;
UPDATE `tbl_example` SET `id` = #ROW := #ROW+1 ORDER BY `fld_date` ASC;
This will order your rows like: 0,1,2,4,5...etc by your date.
the way i would do it is to create a new table with auto increment index and just select all your old table into it ordering by date. you can then remove your old table.
Why do you want the sequence of IDs to correlate with the dates? It sounds like you want to do ORDER BY id and have the rows come back in date order. If you want rows in date order, just use ORDER BY date instead.
Values in an autoincrement ID column should be treated as arbitrary. Relying on your IDs being in date order is a bad idea.
The following SQL snippet should do what you want.
ALTER TABLE test_table ADD COLUMN id2 int unsigned not null;
SET #a:=0;
UPDATE test_table SET id2=#a:=#a+1 ORDER BY `date`;
ALTER TABLE test_table DROP id;
ALTER TABLE test_table CHANGE id2 id int UNSIGNED NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY (id);
Keep in mind that you can never guarantee the order of an auto-incremented column once you start inserting and removing data, so you shouldn't be relying on any order except that which you specify using ORDER BY in your queries. This is an expensive operation that you are doing, as it requires indexes to be completely re-created, so I wouldn't suggest doing it often.
You can use ALTER TABLE t ORDER BY col;
The allowed syntax of ORDER BY is as in SELECT statements.
I had to do something similar. The best way to do it was the following (you can run it in one SQL Query if you want, but bare in mind that this is a slow and very resource consuming operation):
BE SURE TO MAKE A BACKUP OF YOUR TABLE, INCLUDING STRUCTURE AND DATA BEFORE STARTING THIS QUERY!
ALTER TABLE your_table ADD COLUMN temp_id INT UNSIGNED NOT NULL;
SET #a:=0;
UPDATE your_table SET temp_id=#a:=#a+1 ORDER BY `date` ASC;
ALTER TABLE your_table DROP id;
ALTER TABLE your_table CHANGE temp_id id INT UNSIGNED NOT NULL AUTO_INCREMENT, ADD PRIMARY KEY (id);
ALTER TABLE your_table CHANGE COLUMN id id INT(10) FIRST;
Just don't forget to change "your_table" with the name of your table, and the ORDER BY columns.
Here I explain you what you're doing this way step by step:
First you add a new column named "temp_id" (make sure it's not a name you're using already);
Next you add a temp variable equal to 0 (or to whatever you want for your ID to start from);
Then you update your table, row by row by the set ORDER logic, setting a value for your new column "temp_id" equal to the variable you've set, then increment this variable by 1 (you can do something funky here, for example if you want your ID's to be always even, the you can set #a+2);
Next step you drop (remove) your old column ID;
Then you change the name of your temp_id column back to ID and it as a positive integer with auto increment which is the primary key of your table.
Because ID now is the newly added temp_id column, it's located at the end of your table structure. To move it again as first column, you run the last query, to make sure it's the first column.
If you are using something like phpmysql this could be achieved by:
going to the table (left side list of db's and tables), then
from the options in the upper bar select 'SQL'. Follow the advice by #Ryun, then go to 'Operations' (from the upper bar),
look for 'TABLE OPTIONS', leave everything except 'AUTO_INCREMENT' unchanged,
set the 'AUTO_INCREMENT' value to 1 and press go at the bottom of the form.
What will this do, in all?
It will set the id columns in each from 1 to {count}.
Then it will reset the index of the table so that your next inserted row will equal +1 the number of columns (and not +1 the old index).
#Wyzard made reference to just ordering the columns by date when you retrieve them from the table (and not re-indexing). Since, indeed, the Primary Key should be arbitrary (except to any foreign keys and perhaps the consuming platform (but that is another matter)).