Good evening, I have one table, with a timestamp column, then I have a tool to insert into this table N registries. To avoid duplicate information in this table, I've used INSERT IGNORE, but of course as you can imagine with the timestamp attribute always the new row is different. I cannot make a previous search and check the result set in my code, because I'm adding all the queries into a statement batch, that's why I'm using INSERT IGNORE.
So the question would be, it is possible avoid the timestamp column when I'm making a compare between the new row and all the previous already inserted? the first time that used the tool?
Regards!.
Create a unique key on the fields you don't want duplicated, and exclude the timestamp field from that key.
Related
If I have a table that has these rows:
animal (primary)
-------
man
dog
cow
and I want to delete all the rows and insert my new rows (that may contain some of the same data), such as:
animal (primary)
-------
dog
chicken
wolf
I could simply do something like:
delete from animal;
and then insert the new rows.
But when I do that, for a split second, 'dog' won't be accessible through the SELECT statement.
I could simply insert ignore the new data and then delete the rest, one by one, but that doesn't feel like the right solution when I have a lot of rows.
Is there a way to insert the new data and then have MySQL automatically delete the rest afterward?
I have a program that selects data from this table every 5 minutes (and the code I'm writing now will be updating this table once every 30 minutes), so I would like to be as accurate as possible at all times, and I would rather have too many rows for a split second than too few rows for the same time.
Note: I know that this may seem like it is unnecessary but I just feel like if I leave too many of those unlikely possibilities in different places, there will be times where things go wrong.
You may want to use TRUNCATE instead of DELETE here. TRUNCATE is faster than DELETE and resets the table back to its empty state (meaning IDENTITY columns are reset to original values as well).
Not sure why you're having problems with selecting a value that was deleted and re-added, maybe I'm missing some context. But if you're wiping the table clean, you might want to use truncate instead.
You could add another column timestamp and change the select statement to accommodate this scenario where it needs to check for the latest value.
If this is for school, I would argue that you need a timestamp and that is what your professor is looking for. You shouldn't need to truncate a table to get the latest values, you need to adjust the thinking behind the table and how you are querying data. Hope this helps!
Check out these:
How to make a mysql table with date and time columns?
Why not update values instead?
My other questions would be:
How are you loading this into the table?
What does that code look like?
Can you change the way you Select from the table?
What values are being "updated" and change in such a way that you need to truncate the entire table?
If you don't want to add new column, there is an other method.
1. At first step, update table in any way that mark all existing rows for deletion in future. For example:
UPDATE `table_name` SET `animal`=CONCAT('MUST_BE_DELETED_', `animal`)
At second step, insert new rows.
On final step, remove all marked rows:
DELETE FROM `table_name` WHERE `animal` LIKE 'MUST_BE_DELETED_%'
You could implement this by having the updated_on column as timestamp and you may even utilize some default values, but let's go with an example without them.
I presume the table would look something like this:
CREATE TABLE `new_table` (
`animal` varchar(255) NOT NULL,
`updated_on` timestamp,
PRIMARY KEY (`animal`)
) ENGINE=InnoDB
This is just a dummy table example. What's important are the two queries later on.
You would simply perform a query to insert the data, such as:
insert into my_table(animal)
select animal from my_view where animal = 'dogs'
on duplicate key update
updated_on = current_timestamp;
Please notice that my_view is your table/view/query by which you supply the values to insert into your table. Also notice that you need to have primary/unique key constraint on your animal column in this example, in order to work.
Then, you proceed with the following query, to "purge" (delete) the old values:
delete from my_table
where updated_on < (
select *
from (
select max(updated_on) from my_table
) as max_date
);
Please notice that you could make a separate view in order to obtain this max_date value for updated_on entry. This entry should indicate the timestamp for your last updated/inserted values in a previous query, so you could proceed with utilizing it in a where clause in order to issue deletion of old records that you don't want/need anymore.
IMPORTANT NOTE:
Since you are doing multiple queries and it's supposed to be a single operation, I'd advise you to utilize it within a single trancations and to utilize a proper rollback on various potential outcomes (i.e. in case of mysql exceptions). You might wish to utilize a proper stored procedure for that.
This seems like it should be simple, but I couldn't figure out a way to do it. Let's say I have a table with 5,000 rows, each with an ID (primary key) of 1–5000. I am blindly inserting a new value with an existing ID, and it could be something like 2677. What I want to happen is that if the ID already exists, it will use the auto_increment value, in this case 5001. That or the maximum existing value + 1.
Most importantly, I can't use PHP (or anything else other than SQL) to do this, because the output is a query that needs to be directly importable without errors.
I have looked at two similar questions on SO:
Can you use aggregate values within ON DUPLICATE KEY
– the problem here is that they're selecting from an existing table which I can't do.
on duplicate key update with a condition? – the problem here is that I have no information on the table I'm importing to (except the basic structure), and don't know what the maximum value is.
INSERT INTO table (column1,column2) VALUES (1,2) ON DUPLICATE KEY UPDATE id=VALUES(id)
Obviously this requires an id column with AUTO_INCREMENT.
Moreover if you later need to select the inserted id just like if it was a new Insert, you do:
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(VALUES(id));
I am using MYSQL in my application development as my DB.
I want to clarify a thing.
Imagine There is a table called test.
Columns are col1,col2,col3,col4.
these columns have separate indexes. that mean 4 indexes.
I am inserting a record just to col1 and col2.
When you have a index in a column insert operation have a cost.
My question is. ----
So when I insert records only to one and two Do I have an affect from col3 and col4 ?
Will indexes will fire for every insert or will it fire if I do insert to those columns?
Let's get a basic fact straight: in an RDBMS there is no such thing that you insert a record for only a selected number fields in a table. If you insert a record, then all fields within that table will have a value for that record. That value may be a null value, but it is there.
Not to mention another fact, that columns may have non-null default values, so executing an insert that does not specify value for them will still result a non-null value to be stored.
Mysql indexes even null values, so if you have separate indexes for each column, then mysql has to update all indexes when a new record is inserted into the table, regardless how many fields are specifically assigned value within the insert.
I am doing the following SQL tutorial: http://sql.learncodethehardway.org/book/ex11.html
and in this exercise the author says in the second paragraph:
In this situation, I want to replace my record with another guy but
keep the unique id. Problem is I'd have to either do a DELETE/INSERT
in a transaction to make it atomic, or I'd need to do a full UPDATE.
Could anyone explain to me what the problem is with doing an UPDATE, and when we might choose REPLACE instead of UPDATE?
The UPDATE code:
UPDATE person SET first_name = "Frank", last_name = "Smith", age = 100
WHERE id = 0;
Here is the REPLACE code:
REPLACE INTO person (id, first_name, last_name, age)
VALUES (0, 'Frank', 'Smith', 100);
EDIT: I guess another question I have is why would you ever do a DELETE/INSERT instead of just an UPDATE as is discussed in the quoted section?
According to the documentation, the difference is:
REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.
So what it does:
Try to match the row using one of the available indexes;
If the row doesn't exist already: add a new one;
If the row exists already: delete the existing row and add a new one afterwards.
When might using this become useful over separate insert and update statements?
You can safely call this, and you don't have to worry about existing rows (one statement vs. two);
If you want related data to be removed when inserting / updating, you can use replace: it deletes all related data too);
When triggers need to fire, and you expect an insert (bad reason, okay).
First Replace isn't widely understood in all database engines.
Second replace inserts/updates a record based on the primary key. While with update you can specify more elaborate conditions:
UPDATE person SET first_name = 'old ' + first_name WHERE age > 50
Also UPDATE won't create records.
UPDATE will have no effect if the row does not exist.
Where as the INSERT or REPLACE will insert if the row doesn't exists or replace the values if it does.
Update will change the existing records value in table based on particular condition. So you can change one or many records in single query.
Insert or Replace will insert a new record if records is not present in table else will replace. Replace will only work if and only if you provide the primary key value in the insert or replace query. If you forget to add primary key field value than a new record will created in table.
Case example:-
Update: You have a calculation of wages to be done based on a formula using the column values. In this case you will always use update query as using one single query you can update multiple records.
Insert or Replace: Already mentioned in the link you shared.
How the REPLACE INTO statement works:
AS INSERT:
REPLACE INTO table_name (column1name, column2name, ...)
VALUES (value1, value2, ...);
AS UPDATE:
REPLACE INTO table_name SET column1name = value, column2name = value, ... ;
The REPLACE statement checks whether the intended data record's unique key value already exists in the table before inserting it as a new record or updating it.
The REPLACE INTO statement attempts to insert a new record or modify an existing record. In both cases, it checks whether the unique key of the proposed record already exists in the table. Suppose a value of NO or FALSE is returne. In that case, the REPLACE statement inserts the record similar to the INSERT INTO statement.
Suppose the key value already exists in the table (in other words, a duplicate key). In that case, the REPLACE statement deletes the existing record of data and replaces it with a new record of data. This happens regardless of whether you use the first or the second REPLACE statement syntax.
Once the REPLACE INTO statement is used to insert or modify data, it determines first whether the new data record already exists in the table. It checks if the PRIMARY or the UNIQUE KEY matches one of the existing records.
If there is no matching key, the REPLACE works like a normal INSERT statement. Otherwise, it deletes the existing record and replaces it with the new one. This is considered a sort of modification or update of an existing record. However, it would be best if you were careful here. Suppose you do not specify a value for a column in the SET clause. In that case, the REPLACE statement uses the default value (if a default value has been set). Otherwise, it's set as NULL.
After inserting new data into a table, I need to select some of the new data straight after, so I can use it for inserting into another table.
The only way I can think of doing this is using the 'datetime' field I have in my row, but how would I retrieve the latest date/time inserted.
INSERT statement with NOW() value for datetime
society_select = SELECT socID, creator, datetime FROM societies.society WHERE datetime='[..datetime is the newest...]';
Hope that makes sense. Thank you
There are a number of ways to do this.
Why not make use of a trigger for this?
When a trigger creates a record you can get the id's of the records inserted. You can then do a select and insert new values into the relevant table.
MYSQL has loads of resources on using triggers.
http://dev.mysql.com/doc/refman/5.0/en/triggers.html
Or you can get the number of rows affected then use this to get the required result set in a select statement.
Get the last inserted ID?
If you are inserting one row into the database at a time then you would be able to get the last inserted id from MYSQL. This will be the Primary Key value of the record you last inserted into the database.
You would basically do something like this in mysql:
SET #inserted_id = LAST_INSERT_ID();
Or in PHP you can use the function:
mysql_insert_id(&mysql);
http://dev.mysql.com/doc/refman/5.0/en/getting-unique-id.html
Sort the results by their datetime in descending order, and select the first of them.
society_select = SELECT socID, creator, datetime FROM societies.society ORDER BY datetime DESC LIMIT 1;
you can use this with an auto increment filed. after inserting data you can retrieve the list inserted id from the table. and use that id to get the latest record.
A trigger as suggested is an option. If you don't want to use that for some kind of reason you can:
Add an integer primary key with auto_increment as ID and sort it DESC (e.g. INT(11))
Sort descending on a timestamp column (ofcourse with an index on it)
Use a trigger after inserting the data. This is for sure the cleaner way.
Another option is to use a method like mysql_insert_id. Assumed that you use PHP. There are of course equivalent methods in other languages as well.
Sorting is not an option(if not wrapped smart in transaction) - If you have multiple writes and reads on the table this might end up pretty ugly.