How to check if the value in the mySQL DB exist? - mysql

I have a mySQL database and I have a Perl script that connects to it and performs some data manipulation.
One of the tables in the DB looks like this:
CREATE TABLE `mydb`.`companies` (
`company_id` INT NOT NULL AUTO_INCREMENT,
`company_name` VARCHAR(100) NULL ,
PRIMARY KEY (`company_id`) );
I want to insert some data in this table. The problem is that some companies in the data can be repeated.
The question is: How do I check that the "company_name" already exist? If it exist I need to retrieve "company_id" and use it to insert the data into another table. If it does not, then this info should be entered in this table, but I already have this code.
Here is some additional requirement: The script can be run multiple times simultaneously, so I can't just read the data into the hash and check if it already exist.
I can throw an additional "SELECT" query, but it will create an additional hit on the DB.
I tried to look for an answer, but every question here or the thread on the web talks about using the primary key checking. I don't need this. The DB structure is already set but I can make changes if need to be. This table will be used as an additional table.
Is there another way? In both DB and Perl.

"The script can be run multiple times simultaneously, so I can't just read the data into the hash and check if it already exist."
It sounds like your biggest concern is that one instance of the script may insert a new company name while another script is running. The two scripts may check the DB for the existence of that company name when it doesn't exist, and then they might both insert the data, resulting in a duplicate.
Assuming I'm understanding your problem correctly, you need to look at transactions. You need to be able to check for the data and insert the data before anyone else is allowed to check for that data. That will keep a second instance of the script from checking for data until the 1st instance is done checking AND inserting.
Check out: http://dev.mysql.com/doc/refman/5.1/en/innodb-transaction-model.html
And: http://dev.mysql.com/doc/refman/5.1/en/commit.html
MyISAM doesn't support transactions. InnoDB does. So you need to make sure your table is InnoDB. Start your set of queries with START TRANSACTION.
Alternatively, you could do this, if you have a unique index on company_name (which you should).
$query_string = "INSERT INTO `companies` (NULL,'$company_name')";
This will result in an error if the company_name already exists. Try a sample run attempting to insert a duplicate company name. In PHP,
$result = mysql_query($query_string);
$result will equal false on error. So,
if(!$result) {
$query2 = "INSERT INTO `other_table` (NULL,`$company_name`)";
$result2 = mysql_query($query2);
}
If you have a unique key on company_name in both tables, then MySQL will not allow you to insert duplicates. Your multiple scripts may spend a lot of time trying to insert duplicates, but they will not succeed.
EDIT: continuing from the above code, and doing your work for you, here is what you would do if the insert was successful.
if(!$result) {
$query2 = "INSERT INTO `other_table` (NULL,`$company_name`)";
$result2 = mysql_query($query2);
} else if($result !== false) { // must use '!==' rather than '!=' because of loose PHP typing
$last_id = mysql_insert_id();
$query2 = "UPDATE `other_table` SET `some_column` = 'some_value' WHERE `id` = '$last_id'";
// OR, maybe you want this query
// $query2a = "INSERT INTO `other_table` (`id`,`foreign_key_id`) VALUES (NULL,'$last_id');
}

I suggest you write a stored procedure(STP), which takes input as company name.
In this STP, first check existing company name. If it exists, return id. Otherwise, insert and return id.
This way, you hit DB only once

For InnoDB use transaction. For MyISAM lock table, do modifications, unlock.

Related

Which is better practice: 'check whether a primary ID exists then skip' or 'insert ignore'?

If I'm writing a system which checks an API for new messages every n minutes, which is the better practice? (Each message has a unique ID which is used as the primary ID in my system.)
Would you prefer to:
Look up the primary ID of the message in the database and skip inserting the message it if it already exists
Do 'Insert Ignore'?
Neither of your solutions. Keep reading.
Let the database do the work. If you don't want duplicates, then create unique index on the columns. My guess is:
create unique index unq_messages_messageid on messages (mesageid);
(This will also work on the string or multiple columns, if that is what you really want.)
Once you have the unique index or constraint the following are the two methods I would suggest.
(1) Just do an insert. If there is a duplicate, it will fail. Handle the error in your application code. Good application code handles errors.
(2) Use on duplicate key update (this might one day be replaced by on conflict ignore):
insert into messages ( . . . )
values ( . . .)
on duplicate key update message_id = values(message_id);
The assignment is a no-op -- it does nothing.
Why is this preferred over insert ignore? Simple reason: it only handles the specific error of a duplicate key. Other errors that might occur are still returned to the application.

Prevent Auto Increment Skip On Duplicate Key Update

I have a table with
(ID INT auto_incrment primary key,
tag VARCHAR unique)
I want to insert multiple tags at one. Like this:
INSERT INTO tags (tag) VALUES ("java"), ("php"), ("phyton");
If I would execute this, and "java" is already in the table, I'm getting an error. It doesn't add "php" and "python".
If I do it like this :
INSERT INTO tags (tag) VALUES ("java"), ("php"), ("phyton")
ON DUPLICATE KEY UPDATE tag = VALUES(tag)
it gets added without an error, but it skips 2 values at the ID field.
Example: I have Java with ID = 1 and I run the query. Then PHP will be 3 and Phyton 4. Is there a way to execute this query without skipping the IDs?
I don't want big spaces between them. I also tried INSERT IGNORE.
Thank you!
See "SQL #1" in http://mysql.rjweb.org/doc.php/staging_table#normalization . It is more complex but avoids 'burning' ids. It has the potential drawback of needing the tags in another table. A snippet from that link:
# This should not be in the main transaction, and it should be
# done with autocommit = ON
# In fact, it could lead to strange errors if this were part
# of the main transaction and it ROLLBACKed.
INSERT IGNORE INTO Hosts (host_name)
SELECT DISTINCT s.host_name
FROM Staging AS s
LEFT JOIN Hosts AS n ON n.host_name = s.host_name
WHERE n.host_id IS NULL;
By isolating this as its own transaction, we get it finished in a hurry, thereby minimizing blocking. By saying IGNORE, we don't care if other threads are 'simultaneously' inserting the same host_names. (If you don't have another thread doing such INSERTs, you can toss the IGNORE.)
(Then it goes on to talk about IODKU.)
INNODB engine Its main feature is to support ACID type transactions.
What it usually does that I point out is not a "problem", is that the engine will "reserve" the id before knowing if it is a duplicate or not.
This is a solution, but it depends on your table, if we are talking about a very large one you should do some tests first because the AUTO_INCREMENT function helps you to follow the ordering of the id.
I'll give you some examples:
INSERT INTO tags (java,php,python) VALUES ("val1"), ("val2"), ("val3")
ON DUPLICATE KEY UPDATE java = VALUES(java), id = LAST_INSERT_ID(id);
SELECT LAST_INSERT_ID();
ALTER TABLE tags AUTO_INCREMENT = 1;
Note: I added LAST_INSERT_ID () to you because every time you insert or update it always gives you an inserted or reserved id.
Each time INSERT INTO is called, AUTO_INCREMNT must be followed.

I need to find out who last updated a mysql database

I can get a last update time from TABLES in information_schema. Can I get a USER who updated the database or a table?
As Amadan mentioned, I'm pretty sure there isn't a way to do this unless you record it yourself. However, this is a pretty straightforward thing to do: Whenever you perform an UPDATE query, also log in a separate table the user (as well as any other relevant information) that you want to record via an additional MySQL query. Something like this (written in PHP as you didn't specify a language, but the MySQL can be exported anywhere) will work:
// The update query
$stmt = $db->prepare("UPDATE table SET `col` = ? WHERE `col` = ?");
$stmt->execute(array($var1, $var2));
// Something in table has just been updated; record user's id and time of update
$stmt = $db->prepare("INSERT INTO log (userid, `time`) VALUES (?, NOW())");
$stmt->execute(array($userid));

Is there a smart way to mass UPDATE in MySQL?

I have a table that needs regular updating. These updates happen in batches. Unlike with INSERT, I can't just include multiple rows in a single query. What I do now is prepare the UPDATE statement, then loop through all the possibilities and execute each. Sure, the preparation happens only once, but there are still a lot of executions.
I created several versions of the table of different sizes (thinking that maybe better indexing or splitting the table would help). However, that did not have an effect on update times. 100 updates take about 4 seconds for either 1,000-row table or 500,000-row one.
Is there a smarter way of doing this faster?
As asked in the comments, here is actual code (PHP) I have been testing with. Column 'id' is a primary key.
$stmt = $dblink->prepare("UPDATE my_table SET col1 = ? , col2 = ? WHERE id = ?");
$rc = $stmt->bind_param("ssi", $c1, $c2, $id);
foreach ($items as $item) {
$c1 = $item['c1'];
$c2 = $item['c2'];
$id = $item['id'];
$rc = $stmt->execute();
}
$stmt->close();
If you really want to do it all in one big statement, a kludgy way would be to use the "on duplicate key" functionality of the insert statement, even though all the rows should already exist, and the duplicate key update will hit for every single row.
INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE 1=VALUES(a), b=VALUES(b), c=VALUES(c);
Try LOAD DATA INFILE. Much faster than MySQL INSERT's or UPDATES, as long as you can get the data in a flat format.

Overwriting data in a MySQL table

With the query below, I am trying to overwrite the 10th field in a MySQL table called "login" with the value NEW_VALUE. It's not working. Is the code below the correct method for overwriting existing data in a MySQL table?
Thanks in advance,
John
INSERT INTO login VALUES (NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, 'NEW_VALUE', NULL, NULL, NULL)
Just as an addition if anyone is still looking for an actual overwrite and not just an update. If you want to OVERWRITE always, (not update, just overwrite) you can use REPLACE instead of INSERT.
REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted. See Section 13.2.5, “INSERT Syntax”
http://dev.mysql.com/doc/refman/5.5/en/replace.html
No your code is not correct. You are adding a new row to your table not updating existing values. To update existing values, you want to use an update statement:
Upate a specific record
mysql_query("Update login SET nameOfYourColumn = '$cleanURL' WHERE primaryKey = idOfRowToUpdate")
To update the entire table
mysql_query("Update login SET nameOfYourColumn = '$cleanURL'")
If I've understood your question then the answer is "no". This isn't a mysql specific issue either, it's a generic SQL question. I'd strongly recommend going through an SQL tutorial, the best one I know if is here:
http://philip.greenspun.com/sql/
To answer your question, you should be able to do:
mysql_query("UPDATE login SET foo = '$cleanurl'");
where "foo" is the name of the tenth field.
A few other comments though:
Firstly, don't rely on the position of your fields, always explicitly list the field names. For example, it's better to go
INSERT INTO login (id, name) VALUES (1, 'Fred')
instead of
INSERT INTO login VALUES (1, 'Fred')
Point 2: You have directly embedded the value of $cleanurl into your query. Of course, you have to learn one thing at a time but be aware that this is very dangerous. If $cleanurl contains something like "'); DROP TABLE login;" then you might be in trouble. This is called SQL injection and is the source of constant security problems. Without going into too much detail, you should learn how to use prepared statements.
Point 3: PHP comes with a library called PDO which supports prepared statements. It also provides a common API for interacting with your database so if you find that you need to move from Mysql to another DBMS, PDO will abstract away most of the differences. By using the mysql_query function you lock yourself into using mysql.
You don't have to address all of these issues simultaneously but don't forget about them either, once you get familiar with PHP and SQL come back to the points about PDO and prepared statements.
First off: INSERT adds a new record to a table, UPDATE updates (overwrites) one or more existing records.
Second: UPDATE needs to know the name of the column to update, and which rows to update
UPDATE <tableName>
SET <columnName> = '$cleanurl'
WHERE <some condition to identify which record should be updated>
Thirdly: it's probably worth your while reading a few basic tutorials on MySQL/SQL