Learning MySQL, Python - Skip Duplicates - mysql

I've been trying to learn SQL using python to update a db and am trying to do something simple. Iterate through a csv file that includes the fortune 500 with their revenue info and push into an SQL db. I've run it a few times and it's working great, the only issue is I'm getting duplicates because I've run the same file a few times.
In the future, I'm assuming it's good to learn how to avoid duplicates. After looking around this is what I've found for a proposed solution using WHERE NOT EXISTS but am getting an error. Any advice is welcome as I'm totally new.
Note - I do know I should be updating more than one row at a time, that's my next lesson
import pymysql
import csv
with open('companies.csv','rU') as f:
reader = csv.DictReader(f)
for i in reader:
conn = pymysql.connect(host='host', user='user', passwd='pw', db='db_test')
cur = conn.cursor()
query1 = "INSERT INTO companies (Name, Revenue, Profit, Stock_Price) VALUES (\'{}\',{},{},{})".format(str(i['Standard']),float(i['Revenues']),float(i['Profits']),float(i['Rank']))
query2 = 'WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = \'{}\')'.format(str(i['Standard']))
query = query1+' '+query2
cur.execute(query)
conn.commit()
cur.close()
OUTPUT:
INSERT INTO companies (Name, Revenue, Profit, Stock_Price) VALUES ('WalMart Stores',469.2,16999.0,1.0) WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = 'WalMart Stores')
ERROR:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = 'WalMart Stores')' at line 1")

Ok. First of all, congratulations on self-learning!
Now, to the point.
When you use insert ... values, you can't define a where condition for the table on which you're inserting values. insert statement is only used to insert (When you use insert... select, you can define a where condition on the select, not on the table on which you're about to insert values).
So, there are two ways to do what you want:
Create a unique index on the column that you want to test, and then use insert ignore...
In your code, check if the value is already there, and if it's not, then insert it.
I'll tell you how to work with the first suggestion, because it'll teach you a couple of things. As for suggestion 2, I'll leave that for you as homework ;-)
First, you need to add a unique index to your table. If you want to avoid duplicates on the Name column, then:
alter table companies
add unique index idx_dedup_name(Name);
Check the syntax for ALTER TABLE.
And now, let's say that Companies already has a row with name 'XCorp'. If you try a normal INSERT... VALUES statement here, you'll get an error, because you're trying to add a duplicate value. If you want to avoid that error, you can use something like this:
insert ignore into companies(name) values ('XCorp');
This will execute as a normal insert, but, since you're trying to insert a duplicate value, it will fail, but silently (it wil throw a warning instead of an error).
As for suggestion 2, as I told you, I leave it to you as homework.
Hints:
Count the rows where the name matches a value.
Read the count to a variable in your python program
Test the value... if there's zero entries, then perform the insert.

Related

A variable is cut in a wrong way but now sure why?

I'm writing a script that locates all branches of a specific repo that haven't received any commits for more than 6 months and deletes them (after notifying committers).
This script will run from Jenkins every week, will store all these branches in some MySQL database and then in the next run (after 1 week), will pull the relevant branch names from the database and will delete them.
I want to make sure that if for some reason the script is run twice on the same day, relevant branches will not get added again to the database, so I check it using a SQL query:
def insert_data(branch_name):
try:
connection = mysql.connector.connect(user=db_user,
host=db_host,
database=db_name,
passwd=db_pass)
cursor = connection.cursor(buffered=True)
insert_query = """insert into {0}
(
branch_name
)
VALUES
(
\"{1}\"
) where not exists (select 1 from {0} where branch_name = \"{1}\" and deletion_date is NULL) ;""".format(
db_table,
branch_name
)
cursor.execute(insert_query, multi=True)
connection.commit()
except Exception as ex:
print(ex)
finally:
cursor.close()
connection.close()
When I run the script, for some reason, the branch_name variable is cut in the middle and then the query that checks if the branch name already exists in the database fails:
1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'where not exists (select 1 from branches_to_delete where branch_name = `AUT-1868' at line 8
So instead of checking for 'AUT-18681_designer_create_new_name_if_illegal_char_exist' it checks for 'AUT-1868' which doesn't exist in the database.
I've tried the following:
'{1}'
"{1}"
{1}
But to no avail.
What am I doing wrong?
Using WHERE statement in INSERT INTO query is illegal:
INSERT INTO `some_table`(`some_column`)
VALUES ('some_value') WHERE [some_condition]
So, the above example is not valid MySQL query. For prevent duplication of branch_name you should add unique index on your table like:
ALTER TABLE `table` ADD UNIQUE INDEX `unique_branch_name` (`branch_name`);
And after this you can use next query:
INSERT INTO `table` (`branch_name`) VALUES ('branch_name_1')
ON DUPLICATE KEY UPDATE `branch_name` = `branch_name`;
Pay attention: If your table have auto-increment id, it will be incremented on each insert attempt
Since MySQL 8.0 you can use JASON_TABLE function for generate pseudo table from your values filter it from already exists values and use it fro insert. Look here for example
I don't see anything wrong assuming the source of the branch_name is safe (you are not open to SQL Injection attacks), but as an experiment you might try:
insert_query = f"""insert into {db_table}(branch_name) VALUES(%s) where not exists
(select 1 from {db_table} where branch_name = %s and deletion_date is NULL)"""
cursor.execute(insert_query, (branch_name, branch_name))
I am using a prepared statement (which is also SQL Injection-attack safe) and thus passing the branch_name as a parameters to the execute method and have also removed the multi=True parameter.
Update
I feel like a bit of a dummy for missing what is clearly an illegal WHERE clause. Nevertheless, the rest of the answer suggesting the use of a prepared statement is advice worth following, so I will keep this posted.

Syntax Error in MYSQL clausules

Hello stackoverflow's friends i need your help with this sql clausule this is the error into mysql:
_mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'WHERE email='Tysaic0344#gmail.com'' at line 1")
and this is my code:
INSERT INTO user (token) VALUES (1) WHERE email='example#email.com'
You cannot insert values into an existing row. You can either update or delete the existing records. In your case, I think you want to update the existing row. You can use UPDATE.
UPDATE user SET token = 1 WHERE email = 'example#email.com';
If you want to add records to the table use INSERT
INSERT INTO user VALUES (1, 'example#email.com');
Here is the link for your reference
https://msdn.microsoft.com/en-us/library/bb243852(v=office.12).aspx
You can't INSERT with a WHERE clause.
If you need to UPDATE the record where you have the email from:
UPDATE user
Set token = 1
WHERE email='example#email.com'
Or INSERT with email
INSERT INTO user (token, email)
VALUES (1, 'example#email.com')
(or without)
INSERT INTO user (token)
VALUES (1)
These kind of errors you MUST be able to fix by yourself, the error even tells you where it went wrong (at the end it says "near 'WHERE...").
Check the docs that dns_nx included (especially https://dev.mysql.com/doc/refman/5.7/en/update.html ) for the correct syntax to do an update.
You cannot INSERT a value into an existing row. The WHERE clause is invalid with INSERT. If you want to update an existing row, then you have to UPDATE the field like this:
UPDATE
user
SET
token = 1
WHERE
email='example#email.com'
Please review the docs about INSERT and UPDATE
https://dev.mysql.com/doc/refman/5.7/en/update.html
https://dev.mysql.com/doc/refman/5.7/en/insert.html
INSERT inserts new rows into a table. The WHERE clause is used to filter existing rows from a table. It doesn't make sense in a INSERT query; that's why the INSERT statement does not contain a WHERE clause.
The WHERE clause is used to filter the rows to fetch from the table (the SELECT statement), the rows to modify (the UPDATE statement) or to remove from the table (the DELETE statement).
Your query looks like you want to modify the data already existing in the table. The UPDATE statement you need looks like this:
UPDATE user SET token = 1 WHERE email = 'example#email.com'

SQL insert query inside select or where clause

Maybe my question above may be could be stupid , but I just want to know if is it possible to have insert query inside select or where.
The reason that I want to know that is if someone hack website or any application database, can the hacker input data to hacked database without my knowledge ?
the following example of SQL injection I see in other sites
http://www.example.com/empsummary.php?id=1 AND 1=-1 union select 1,group_concat(name,0x3a,email,0x3a,phone,0x2a),3,4,5,6,7,8,9 from employee
I know what exactly that above query does, but can the hacker input (use insert query) on the database or on any table ?
Yes, it can happen, if the database interface is configured to allow multiple statements in a query.
An INSERT can't run as part of a SELECT statement. But it's possible that the exploit of a vulnerability could finish a SELECT and then execute a separate insert.
Say you have a vulnerable statement like this:
SELECT foo FROM bar WHERE fee = '$var'
Consider the SQL text when $var contains:
1'; INSERT INTO emp (id) VALUES (999); --
The SQL text could be something like this:
SELECT foo FROM bar WHERE fee = '1'; INSERT INTO emp (id) VALUES (999); --'
If multi-statement queries are enabled in the database interface library, it's conceivable that an INSERT statement could be executed.
See: https://www.owasp.org/index.php/SQL_Injection

Check if record exists delete it using mysql

i'm using MySQL and i want to check if a record exists and if it exists delete this record.
i try this but it 's not working for me:
SELECT 'Barcelone' AS City, EXISTS(SELECT 1 FROM mytable WHERE City = 'Barcelone') AS 'exists';
THEN
DELETE FROM mytable
WHERE City = 'Barcelone';
Thank you for your help.
The if statement is only allowed in stored procedures, stored functions, and triggers (in MySQL).
If I understand what you want, just do:
DELETE FROM mytable
WHERE City = 'Barcelone';
There is no reason to check for the existence beforehand. Just delete the row. If none exist, no problem. No errors.
I would recommend an index on mytable(city) for performance reasons. If you want to check if the row exists first, that is fine, but it is unnecessary for the delete.
If you mean MySQL is returning an error message (if that's what you mean by "not working for me"), then that's exactly the behavior we would expect.
That SQL syntax is not valid for MySQL.
If you want to delete rows from a table, issue a DELETE statement, e.g.
DELETE FROM mytable WHERE City = 'Barcelone'
If you want to know how many rows were deleted (if the statement doesn't throw an error), immediately follow the DELETE statement (in the same session) with a query:
SELECT ROW_COUNT()
Or the appropriate function in whatever client library you are using.
If the ROW_COUNT() function returns 0, then there were no rows deleted.
There's really no point (in terms of MySQL) in issuing a SELECT to find out if there are rows to be deleted; the DELETE statement itself will figure it out.
If for some reason your use case requires you to check whether there are rows be be deleted, then just run a separate SELECT:
SELECT COUNT(1) FROM mytable WHERE City = 'Barcelone'

Multiple Queries in Triggers

I'm fairly new to using triggers and have a tiny question.
I have a trigger finds a match between a newly inserted enquiry and a customer table.
INSERT INTO customersmatched (customerID,enquiryID) SELECT id, NEW.id FROM customer AS c WHERE c.customerName=NEW.companyName HAVING COUNT(id)=1;
I then need to update the newly inserted enquiry so it has a status which shows it's matched (but only if it has matched). So I tried adding this line after the insert.
UPDATE enquiry SET status="Live-Enquiry" WHERE id IN ( SELECT enquiryID FROM customersmatched WHERE enquiryID = NEW.id);
Except I get this error:
MySQL said: #1064 - You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for the >right
syntax to use near 'UPDATE enquiry SET status="Live-Enquiry" WHERE id
IN ( SELECT enquiryID FROM cus' at line 5
How do I allow multiple queries within a trigger. I've tried doing something like in this link: Multiple insert/update statements inside trigger?
But doesn't work either. I'm using phpmyadmin btw. Can anyone help? :D
If you have ansi quotes enabled then you can't use double quotes as a string literal, and need to use single quotes instead. see: http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_ansi_quotes Otherwise, I don't see any syntax errors that jump out at me.
Try changing SET status="Live-Enquiry" to SET status='Live-Enquiry'
EDIT:
What is the purpose of the first query? I'm not sure you need the HAVING in that query. If want a distinct list of matches, just use DISTINCT
INSERT INTO customersmatched (customerID,enquiryID)
SELECT DISTINCT id, NEW.id
FROM customer AS c
WHERE c.customerName=NEW.companyName;
The second query, if I understand it correctly, can be simplified to this:
UPDATE enquiry
SET status='Live-Enquiry'
WHERE id = NEW.id;