executing queries together to gain efficiency - mysql

Presently I am using prepared statements to execute a query or an update on the database. Though pre-compiled(and hence fast) I think that it would be even more efficient if could have such an arrangement:
Scenario:
Suppose I have to insert 100 rows in a database table. I use prepare statements so I prepare a statement and send it to the database for execution. So each time the query is of the form:
insert into user values(....);
Now consider this situation when I have a query of the form
insert into user values (...), (...), ....,(...);
By this we can minimize table access and execute query in one go.
Is there any way that we can do this using prepared statements or such an arrangement where we can instruct database that execute next 100 updates together. By the way I am presently working on mysql

INSERT statements that use VALUES syntax can insert multiple rows. To
do this, include multiple lists of column values, each enclosed within
parentheses and separated by commas. Example:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Try something like
st = con.createStatement();
st.addBatch("INSERT INTO tbl_name VALUES(1,2,3)");
st.addBatch("INSERT INTO tbl_name VALUES(4,5,6)");
st.addBatch("INSERT INTO tbl_name VALUES(7,8,9)");
st.executeBatch();

Any reason not to do it as a bulk insert operation?
There's probably a better way of doing it, but I simply write a "file" to /dev/shm then reference in a LOAD DATA INFILE statement, thus never hitting disk on the client machine.

Related

Adding recors in MySql

I want to add some 1000 records into my table for creating a database. Inserting each record manually is not at all practical. Is there a proper way to do this?
In MySQL you can insert multiple rows with a single insert statement.
insert into table values (data-row-1), (data-row-2), (data-row-3)
If you run a mysqldump on your database, you will see that this is what the output does.
The insert is then run as a single "transaction", so it's much, much faster than running 1000 individual inserts

INSERT doesn't work when followed by SELECT LAST_INSERT_ID

There are a lot of questions about LAST_INSERT_ID()
In my case, issue is:
When INSERT is followed by SELECT LAST_INSERT_ID() there are no records being inserted
INSERT INTO sequences (state) VALUES (0);
select LAST_INSERT_ID();
>>> 80 // nothing is added to DB
INSERT on it's own works OK
INSERT INTO sequences (state) VALUES (0);
>>>
select LAST_INSERT_ID();
>>> 81 // row is inserted
For testing I am using SequelPro, DB is Amazon's RDS MySQL. Same issue happens when I use Python's MySQLdb module.
Ideally I want to insert row, get back ID of it for future identification and use.
You should run one query at a time. Most SQL interfaces don't allow multiple queries.
MySQL allows a single query to be terminated by ; but if there's any words following the ; (except for a comment), it's be a syntax error, which will make the whole request fail. So the INSERT won't run either.
MySQL does have a connection option to allow multi-query, but it's not active by default. See https://dev.mysql.com/doc/refman/8.0/en/c-api-multiple-queries.html
There's really no reason to use multi-query. Just run the SELECT LAST_INSERT_ID() as a separate query following the INSERT. As long as you use the same connection, it'll give you the right answer.

How to handle milions of separate insert queries

I have a situation in which I have to insert over 10 million separate records into one table. Normally a batch insert split into chunks does the work for me. The problem however is that this over 3gig file contains over 10 million separate insert statements. Since every query takes 0.01 till 0.1 seconds, it will take over 2 days to insert everything.
I'm sure there must be a way to optimize this by either lowering the insert time drasticly or somehow import in a different way.
I'm now just using the cli
source /home/blabla/file.sql
Note: It's a 3th party that is providing me this file. I'm
Small update
I removed any indexes
Drop the indexes, then re-index when you are done!
Maybe you can parse the file data and combine several INSERT queries to one query like this:
INSERT INTO tablename (field1, field2...) VALUES (val1, val2, ..), (val3, val4, ..), ...
There are some ways to improve the speed of your INSERT statements:
Try to insert many rows at once if this is an option.
An alternative can be to insert the data into a copy of your desired table without indexes, insert the data there, then add the indexes and rename your table.
Maybe use LOAD DATA INFILE, if this is an option.
The MySQL manual has something to say about that, too.

Restructure huge unnormalized mysql database

Hi I have a huge unnormalized mysql database with (~100 million) urls (~20% dupes) divided into identical split tables of 13 million rows each.
I want to move the urls into a normalized database on the same mySql server.
The old database table is unnormalized, and the url's have no index
It look like this:
entry{id,data,data2, data3, data4, possition,rang,url}
And i'm goin to slit it up into multiple tables.
url{id,url}
data{id,data}
data1{id,data}
etc
The first thing I did was
INSERT IGNORE INTO newDatabase.url (url)
SELECT DISTINCT unNormalised.url FROM oldDatabase.unNormalised
But the " SELECT DISTINCT unNormalised.url" (13 million rows) took ages, and I figured that that since "INSERT IGNORE INTO" also do a comparison, it would be fast to just do a
INSERT IGNORE INTO newDatabase.url (url)
SELECT unNormalised.url FROM oldDatabase.unNormalised
Without the DISTINCT, is this assumption Wrong?
Any way it still takes forever and i need some help, is there a better way of dealing withe this huge quantity of unnormalized data?
Whould it be best if i did a SELECT DISTINCT unNormalised.url" on the entire 100 milion row database, and exported all the id's, and then moved only those id's to the new database with lets say a php script?
All ideas are welcomed, i have no clue how to port all this date without it taking a year!
ps it is hosted on a rds amazon server.
Thank you!
As the MySQL Manual states that LOAD DATA INFILE is quicker than INSERT, the fastest way to load your data would be:
LOCK TABLES url WRITE;
ALTER TABLE url DISABLE KEYS;
LOAD DATA INFILE 'urls.txt'
IGNORE
INTO TABLE url
...;
ALTER TABLE url ENABLE KEYS;
UNLOCK TABLES;
But since you already have the data loaded into MySQL, but just need to normalize it, you might try:
LOCK TABLES url WRITE;
ALTER TABLE url DISABLE KEYS;
INSERT IGNORE INTO url (url)
SELECT url FROM oldDatabase.unNormalised;
ALTER TABLE url ENABLE KEYS;
UNLOCK TABLES;
My guess is that INSERT IGNORE ... SELECT will be faster than INSERT IGNORE ... SELECT DISTINCT but that's just a guess.

Capturing MySQL Statement and inserting it into another db via trigger

I want to be able to keep a log of every sql statment that edits a specific table, and I think the best way would be with a trigger. However, I don't know how to get the statement that triggers the trigger.
For example if I run:
$sql = "INSERT INTO foo (col_1) VALUES ('bar');"
I want to be able to craft an insert statement like:
$sql = "INSERT INTO log (statements) VALUES (INSERT INTO foo (col_1) VALUES ('bar')'
Is this possible with a trigger? I know I can do this easily with php, but I figured that a trigger would have less overhead.
You can't get the statements, only their results / alterations. Consider using a database abstraction- or decorator-interface to capture those queries.