I have a mysql db with several tables, let's call them Table1, Table2, etc. I have to make several calls to each of these tables
Which is most efficient,
a) Collecting all queries for each table in one message, then executing them separately, e.g.:
INSERT INTO TABLE1 VALUES (A,B);
INSERT INTO TABLE1 VALUES (A,B);
...execute
INSERT INTO TABLE2 VALUES (A,B);
INSERT INTO TABLE2 VALUES (A,B);
...execute
b) Collecting ALL queries in one long message(not in order of table), then executing this query, e.g:
INSERT INTO TABLE1 VALUES (A,B);
INSERT INTO TABLE2 VALUES (B,C);
INSERT INTO TABLE1 VALUES (B,A);
INSERT INTO TABLE3 VALUES (D,B);
c) Something else?
Currently I am doing it like option (b), but I am wondering if there is a better way.
(I am using jdbc to access the db, in a groovy script).
Thanks!
Third option - using prepared statements.
Without posting your code, you've made this a bit of a wild guess, but this blog post shows great performance improvements using the groovy Sql.withBatch method.
The code they show (which uses sqlite) is reproduced here for posterity:
Sql sql = Sql.newInstance("jdbc:sqlite:/home/ron/Desktop/test.db", "org.sqlite.JDBC")
sql.execute("create table dummyTable(number)")
sql.withBatch {stmt->
100.times {
stmt.addBatch("insert into dummyTable(number) values(${it})")
}
stmt.executeBatch()
}
which inserts the numbers 1 to 1000 into a table dummyTable
This will obviously need tweaking to work with your unknown code
Rather than looking at which is more efficient, first consider whether the tables are large and whether you need concurrency.
If they are (millions of records) then you may want to separate them on a statement to statement basis and give some time between each statement, so you will not lock the table for too long at a time.
If your table isn't that large or concurrency is not a problem, then by all means do whichever. You should look at the slow logs of the statements and see which statement is faster.
Related
I have a case where I'm doing two queries: query1 is a bulk INSERT ... ON DUPLICATE KEY UPDATE on table1. For query2, I want to do another bulk INSERT on table2 with some application data along with using the ids inserted/updated from query1. I know I can do this with an intermediate query, selecting the ids I need from table1 and then inserting them into table2 along with application data, but I really want to avoid the extra network back-and-forth of that query along with the db overhead. Is there any way I can either get the ids inserted/updated from query1 when running that, or do some kind of complex, but relatively less expensive INSERT ... SELECT FROM in query2 to avoid this?
As far as I know, getting ids added/modified returned from query1 is impossible without a separate query, and I can't think of a way to batch INSERT ... SELECT FROM where the insertion values for each row are dependent on the selected value, but I'd love to be proven wrong, or shown a way around either of those.
There is no way to get a set of IDs as a result of a bulk INSERT.
One option you have is indeed to run a SELECT query to get the IDs and use them in the second bulk INSERT. But that's a hassle.
Another option is to run the 2nd bulk INSERT into a temporary table, let's call it table3, then use INSERT INTO table2 ... SELECT FROM ... table1 JOIN table3 ...
With a similar use case we eventually found that this is the fastest option, given that you index table3 correctly.
Note that in this case you don't have a SELECT that you need to loop over in your code, which is nice.
I have some words like ["happy","bad","terrible","awesome","happy","happy","horrible",.....,"love"].
These words are large in number, exceeding 100 ~ 200 maybe.
I want to saving that to DB at the same time.
I think calling to DB connection at every word is so wasteful.
What is the best way to save?
table structure
wordId userId word
You are right that executing repeated INSERT statements to insert rows one at a time i.e processing RBAR (row by agonizing row) can be expensive, and excruciatingly slow, in MySQL.
Assuming that you are inserting the string values ("words") into a column in a table, and each word will be inserted as a new row in the table... (and that's a whole lot of assumptions there...)
For example, a table like this:
CREATE TABLE mytable (mycol VARCHAR(50) NOT NULL PRIMARY KEY) ENGINE=InnoDB
You are right that running a separate INSERT statement for each row is expensive. MySQL provides an extension to the INSERT statement syntax which allows multiple rows to be inserted.
For example, this sequence:
INSERT IGNORE INTO mytable (mycol) VALUES ('happy');
INSERT IGNORE INTO mytable (mycol) VALUES ('bad');
INSERT IGNORE INTO mytable (mycol) VALUES ('terrible');
Can be emulated with single INSERT statement
INSERT IGNORE INTO mytable (mycol) VALUES ('happy'),('bad'),('terrible');
Each "row" to be inserted is enclosed in parens, just as it is in the regular INSERT statement. The trick is the comma separator between the rows.
The trouble with this comes in when there are constraint violations; either the whole statement succeeds or fails. Unlike the individual inserts, where one of them can fail and the other two succeed.
Also, be careful that the size (in bytes) of the statement does not exceed the max_allowed_packet variable setting.
Alternatively, a LOAD DATA statement is an even faster way to load rows into a table. But for a couple of hundred rows, it's not really going to be much faster. (If you were loading thousands and thousands of rows, the LOAD DATA statement could potentially be much faster.
It would be helpful to know you are generating that list of words but you could do
insert into table (column) values (word), (word2);
Without more info that is about as much as we can help
You could add a loop in whatever language is needed to iterate over the list to add them.
I have executed an insert query as follows -
Insert into tablename
select
query1 union query2
Now if I execute the select part of this insert query,it takes around 2-3 minutes.However,the entire insert script is taking more than 8 minutes.As per my knowledge the insert and corresponding select queries should take almost the same time for execution.
So is their any other factor that could impact the execution time of the insert?
It's not correct that insert and corresponding select takes the same time, it should not!
The select query just "reads" data and transmit them; if you are trying the query in an application (like phpMyadmin) is very likely to limit query result to paginate them, so the select is faster (as it doesn't select all the data).
The insert query must read that data, insert in the table, update primary key tree, update every other index on that table, update every view using that table, triggering any trigger on that table/column, ecc... so the insert operates a LOT way more actions than an insert.
So IT'S normal that the insert is slower than the select, how much slower depends on your tables and db structure.
You could optimize the insert with some db specific options, for example you could read here for mysql, if you are on DB2 you could crete a temp file then cpyf that into the real one, and so on...
I have to write into MySQL database a lot of data for about 5 times per second.
What is the fastest way: insert each 1/5 of second or make a queue and insert all stored data each ~5 seconds? If the second way is better - is it possible to insert into 1 table using 1 request a few rows?
Considering the frequency of the insertions
Its better to go with the second approach that is queuing and than adding at one go.!
But You should consider these scenarios first :
Is your system Real Time.? Yes then what is the maximum delay that you can afford (As it'll take ~5 seconds for next insertion and data to be persisted/available)?
What are the chances of Incorrect values/Errors to come in data, as if one data is incorrect you'll loose rest all if the query has to fail.
Using multiple buffer pools with innodb_buffer_pool_instances. it can depend on number of cores onmachine.
Use Partitioning of table.
You can collectively insert data using XML.
As each transaction comes with a fixed cost, I'd say that doing a multi-line insert every few seconds is better. With some of the systems we use at work we cache hundreds of lines before inserting them all in one go.
From the MySQL documentation you can do a multi-line insert like so:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
My experience is that when inserting data into a MySQL database it is faster to work with batches.
So the slower option is executing multiple insert queries:
INSERT INTO my_table VALUES (1, "a");
INSERT INTO my_table VALUES (2, "b");
The faster option would be:
INSERT INTO my_table VALUES (1, "a"), (2, "b");
You can make an insert with all the data witho something like this:
INSERT INTO table (field1, field2,... , fieldN )
VALUES
(value1_1', value1_2,... , value1_N),
(value2_1', value2_2,... , value2_N),
...
(valueM_1', valueM_2,... , valueM_N);
I have a single procedure that has two insert statements in it for two different tables. I must insert data into table1 before I can insert into table2. I'm using PHP to do the data collection. What I'd like to know is how to insert multiple rows into table2, which can have many rows associated with table1. How would I do this?
I want to only store the person in table1 just one time but table2 requires multiple rows. If these insert statements were in separate procedures, I wouldn't have a problem but I just don't know how I would insert more than one row into table2 without table1 rejecting a second duplicate record.
BEGIN
INSERT INTO user(name, address, city) VALUES(Name, Address, City);
INSERT INTO order(order_id, desc) VALUES(OrderNo, Description);
END
I'd suggest you do it separately, otherwise you'd need a complicated solution which is prone to error if something changes.
The complicated solution is:
join all orderno and descriptions with a separator. (orderno#description)
join all orders with a different separator. (orderno#description/orderno#description/...)
pass it to the procedure
in the procedure, split the string by order separator, then loop through each of them
for each order, split the string by the first separator, then insert into the appropriate columns
As you can see, this is bad.
I am sorry, but what's stopping you from inserting data into these (seemingly unrelated) tables in separate queries? If you don't like the idea of it failing halfway through, you can wrap it into a transaction. I know, mysqli and pdo can do that just fine.
Answering your question directly, insert's ignore mode turns errors during insertion into warnings, so upon attempting to insert a duplicate row the warning is issued and the row is not inserted, but there is no error.
You could use the IGNORE keyword on the first statement.
http://dev.mysql.com/doc/refman/5.1/en/insert.html:
If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error is issued.But somehow this seems rather inefficient to me, a "stabbed from behind through the chest in the eye"-solution.