PDO Bind Value vs Insert - mysql

Besides the advantage of escaping value when using bind value in PDO, is there any difference in performance when using bind value with multiple values (prepare the statement once, but execute multiple time with different values) instead of a single insert statement like
INSERT INTO table_name VALUES (value1, value2, value3),(value1, value2, value3),(value1, value2, value3)

Did some tests myself on 100,000 records. For a simpler scenario I did not use INSERT INTO but REPLACE INTO to avoid having to come up with new keys every time.
REPLACE INTO
A raw replace into of 3 columns example REPLACE INTO table_name VALUES (value1, value2, value3),(value1, value2, value3),(value1, value2, value3)...... on 100,000 rows took approx 14sec.
NORMAL BIND
Using a preparing the statement, binding the value and executing the prepared statement took around 33 seconds
foreach ($vars as $var) {
$stmt->bindValue(':a' . $var["value1"], $var["value2"]);
$stmt->bindValue(':b' . $var["value3"], $var["value4"]);
$stmt->bindValue(':c' . $var["value5"], $var["value6"]);
$stmt->execute();
}
BIND BUT 1 EXECUTE
Creating a long statement before preparing it, binding all the paramaters and executing once took around 22 seconds
REPLACE INTO clientSettings(clientId, settingName, settingValue) VALUES (:a1,:b1,:c1)
(:a2,:b2,:c2)
(:a3,:b3,:c3)
(:a4,:b4,:c4)
.......
Note that these are rough numbers and used creating REPLACE INTO (where fields where deleted and insert) on 100,000 records.

It's faster (for MySQL) if you use prepared statements. That way, the actual SQL is parsed once and data is sent multiple times - so the actual SQL layer for transforming your INSERT INTO ... isn't invoked every time you want to execute that particular insert, it's parsed only once and then you just send different parameters (different data) that you want to insert (or execute any other operation).
So not only does it reduce overhead, it increases security as well (if you use PDO::bindValue/param due to proper escaping based on driver / charset being used).
In short - yes, your insert will be faster and safer. But by what margin - it's hard to tell.

Related

How to preserve SQL data type when using mysqli non-prepared statements

I would like to use non-prepared statements using mysqli in PHP and preserve the SQL data type.
The comparison table on this page on php.net it suggests that it is possible to do this:
Result set values SQL data types
Prepared Statement: Preserved when fetching
Non-prepared statement: Converted to string or preserved when fetching
What is meant by fetching?
At present, in code like the following column values are all string. Is there a way to preserve data type without using prepared statements?
$conn = new mysqli('db_host', 'db_username', 'db_password', 'db_name');
$conn->set_charset('db_charset');
$result = $conn->query($sql, MYSQLI_USE_RESULT);
$row = $result->fetch_assoc();
NB. unbuffered query
I have since found the option mysqli MYSQLI_OPT_INT_AND_FLOAT_NATIVE that converts numeric columns to int or float (boolean columns become int).
I'm not sure if this is what was referred to in the php.net documentation or if there is another method.
Example:
$conn->options(MYSQLI_OPT_INT_AND_FLOAT_NATIVE, 1);

Searching for multiple values in 1 query

If I have a database having 2 fields, Roll no and name and I have a list (of n values) of roll numbers for which I have to search the corresponding names.
Can this be done using just one query in SQL or HQL?
SELECT name FROM [table] WHERE id IN ([list of ids])
where [list of ids] is for example 2,3,5,7.
Use the IN operator and separate your Roll no's by a comma.
SELECT name
FROM yourtable
WHERE [Roll no] IN (1, 2, 3, 4, etc)
You can use the IN statement as shown above.
There are a couple of minor issues with this. It can perform poorly if the number of values in the clause gets too large.
The second issue is that in many development environments you land up needing to dynamically create the query with a variable number of items (or a variable number of placeholders if using parameterised queries). While not difficult if does make your code look messy and mean you haven't got a nice neat piece of SQL that you can copy out and use to test.
But examples (using php).
Here the IN is just dynamically created with the SQL. Assuming the roll numbers can only be integers it is applying intval() to each member of the array to avoid any non integer values being used in the SQL.
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
$sql = "SELECT FROM some_table WHERE `Roll no` IN (".implode(", ", array_map ('intval', $list_of_roll_no)).")";
?>
Using mysqli bound parameters is a bit messy. This is because the bind parameter statement expects a variable number of parameters. The 2nd parameter onwards are the values to be bound, and it expects them to be passed by reference. So the foreach here is used to generate an array of references:-
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
if ($stmt = $mysqli->prepare("SELECT FROM some_table WHERE `Roll no` IN (".implode(",", array_fill(0, count($list_of_roll_no), '?')).")"))
{
$bind_arguments = [];
$bind_arguments[] = str_repeat("i", count($list_of_roll_no));
foreach ($list_of_roll_no as $list_of_roll_no_key => $list_of_roll_no_value)
{
$bind_arguments[] = & $list_of_roll_no[$list_of_roll_no_key]; # bind to array ref, not to the temporary $recordvalue
}
call_user_func_array(array($statement, 'bind_param'), $bind_arguments);
$statement->execute();
}
?>
Another solution is to push all the values into another table. Can be a temp table. Then you use an INNER JOIN between your table and your temp table to find the matching values. Depending on what you already have in place then this is quite easy to do (eg, I have a php class to insert multiple records easily - I just keep passing them across and the class batches them up and inserts them occasionally to avoid repeatedly hitting the database).

Performance of batched statements using INSERT ... SET vs INSERT ... VALUES

I recently wrote a simple Java program that processed some data and inserted it in a MyISAM table. About 35000 rows had to be inserted. I wrote the INSERT statement using INSERT ... SET syntax and executed it for all rows with PreparedStatement.executeBatch(). So:
String sql = "INSERT INTO my_table"
+ " SET "
+ " my_column_1 = ? "
+ " my_column_2 = ? "
...
+ " my_column_n = ? ";
try(PreparedStatement pst = con.prepareStatement(sql)){
for(Object o : someCollection){
pst.setInt(1, ...);
pst.setInt(2, ...);
...
pst.setInt(n, ...);
pst.addBatch();
}
pst.executeBatch();
}
I tried inserting all rows in a single batch and in bacthes of 1000, but in all cases the execution was VERY slow (about 1 minute per 1000 rows). After some tinkering I found that changing the syntax to INSERT ... VALUES improved the speed dramatically, 100x at the very least (I didn't measure it accurately).
String sql = "INSERT INTO my_table (my_column_1, my_column_2, ... , my_column_n)"
+ " VALUES (?, ?, ... , ?)";
What's going on here? Can it be that the JDBC driver cannot rewrite the batches when using INSERT ... SET? I didn't find any documentation about this. I am creating my connections with options rewriteBatchedStatements=true&useServerPrepStmts=false.
I first noticed this problem when accessing a database in another host. That is, I have used the INSERT ... SET approach before without any noticeable performance issue in applications that were executing in the same host as the database. So I guess the problem may be that many more statements are sent over the network with INSERT ... SET than with INSERT ... VALUES.
If you examine the INSERT ... SET syntax, you'll see it's only meant for inserting a single row. INSERT ... VALUES is meant for inserting multiple rows at one time.
In other words - even though you set rewriteBatchedStatements=true, the JDBC driver can't optimize the SET variation like it can with the VALUES variation because SET is not built for the batch case you have. Use VALUES to compress N inserts into one.
Bonus tip - If you use ON DUPLICATE KEY UPDATE, the JDBC currently can't rewrite those statements either. (edit: This statement is false - my mistake.)
There's an option you can set to verify all of this for yourself (I think it's 'profileSQL').

Perl DBI insert multiple rows using mysql native multiple insert ability

Has anyone seen a DBI-type module for Perl which capitalizes, easily, on MySQL's multi-insert syntax
insert into TBL (col1, col2, col3) values (1,2,3),(4,5,6),...?
I've not yet found an interface which allows me to do that. The only thing I HAVE found is looping through my array. This method seems a lot less optimal vs throwing everything into a single line and letting MySQL handle it. I've not found any documentation out there IE google which sheds light on this short of rolling my own code to do it.
TIA
There are two approaches. You can insert (?, ?, ?) a number of times based on the size of the array. The text manipulation would be something like:
my $sql_values = join( ' ', ('(?, ?, ?)') x scalar(#array) );
Then flatten the array for calling execute(). I would avoid this way because of the thorny string and array manipulation that needs to be done.
The other way is to begin a transaction, then run a single insert statement multiple times.
my $sql = 'INSERT INTO tbl (col1, col2, col3)';
$dbh->{AutoCommit} = 0;
my $sth = $dbh->prepare_cached( $sql );
$sth->execute( #$_ ) for #array;
$sth->finish;
$dbh->{AutoCommit} = 1;
This is a bit slower than the first method, but it still avoids reparsing the statement. It also avoids the subtle manipulations of the first solution, while still being atomic and allowing disk I/O to be optimized.
If DBD::mysql supported DBI's execute_for_fetch (see DBI's execute_array and execute_for_fetch) this is the typical usage scenario i.e., you have multiple rows of inserts/updates/deletes available now and want to send them in one go (or in batches). I've no idea if the mysql client libs support sending multiple rows of bound parameters in one go but most other database client libs do and can take advantage of DBI's execute_array/execute_for_fetch. Unfortunately few DBDs actually implement execute_array/execute_for_fetch and rely on DBI implementing it one row at a time.
Jim,
Frezik has it. That is probably the most optimal:
my $sth = $dbh->prepare( 'INSERT INTO tbl (?, ?, ?)' );
foreach(#array) { $sth->execute( #{$_} ); }
$sth->finish;

Why does my INSERT sometimes fail with "no such field"?

I've been using the following snippet in developements for years. Now all of a sudden I get a DB Error: no such field warning
$process = "process";
$create = $connection->query
(
"INSERT INTO summery (process) VALUES($process)"
);
if (DB::isError($create)) die($create->getMessage($create));
but it's fine if I use numerics
$process = "12345";
$create = $connection->query
(
"INSERT INTO summery (process) VALUES($process)"
);
if (DB::isError($create)) die($create->getMessage($create));
or write the value directly into the expression
$create = $connection->query
(
"INSERT INTO summery (process) VALUES('process')"
);
if (DB::isError($create)) die($create->getMessage($create));
I'm really confused ... any suggestions?
It's always better to use prepared queries and parameter placeholders. Like this in Perl DBI:
my $process=1234;
my $ins_process = $dbh->prepare("INSERT INTO summary (process) values(?)");
$ins_process->execute($process);
For best performance, prepare all your often-used queries right after opening the database connection. Many database engines will store them on the server during the session, much like small temporary stored procedures.
Its also very good for security. Writing the value into an insert string yourself means that you must write the correct escape code at each SQL statement. Using a prepare and execute style means that only one place (execute) needs to know about escaping, if escaping is even necessary.
Ditto what Zan Lynx said about placeholders. But you may still be wondering why your code failed.
It appears that you forgot a crucial detail from the previous code that worked for you for years: quotes.
This (tested) code works fine:
my $thing = 'abcde';
my $sth = $dbh->prepare("INSERT INTO table1 (id,field1)
VALUES (3,'$thing')");
$sth->execute;
But this next code (lacking the quotation marks in the VALUES field just as your first example does) produces the error you report because VALUES (3,$thing) resolves to VALUES (3,abcde) causing your SQL server to look for a field called abcde and there is no field by that name.
my $thing = 'abcde';
my $sth = $dbh->prepare("INSERT INTO table1 (id,field1)
VALUES (3,$thing)");
$sth->execute;
All of this assumes that your first example is not a direct quote of code that failed as you describe and therefore not what you intended. It resolves to:
"INSERT INTO summery (process) VALUES(process)"
which, as mentioned above causes your SQL server to read the item in the VALUES set as another field name. As given, this actually runs on MySQL without complaint and will fill the field called 'process' with NULL because that's what the field called 'process' contained when MySQL looked there for a value as it created the new record.
I do use this style for quick throw-away hacks involving known, secure data (e.g. a value supplied within the program itself). But for anything involving data that comes from outside the program or that might possibly contain other than [0-9a-zA-Z] it will save you grief to use placeholders.