Using PDO to insert variables into SELECT clause? - mysql

I am attempting to get the distance from a user to each venue stored in a MySQL database, using the spherical law of cosines. The user inputs their location, and the following query is executed.
$data = array(':lat' => $lat, ':lon' => $lon);
$qry = "SELECT ACOS(SIN(v.Latitude) * SIN(:lat) + COS(v.Latitude) * COS(:lat) * COS(:lon - v.Longitude)) * 3963 AS distance FROM Venue v";
$stmt = $pdo->prepare($qry);
$stmt->execute($data);
$rows = $stmt->fetchAll();
The problem is, I get the following error.
PHP Fatal error: Uncaught exception 'PDOException' with message 'SQLSTATE[HY093]: Invalid parameter number'
When I remove the variables (:lat and :lon) from the SELECT clause, it works just fine. Other variables further on in the statement (not shown here) work just fine, it is only the variables in the SELECT clause that cause an issue. Is this inability to use PDO variables within SELECT clauses a limitation of PDO, or is there a way around this issue?
I am using PHP 5.4.15, and my PDO options are as follows.
$options = array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8', // UTF-8 to prevent issue sending special characters with JSON
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION, // fire exceptions for errors (turn this off for release)
PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC, // only return results indexed by column name
PDO::ATTR_EMULATE_PREPARES => false // actually prepare statements, not pseudo-prepare ( http://stackoverflow.com/questions/10113562/pdo-mysql-use-pdoattr-emulate-prepares-or-not )
);

$data = array($lat, $lat, $lon);
$qry = "SELECT ACOS(SIN(v.Latitude) * SIN(?) + COS(v.Latitude) * COS(?) * COS(? - v.Longitude)) * 3963 AS distance FROM Venue v";
$stmt = $pdo->prepare($qry);
$stmt->execute($data);
$rows = $stmt->fetchAll();

Related

Symfony3 : How to do a massive import from a CSV file as fast as possible?

I have a .csv file with more than 690 000 rows.
I found a solution to import data that works very well but it's a little bit slow... (around 100 records every 3 seconds = 63 hours !!).
How can I improve my code to make it faster ?
I do the import via a console command.
Also, I would like to import only prescribers that aren't already in database (to save time). To complicate things, no field is really unique (except for id).
Two prescribers can have the same lastname, firstname, live in the same city and have the same RPPS and professional codes. But, it's the combination of these 6 fields which makes them unique !
That's why I check on every field before create a new one.
<?php
namespace AppBundle\Command;
use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Symfony\Component\Console\Helper\ProgressBar;
use AppBundle\Entity\Prescriber;
class PrescribersImportCommand extends ContainerAwareCommand
{
protected function configure()
{
$this
// the name of the command (the part after "bin/console")
->setName('import:prescribers')
->setDescription('Import prescribers from .csv file')
;
}
protected function execute(InputInterface $input, OutputInterface $output)
{
// Show when the script is launched
$now = new \DateTime();
$output->writeln('<comment>Start : ' . $now->format('d-m-Y G:i:s') . ' ---</comment>');
// Import CSV on DB via Doctrine ORM
$this->import($input, $output);
// Show when the script is over
$now = new \DateTime();
$output->writeln('<comment>End : ' . $now->format('d-m-Y G:i:s') . ' ---</comment>');
}
protected function import(InputInterface $input, OutputInterface $output)
{
$em = $this->getContainer()->get('doctrine')->getManager();
// Turning off doctrine default logs queries for saving memory
$em->getConnection()->getConfiguration()->setSQLLogger(null);
// Get php array of data from CSV
$data = $this->getData();
// Start progress
$size = count($data);
$progress = new ProgressBar($output, $size);
$progress->start();
// Processing on each row of data
$batchSize = 100; # frequency for persisting the data
$i = 1; # current index of records
foreach($data as $row) {
$p = $em->getRepository('AppBundle:Prescriber')->findOneBy(array(
'rpps' => $row['rpps'],
'lastname' => $row['nom'],
'firstname' => $row['prenom'],
'profCode' => $row['code_prof'],
'postalCode' => $row['code_postal'],
'city' => $row['ville'],
));
# If the prescriber doest not exist we create one
if(!is_object($p)){
$p = new Prescriber();
$p->setRpps($row['rpps']);
$p->setLastname($row['nom']);
$p->setFirstname($row['prenom']);
$p->setProfCode($row['code_prof']);
$p->setPostalCode($row['code_postal']);
$p->setCity($row['ville']);
$em->persist($p);
}
# flush each 100 prescribers persisted
if (($i % $batchSize) === 0) {
$em->flush();
$em->clear(); // Detaches all objects from Doctrine!
// Advancing for progress display on console
$progress->advance($batchSize);
$progress->display();
}
$i++;
}
// Flushing and clear data on queue
$em->flush();
$em->clear();
// Ending the progress bar process
$progress->finish();
}
protected function getData()
{
// Getting the CSV from filesystem
$fileName = 'web/docs/prescripteurs.csv';
// Using service for converting CSV to PHP Array
$converter = $this->getContainer()->get('app.csvtoarray_converter');
$data = $converter->convert($fileName);
return $data;
}
}
EDIT
According to #Jake N answer, here is the final code.
It's very very faster ! 10 minutes to import 653 727 / 693 230 rows (39 503 duplicate items!)
1) Add two columns in my table : created_at and updated_at
2) Add a single index of type UNIQUE on every column of my table (except id and dates) to prevent duplicate items with phpMyAdmin.
3) Add ON DUPLICATE KEY UPDATE in my query, to update just the updated_at column.
foreach($data as $row) {
$sql = "INSERT INTO prescripteurs (rpps, nom, prenom, code_prof, code_postal, ville)
VALUES(:rpps, :nom, :prenom, :codeprof, :cp, :ville)
ON DUPLICATE KEY UPDATE updated_at = NOW()";
$stmt = $em->getConnection()->prepare($sql);
$r = $stmt->execute(array(
'rpps' => $row['rpps'],
'nom' => $row['nom'],
'prenom' => $row['prenom'],
'codeprof' => $row['code_prof'],
'cp' => $row['code_postal'],
'ville' => $row['ville'],
));
if (!$r) {
$progress->clear();
$output->writeln('<comment>An error occured.</comment>');
$progress->display();
} elseif (($i % $batchSize) === 0) {
$progress->advance($batchSize);
$progress->display();
}
$i++;
}
// Ending the progress bar process
$progress->finish();
1. Don't use Doctrine
Try to not use Doctrine if you can, it eats memory and as you have found is slow. Try and use just raw SQL for the import with simple INSERT statements:
$sql = <<<SQL
INSERT INTO `category` (`label`, `code`, `is_hidden`) VALUES ('Hello', 'World', '1');
SQL;
$stmt = $this->getDoctrine()->getManager()->getConnection()->prepare($sql);
$stmt->execute();
Or you can prepare the statement with values:
$sql = <<<SQL
INSERT INTO `category` (`label`, `code`, `is_hidden`) VALUES (:label, :code, :hidden);
SQL;
$stmt = $this->getDoctrine()->getManager()->getConnection()->prepare($sql);
$stmt->execute(['label' => 'Hello', 'code' => 'World', 'hidden' => 1);
Untested code, but it should get you started as this is how I have done it before.
2. Index
Also, for your checks, have you got an index on all those fields? So that the lookup is as quick as possible.

UTF-8 Bad Encoding when Using ZF2 dbAdapter for mySQL for Update

I am getting the Exception when I attempt to update the record with "tableGateway" object:
Zend\Db\Adapter\Exception\InvalidQueryException
Statement could not be executed
(HY000 - 1300 - Invalid utf8 character string: 'C`\xC3`\xB3`digo')
I have the following table structure with data in mySQL:
CREATE TABLE `clientes` (
`Código` int,
`Nome` varchar(50),
`Descricao` varchar(150)
....
);
INSERT INTO `clientes` (`Código`, `Nome`, `Descricao`)
VALUES (1, 'Test Nome', 'Test Descricao');
The database encoding is 'latin1', but the database configuration is as shown:
'mycnn' => array(
'driver' => 'pdo',
'dsn' => 'mysql:dbname={$mydb};host={$myhost}',
'username' => '{$myuser}',
'password' => '{$mypassword}',
'driver_options' => array(
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES \'UTF8\''
),
)
As you can see the above scenario, I have setup the driver for "UTF-8", the column name "Código" has a special character and renaming this column is not an option.
The syntax that I am using for updating in the model is:
$set = array("Nome" => "Edited Test");
$where = array("Código" => 1);
$this->tableGateway->update($set, $where);
After that, the ZF is parsing the SQL throwing the Exception:
UPDATE "clientes" SET "Nome" = 'Edited Test' WHERE "C`\xC3`\xB3`digo" = 1
I have also removed the UTF-8 option, since the catalog is "latin1_swedish_ci" without success.
I would appreciate anyone who gives me a hint how to face this issue.
Thanks in advance.
Make sure your database encoding type is UTF-8.
'driver_options' => array(
PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES \'UTF8\''
),
Make sure fields have utf8_general_ci.
In your layout phtml head has
<meta charset='utf-8'>
Updated
As you said you are not able to change encoding to utf-8 so use one of the following commands using driver_options
'SET NAMES \'latin1\'' or 'SET CHARACTER SET \'latin1\''
For more details check out the doc please!
When problem with column name which has latin1 characters
Just pass the condition as string not an array as the second argument into TableGateway's update() method. But you must set this 'SET NAMES \'UTF8\'' in your driver_options.
$id = 1;
$where = "Código = {$id}";
// Or use this way
$where = "Código = 1";
$this->tableGateway->update($set, $where);

Tried to bind parameter number 0. SQL Server supports a maximum of 2100 parameters

I'm currently using a PDO class that works on MySQL perfectly. But when it comes to MSSQL , I get an error when I try to insert data via the bindValue() function.
I'm using this method for data binding:
bindValue(":param",$value)
Step 1 - Create an array for the table fields in the query
$counter = 0;
foreach($fields as $cols)
{
$fieldBind[$counter] = ":".$cols;
$new_f = $new_f ."". $cols;
$counter ++;
if($counter!=count($fields))
{
$new_f = $new_f.",";
}
}
output : (
[0] => :field1
[1] => :field2
[2] => :field3
)
Step 2 - Create an array for the data of the fields in the query
$counter2 = 0;
foreach($data as $cols)
{
$dataBind[$counter2] = $cols;
$new_d = $new_d."'".$cols."'";
$counter2 ++;
if($counter2!=count($data))
{
$new_d = $new_d.",";
}
}
output : ( [0] => value1 [1] => value2 [2] => value3 )
Step 3 - Prepare the query via the query function
parent::query("INSERT INTO $table($new_f) VALUES($new_d)");
Step 4 - Bind the Parameters and Values
for($i=0;$i<count($data);$i++){
parent::bind($fieldBind[$i],$dataBind[$i]);
}
The query looks like this:
INSERT INTO table(field1,field2,field3) values(':value1',':value2',':value3')
Step 5 - Execute the Query
try {
parent::execute();
return parent::rowCount();
}
catch(PDOException $e) {
echo $e->getMessage();
}
This method works perfectly on MySQL, but when I try to execute this on SQL Server, I get this error:
SQLSTATE[IMSSP]: Tried to bind parameter number 0. SQL Server supports a maximum of 2100 parameters.
Try removing the apostrophe ''
from :
INSERT INTO table(field1,field2,field3) values(':value1',':value2',':value3')
to the following:
INSERT INTO table(field1,field2,field3) values(:value1,:value2,:value3)

Dynamic Multi Insert with DBI placeholders for many sets of VALUES

I'm building a dynamic SQL statement, that will insert one or more sets of VALUES via a prepared DBI statement, my question is this:
Since I have a dynamic number of VALUES sets, and I will need to add as many ( ?, ?, ?),( ?, ?, ?) etc as necessary to extend the statement INSERT INTO `tblname` ( $columnsString ) VALUES in order to submit only one query using placeholders and bind values- is this the preferred method(most efficient, etc., - reasoning behind efficiency would be helpful in your answer if possible) or should I just be building this as a query string with sprintf and dbh->quote()?
(As a little extra information: I'm actually using AnyEvent::DBI right now, which only exposes placeholders & bind values and not the quote() method so this wouldn't be easy for me to accomplish without creating another straight DBI $dbh and using another db server connection just to use the quote() method, or without altering the AnyEvent::DBI module myself.)
Normally I would just execute the statements as necessary but in this heavy workload case I'm trying to batch inserts together for some DB efficiency.
Also, if anyone could answer if it is possible( and then how to ) insert an sql DEFAULT value using placeholders and bind values that'd be awesome. Typically if I ever needed to do that I'd append the DEFAULTs to the string directly and use sprintf and $dbh->quote() only for the non DEFAULT values.
UPDATE:
Worked out the misunderstanding in a quick chat. User ikegami suggested that instead of building the query string myself without placeholders, that I just intermingle VALUES and placeholders such as:
$queryString .= '(DEFAULT,?,?),(DEFAULT,DEFAULT,DEFAULT)';
Some of the reasoning behind my first asking of this question on SO was because I was somewhat against this intermingling due to my thought that it made the code less readable, though after being assured that sql 'DEFAULT' couldn't be in a placeholder bind value, this was the method I had begun implementing.
Using placeholders where possible does seem to be the more accepted method of building queries, and if you want an SQL DEFAULT you just need to include it in the same query building as the placeholders. This does not apply to NULL values, as those CAN be inserted with placeholders and a bind value of undef.
Update 2:
The reasoning I asked about performance, the 'acceptance' of building your own query with quote() vs building with placeholders, and why I've gone with a solution that involves using all columns for the SQL INSERT INTO tblname (cols) is because I have roughly 2-4 million rows a day going into a terrible db server, and my code is running on an equally terrible server. With my requirements of needing DEFAULT sql values, and these terrible performance constraints, I've chosen a solution for now.
For future devs who stumble upon this - take a look at #emazep's solution of using SQL::Abstract, or if for some reason you need to build your own, you might consider either using #Schwern's subroutine solution or possibly incorporating some of #ikegami's answer into it as these are all great answers as to the 'Current state of affairs' regarding the usage of DBI and building dynamic queries.
Unless there is a specific reason to reinvent the wheel (there could be some), SQL::Abstract (among others) has already solved the problem of dynamic SQL generation for all of us:
my %data = (
name => 'Jimbo Bobson',
phone => '123-456-7890',
address => '42 Sister Lane',
city => 'St. Louis',
state => 'Louisiana'
);
use SQL::Abstract;
my ($stmt, #bind)
= SQL::Abstract->new->insert('people', \%data);
print $stmt, "\n";
print join ', ', #bind;
which prints:
INSERT INTO people ( address, city, name, phone, state)
VALUES ( ?, ?, ?, ?, ? )
42 Sister Lane, St. Louis, Jimbo Bobson, 123-456-7890, Louisiana
SQL::Abstract then offers a nice trick to iterate over many rows to insert without regenerating the SQL every time, but for bulk inserts there is also SQL::Abstract::Plugin::InsertMulti
use SQL::Abstract;
use SQL::Abstract::Plugin::InsertMulti;
my ($stmt, #bind)
= SQL::Abstract->new->insert_multi( 'people', [
{ name => 'foo', age => 23 },
{ name => 'bar', age => 40 },
]);
# INSERT INTO people ( age, name ) VALUES ( ?, ? ), ( ?, ? )
# 23, foo, 40, bar
I have, on occasion, used a construct like:
#!/usr/bin/env perl
use strict; use warnings;
# ...
my #columns = ('a' .. 'z');
my $sql = sprintf(q{INSERT INTO sometable (%s) VALUES (%s)},
join(',', map $dbh->quote($_), #columns),
join(',', ('?') x #columns),
);
As for handling DEFAULT, wouldn't leaving that column out ensure that the DB sets it to the default value?
If you would use placeholders for "static" queries, you should use them for "dynamic" queries too. A query is a query.
my $stmt = 'UPDATE Widget SET foo=?'
my #params = $foo;
if ($set_far) {
$stmt .= ', far=?';
push #params, $far;
}
{
my #where;
if ($check_boo) {
push #where, 'boo=?';
push #params, $boo;
}
if ($check_bar) {
push #where, 'bar=?';
push #params, $bar;
}
$stmt .= ' WHERE ' . join ' AND ', map "($_)", #where
if #where;
}
$dbh->do($stmt, undef, #params);
I used an UPDATE since it allowed me to demonstrate more, but everything applies to INSERT too.
my #fields = ('foo');
my #params = ($foo);
if ($set_far) {
push #fields, 'bar';
push #params, $far;
}
$stmt = 'INSERT INTO Widget ('
. join(',', #fields)
. ') VALUES ('
. join(',', ('?')x#fields)
. ')';
$dbh->do($stmt, undef, #params);
You've expressed concerns about the readability of the code and also being able to pass in a DEFAULT. I'll take #ikegami's answer one step further...
sub insert {
my($dbh, $table, $fields, $values) = #_;
my $q_table = $dbh->quote($table);
my #q_fields = map { $dbh->quote($_) } #$fields;
my #placeholders = map { "?" } #q_fields;
my $sql = qq{
INSERT INTO $q_table
( #{[ join(', ', #q_fields) ]} )
VALUES ( #{[ join(', ', #placeholders ]} )
};
return $dbh->do($sql, undef, #$values);
}
Now you have a generic multi value insert routine.
# INSERT INTO foo ('bar', 'baz') VALUES ( 23, 42 )
insert( $dbh, "foo", ['bar', 'baz'], [23, 43] );
To indicate a default value, don't pass in that column.
# INSERT INTO foo ('bar') VALUES ( 23 )
# 'baz' will use its default
insert( $dbh, "foo", ['bar'], [23] );
You can optimize this to make your subroutine do multiple inserts with one subroutine call and one prepared statement saving CPU on the client side (and maybe some on the database side if it supports prepared handles).
sub insert {
my($dbh, $table, $fields, #rows) = #_;
my $q_table = $dbh->quote($table);
my #q_fields = map { $dbh->quote($_) } #$fields;
my #placeholders = map { "?" } #q_fields;
my $sql = qq{
INSERT INTO $q_table
( #{[ join(', ', #q_fields) ]} )
VALUES ( #{[ join(', ', #placeholders ]} )
};
my $sth = $dbh->prepare_cached($sql);
for my $values (#rows) {
$sth->execute(#$values);
}
}
# INSERT INTO foo ('bar', 'baz') VALUES ( 23, 42 )
# INSERT INTO foo ('bar', 'baz') VALUES ( 99, 12 )
insert( $dbh, "foo", ['bar', 'baz'], [23, 43], [99, 12] );
Finally, you can write a bulk insert passing in multiple values in a single statement. This is probably the most efficient way to do large groups of inserts. This is where having a fixed set of columns and passing in a DEFAULT marker comes in handy. I've employed the idiom where values passed as scalar references are treated as raw SQL values. Now you have the flexibility to pass in whatever you like.
sub insert {
my($dbh, $table, $fields, #rows) = #_;
my $q_table = $dbh->quote($table);
my #q_fields = map { $dbh->quote($_) } #$fields;
my $sql = qq{
INSERT INTO $q_table
( #{[ join(', ', #q_fields) ]} )
VALUES
};
# This would be more elegant building an array and then joining it together
# on ",\n", but that would double the memory usage and there might be
# a lot of values.
for my $values (#rows) {
$sql .= "( ";
# Scalar refs are treated as bare SQL.
$sql .= join ", ", map { ref $value ? $$_ : $dbh->quote($_) } #$values;
$sql .= "),\n";
}
$sql =~ s{,\n$}{};
return $dbh->do($sql);
}
# INSERT INTO foo ('bar', 'baz') VALUES ( 23, NOW ), ( DEFAULT, 12 )
insert( $dbh, "foo", ['bar', 'baz'], [23, \"NOW"], [\"DEFAULT", 12] );
The down side is this builds a string in memory, possibly very large. To get around that you have to involve database specific bulk insert from file syntax.
Rather than writing all this SQL generation stuff yourself, go with #emazep's answer and use SQL::Abstract and SQL::Abstract::Plugin::InsertMulti.
Just make sure you profile.

Mysql CASE statement usage with Zend

I have the following query that selects some records from the database:
$select_person = $this->select()
->setIntegrityCheck(false)
->from(array('a' => 'tableA'),
array(new Zend_Db_Expr('SQL_CALC_FOUND_ROWS a.id'),
'a.cid',
'a.email',
'final' => new Zend_Db_Expr( "concat( '<div
style=\"color:#1569C7; font-weight:bold\">',
a.head , ' ' , a.tail, '</div>')" ),
'a.red_flag'
)
)
->joinLeft(array('b' => 'tableb'), ... blah blah)
->where('blah blah')
->order('a.head ASC')
I want to modify the above query so that it selects a different value for 'final' depending on the value of
a.red_flag.
which can have values - true or false.
I understand I can use the CASE statement of mysql - eg something like the following:
'final' => new Zend_Db_Expr("CASE a.red_flag WHEN 'true' THEN '$concatstr1'
ELSE '$concatstr2' END")
The value of $concatstr1 = "concat( '<div style=\"color:red; font-weight:bold\">', a.head , ' ' , a.tail, '</div>')" ;
The value of $concatstr2 = "concat( '<div style=\"color:blue; font-weight:bold\">', a.head , ' ' , a.tail, '</div>')" ;
However, it throws an error saying
Message: SQLSTATE[42000]: Syntax error or access violation: 1064
You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use
near 'div
style="color:red; font-weight:bold">',
a.head , ' ' , ' at line 1
How can I make this query work?
Any help is greatly appreciated.
Thanks
Personnaly, I don't like to get HTML as an answer from the DB. It gets confusing and harder to debug and change afterwards. Furthermore, you might get some errors due to the confusion with the ' and " and all the reserved characters in MySQL (<, >, ;, ...) I would suggest that you try this:
'final' => new Zend_Db_Expr("CASE a.red_flag WHEN 'true' THEN 1
ELSE 0 END")
Then do a check on the value of a.red_flag;
if($this->final) {
$output .= '<div style=\"color:red; font-weight:bold\">';
} else {
$output .= '<div style=\"color:blue; font-weight:bold\">';
}
$output .= $this->head.' '.$this->tail;
$output .= '</div>';
If the query still doesn't work. Try
echo $select->__toString; exit();
and check the query. Try the output that you got with the __toString on your database and check if it works. It's easier to fix it that way. You could also show the query string here and it'll be easier to debug.
Finally, I found the error in my statement.
The culprit was - I was using quotes in $concatstr1 and $concatstr2 inside the $select_person statement.
The correct query should be formed as follows:
$select_person = $this->select()
->setIntegrityCheck(false)
->from(array('a' => 'tableA'),
array(new Zend_Db_Expr('SQL_CALC_FOUND_ROWS a.id'),
'a.cid',
'a.email',
final' => new Zend_Db_Expr("CASE a.red_flag WHEN 'true' THEN $concatstr1 ELSE $concatstr2 END"),
'a.red_flag'
)
)
->joinLeft(array('b' => 'tableb'), ... blah blah)
->where('blah blah')
->order('a.head ASC');
This is now returning me the appropriate value of 'final' - concatstr1 when red_flag is true otherwise it is returning me concatstr2.