Improving performance on a large Doctrine query - mysql

In Symfony3, I'm using Doctrine's QueryBuilder to iterate up to 500k rows from my 35 million row table:
$query = $this->createQueryBuilder('l')
->where('l.foo = :foo')
->setParameter('foo', $foo)
->getQuery();
$results = $query->iterate();
foreach ($results as $result) {
$em->clear();
// My logic using $result[0]
}
The memory usage of this often approaches 512mb, before I even begin to iterate. Is there any further way I can optimise this? Am I correct in reading that hydration is turned off when iterating a query?

I had great results with generators. Perhaps processing results in a separate method helps PHP to cleanup unused objects. I'm not sure what you're doing to process your records, and cannot guarantee you'll get the same results, but in my case memory consumption remained constant through whole script execution:
public function getMyResults($foo)
{
$query = $this->createQueryBuilder('l')
->where('l.foo = :foo')
->setParameter('foo', $foo)
->getQuery();
foreach ($query->iterate() as $result) {
yield $result[0]
$em->clear();
}
}
public function processMyResults($foo)
{
foreach ($this->getMyResults($foo) as $result) {
}
}
If this doesn't help, consider making a query with DBAL or PDO (both with the fetch() method to avoid fetching all records at once). Doctrine's iterator might leak memory (PDO's resultset shouldn't).
Doctrine will solve 80% of your problems. The remaining 20% is better approached without it.
Am I correct in reading that hydration is turned off when iterating a query?
No, unless you change the hydration mode. You can do it by passing a second argument to the iterate() method.

Example from Doctrine docs
$batchSize = 20;
$i = 1;
$q = $em->createQuery('select u from MyProject\Model\User u');
foreach ($q->toIterable() as $user) {
$user->increaseCredit();
$user->calculateNewBonuses();
++$i;
if (($i % $batchSize) === 0) {
$em->flush(); // Executes all updates.
$em->clear(); // Detaches all objects from Doctrine!
}
}
$em->flush();

Related

How can fetching huge records using Laravel and MySQL?

I Need experts Suggestions and Solutions. We are developing job portal website here by handle around 1 million records. We are facing records fetching timeout errors. How can I handle those records using laravel and MySql?
We are trying to follow steps:
Increase the PHP execution time
MySql Indexing
Paginations
You should be chunking results when working with large data sets. This allows you to process smaller loads, reduces memory consumption and allows you to return data to the User while the rest is being fetched/processing. See the laravel documentation on chunking:
https://laravel.com/docs/5.5/eloquent#chunking-results
To further speed things up you can leverage multithreading and spawn concurrent processes that each handle a chunk at a time. Symfony's Symfony\Component\Process\Process class makes this easy to do.
https://symfony.com/doc/current/components/process.html
From the docs:
If you need to work with thousands of database records, consider using the chunk method. This method retrieves a small chunk of the results at a time and feeds each chunk into a Closure for processing. This method is very useful for writing Artisan commands that process thousands of records.
For example, let's work with the entire users table in chunks of 100 records at a time:
DB::table('users')->orderBy('id')->chunk(100, function ($users) {
foreach ($users as $user) {
//
}
});
Hi I think this might help
$users = User::groupBy('id')->orderBy('id', 'asc');
$response = new StreamedResponse(function() use($users){
$handle = fopen('php://output', 'w');
// Add Excel headers
fputcsv($handle, [
'col1', 'Col 2' ]);
$users->chunk(1000, function($filtered_users) use($handle) {
foreach ($filtered_users as $user) {
// Add a new row with user data
fputcsv($handle, [
$user->col1, $user->col2
]);
}
});
// Close the output stream
fclose($handle);
}, 200, [
'Content-Type' => 'text/csv',
'Content-Disposition' => 'attachment; filename="Users'.Carbon::now()->toDateTimeString().'.csv"',
]);
return $response;
Laravel has a lazy feature for this purpose. I tried both chunk and cursor. The cursor makes one query and puts a lot of data in the memory which is not useful if you have millions of records in DB. Chunk also was ok but lazy much cleaner in the way you write your code.
use App\Models\Flight;
foreach (Flight::lazy() as $flight) {
//
}
Source: https://laravel.com/docs/9.x/eloquent#chunking-results

Symfony3: Doctrine batch processing with exceptions handling

I need to insert multiple user from an Excel file using Symfony3 command. I have read the following article about batch processing: http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/batch-processing.html
I have been wondering if there is a way to no stop the flushing process when a query fails (for a not null column for instance). I would actually like to be able to not check all my data before doing the persist and let Doctrine continue the inserts, even though a query failed in the flush of 20 queries lets say.
Thank you for your help.
Kind regards,
This template may help you go further...
$batchSize = 20;
$currentSize = 0;
$data = [ .... ];
foreach ($data as $item) {
$entity = new Entity();
$entity->setProperty($item['property']);
try {
$currentSize++;
$em->persist($entity);
if ($batchSize% $currentSize === 0) {
$em->flush();
$em->clear();
}
} catch (\Doctrine\ORM\ORMException $e) {
$currentSize--;
}
}
$em->flush();
$em->clear();

Stop mysql query when press stop button

I got stuck with this problem, I found many posts but seemed it's not useful. So I post again here and hope someone can help me.
Let say I have 2 button, 1 is Start button and 1 is Stop button. When I press start will call ajax function which query very long time. I need when I press Stop will stop immediately this query, not execute anymore.
this is function used to call query and fetch row. (customize Mysqli.php)
public function fetchMultiRowset($params = array()) {
$data = array();
$mysqli = $this->_adapter->getConnection();
$mysqli->multi_query($this->bindParams($this->_sql, $params));
$thread_id = mysqli_thread_id($mysqli);
ignore_user_abort(true);
ob_start();
$index = 0;
do {
if ($result = $mysqli->store_result()) {
while ($row = $result->fetch_array(MYSQLI_ASSOC)) {
$data[$index] = $row;
$index++;
echo " ";
ob_flush();
flush();
}
$result->free();
}
}
while ($mysqli->more_results() && $mysqli->next_result());
ob_end_flush();
return $data;
}
Function in Model:
public function select_entries() {
$data = null;
try {
$db = Zend_Db_Adapter_Mysqlicustom::singleton();
$sql = "SELECT * FROM report LIMIT 2000000";
$data = $db->fetchMultiRowset($sql);
$db->closeConnection();
} catch (Exception $exc) {
}
return $data;
}
Controller:
public function testAction(){
$op = $this->report_test->select_entries();
}
In AJAX I used xhr.abort() to stop the AJAX function. But it still runs the query while AJAX was aborted.
How do I stop query? I used Zend Framework.
EDIT: I did not look in detail at your program, now I see that not the query itself is taking so long, but the reading of all the data. So just check every 1000 rows, if the ajax call is still active. Ajax Abort.
Solution in case of a long-running SQL-query:
You would have to allow the application to kill database queries, and you need to implement a more complex interaction between Client and Server, which could lead to security holes if done wrong.
The Start-Request should contain a session and a page id (secure id, so not 3 and 4 and 5 but a non-guessable but unique hash of some kind). The backend then connects this id with the query. This could be done in some extra table of the database, but also via comments in the SQL query, like "Session fid98a08u4j, Page 940jfmkvlz" => s:<session>p:<page>.
/* s:fid98a08u4jp:940jfmkvlz */ select * from ...
If the user presses "stop", you send session and page id to the server. The php-code then fetches the list of your running SQL Queries and searches for session and page and extracts the query id.
Then the php sends a
kill query <id>
to the MySQL-server.
This might lead to trouble when not using transactions, and this might damage replication. And even a kill query might take some time in the state 'killing'.
So be sure that you can and want not to split the long running query into subqueries, which check if the request is still valid every few seconds, or that you do not just want to kill the query for cosmetical reasons.

Import of 50K+ Records in MySQL Gives General error: 1390 Prepared statement contains too many placeholders

Has anyone ever come across this error: General error: 1390 Prepared statement contains too many placeholders
I just did an import via SequelPro of over 50,000 records and now when I go to view these records in my view (Laravel 4) I get General error: 1390 Prepared statement contains too many placeholders.
The below index() method in my AdminNotesController.php file is what is generating the query and rendering the view.
public function index()
{
$created_at_value = Input::get('created_at_value');
$note_types_value = Input::get('note_types_value');
$contact_names_value = Input::get('contact_names_value');
$user_names_value = Input::get('user_names_value');
$account_managers_value = Input::get('account_managers_value');
if (is_null($created_at_value)) $created_at_value = DB::table('notes')->lists('created_at');
if (is_null($note_types_value)) $note_types_value = DB::table('note_types')->lists('type');
if (is_null($contact_names_value)) $contact_names_value = DB::table('contacts')->select(DB::raw('CONCAT(first_name," ",last_name) as cname'))->lists('cname');
if (is_null($user_names_value)) $user_names_value = DB::table('users')->select(DB::raw('CONCAT(first_name," ",last_name) as uname'))->lists('uname');
// In the view, there is a dropdown box, that allows the user to select the amount of records to show per page. Retrieve that value or set a default.
$perPage = Input::get('perPage', 10);
// This code retrieves the order from the session that has been selected by the user by clicking on a table column title. The value is placed in the session via the getOrder() method and is used later in the Eloquent query and joins.
$order = Session::get('account.order', 'company_name.asc');
$order = explode('.', $order);
$notes_query = Note::leftJoin('note_types', 'note_types.id', '=', 'notes.note_type_id')
->leftJoin('users', 'users.id', '=', 'notes.user_id')
->leftJoin('contacts', 'contacts.id', '=', 'notes.contact_id')
->orderBy($order[0], $order[1])
->select(array('notes.*', DB::raw('notes.id as nid')));
if (!empty($created_at_value)) $notes_query = $notes_query->whereIn('notes.created_at', $created_at_value);
$notes = $notes_query->whereIn('note_types.type', $note_types_value)
->whereIn(DB::raw('CONCAT(contacts.first_name," ",contacts.last_name)'), $contact_names_value)
->whereIn(DB::raw('CONCAT(users.first_name," ",users.last_name)'), $user_names_value)
->paginate($perPage)->appends(array('created_at_value' => Input::get('created_at_value'), 'note_types_value' => Input::get('note_types_value'), 'contact_names_value' => Input::get('contact_names_value'), 'user_names_value' => Input::get('user_names_value')));
$notes_trash = Note::onlyTrashed()
->leftJoin('note_types', 'note_types.id', '=', 'notes.note_type_id')
->leftJoin('users', 'users.id', '=', 'notes.user_id')
->leftJoin('contacts', 'contacts.id', '=', 'notes.contact_id')
->orderBy($order[0], $order[1])
->select(array('notes.*', DB::raw('notes.id as nid')))
->get();
$this->layout->content = View::make('admin.notes.index', array(
'notes' => $notes,
'created_at' => DB::table('notes')->lists('created_at', 'created_at'),
'note_types' => DB::table('note_types')->lists('type', 'type'),
'contacts' => DB::table('contacts')->select(DB::raw('CONCAT(first_name," ",last_name) as cname'))->lists('cname', 'cname'),
'accounts' => Account::lists('company_name', 'company_name'),
'users' => DB::table('users')->select(DB::raw('CONCAT(first_name," ",last_name) as uname'))->lists('uname', 'uname'),
'notes_trash' => $notes_trash,
'perPage' => $perPage
));
}
Any advice would be appreciated. Thanks.
Solved this issue by using array_chunk function.
Here is the solution below:
foreach (array_chunk($data,1000) as $t)
{
DB::table('table_name')->insert($t);
}
There is limit 65,535 (2^16-1) place holders in MariaDB 5.5 which is supposed to have identical behaviour as MySQL 5.5.
Not sure if relevant, I tested it on PHP 5.5.12 using MySQLi / MySQLND.
This error only happens when both of the following conditions are met:
You are using the MySQL Native Driver (mysqlnd) and not the MySQL client library (libmysqlclient)
You are not emulating prepares.
If you change either one of these factors, this error will not occur. However keep in mind that doing both of these is recommended either for performance or security issues, so I would not recommend this solution for anything but more of a one-time or temporary problem you are having. To prevent this error from occurring, the fix is as simple as:
$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, true);
While I think #The Disintegrator is correct about the placeholders being limited. I would not run 1 query per record.
I have a query that worked fine until I added one more column and now I have 72k placeholders and I get this error. However, that 72k is made up of 9000 rows with 8 columns. Running this query 1 record at a time would take days. (I'm trying to import AdWords data into a DB and it would literally take more than 24 hours to import a days worth of data if I did it 1 record at a time. I tried that first.)
What I would recommend is something of a hack. First either dynamically determine the max number of placeholders you want to allow - i.e. 60k to be safe. Use this number to determine, based on the number of columns, how many complete records you can import/return at once. Create the full array of data for you query. Use a array_chunk and a foreach loop to grab everything you want in the minimum number of queries. Like this:
$maxRecords = 1000;
$sql = 'SELECT * FROM ...';
$qMarks = array_fill(0, $maxInsert, '(?, ...)');
$tmp = $sql . $implode(', ', $qMarks);
foreach (array_chunk($data, $maxRecords) AS $junk=>$dataArray) {
if (count($dataArray) < $maxRecords)) { break; }
// Do your PDO stuff here using $tmp as you SQL statement with all those placeholders - the ?s
}
// Now insert all the leftovers with basically the same code as above except accounting for
// the fact that you have fewer than $maxRecords now.
Using Laravel model, copy all 11000 records from sqlite database to mysql database in few seconds. Chunk data array to 500 records:
public function handle(): void
{
$smodel = new Src_model();
$smodel->setTable($this->argument('fromtable'));
$smodel->setConnection('default'); // sqlite database
$src = $smodel::all()->toArray();
$dmodel = new Dst_model();
$dmodel->setTable($this->argument('totable'));
$dmodel->timestamps = false;
$stack = $dmodel->getFields();
$fields = array_shift($stack);
$condb = DB::connection('mysql');
$condb->beginTransaction();
$dmodel::query()->truncate();
$dmodel->fillable($stack);
$srcarr=array_chunk($src,500);
$isOK=true;
foreach($srcarr as $item) {
if (!$dmodel->query()->insert($item)) $isOK=false;
}
if ($isOK) {
$this->notify("Przenieśliśmy tabelę z tabeli : {$this->argument('fromtable')} do tabeli: {$this->argument('totable')}", 'Będzie świeża jak nigdy!');
$condb->commit();
}
else $condb->rollBack();
}
You can do it with array_chunk function, like this:
foreach(array_chunk($data, 1000) as $key => $smallerArray) {
foreach ($smallerArray as $index => $value) {
$temp[$index] = $value
}
DB::table('table_name')->insert(temp);
}
My Fix for above issue:
On my side when i got this error I fixed it by reducing the the bulk insertion chunk size from 1000 to 800 and it worked for me.
Actually there were too many fields in my table and most them contains the details descriptions of size like a complete page text. when i go for there bulk insertion the service caused crashed and through the above error.
I think the number of placeholders is limited to 65536 per query (at least in older mysql versions).
I really can't discern what this piece of code is generating. But if it's a gigantic query, There's your problem.
You should generate one query per record to import and put those into a transaction.

How to see MySQL statements and error (if any) in CakePHP shell

I am using CakePHP 1.3 and writing custom shells to run mundane tasks in cronjobs. I am seeing failed Model->save() from time to time but I don't know anyway to find out what the exact problem is.
Is there a way to display the actual SQL statements executed and warning/error returned by MySQL in a CakePHP shell?
Thanks.
You can use the following SQL dump task for shells.
http://bakery.cakephp.org/articles/carcus88/2011/04/08/sql_dump_task_for_shells
One way to do this would be to watch the MySQL log file in a separate terminal.
A couple ways of doing this are listed here:
MySQL Query Logging in CakePHP
I found a way to do it. In your shell, add:
function initialize()
{
Configure::write('debug', 2);
$this->_loadDbConfig();
$this->_loadModels();
}
Then whenever you like to see the log, call this function:
function dump_sql()
{
$sql_dump = '';
if (!class_exists('ConnectionManager') || Configure::read('debug') < 2)
return false;
$noLogs = !isset($logs);
if ($noLogs)
{
$sources = ConnectionManager::sourceList();
$logs = array();
foreach ($sources as $source):
$db =& ConnectionManager::getDataSource($source);
if (!$db->isInterfaceSupported('getLog')):
continue;
endif;
$logs[$source] = $db->getLog();
endforeach;
}
if ($noLogs || isset($_forced_from_dbo_))
{
foreach ($logs as $source => $logInfo)
{
$text = $logInfo['count'] > 1 ? 'queries' : 'query';
$sql_dump .= "cakeSqlLog_" . preg_replace('/[^A-Za-z0-9_]/', '_', uniqid(time(), true));
$sql_dump .= '('.$source.') '. $logInfo['count'] .' '.$text. ' took '.$logInfo['time'].' ms';
$sql_dump .= 'Nr Query Error Affected Num. rows Took (ms)';
foreach ($logInfo['log'] as $k => $i)
{
$sql_dump .= $i['query'];
}
}
}
else
{
$sql_dump .= 'Encountered unexpected $logs cannot generate SQL log';
}
}
One other approach would be to have all your custom queries in the models/behaviors, and just calling the data/updates from shells. This would give you an extra benefit of being able to reuse those custom SQL in other parts of the project. For example, in unit tests.
In CakePHP 1.2, I was able to get the SQL queries to show up in my console output by adding a Configure::write('debug', 2); call to the bottom of the __bootstrap method in the cake/console/cake.php file.
No need to mess around with specifically calling a dump_sql function like some of these answers, I just automatically get the normal queries like at the bottom of a web page.