Querying Django model when max_allowed_packet in MySQL is exceeded - mysql

So I've got this periodic task of sending an automated report to the user every month. The problem I encountered while generating the report data was that the MySQL DB has tons of report data for each user, so when I try to query on the User model, I get OperationalError: (1153, "Got a packet bigger than 'max_allowed_packet' bytes").
I've gone into the dbshell and check what the setting for that variable is, and it's the max allowed value (1 GB).
So I'm basically stuck here. Is there any way to get all the data without hitting that OperationalError?
The code is as follows (I've put in dummy names as I can't reveal company information) -
user_ids = list(Model1.objects.filter(param=param_value).values_list('user_id', flat=True)) # returns 143992 user_ids
users = User.objects.filter(user_id__in=user_ids)
I then try to iterate over users, but I hit the OperationalError.
I've also tried to split up the queryset like so -
slices = []
step = 1000
while True:
sliced_queryset = users[step-1000:step]
slices.append(sliced_queryset)
step += 1000
if sliced_queryset.count() < 1000:
break
But I hit the same error for .count().

Related

Issue with concurrent requests in CakePHP 2.0

Thanks in advance for attempting to asssist me with this issue.
I'm using CakePHP 2 (2.10.22).
I have a system which creates applications. Each application that gets created has a unique application number. The MySQL database column that stores this application number is set to 'Not null' and 'Unique'. I'm using CakePHP to get the last used application number from the database to then build the next application number for the new application that needs to be created. The process that I have written works without any problem when a single request is received at a given point in time. The problem arises when two requests are received to create an application at the exact same time. The behaviour that I have observed is that the the request that gets picked up first gets the last application number - e.g. ABC001233 and assigns ABC001234 as the application number for the new application it needs to create. It successfully saves this application into the database. The second request which is running concurrently also gets ABC001233 as the last application number and tries to create a new application with ABC001234 as the application number. The MySQL database returns an error saying that the application number is not unique. I then put the second request to sleep for 2 seconds by which time the first application has successfully saved to the database. I then re-attempt the application creation process which first gets the last application number which should be ABC001234 but instead each database read keeps returning ABC001233 even though the first request has long been completed. Both requests have transactions in the controller. What I have noticed is that when I remove these transactions, the process works correctly where for the second request after the first attempt fails, the second attempt works correctly as the system correctly gets ABC001234 as the last application number and assigns ABC001235 as the new application number. I want to know what I need to be doing so as to ensure the process works correctly even with the transaction directives in the controller.
Please find below some basic information on how the code is structured -
Database
The last application number is ABC001233
Controller file
function create_application(){
$db_source->begin(); //The process works correctly if I remove this line.
$result = $Application->create_new();
if($result === true){
$db_source->commit();
)else{
$db_source->rollback();
}
}
Application model file
function get_new_application_number(){
$application_record = $this->find('first',[
'order'=>[
$this->name.'.application_number DESC'
],
'fields'=>[
$this->name.'.application_number'
]
]);
$old_application_number = $application_record[$this->name]['application_number'];
$new_application_number = $old_application_number+1;
return $new_application_number;
}
The above is where I feel the problem originates. For the first request that gets picked up, this find correctly finds that ABC001233 is the last application number and this function then returns ABC001234 as the next application number. For the second request, it also picks up ABC001233 as the last application number but will fail when it tries to save ABC001234 as the application number as the first request has already saved an application with that number. As a part of the second attempt for the second request (which occurs because of the do/while loop) this find is requested again, but instead of returning ABC001234 as the last application number (per the successfuly save of the first request), it keeps returning ABC001233 resulting in a failure to correctly save. If I remove the transaction from the controller, this then works correctly where it will return ABC001234 in the second attempt. I couldn't find any documentation as to why that is and what can be done about the same and is where I need some assistance. Thank you!
function create_new(){
$new_application_number = $this->get_new_application_number();
$save_attempts = 0;
do{
$save_exception = false;
try{
$result = $this->save([$this->name=>['application_number'=>$new_application_number]], [
'atomic'=>false
]);
}catch(Exception $e){
$save_exception = true;
sleep(2);
$new_application_number = $this->get_new_application_number();
}
}while($save_exception === true && $save_attempts++<5);
return !$save_exception;
}
You just have to lock the row with the previous number in a transaction using SELECT ... FOR UPDATE. It's much better than the whole table lock as said in the comments.
According to documentation https://book.cakephp.org/2/en/models/retrieving-your-data.html you just have to add 'lock' => true to get_new_application_number function:
function get_new_application_number(){
$application_record = $this->find('first',[
'order'=>[
$this->name.'.application_number DESC'
],
'fields'=>[
$this->name.'.application_number'
],
'lock'=>true
]);
$old_application_number = $application_record[$this->name]['application_number'];
$new_application_number = $old_application_number+1;
return $new_application_number;
}
How does it work:
The second transaction will wait on that request while the first transaction is ended.
P.S. According to documentation lock option was added in the 2.10.0 version of CakePHP.

Flask SQL-Alchemy query is returning null for data that exists in my database. What could be the cause

My python program is meant to query my MySQL database for a record. The record exists in the database and contains data but the program returns null values. The table that gets queried is titled Market. In that table there is a column titled market_cap and a column titled volume. When I use MySQLWorkbench to query my database, the result shows that there is data in the columns. However, the program receives null.
Attached are two images (links, because I need to earn 10 reputation points to embed images in a post):
MySql database column image
shows a view of the column in my database that is having issues.
From the image, you can see that the data I need exists in my database.
Code with results from Pycharm debugger
Before running the debugger, I set a breakpoint right after the line where
the code queries the database for an object. Image two shows the output I
received when the code queried the database.
Screenshot of the Market Model
Screenshot of the solution I found out that converting the market cap(market_cap) before adding it to the dictionary(price_map) returns the correct value. You can see it in line 138.
What could cause existent data in a record to be returned as null?
import logging
from flask_restful import Resource
from api.resources.util.date_util import pretty_date_to_epoch,
epoch_to_pretty_date
from common.decorators import log_exception
from db.models import db, Market
log = logging.getLogger(__name__)
def map_date_to_price():
buy_date_list = ["2015-01-01", "2015-02-01", "2015-03-01"]
sell_date_list = ["2014-12-19", "2014-01-10", "2015-01-20",
"2016-01-10"]
date_list = buy_date_list + sell_date_list
market_list = []
price_map = {}
for day in date_list:
market_list.append(db.session.query(Market).
filter(Market.pretty_date == day).first())
for market in market_list:
price_map[market.pretty_date] = market.market_cap
return price_map
The two fields that are (apparently) being retrieved as null are both db.Numeric. http://docs.sqlalchemy.org/en/latest/core/type_basics.html notes that these are, by default, backed up by a decimal.Decimal object, which I'll bet can't be converted to JSON, so what comes back form Market.__repr__() will show them as null.
I would try adding asdecimal=False to the two db.Numeric() calls in Market.

Jmeter loop though CSV Data set config - Ajax flow

I am new to Jmeter and trying to carry out the following flows:
User Login with username and password
Page 1 is displayed with 10 invoices - User select ten invoices -
10 ajax call is executed (invoice1, invoice2,invoice3.. json file is generated with invoices as request)
Page 2 is displayed to view invoices
User log out
I have recorded the flow with blazemeter plugin on chrome.
The thread group in Jmeter has the following tasks:
I have 10 users in a file called users.txt and i am using CSV Data
set config to load them.
For each user I will load only 10 invoices from invoices.txt using
CSV Data set config to load them.
Since I have 10 users and each user needs 10 invoices, my
invoices.txt has 100 unique invoices.
Please find csv config for invoice below:
The problem is that I need each user to be assigned with 10 unique invoices and those 10 invoices cannot be allocated to another user.
Any idea how I can load 10 unique invoices for each user and make sure those invoices are not assigned again to another user?
invoices.txt should have only unique IDs before test start, you can share the IDs using:
CSV Data Set Config inside loop of users with attributes:
Sharing mode - All Threads - ID won't be repeated
Recycle on EOF? - False - for not to get invalid Id (<EOF>)
Stop thread on EOF? - True - Stop when file with unique IDs ends
You can consider using HTTP Simple Table Server instead of 2nd CSV Data Set Config.
HTTP Simple Table Server has KEEP option, given you set it to FALSE each used "invoice" will be removed, it will guarantee uniqueness even in case when you run your test in Distributed (Remote) mode
You can install HTTP Simple Table Server (as well as any other JMeter Plugin) using JMeter Plugins Manager

Running a thread group multiple times for all the values in a csv file

I have recorded a series of 5 HTTP requests in a thread group (say TG). The response value of a request has to be sent as a parameter in next request, and so on till the last request is made.
To send the parameter in first request, I have created a csv file with unique values (say 1,2,3,4,5).
Now I want this TG to run for all the values read from the csv file (In above case, TG should start running for value 1, then value 2, till 5).
How do I do this?
Given your CSV file looks like:
1
2
3
4
5
In the Thread Group set Loop Count to "Forever"
Add CSV Data Set Config element under the Thread Group and configure it as follows:
Filename: if file is in JMeter's bin folder - file name only. If in the other location - full path to CSV file
Variable Names: anything meaningful, i.e. parameter
Recycle on EOF - false
Stop thread on OEF - true
Sharing mode - according to your scenario
You'll get something like:
See Using CSV DATA SET CONFIG guide for more detailed explanation.
Another option is using __CSVRead() function
This method of creating individual request for each record will not be scalable for multiple records. There is another scalable solution here - Jmeter multiple executions for each record from CSV

Getting time intervals in ruby

I have an external service that allows me to log users into my website.
To avoid getting kicked out of it for overuse I use a MySQL table on the following form that caches user accesses:
username (STRING) | last access (TIMESTAMP) | has access? (TINYINT(1) - BOOLEAN)
If the user had access on a given time I trust he has access and don't query the service during a day, that is
query_again = !user["last access"].between?(Time.now, 1.day.ago)
This always returns true for some reason, any help with this logic?
In ranges (which you effectively use here), it is generally expected that the lower number is the start and the higher number is the end. Thus, it should work for you if you just switch the condition in your between? call
query_again = !user["last access"].between?(1.day.ago, Time.now)
You can test this yourself easily in IRB:
1.hour.ago.between?(Time.now, 1.day.ago)
# => false
1.hour.ago.between?(1.day.ago, Time.now)
# => true