How can I lock a table in DBIx:Class? - mysql

I am writing a small application with
Mojolicious
DBIx::Class
Hypnotoad which is a pre-forking web
server.
MySQL
In my application I need to do the following;
Do some complex processing ( takes a minute of so to complete )
insert resulting data from above processing into tables
obtain the last auto increment of some tables, do some more processing.
use the values from (3) as part of an insert into another table ( a junction table )
Here is some sample code starting at step 2
#step 2
my $device = $device_rs->create(
{
devicename => $deviceName,
objects => \#objects
object_groups => \#objectgroups,
}
);
#step 3
my $lastogid = $db->resultset('ObjectGroup')->get_column('objectgroupid')->max;
my $lastobid = $db->resultset('Object')->get_column('objectid')->max;
my $obgcount = scalar(#objectgroups);
my $objcount = scalar(#objects);
my $ogoffset = $lastogid - $obgcount;
my $oboffset = $lastobid - $objcount;
#now increment the object/group ids by the offset which will be inserted into the many- many table
foreach my $hash (#childobjects) {
$hash->{'objectgroup_objectgroupid'} += $ogoffset;
$hash->{'object_objectid'} += $oboffset;
}
#step 4 - populate the junction table
$db->resultset('ObjectGroupHasObjects’)->populate(\#childobjects);
Now due to having multiple threads going a once the values obtained from step 3 may not be correct ( for the current ‘device’ ).
I’m trying to find a way around this issue. The only thing I can think of at the moment is putting a lock on the database tables before step 2) and unlocking after step 4).
How can I do this in DBIx::Class and is this likely to resolve my issue?
Thank you.

Something like
$schema->dbh_do("LOCK TABLES names");
...
...
$schema->dbh_do("UNLOCK TABLES");
Source: http://www.perlmonks.org/?node_id=854538
Also see: How to avoid race conditions when using the find_or_create method of DBIx::Class::ResultSet?
and SQLHackers::SELECT#SELECT_..._FOR_UPDATE

Related

Django bulk update setting each to different values? [duplicate]

I'd like to update a table with Django - something like this in raw SQL:
update tbl_name set name = 'foo' where name = 'bar'
My first result is something like this - but that's nasty, isn't it?
list = ModelClass.objects.filter(name = 'bar')
for obj in list:
obj.name = 'foo'
obj.save()
Is there a more elegant way?
Update:
Django 2.2 version now has a bulk_update.
Old answer:
Refer to the following django documentation section
Updating multiple objects at once
In short you should be able to use:
ModelClass.objects.filter(name='bar').update(name="foo")
You can also use F objects to do things like incrementing rows:
from django.db.models import F
Entry.objects.all().update(n_pingbacks=F('n_pingbacks') + 1)
See the documentation.
However, note that:
This won't use ModelClass.save method (so if you have some logic inside it won't be triggered).
No django signals will be emitted.
You can't perform an .update() on a sliced QuerySet, it must be on an original QuerySet so you'll need to lean on the .filter() and .exclude() methods.
Consider using django-bulk-update found here on GitHub.
Install: pip install django-bulk-update
Implement: (code taken directly from projects ReadMe file)
from bulk_update.helper import bulk_update
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
r = random.randrange(4)
person.name = random_names[r]
bulk_update(people) # updates all columns using the default db
Update: As Marc points out in the comments this is not suitable for updating thousands of rows at once. Though it is suitable for smaller batches 10's to 100's. The size of the batch that is right for you depends on your CPU and query complexity. This tool is more like a wheel barrow than a dump truck.
Django 2.2 version now has a bulk_update method (release notes).
https://docs.djangoproject.com/en/stable/ref/models/querysets/#bulk-update
Example:
# get a pk: record dictionary of existing records
updates = YourModel.objects.filter(...).in_bulk()
....
# do something with the updates dict
....
if hasattr(YourModel.objects, 'bulk_update') and updates:
# Use the new method
YourModel.objects.bulk_update(updates.values(), [list the fields to update], batch_size=100)
else:
# The old & slow way
with transaction.atomic():
for obj in updates.values():
obj.save(update_fields=[list the fields to update])
If you want to set the same value on a collection of rows, you can use the update() method combined with any query term to update all rows in one query:
some_list = ModelClass.objects.filter(some condition).values('id')
ModelClass.objects.filter(pk__in=some_list).update(foo=bar)
If you want to update a collection of rows with different values depending on some condition, you can in best case batch the updates according to values. Let's say you have 1000 rows where you want to set a column to one of X values, then you could prepare the batches beforehand and then only run X update-queries (each essentially having the form of the first example above) + the initial SELECT-query.
If every row requires a unique value there is no way to avoid one query per update. Perhaps look into other architectures like CQRS/Event sourcing if you need performance in this latter case.
Here is a useful content which i found in internet regarding the above question
https://www.sankalpjonna.com/learn-django/running-a-bulk-update-with-django
The inefficient way
model_qs= ModelClass.objects.filter(name = 'bar')
for obj in model_qs:
obj.name = 'foo'
obj.save()
The efficient way
ModelClass.objects.filter(name = 'bar').update(name="foo") # for single value 'foo' or add loop
Using bulk_update
update_list = []
model_qs= ModelClass.objects.filter(name = 'bar')
for model_obj in model_qs:
model_obj.name = "foo" # Or what ever the value is for simplicty im providing foo only
update_list.append(model_obj)
ModelClass.objects.bulk_update(update_list,['name'])
Using an atomic transaction
from django.db import transaction
with transaction.atomic():
model_qs = ModelClass.objects.filter(name = 'bar')
for obj in model_qs:
ModelClass.objects.filter(name = 'bar').update(name="foo")
Any Up Votes ? Thanks in advance : Thank you for keep an attention ;)
To update with same value we can simply use this
ModelClass.objects.filter(name = 'bar').update(name='foo')
To update with different values
ob_list = ModelClass.objects.filter(name = 'bar')
obj_to_be_update = []
for obj in obj_list:
obj.name = "Dear "+obj.name
obj_to_be_update.append(obj)
ModelClass.objects.bulk_update(obj_to_be_update, ['name'], batch_size=1000)
It won't trigger save signal every time instead we keep all the objects to be updated on the list and trigger update signal at once.
IT returns number of objects are updated in table.
update_counts = ModelClass.objects.filter(name='bar').update(name="foo")
You can refer this link to get more information on bulk update and create.
Bulk update and Create

How to get Ruby MySQL returning PHP like DB SELECT result

So I use the PDO for a DB connection like this:
$this->dsn[$key] = array('mysql:host=' . $creds['SRVR'] . ';dbname=' . $db, $creds['USER'], $creds['PWD']);
$this->db[$key] = new PDO($this->dsn[$key]);
Using PDO I can then execute a MySQL SELECT using something like this:
$sql = "SELECT * FROM table WHERE id = ?";
$st = $db->prepare($sql);
$st->execute($id);
$result = $st->fetchAll();
The $result variable will then return an array of arrays where each row is given a incremental key - the first row having the array key 0. And then that data will have an array the DB data like this:
$result (array(2)
[0]=>[0=>1, "id"=>1, 1=>"stuff", "field1"=>"stuff", 2=>"more stuff", "field2"=>"more stuff" ...],
[1]=>[0=>2, "id"=>2, 1=>"yet more stuff", "field1"=>"yet more stuff", 2=>"even more stuff", "field2"=>"even more stuff"]);
In this example the DB table's field names would be id, field1 and field2. And the result allows you to spin through the array of data rows and then access the data using either a index (0, 1, 2) or the field name ("id", "field1", "field2"). Most of the time I prefer to access the data via the field names but access via both means is useful.
So I'm learning the ruby-mysql gem right now and I can retrieve the data from the DB. However, I cannot get the field names. I could probably extract it from the SQL statement given but that requires a fair bit of coding for error trapping and only works so long as I'm not using SELECT * FROM ... as my SELECT statement.
So I'm using a table full of State names and their abbreviations for my testing. When I use "SELECT State, Abbr FROM states" with the following code
st = #db.prepare(sql)
if empty(where)
st.execute()
else
st.execute(where)
end
rows = []
while row = st.fetch do
rows << row
end
st.close
return rows
I get a result like this:
[["Alabama", "AL"], ["Alaska", "AK"], ...]
And I'm wanting a result like this:
[[0=>"Alabama", "State"=>"Alabama", 1=>"AL", "Abbr"=>"AL"], ...]
I'm guessing I don't have the way inspect would display it quite right but I'm hoping you get the idea by now.
Anyway to do this? I've seen some reference to doing this type of thing but it appears to require the DBI module. I guess that isn't the end of the world but is that the only way? Or can I do it with ruby-mysql alone?
I've been digging into all the methods I can find without success. Hopefully you guys can help.
Thanks
Gabe
You can do this yourself without too much effort:
expanded_rows = rows.map do |r|
{ 0 => r[0], 'State' => r[0], 1 => r[1], 'Abbr' => r[1] }
end
Or a more general approach that you could wrap up in a method:
columns = ['State', 'Abbr']
expanded_rows = rows.map do |r|
0.upto(names.length - 1).each_with_object({}) do |i, h|
h[names[i]] = h[i] = r[i]
end
end
So you could collect up the rows as you are now and then pump that array of arrays through something like what's above and you should get the sort of data structure you're looking for out the other side.
There are other methods on the row you get from st.fetch as well:
http://rubydoc.info/gems/mysql/2.8.1/Mysql/Result
But you'll have to experiment a little to see what exactly they return as the documentation is, um, a little thin.
You should be able to get the column names out of row or st:
http://rubydoc.info/gems/mysql/2.8.1/Mysql/Stmt
but again, you'll have to experiment to figure out the API. Sorry, I don't have anything set up to play around with the MySQL API that you're using so I can't be more specific.
I realize that php programmers are all cowboys who think using a db layer is cheating, but you should really consider activerecord.

LINQ Insertonsubmit very slow compared to legacy SQL Insert statement

I have a large inserting job to perform, say 300000 Inserts.
If I do it the legacy way, I just write a SQL string with blocks of 100 Insert statements, and perform an executeCommand against the DB (each 100 records).
That lends to some 100 inserts per 3 seconds or so.
Now of course there are issue with single quotes and CrLf's within the inserted values. So rather than writing code to double the single quotes and so on, since I'm lazy I have a go with Linq InsertOnSubmit and one context.SublitChanges each other 100 rows.
And that take some 20x more times than the legacy way!!!
Why?
You're not using the right tool for the job. LINQ-to-SQL and most other ORMs (at least Entity Framework and NHibernate) are meant for OLTP scenarios, they are not meant for bulk data operations and will perform slowly when used for bulk data operations.
You should be using SqlBulkCopy.
I had the same issues, with InsertOnSubmit() taking a long time.
However, using the DataTableHelper class (downloadable from the link below), and changing just 1 or 2 lines of your code, you can easily use a Bulk Insert instead.
Bulk-inserts
For example:
const int RECORDS_TO_INSERT = 5000;
List<Product> recordsToBeInserted = new List<Product>();
using (NorthwindDataContext dc = new NorthwindDataContext())
{
for (int n = 0; n < RECORDS_TO_INSERT; n++)
{
Product newProduct = new Product()
{
ProductName = "Product " + n.ToString(),
UnitPrice = 3999,
UnitsInStock = 2,
UnitsOnOrder = 0,
Discontinued = false
};
recordsToBeInserted.Add(newProduct);
}
// Insert this List<> of records into the [Products] table in our database, using a Bulk Insert
DataTableHelper.BulkCopyToDatabase(recordsToBeInserted, "Products", dc);
}
Hope this helps.

How to to create unique random integer ID for primary key for table?

I was wondering if anybody knew a good way to create a unique random integer id for a primary key for a table. I'm using MySQL. The value has to be integer.
In response to: "Because I want to use that value to Encode to Base62 and then use that for an id in a url. If i auto increment, it might be obvious to the user how the url id is generated."
If security is your aim then using Base62, even with a "randomly" generated number won't help.
A better option would:
Do not re-invent the wheel -- use AUTO_INCREMENT
Then use a cryptographic hash function + a randomly generated string (hidden in the db for that particular url) to generate the final "unique id for that url"
If your're open to suggestions and you can implement it, use UUIDs.
MySQL's UUID() function will return a 36 chars value which can be used for ID.
If you want to use integer, still, I think you need to create a function getRandID() that you will use in the INSERT statement. This function needs to use random + check of existing ids to return one that is not used before.
Check RAND() function for MySQL.
How you generate the unique_ids is a useful question - but you seem to be making a counter productive assumption about when you generate them!
My point is that you do not need to generate these unique id's at the time of creating your rows, because they are essentially independent of the data being inserted.
What I do is pre-generate unique id's for future use, that way I can take my own sweet time and absolutely guarantee they are unique, and there's no processing to be done at the time of the insert.
For example I have an orders table with order_id in it. This id is generated on the fly when the user enters the order, incrementally 1,2,3 etc forever. The user does not need to see this internal id.
Then I have another table - unique_ids with (order_id, unique_id). I have a routine that runs every night which pre-loads this table with enough unique_id rows to more than cover the orders that might be inserted in the next 24 hours. (If I ever get 10000 orders in one day I'll have a problem - but that would be a good problem to have!)
This approach guarantees uniqueness and takes any processing load away from the insert transaction and into the batch routine, where it does not affect the user.
How about this approach (PHP and MySQL):
Short
Generate random number for user_id (UNIQUE)
Insert row with generated number as user_id
If inserted row count equal to 0, go to point 1
Looks heavy? Continue to read.
Long:
Table:
users (user_id int UNIQUE)
Code:
<?php
// values stored in configuration
$min = 1;
$max = 1000000;
$numberOfLoops = 0;
do {
$randomNumber = rand($min, $max);
// the very insert
$insertedRows = insert_to_table(
'INSERT INTO foo_table (user_id) VALUES (:number)',
array(
':number' => $randomNumber
));
$numberOfLoops++;
// the magic
if (!isset($reported) && $numberOfLoops / 10 > 0.5) {
/**
* We can assume that at least 50% of numbers
* are already in use, so increment values of
* $min and $max in configuration.
*/
report_this_fact();
$reported = true;
} while ($insertedRows < 1);
All values ($min, $max, 0.5) are just for explanation and they have no statistical meaning.
Functions insert_to_table and report_this_fact are not build in PHP. The are also as numbers just for clarify of explanation purposes.
You can use an AUTO_INCREMENT for your table, but give the users the encrypted version:
encrypted_id: SELECT HEX(AES_ENCRYPT(id, 'my-private-key'));
id: SELECT AES_DECRYPT(UNHEX(encrypted_id), 'my-private-key');
my way, for both 32bit and 64bit platform. result is 64bit
function hexstr2decstr($hexstr){
$bigint = gmp_init($hexstr, 16);
$bigint_string = gmp_strval($bigint);
return $bigint_string;
}
function generate_64bitid(){
return substr(md5(uniqid(rand(), true)), 16, 16);
}
function dbGetUniqueXXXId(){
for($i = 0; $i < 10; $i++){
$decstr = hexstr2decstr(generate_64bitid());
//check duplicate for mysql.tablexxx
if($dup == false){
return $decstr;
}
}
return false;
}
AUTO_INCREMENT is going to be your best bet for this.
Here are some examples.
If you need to you can adjust where the increment value starts (by default it's 1).
There is an AUTO_INCREMENT feature. I would use that.
See here more examples.

Do MERGE using Linq to SQL

SQL Server 2008 Ent
ASP.NET MVC 2.0
Linq-to-SQL
I am building a gaming site, that tracks when a particular player (toon) had downed a particular monster (boss). Table looks something like:
int ToonId
int BossId
datetime LastKillTime
I use a 3d party service that gives me back latest information (toon,boss,time).
Now I want to update my database with that new information.
Brute force approach is to do line-by-line upsert. But It looks ugly (code-wise), and probably slow too.
I think better solution would be to insert new data (using temp table?) and then run MERGE statement.
Is it good idea? I know temp tables are "better-to-avoid". Should I create a permanent "temp" table just for this operation?
Or should I just read entire current set (100 rows at most), do merge and put it back from within application?
Any pointers/suggestions are always appreciated.
An ORM is the wrong tool for performing batch operations, and Linq-to-SQL is no exception. In this case I think you have picked the right solution: Store all entries in a temporary table quickly, then do the UPSERT using merge.
The fastest way to store the data to the temporary table is to use SqlBulkCopy to store all data to a table of your choice.
If you're using Linq-to-SQL, upserts aren't that ugly..
foreach (var line in linesFromService) {
var kill = db.Kills.FirstOrDefault(t=>t.ToonId==line.ToonId && t.BossId==line.BossId);
if (kill == null) {
kill = new Kills() { ToonId = line.ToonId, BossId = line.BossId };
db.Kills.InsertOnSubmit(kill);
}
kill.LastKillTime = line.LastKillTime;
}
db.SubmitChanges();
Not a work of art, but nicer than in SQL. Also, with only 100 rows, I wouldn't be too concerned about performance.
Looks like a straight-forward insert.
private ToonModel _db = new ToonModel();
Toon t = new Toon();
t.ToonId = 1;
t.BossId = 2;
t.LastKillTime = DateTime.Now();
_db.Toons.InsertOnSubmit(t);
_db.SubmitChanges();
To update without querying the records first, you can do the following. It will still hit the db once to check if record exists but will not pull the record:
var blob = new Blob { Id = "some id", Value = "some value" }; // Id is primary key (PK)
if (dbContext.Blobs.Contains(blob)) // if blob exists by PK then update
{
// This will update all columns that are not set in 'original' object. For
// this to work, Blob has to have UpdateCheck=Never for all properties except
// for primary keys. This will update the record without querying it first.
dbContext.Blobs.Attach(blob, original: new Blob { Id = blob.Id });
}
else // insert
{
dbContext.Blobs.InsertOnSubmit(blob);
}
dbContext.Blobs.SubmitChanges();
See here for an extension method for this.