Representation of arrays in MySQL - mysql

We are doing a project which is using the django framework with MySQL database. I wanted to make an array in the models by using
CommaSeparatedIntegerField.
eg:
class MyModel(models.Model):
values = CommaSeparatedIntegerField(max_length = 200)
How will this be represented in MySQL?

There are no arrays in MySQL. If you store array as a comma separated string, then you will have problems with selecting, modifying data and optimization.
So, I'd suggest you to store items in table's rows.

You might be better off avoiding that: https://en.wikipedia.org/wiki/1NF

I had to do something of this sort some time back and what i did was used the way suggested here.
http://justcramer.com/2008/08/08/custom-fields-in-django/
from django.db import models
class SeparatedValuesField(models.TextField):
__metaclass__ = models.SubfieldBase
def __init__(self, *args, **kwargs):
self.token = kwargs.pop('token', ',')
super(SeparatedValuesField, self).__init__(*args, **kwargs)
def to_python(self, value):
if not value: return
if isinstance(value, list):
return value
return value.split(self.token)
def get_db_prep_value(self, value):
if not value: return
assert(isinstance(value, list) or isinstance(value, tuple))
return self.token.join([unicode(s) for s in value])
def value_to_string(self, obj):
value = self._get_val_from_obj(obj)
return self.get_db_prep_value(value)

Related

difference between Dataset and TensorDataset in pyTorch

what is the difference between "torch.utils.data.TensorDataset" and "torch.utils.data.Dataset" - the docs are not clear about that and I could not find any answers on google.
The Dataset class is an abstract class that is used to define new types of (customs) datasets. Instead, the TensorDataset is a ready to use class to represent your data as list of tensors.
You can define your custom dataset in the following way:
class CustomDataset(torch.utils.data.Dataset):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Your code
self.instances = your_data
def __getitem__(self, idx):
return self.instances[idx] # In case you stored your data on a list called instances
def __len__(self):
return len(self.instances)
If you just want to create a dataset that contains tensors for input features and labels, then use the TensorDataset directly:
dataset = TensorDataset(input_features, labels)
Note that input_features and labels must match on the length of the first dimension.

Consuming JSON API data with Django

From a Django app, I am able to consume data from a separate Restful API, but what about filtering? Below returns all books and its data. But what if I want to grab only books by an author, date, etc.? I want to pass an author's name parameter, e.g. .../authors-name or /?author=name and return only those in the json response. Is this possible?
views.py
def get_books(request):
response = requests.get('http://localhost:8090/book/list/').json()
return render(request, 'books.html', {'response':response})
So is there a way to filter like a model object?
I can think of three ways of doing this:
Python's filter could be used with a bit of additional code.
QueryableList, which is the closest to an ORM for lists I've seen.
query-filter, which takes a more functional approach.
1. Build-in filter function
You can write a function that returns functions that tell you whether a list element is a match and the pass the generated function into filter.
def filter_pred_factory(**kwargs):
def predicate(item):
for key, value in kwargs.items():
if key not in item or item[key] != value:
return False
return True
return predicate
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
pred = filter_pred_factory(**request.GET)
data_filter = filter(pred, books_data)
# data_filter is cast to a list as a precaution
# because it is a filter object,
# which can only be iterated through once before it's exhausted.
filtered_data = list(data_filter)
return render(request, 'books.html', {'books': filtered_data})
2. QueryableList
QueryableList would achieve the same as the above, with some extra features. As well as /books?isbn=1933988673, you could use queries like /books?longDescription__icontains=linux. You can find other functionality here
from QueryableList import QueryableListDicts
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
queryable_books = QueryableListDicts(books_data)
filtered_data = queryable_books.filter(**request.GET)
return render(request, 'books.html', {'books':filtered_data})
3. query-filter
query-filter has similar features but doesn't copy the object-orient approach of an ORM.
from query_filter import q_filter, q_items
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
data_filter = q_filter(books_data, q_items(**request.GET))
# filtered_data is cast to a list as a precaution
# because q_filter returns a filter object,
# which can only be iterated through once before it's exhausted.
filtered_data = list(data_filter)
return render(request, 'books.html', {'books': filtered_data})
It's worth mentioning that I wrote query-filter.

Implementing MySQL "generated columns" on Django 1.8/1.9

I discovered the new generated columns functionality of MySQL 5.7, and wanted to replace some properties of my models by those kind of columns. Here is a sample of a model:
class Ligne_commande(models.Model):
Quantite = models.IntegerField()
Prix = models.DecimalField(max_digits=8, decimal_places=3)
Discount = models.DecimalField(max_digits=5, decimal_places=3, blank=True, null=True)
#property
def Prix_net(self):
if self.Discount:
return (1 - self.Discount) * self.Prix
return self.Prix
#property
def Prix_total(self):
return self.Quantite * self.Prix_net
I defined generated field classes as subclasses of Django fields (e.g. GeneratedDecimalField as a subclass of DecimalField). This worked in a read-only context and Django migrations handles it correctly, except a detail : generated columns of MySQL does not support forward references and django migrations does not respect the order the fields are defined in a model, so the migration file must be edited to reorder operations.
After that, trying to create or modify an instance returned the mysql error : 'error totally whack'. I suppose Django tries to write generated field and MySQL doesn't like that. After taking a look to django code I realized that, at the lowest level, django uses the _meta.local_concrete_fields list and send it to MySQL. Removing the generated fields from this list fixed the problem.
I encountered another problem: during the modification of an instance, generated fields don't reflect the change that have been made to the fields from which they are computed. If generated fields are used during instance modification, as in my case, this is problematic. To fix that point, I created a "generated field descriptor".
Here is the final code of all of this.
Creation of generated fields in the model, replacing the properties defined above:
Prix_net = mk_generated_field(models.DecimalField, max_digits=8, decimal_places=3,
sql_expr='if(Discount is null, Prix, (1.0 - Discount) * Prix)',
pyfunc=lambda x: x.Prix if not x.Discount else (1 - x.Discount) * x.Prix)
Prix_total = mk_generated_field(models.DecimalField, max_digits=10, decimal_places=2,
sql_expr='Prix_net * Quantite',
pyfunc=lambda x: x.Prix_net * x.Quantite)
Function that creates generated fields. Classes are dynamically created for simplicity:
from django.db.models import fields
def mk_generated_field(field_klass, *args, sql_expr=None, pyfunc=None, **kwargs):
assert issubclass(field_klass, fields.Field)
assert sql_expr
generated_name = 'Generated' + field_klass.__name__
try:
generated_klass = globals()[generated_name]
except KeyError:
globals()[generated_name] = generated_klass = type(generated_name, (field_klass,), {})
def __init__(self, sql_expr, pyfunc=None, *args, **kwargs):
self.sql_expr = sql_expr
self.pyfunc = pyfunc
self.is_generated = True # mark the field
# null must be True otherwise migration will ask for a default value
kwargs.update(null=True, editable=False)
super(generated_klass, self).__init__(*args, **kwargs)
def db_type(self, connection):
assert connection.settings_dict['ENGINE'] == 'django.db.backends.mysql'
result = super(generated_klass, self).db_type(connection)
# double single '%' if any because it will clash with later Django format
sql_expr = re.sub('(?<!%)%(?!%)', '%%', self.sql_expr)
result += ' GENERATED ALWAYS AS (%s)' % sql_expr
return result
def deconstruct(self):
name, path, args, kwargs = super(generated_klass, self).deconstruct()
kwargs.update(sql_expr=self.sql_expr)
return name, path, args, kwargs
generated_klass.__init__ = __init__
generated_klass.db_type = db_type
generated_klass.deconstruct = deconstruct
return generated_klass(sql_expr, pyfunc, *args, **kwargs)
The function to register generated fields in a model. It must be called at django start-up, for example in the ready method of the AppConfig of the application.
from django.utils.datastructures import ImmutableList
def register_generated_fields(model):
local_concrete_fields = list(model._meta.local_concrete_fields[:])
generated_fields = []
for field in model._meta.fields:
if hasattr(field, 'is_generated'):
local_concrete_fields.remove(field)
generated_fields.append(field)
if field.pyfunc:
setattr(model, field.name, GeneratedFieldDescriptor(field.pyfunc))
if generated_fields:
model._meta.local_concrete_fields = ImmutableList(local_concrete_fields)
And the descriptor. Note that it is used only if a pyfunc is defined for the field.
class GeneratedFieldDescriptor(object):
attr_prefix = '_GFD_'
def __init__(self, pyfunc, name=None):
self.pyfunc = pyfunc
self.nickname = self.attr_prefix + (name or str(id(self)))
def __get__(self, instance, owner):
if instance is None:
return self
if hasattr(instance, self.nickname) and not instance.has_changed:
return getattr(instance, self.nickname)
return self.pyfunc(instance)
def __set__(self, instance, value):
setattr(instance, self.nickname, value)
def __delete__(self, instance):
delattr(instance, self.nickname)
Note the instance.has_changed that must tell if the instance is being modified. If found a solution for this here.
I have done extensive tests of my application and it works fine, but I am far from using all django functionalities. My question is: could this settings clash with some use cases of django ?

sqlalchemy: how to block updates on a specific column

I have a declarative mapping:
class User(base):
username = Column(Unicode(30), unique=True)
How can I tell sqlalchemy that this attribute may not be modified?
The workaround I came up with is kindof hacky:
from werkzeug.utils import cached_property
# regular #property works, too
class User(base):
_username = Column('username', Unicode(30), unique=True)
#cached_property
def username(self):
return self._username
def __init__(self, username, **kw):
super(User,self).__init__(**kw)
self._username=username
Doing this on the database column permission level will not work because not all databases support that.
You can use the validates SQLAlchemy feature.
from sqlalchemy.orm import validates
...
class User(base):
...
#validates('username')
def validates_username(self, key, value):
if self.username: # Field already exists
raise ValueError('Username cannot be modified.')
return value
reference: https://docs.sqlalchemy.org/en/13/orm/mapped_attributes.html#simple-validators
I can suggest the following ways do protect column from modification:
First is using hook when any attribute is being set:
In case above all column in all tables of Base declarative will be hooked, so you need somehow to store information about whether column can be modified or not. For example you could inherit sqlalchemy.Column class to add some attribute to it and then check attribute in the hook.
class Column(sqlalchemy.Column):
def __init__(self, *args, **kwargs):
self.readonly = kwargs.pop("readonly", False)
super(Column, self).__init__(*args, **kwargs)
# noinspection PyUnusedLocal
#event.listens_for(Base, 'attribute_instrument')
def configure_listener(class_, key, inst):
"""This event is called whenever an attribute on a class is instrumented"""
if not hasattr(inst.property, 'columns'):
return
# noinspection PyUnusedLocal
#event.listens_for(inst, "set", retval=True)
def set_column_value(instance, value, oldvalue, initiator):
"""This event is called whenever a "set" occurs on that instrumented attribute"""
logging.info("%s: %s -> %s" % (inst.property.columns[0], oldvalue, value))
column = inst.property.columns[0]
# CHECK HERE ON CAN COLUMN BE MODIFIED IF NO RAISE ERROR
if not column.readonly:
raise RuntimeError("Column %s can't be changed!" % column.name)
return value
To hook concrete attributes you can do the next way (adding attribute to column not required):
# standard decorator style
#event.listens_for(SomeClass.some_attribute, 'set')
def receive_set(target, value, oldvalue, initiator):
"listen for the 'set' event"
# ... (event handling logic) ...
Here is guide about SQLAlchemy events.
Second way that I can suggest is using standard Python property or SQLAlchemy hybrid_property as you have shown in your question, but using this approach result in code growing.
P.S. I suppose that compact way is add attribute to column and hook all set event.
Slight correction to #AlexQueue answer.
#validates('username')
def validates_username(self, key, value):
if self.username and self.username != value: # Field already exists
raise ValueError('Username cannot be modified.')
return value

Random ids in sqlalchemy (pylons)

I'm using pylons and sqlalchemy and I was wondering how I could have some randoms ids as primary_key.
the best way is to use randomly generated UUIDs:
import uuid
id = uuid.uuid4()
uuid datatypes are available natively in some databases such as Postgresql (SQLAlchemy has a native PG uuid datatype for this purpose - in 0.5 its called sqlalchemy.databases.postgres.PGUuid). You should also be able to store a uuid in any 16 byte CHAR field (though I haven't tried this specifically on MySQL or others).
i use this pattern and it works pretty good. source
from sqlalchemy import types
from sqlalchemy.databases.mysql import MSBinary
from sqlalchemy.schema import Column
import uuid
class UUID(types.TypeDecorator):
impl = MSBinary
def __init__(self):
self.impl.length = 16
types.TypeDecorator.__init__(self,length=self.impl.length)
def process_bind_param(self,value,dialect=None):
if value and isinstance(value,uuid.UUID):
return value.bytes
elif value and not isinstance(value,uuid.UUID):
raise ValueError,'value %s is not a valid uuid.UUID' % value
else:
return None
def process_result_value(self,value,dialect=None):
if value:
return uuid.UUID(bytes=value)
else:
return None
def is_mutable(self):
return False
id_column_name = "id"
def id_column():
import uuid
return Column(id_column_name,UUID(),primary_key=True,default=uuid.uuid4)
#usage
my_table = Table('test',metadata,id_column(),Column('parent_id',UUID(),ForeignKey(table_parent.c.id)))
Though zzzeek I believe is the author of sqlalchemy, so if this is wrong he would know, and I would listen to him.
Or with ORM mapping:
import uuid
from sqlalchemy import Column, Integer, String, Boolean
def uuid_gen():
return str(uuid.uuid4())
Base = declarative_base()
class Device(Base):
id = Column(String, primary_key=True, default=uuid_gen)
This stores it as a string providing better database compatibility. However, you lose the database's ability to more optimally store and use the uuid.