Iterating through multiline input, and match to database items - mysql

I need help iterating through input to a webapp I'm writing, which looks like:
The users will be inputting several hundred (or thousands) of urls pasted from excel documents, each on a new line like this. Thus far, as you can see, I've created the input page, an output page, and written the code to query the database.
from flask import Flask,render_template, request
from flask_sqlalchemy import SQLAlchemy
from urllib.parse import urlparse
from sqlalchemy.ext.declarative import declarative_base
app = Flask(__name__)
app.config["DEBUG"] = True
app.config["SECRET_KEY"] = "secret_key_here"
db = SQLAlchemy(app)
SQLALCHEMY_DATABASE_URI = db.create_engine(connector_string_here))
app.config[SQLALCHEMY_DATABASE_URI] = SQLALCHEMY_DATABASE_URI
app.config["SQLALCHEMY_POOL_RECYCLE"] = 299
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
db.Model = declarative_base()
class Scrapers(db.Model):
__tablename__ = "Scrapers"
id = db.Column(db.Integer, primary_key = True)
scraper_dom = db.Column(db.String(255))
scraper_id = db.Column(db.String(128))
db.Model.metadata.create_all(SQLALCHEMY_DATABASE_URI)
Session = db.sessionmaker()
Session.configure(bind=SQLALCHEMY_DATABASE_URI)
session = Session()
scrapers = session.query(Scrapers.scraper_dom, Scrapers.scraper_id).all()
#app.route("/", methods=["GET","POST"])
def index():
if request.method == "Get":
return render_template("url_page.html")
else:
return render_template("url_page.html")
#app.route("/submit", methods=["GET","POST"])
def submit():
sites = [request.form["urls"]]
for site in sites:
que = urlparse(site).netloc
return render_template("submit.html", que=que)
#scrapers.filter(Scrapers.scraper_dom.in_(
#next(x.scraper_id for x in scrapers if x.matches(self.fnetloc))
As is apparent, this is incomplete. I've omitted previous attempts at matching the input, as I realized I had issues iterating through the input. At first, I could only get it to print all of the input instead of iterating over it. And now, it prints like this:
Which is just repeating the urlparse(site).netloc for the first line of input, some random number of times. It is parsing correctly and returning the actual value I will need to use later (for each urlparse(site).netloc match scraper_dom and return associated scraper_id). Now, though, I've tried using input() but kept getting errors with [request.form["urls"]] not being an iterable.
Please help, it'd be much appreciated.
Output of sites:
New output with:
que = [urlparse(site).netloc for site in request.form["urls"].split('\n')]

Related

How do I store a contentfile into ImageField in Django

I am trying to convert an image uploaded by user into a PDF , and then store it into an ImageField in a mysql database ,using a form, but am facing an error when trying to store the PDF into the database
My views.py is:
from django.core.files.storage import FileSystemStorage
from PIL import Image
import io
from io import BytesIO
from django.core.files.uploadedfile import InMemoryUploadedFile
from django.core.files.base import ContentFile
def formsubmit(request): #submits the form
docs = request.FILES.getlist('photos')
print(docs)
section = request.POST['section']
for x in docs:
fs = FileSystemStorage()
print(type(x.size))
img = Image.open(io.BytesIO(x.read()))
imgc = img.convert('RGB')
pdfdata = io.BytesIO()
imgc.save(pdfdata,format='PDF')
thumb_file = ContentFile(pdfdata.getvalue())
filename = fs.save('photo.pdf', thumb_file)
linkobj = Link(link = filename.file, person = Section.objects.get(section_name = section), date = str(datetime.date.today()), time = datetime.datetime.now().strftime('%H:%M:%S'))
linkobj.save()
count += 1
size += x.size
return redirect('index')
My models.py:
class Link(models.Model):
id = models.BigAutoField(primary_key=True)
person = models.ForeignKey(Section, on_delete=models.CASCADE)
link = models.ImageField(upload_to= 'images', default = None)
date = models.CharField(max_length=80, default = None)
time = models.CharField(max_length=80,default = None)
Error I am getting is:
AttributeError: 'str' object has no attribute 'file'
Other methods I have tried:
1) linkobj = Link(link = thumb_file, person = Section.objects.get(section_name = section), date = str(datetime.date.today()), time = datetime.datetime.now().strftime('%H:%M:%S'))
RESULT OF ABOVE METHOD:
1)The thumb_file doesnt throw an error, rather it stores nothing in the database
Points I have noticed:
1)The file is being stored properly into the media folder, ie: I can see the pdf getting stored in the media folder
How do I solve this? Thank you
You don't (basically ever) need to initialize a Storage by yourself. This holds especially true since the storage for the field might not be a FileSystemStorage at all, but could e.g. be backed by S3.
Something like
import datetime
import io
from PIL import Image
from django.core.files.base import ContentFile
def convert_image_to_pdf_data(image):
img = Image.open(io.BytesIO(image.read()))
imgc = img.convert("RGB")
pdfdata = io.BytesIO()
imgc.save(pdfdata, format="PDF")
return pdfdata.getvalue()
def formsubmit(request): # submits the form
photos = request.FILES.getlist("photos") # list of UploadedFiles
section = request.POST["section"]
person = Section.objects.get(section_name=section)
date = str(datetime.date.today())
time = datetime.datetime.now().time("%H:%M:%S")
count = 0
size = 0
for image in photos:
pdfdata = convert_image_to_pdf_data(image)
thumb_file = ContentFile(pdfdata, name="photo.pdf")
Link.objects.create(
link=thumb_file,
person=person,
date=date,
time=time,
)
count += 1
size += image.size
return redirect("index")
should be enough here, i.e. using a ContentFile for the converted PDF content; the field should deal with saving it into the storage.
(As an aside, why are date and time stored separately as strings? Your database surely has a datetime type...)
Ok so I found an answer, to be fair I wont accept my own answer as it doesn't provide an exact answer to the question I asked, rather its a different method, so if anyone does know , please do share so that the community can benefit:
My Solution:
Instead of using ContentFile, I used InMemoryUploadedFile, to store the converted pdf and then moved it into the database( in an ImageField)
I am going to be honest, I am not completely sure about why ContentFile was not working, but when going through the documentation I found out that :
The ContentFile class inherits from File, but unlike File it operates on string content (bytes also supported), rather than an actual file.
Any detailed explanation is welcome
My new views.py
from django.core.files.storage import FileSystemStorage
from PIL import Image
import io
from io import BytesIO
from django.core.files.uploadedfile import InMemoryUploadedFile
from django.core.files.base import ContentFile
import sys
def formsubmit(request): #submits the form
docs = request.FILES.getlist('photos')
print(docs)
section = request.POST['section']
for x in docs:
fs = FileSystemStorage()
print(type(x.size))
img = Image.open(io.BytesIO(x.read()))
imgc = img.convert('RGB')
pdfdata = io.BytesIO()
imgc.save(pdfdata,format='PDF')
thumb_file = InMemoryUploadedFile(pdfdata, None, 'photo.pdf', 'pdf',sys.getsizeof(pdfdata), None)
linkobj = Link(link = thumb_file, person = Section.objects.get(section_name = section), date = str(datetime.date.today()), time = datetime.datetime.now().strftime('%H:%M:%S'))
linkobj.save()
count += 1
size += x.size
return redirect('index')
If you have a question, you can leave it in the comments and ill try to answer it, Good luck!!!

How to receive data from buildin function of another class in another module

I am programming a case manager (administration system). To build it constructively, I program in separate modules to keep an overview. Some modules contain a class-object where I build an small search engine including its own functions. The main program is the case form itself. Obviously, when the search engine finds an entry, it should fill in the case form. I am able to call the search engine (and the search engine works to), however I don't know how to return the results back to the main program/case form/module.
To give you a picture, I have added a image of the GUI, so you can see the case form and the search engine (which is a different module and class (inheriting tk.Toplevel)
The relevant code (case_form/main program):
import ReferenceSearch as rs #Own module
def search_ref(self):
#Function to call search engine
search_engine = rs.ReferenceSearch(self, self.csv_file.get(), self.references_list)
#Reveive data from search_engine and show it in case_form
self.title_var.set(search_engine) #DOES NOT WORK BECAUSE search_engine IS THE ACTUAL ENGINE NOT THE
DATA returned from its buildin function
Relevant code in ReferenceSearch module:
class ReferenceSearch(tk.Toplevel):
def __init__(self, parent, csv_file,references_list=[]):
super().__init__()
self.parent = parent
self.csv_file = csv_file
self.references_list = references_list
self.ref_search_entry = ttk.Entry(self.search_frame)
self.search_but = tk.Button(self.search_frame,
text=" Search ",
command=lambda:self.search_for_ref(self.ref_search_entry.get())
def search_for_ref(self, reference, csv_file="Cases.csv"):
#Function to read specific entry by reference
if reference in self.references_list:
with open(csv_file, "r", newline="") as file:
reader = csv.DictReader(file, delimiter="|")
for entry in reader:
if reference == entry["Reference"]:
data = entry["Title"] #By example
return data
How do I receive the data from the buildin function of the ReferenceSearch class and use it in the main module the case_form?
Keep in mind that the ReferenceSearch module is calling this function when the search button is pressed (and not the case_form module). However, the data is needed in the case_form module.
Change the ReferenceSearch module contents to:
class ReferenceSearch(tk.Toplevel):
def __init__(self, parent, csv_file,references_list=[]):
super().__init__()
self.data = ""
self.parent = parent
self.csv_file = csv_file
self.references_list = references_list
self.ref_search_entry = ttk.Entry(self.search_frame)
self.search_but = tk.Button(self.search_frame,
text=" Search ",
command=lambda:self.search_for_ref(self.ref_search_entry.get())
def search_for_ref(self, reference, csv_file="Cases.csv"):
#Function to read specific entry by reference
if reference in self.references_list:
with open(csv_file, "r", newline="") as file:
reader = csv.DictReader(file, delimiter="|")
for entry in reader:
if reference == entry["Reference"]:
data = entry["Title"] #By example
self.parent.title_var.set(data)
and case_form contents to:
import ReferenceSearch as rs
def search_ref(self):
#Function to call search engine
search_engine = rs.ReferenceSearch(self, self.csv_file.get(), self.references_list)

DjangoRestFramework: how to import csv file in django-restapi?

recently i followed this tutorial, but it was completely based on server side rendering. i didn't get my answer as expected what i am looking for. i want exactly same as with rest-api.
if anyone could help
me what i am looking for then would be much appreciated. thank you in advance!
i just tried in this way but it giving me error that method is not allowed!
from .resources import PersonResource
class UploadAPIView(APIView):
def simple_upload(request):
if request.method == 'POST':
person_resource = PersonResource()
dataset = Dataset()
new_persons = request.FILES['myfile']
imported_data = dataset.load(new_persons.read())
result = person_resource.import_data(dataset, dry_run=True) # Test the data import
if not result.has_errors():
person_resource.import_data(dataset, dry_run=False) # Actually import now

Implementing MySQL "generated columns" on Django 1.8/1.9

I discovered the new generated columns functionality of MySQL 5.7, and wanted to replace some properties of my models by those kind of columns. Here is a sample of a model:
class Ligne_commande(models.Model):
Quantite = models.IntegerField()
Prix = models.DecimalField(max_digits=8, decimal_places=3)
Discount = models.DecimalField(max_digits=5, decimal_places=3, blank=True, null=True)
#property
def Prix_net(self):
if self.Discount:
return (1 - self.Discount) * self.Prix
return self.Prix
#property
def Prix_total(self):
return self.Quantite * self.Prix_net
I defined generated field classes as subclasses of Django fields (e.g. GeneratedDecimalField as a subclass of DecimalField). This worked in a read-only context and Django migrations handles it correctly, except a detail : generated columns of MySQL does not support forward references and django migrations does not respect the order the fields are defined in a model, so the migration file must be edited to reorder operations.
After that, trying to create or modify an instance returned the mysql error : 'error totally whack'. I suppose Django tries to write generated field and MySQL doesn't like that. After taking a look to django code I realized that, at the lowest level, django uses the _meta.local_concrete_fields list and send it to MySQL. Removing the generated fields from this list fixed the problem.
I encountered another problem: during the modification of an instance, generated fields don't reflect the change that have been made to the fields from which they are computed. If generated fields are used during instance modification, as in my case, this is problematic. To fix that point, I created a "generated field descriptor".
Here is the final code of all of this.
Creation of generated fields in the model, replacing the properties defined above:
Prix_net = mk_generated_field(models.DecimalField, max_digits=8, decimal_places=3,
sql_expr='if(Discount is null, Prix, (1.0 - Discount) * Prix)',
pyfunc=lambda x: x.Prix if not x.Discount else (1 - x.Discount) * x.Prix)
Prix_total = mk_generated_field(models.DecimalField, max_digits=10, decimal_places=2,
sql_expr='Prix_net * Quantite',
pyfunc=lambda x: x.Prix_net * x.Quantite)
Function that creates generated fields. Classes are dynamically created for simplicity:
from django.db.models import fields
def mk_generated_field(field_klass, *args, sql_expr=None, pyfunc=None, **kwargs):
assert issubclass(field_klass, fields.Field)
assert sql_expr
generated_name = 'Generated' + field_klass.__name__
try:
generated_klass = globals()[generated_name]
except KeyError:
globals()[generated_name] = generated_klass = type(generated_name, (field_klass,), {})
def __init__(self, sql_expr, pyfunc=None, *args, **kwargs):
self.sql_expr = sql_expr
self.pyfunc = pyfunc
self.is_generated = True # mark the field
# null must be True otherwise migration will ask for a default value
kwargs.update(null=True, editable=False)
super(generated_klass, self).__init__(*args, **kwargs)
def db_type(self, connection):
assert connection.settings_dict['ENGINE'] == 'django.db.backends.mysql'
result = super(generated_klass, self).db_type(connection)
# double single '%' if any because it will clash with later Django format
sql_expr = re.sub('(?<!%)%(?!%)', '%%', self.sql_expr)
result += ' GENERATED ALWAYS AS (%s)' % sql_expr
return result
def deconstruct(self):
name, path, args, kwargs = super(generated_klass, self).deconstruct()
kwargs.update(sql_expr=self.sql_expr)
return name, path, args, kwargs
generated_klass.__init__ = __init__
generated_klass.db_type = db_type
generated_klass.deconstruct = deconstruct
return generated_klass(sql_expr, pyfunc, *args, **kwargs)
The function to register generated fields in a model. It must be called at django start-up, for example in the ready method of the AppConfig of the application.
from django.utils.datastructures import ImmutableList
def register_generated_fields(model):
local_concrete_fields = list(model._meta.local_concrete_fields[:])
generated_fields = []
for field in model._meta.fields:
if hasattr(field, 'is_generated'):
local_concrete_fields.remove(field)
generated_fields.append(field)
if field.pyfunc:
setattr(model, field.name, GeneratedFieldDescriptor(field.pyfunc))
if generated_fields:
model._meta.local_concrete_fields = ImmutableList(local_concrete_fields)
And the descriptor. Note that it is used only if a pyfunc is defined for the field.
class GeneratedFieldDescriptor(object):
attr_prefix = '_GFD_'
def __init__(self, pyfunc, name=None):
self.pyfunc = pyfunc
self.nickname = self.attr_prefix + (name or str(id(self)))
def __get__(self, instance, owner):
if instance is None:
return self
if hasattr(instance, self.nickname) and not instance.has_changed:
return getattr(instance, self.nickname)
return self.pyfunc(instance)
def __set__(self, instance, value):
setattr(instance, self.nickname, value)
def __delete__(self, instance):
delattr(instance, self.nickname)
Note the instance.has_changed that must tell if the instance is being modified. If found a solution for this here.
I have done extensive tests of my application and it works fine, but I am far from using all django functionalities. My question is: could this settings clash with some use cases of django ?

"Class already has a primary mapper defined" error with SQLAlchemy

Back in October 2010, I posted this question to the Sqlalchemy user list.
At the time, I just used the clear_mappers workaround mentioned in the message, and didn't try to figure out what the problem was. That was very naughty of me. Today I ran into this bug again, and decided to construct a minimal example, which appears below. Michael also addressed what is probably the same issue back in 2006. I decided to follow up here, to give Michael a break from my dumb questions.
So, the upshot appears to be that for a given class definition, you can't have more than one mapper defined. In my case I have the Pheno class declared in module scope (I assume that is top level scope here) and each time make_tables runs, it tries to define another mapper.
Mike wrote "Based on the description of the problem above, you need to ensure your Python classes are declared in the same scope as your mappers. The error message you're getting suggests that 'Pheno' is declared at the module level." That would take care of the problem, but how do I manage that, without altering my current structure? What other options do I have, if any? Apparently mapper doesn't have an option like "if the mapper is already defined, exit without doing anything", which would take care of it nicely. I guess I could define a wrapper function, but that would be pretty ugly.
from sqlalchemy import *
from sqlalchemy.orm import *
def make_pheno_table(meta, schema, name='pheno'):
pheno_table = Table(
name, meta,
Column('patientid', String(60), primary_key=True),
schema=schema,
)
return pheno_table
class Pheno(object):
def __init__(self, patientid):
self.patientid = patientid
def make_tables(schema):
from sqlalchemy import MetaData
meta = MetaData()
pheno_table = make_pheno_table(meta, schema)
mapper(Pheno, pheno_table)
table_dict = {'metadata': meta, 'pheno_table':pheno_table}
return table_dict
table_dict = make_tables('foo')
table_dict = make_tables('bar')
Error message follows. Tested with SQLAlchemy 0.6.3-3 on Debian squeeze.
$ python test.py
Traceback (most recent call last):
File "test.py", line 25, in <module>
table_dict = make_tables('bar')
File "test.py", line 20, in make_tables
mapper(Pheno, pheno_table)
File "/usr/lib/python2.6/dist-packages/sqlalchemy/orm/__init__.py", line 818, in mapper
return Mapper(class_, local_table, *args, **params)
File "/usr/lib/python2.6/dist-packages/sqlalchemy/orm/mapper.py", line 209, in __init__
self._configure_class_instrumentation()
File "/usr/lib/python2.6/dist-packages/sqlalchemy/orm/mapper.py", line 381, in _configure_class_instrumentation
self.class_)
sqlalchemy.exc.ArgumentError: Class '<class '__main__.Pheno'>' already has a primary mapper defined. Use non_primary=True to create a non primary Mapper. clear_mappers() will remove *all* current mappers from all classes.
EDIT: Per the documentation in SQLAlchemy: The mapper() API, I could replace mapper(Pheno, pheno_table) above with
from sqlalchemy.orm.exc import UnmappedClassError
try:
class_mapper(Pheno)
except UnmappedClassError:
mapper(Pheno, pheno_table)
If a mapper is not defined for Pheno, it throws an UnmappedClassError. This at least doesn't return an error in my test script, but I haven't checked if it actually works. Comments?
EDIT2: Per Denis's suggestion, the following works:
class Tables(object):
def make_tables(self, schema):
class Pheno(object):
def __init__(self, patientid):
self.patientid = patientid
from sqlalchemy import MetaData
from sqlalchemy.orm.exc import UnmappedClassError
meta = MetaData()
pheno_table = make_pheno_table(meta, schema)
mapper(Pheno, pheno_table)
table_dict = {'metadata': meta, 'pheno_table':pheno_table, 'Pheno':Pheno}
return table_dict
table_dict = Tables().make_tables('foo')
table_dict = Tables().make_tables('bar')
However, the superficially similar
# does not work
class Tables(object):
class Pheno(object):
def __init__(self, patientid):
self.patientid = patientid
def make_tables(self, schema):
from sqlalchemy import MetaData
from sqlalchemy.orm.exc import UnmappedClassError
meta = MetaData()
pheno_table = make_pheno_table(meta, schema)
mapper(self.Pheno, pheno_table)
table_dict = {'metadata': meta, 'pheno_table':pheno_table, 'Pheno':self.Pheno}
return table_dict
table_dict = Tables().make_tables('foo')
table_dict = Tables().make_tables('bar')
does not. I get the same error message as before.
I don't really understand the scoping issues well enough to say why.
Isn't the Pheno class in both cases in some kind of local scope?
You are trying to map the same class Pheno to 2 different tables. SQLAlchemy allows only one primary mapper for each class, so that it knows what table to use for session.query(Pheno). It's not clear what do you wish to get from your question, so I can't propose solution. There are 2 obvious options:
define separate class to map to second table,
create non-primary mapper for second table by passing non_primary=True parameter and pass it (the value returned by mapper() function) to session.query() instead of class.
Update: to define separate class for each table you can put its definition into the make_tables():
def make_tables(schema):
from sqlalchemy import MetaData
meta = MetaData()
pheno_table = make_pheno_table(meta, schema)
class Pheno(object):
def __init__(self, patientid):
self.patientid = patientid
mapper(Pheno, pheno_table)
table_dict = {'metadata': meta,
'pheno_class': Pheno,
'pheno_table':pheno_table}
return table_dict
maybe i didn't quite understand what you want, but this recipe create identical column in different __tablename__
class TBase(object):
"""Base class is a 'mixin'.
Guidelines for declarative mixins is at:
http://www.sqlalchemy.org/docs/orm/extensions/declarative.html#mixin-classes
"""
id = Column(Integer, primary_key=True)
data = Column(String(50))
def __repr__(self):
return "%s(data=%r)" % (
self.__class__.__name__, self.data
)
class T1Foo(TBase, Base):
__tablename__ = 't1'
class T2Foo(TBase, Base):
__tablename__ = 't2'
engine = create_engine('sqlite:///foo.db', echo=True)
Base.metadata.create_all(engine)
sess = sessionmaker(engine)()
sess.add_all([T1Foo(data='t1'), T1Foo(data='t2'), T2Foo(data='t3'),
T1Foo(data='t4')])
print sess.query(T1Foo).all()
print sess.query(T2Foo).all()
sess.commit()
info in example sqlalchemy