Hash a Select SQLAlchemy query - sqlalchemy

I have a SQLAlchemy query that I build, such as :
query_one = User.query.filter(User.id == 1) # Note that I don't call .first() or .all() as I want the "select" instance.
I want to store this Select query in such a way that I can retrieve it by having the same query :
stored_queries = {}
stored_queries[hash(query_one)] = query_one
# ... later on:
query_two = User.query.filter(User.id == 1)
if hash(query_two) in stored_queries:
# Execute custom code because it's the same query
Of course, hash in that case does not work, but is there a SQLAlchemy method that works in the same way?
I thought of str(query_one), but that query only consider the request, without the value. I need both.
Thank you in advance.

You can compile the query to get access to the parameters, and use those as part of your key:
def query_key(query):
statement = query_one.statement.compile()
return str(statement), str(statement.params)
query_key(query_one)
('SELECT user.id, ... FROM user WHERE user.id = :id_1', "{'id_1': 1}")
See https://docs.sqlalchemy.org/en/14/core/selectable.html#sqlalchemy.sql.expression.TableClause.compile

Related

Stored procedure using dynamic SQL statement stored in database column

I have a table called Coupon.
This table has a column called query which holds a string.
The query string has some logical conditions in it formatted for a where statement. For example:
coupon1.query
=> " '/hats' = :url "
coupon2.query
=> " '/pants' = :url OR '/shoes' = :url "
I want to write a stored procedure that takes as input 2 parameters: a list of Coupon ids and a variable (in this example, the current URL).
I want the procedure to look up the value of the query column from each Coupon. Then it should run that string in a where statement, plugging in my other parameter (current url), then return any Coupon ids that matches.
Here's how I would expect the procedure to behave given the two coupons above.
Example 1:
* Call procedure with ids for coupon1 and coupon2, with #url = '/hats'
* Expect coupon1 to be returned.
Example 2:
* Call procedure with ids for coupon1 and coupon2, with #url = '/pants'
* Expect coupon2 to be returned.
Example 3:
* Call procedure with ids for coupon1 and coupon2, with #url = '/shirts'
* Expect no ids returned. URL does not match '/hats' for coupon1, and doesn't match '/pants or /shoes' for coupon2.
It's easy to test these out in ActiveRecord. Here is just example 1.
#url = '/hats'
#query = coupon1.query
# "'/hats' = :url"
Coupon.where(#query, url: #url).count
=> 2
# count is non-zero number because the query matches the url parameter.
# Coupon1 passes, its id would be returned from the stored procedure.
'/hats' == '/hats'
#query = coupon2.query
# " '/pants' = :url OR '/shoes' = :url "
Coupon.where(#query, url: #url).count
=> 0
# count is 0 because the query does not match the url parameter.
# Coupon2 does not pass, its id would not be returned from the stored procedure.
'/pants' != '/hats', '/shoes' != '/hats'
You could write this as a loop (I'm in ruby on rails with activerecord) but I need something that performs better - I could potentially have lots of coupons so I can't just check each one directly with a loop. The queries contain complex AND/OR logic so I can't just compare against a list of urls either. But here's some code of a loop that is essentially what I'm trying to translate into a stored procedure.
# assume coupon1 has id 1, coupon2 has id 2
#coupons = [coupon1, coupon2]
#url = '/hats'
#coupons.map do |coupon|
if Coupon.where(coupon.query, url: #url).count > 0
coupon.id
else
nil
end
end
=> [1, nil]
Ok, I've been pondering this one.
Big picture:
A. You have a #url you want to search for to find a match among many potential Coupons
B. A coupon has a URL that might match #url
If that's the true extent of the problem, I think you've really over-complicated things.
coupon1.query
=> ["/hats"]
coupon2.query
=> ["/pants", "/shoes"]
#url = '/hats'
Coupon.where('FIND_IN_SET(:url, query) <> 0')
Or something similar, I'm not a mySQL user myself.
However, this is very possible to achieve and may even have a much better ActiveRecord way to do the query.
UPDATE
Ok, I'm missing something. I can't actually reproduce this in console.
#url = '/hats'
#query = coupon1.query
# "'/hats' = :url"
Coupon.where(#query, url: #url).count
> SELECT * FROM 'coupons' WHERE ( '/hats' = '/hats' )
As you can see from the select statement, this will always return all records. It's the same as writing SELECT * FROM 'coupons' WHERE ( true )
How are you actually performing a valid query?
Sorry to post this in my answer, I wanted good formatting.
If I've got something wrong here, maybe we need to move this to a chat room.
I think you have just enough reputation for me to invite you to a room.
UPDATE2
Since you have to compare #query to each record individually, I think you'll have to loop.
But, I don't think you need to use Coupon.where to accomplish this since you are only comparing one record at a time.
#coupons.map do |coupon|
# don't bother putting nil in the array
next unless coupon.query == #url
coupon.id
end
However, your original question was about performance when scaled, and you know you aren't going to solve that with a loop.
Maybe JSONB instead of String so that you could actually do some SQL.
But, even with JSONB, this is still complicated by wanting your conditions to be evaluated properly.
{
"url": {
"AND": ["/hats", "/shoes"],
"OR": ["/pants"]
},
"logged_in": true,
"is_gold_member": false
}
{
"logged_in": false,
"url": "/hats"
}
{
"url": {
"OR": ["/pants", "/shoes"]
}
}
Ultimately, I think what you're doing with query attributes is going to continue to be your stumbling block. It's very clever, but it's not simple.
If it were my app, I think I would go back to considering my use case and try to find a different strategy to map specific coupons to specific parameters in a more on-the-rails way.

How can I use Django.db.models Q module to query multiple lines of user input text data

How would I go about using the django.db.models Q module to query multiple lines of input from a list of data using a <textarea> html input field? I can query single objects just fine using a normal html <input> field. I've tried using the same code as my input field, except when requesting the input data, I attempt to split the lines like so:
def search_list(request):
template = 'search_result.html'
query = request.GET.get('q').split('\n')
for each in query:
if each:
results = Product.objects.filter(Q(name__icontains=each))
This did not work of course. My code to query one line of data (that works) is like this:
def search(request):
template = 'search_result.html'
query = request.GET.get('q')
if query:
results = Product.objects.filter(Q(name__icontains=query))
I basically just want to search my database for a list of data users input into a list, and return all of those results with one query. Your help would be much appreciated. Thanks.
Based on your comments, you want to implement OR-logic for the given q string.
We can create such Q object by reduce-ing a list of Q objects that each specify a Q(name__icontains=...) constraint. We reduce this with a "logical or" (a pipe in Python |), like:
from django.db.models import Q
from functools import reduce
from operator import or_
def search_list(request):
template = 'search_result.html'
results = Product.objects.all()
error = None
query = request.GET.get('q')
if query:
query = query.split('\n')
else:
error = 'No query specified'
if query:
results = results.filter(
reduce(or_, (Q(name__icontains=itm.strip()) for itm in query))
)
elif not error:
error = 'Empty query'
some_context = {
'results' : results,
'error': error
}
return render(request, 'app/some_template.html', some_context)
Here we thus first check if q exists and is not the empty string. If that is the case, the error is 'No query specified'. In case there is a query, we split that query, next we check if there is at least one element in the query. If not, our error is 'Empty query' (note that this can not happen with an ordinary .split('\n'), but perhaps you postprocess the list, and for example remove the empty elements).
In case there are elements in query, we perform the reduce(..) function, and thus filter the Products.
Finally here we return a render(..)ed response with some_template.html, and a context that here contains the error, and the result.

How to use filter_by and not equals to in sqlalchemy?

I have a function defined as below to query the database table
def query_from_DB(obj, **filter):
DBSession = sessionmaker(bind=engine)
session = DBSession()
res = session.query(obj).filter_by(**filter)
session.close()
return [x for x in res]
I query the table using the request as below
query_from_DB(Router, sp_id="sp-10.1.10.149", connectivity="NO")
the above result returns the response from the DB correctly, but when I make a query using
query_from_DB(Router, sp_id!="sp-10.1.10.149", connectivity="NO")
i got an error
SyntaxError: non-keyword arg after keyword arg
What could be the possible changes I can make to get the result?
I don't believe you can use != on a keyword argument.
You can do connectivity="YES" or use the filter sqlalchemy function so you can pass == or !=, but then you won't be able to use keyword args. You'd have to pass a SQL expression like so...
res = session.query(obj).filter_by(connectivity != "NO")
This question may be helpful...
flask sqlalchemy querying a column with not equals
Did you simply try res = session.query(obj).filter_by(connectivity <> "NO") ?

How to obtain and process mysql records using Airflow?

I need to
1. run a select query on MYSQL DB and fetch the records.
2. Records are processed by python script.
I am unsure about the way I should proceed. Is xcom the way to go here? Also, MYSQLOperator only executes the query, doesn't fetch the records. Is there any inbuilt transfer operator I can use? How can I use a MYSQL hook here?
you may want to use a PythonOperator that uses the hook to get the data,
apply transformation and ship the (now scored) rows back some other place.
Can someone explain how to proceed regarding the same.
Refer - http://markmail.org/message/x6nfeo6zhjfeakfe
def do_work():
mysqlserver = MySqlHook(connection_id)
sql = "SELECT * from table where col > 100 "
row_count = mysqlserver.get_records(sql, schema='testdb')
print row_count[0][0]
callMYSQLHook = PythonOperator(
task_id='fetch_from_testdb',
python_callable=mysqlHook,
dag=dag
)
Is this the correct way to proceed?
Also how do we use xcoms to store the records for the following MySqlOperator?'
t = MySqlOperator(
conn_id='mysql_default',
task_id='basic_mysql',
sql="SELECT count(*) from table1 where id > 10",
dag=dag)
I was really struggling with this for the past 90 minutes, here is a more declarative way to follow for newcomers:
from airflow.hooks.mysql_hook import MySqlHook
def fetch_records():
request = "SELECT * FROM your_table"
mysql_hook = MySqlHook(mysql_conn_id = 'the_connection_name_sourced_from_the_ui', schema = 'specific_db')
connection = mysql_hook.get_conn()
cursor = connection.cursor()
cursor.execute(request)
sources = cursor.fetchall()
print(sources)
...your DAG() as dag: code
task = PythonOperator(
task_id = 'fetch_records',
python_callable = fetch_records
)
This returns to the logs the contents of your DB query.
I hope this is of use to someone else.
Sure, just create a hook or operator and call the get_records() method: https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/hooks/dbapi.html

Django bulk update setting each to different values? [duplicate]

I'd like to update a table with Django - something like this in raw SQL:
update tbl_name set name = 'foo' where name = 'bar'
My first result is something like this - but that's nasty, isn't it?
list = ModelClass.objects.filter(name = 'bar')
for obj in list:
obj.name = 'foo'
obj.save()
Is there a more elegant way?
Update:
Django 2.2 version now has a bulk_update.
Old answer:
Refer to the following django documentation section
Updating multiple objects at once
In short you should be able to use:
ModelClass.objects.filter(name='bar').update(name="foo")
You can also use F objects to do things like incrementing rows:
from django.db.models import F
Entry.objects.all().update(n_pingbacks=F('n_pingbacks') + 1)
See the documentation.
However, note that:
This won't use ModelClass.save method (so if you have some logic inside it won't be triggered).
No django signals will be emitted.
You can't perform an .update() on a sliced QuerySet, it must be on an original QuerySet so you'll need to lean on the .filter() and .exclude() methods.
Consider using django-bulk-update found here on GitHub.
Install: pip install django-bulk-update
Implement: (code taken directly from projects ReadMe file)
from bulk_update.helper import bulk_update
random_names = ['Walter', 'The Dude', 'Donny', 'Jesus']
people = Person.objects.all()
for person in people:
r = random.randrange(4)
person.name = random_names[r]
bulk_update(people) # updates all columns using the default db
Update: As Marc points out in the comments this is not suitable for updating thousands of rows at once. Though it is suitable for smaller batches 10's to 100's. The size of the batch that is right for you depends on your CPU and query complexity. This tool is more like a wheel barrow than a dump truck.
Django 2.2 version now has a bulk_update method (release notes).
https://docs.djangoproject.com/en/stable/ref/models/querysets/#bulk-update
Example:
# get a pk: record dictionary of existing records
updates = YourModel.objects.filter(...).in_bulk()
....
# do something with the updates dict
....
if hasattr(YourModel.objects, 'bulk_update') and updates:
# Use the new method
YourModel.objects.bulk_update(updates.values(), [list the fields to update], batch_size=100)
else:
# The old & slow way
with transaction.atomic():
for obj in updates.values():
obj.save(update_fields=[list the fields to update])
If you want to set the same value on a collection of rows, you can use the update() method combined with any query term to update all rows in one query:
some_list = ModelClass.objects.filter(some condition).values('id')
ModelClass.objects.filter(pk__in=some_list).update(foo=bar)
If you want to update a collection of rows with different values depending on some condition, you can in best case batch the updates according to values. Let's say you have 1000 rows where you want to set a column to one of X values, then you could prepare the batches beforehand and then only run X update-queries (each essentially having the form of the first example above) + the initial SELECT-query.
If every row requires a unique value there is no way to avoid one query per update. Perhaps look into other architectures like CQRS/Event sourcing if you need performance in this latter case.
Here is a useful content which i found in internet regarding the above question
https://www.sankalpjonna.com/learn-django/running-a-bulk-update-with-django
The inefficient way
model_qs= ModelClass.objects.filter(name = 'bar')
for obj in model_qs:
obj.name = 'foo'
obj.save()
The efficient way
ModelClass.objects.filter(name = 'bar').update(name="foo") # for single value 'foo' or add loop
Using bulk_update
update_list = []
model_qs= ModelClass.objects.filter(name = 'bar')
for model_obj in model_qs:
model_obj.name = "foo" # Or what ever the value is for simplicty im providing foo only
update_list.append(model_obj)
ModelClass.objects.bulk_update(update_list,['name'])
Using an atomic transaction
from django.db import transaction
with transaction.atomic():
model_qs = ModelClass.objects.filter(name = 'bar')
for obj in model_qs:
ModelClass.objects.filter(name = 'bar').update(name="foo")
Any Up Votes ? Thanks in advance : Thank you for keep an attention ;)
To update with same value we can simply use this
ModelClass.objects.filter(name = 'bar').update(name='foo')
To update with different values
ob_list = ModelClass.objects.filter(name = 'bar')
obj_to_be_update = []
for obj in obj_list:
obj.name = "Dear "+obj.name
obj_to_be_update.append(obj)
ModelClass.objects.bulk_update(obj_to_be_update, ['name'], batch_size=1000)
It won't trigger save signal every time instead we keep all the objects to be updated on the list and trigger update signal at once.
IT returns number of objects are updated in table.
update_counts = ModelClass.objects.filter(name='bar').update(name="foo")
You can refer this link to get more information on bulk update and create.
Bulk update and Create