Terser syntax for ON DUPLICATE KEY UPDATE only if one condition is true? - mysql

I know I can do something like
ON DUPLICATE KEY UPDATE
exampleColumn1 = IF (exampleCondition = 1, VALUES(exampleColumn1), exampleColumn1),
exampleColumn2 = IF (exampleCondition = 1, VALUES(exampleColumn2), exampleColumn2),
exampleColumn3 = IF (exampleCondition = 1, VALUES(exampleColumn3), exampleColumn3),
# Etc etc
But since I'm doing the exact same condition check every time, surely there's a terser way to write this? It would be especially nice in those cases where the condition is a bit longer to write out.
From what I've read, it seems possible if I wrote a function for this, but can I do it in a plain old query?

In a nutshell: no. The on duplicate key syntax does not support any form of filtering (where condition or the-like).
If you condition is lengthy, however, one trick is to make use of a user variable to hold its return value, that you can reuse in the following assignments:
on duplicate key update
exampleColumn1 = if(#do_update := (exampleCondition = 1), values(exampleColumn1), exampleColumn1),
exampleColumn2 = if(#do_update, values(exampleColumn2), exampleColumn2),
exampleColumn3 = if(#do_update, values(exampleColumn3), exampleColumn3)

Related

Django: ORM/SQL query speed significantly decreased after adding additional BooleanField or (SQL tinyint) to Django Filter

Using MySQL Latest Django:
I have a vaguely complex Django query that works quite quickly--until I add an additional "AND" with a Boolean Field--
See Below:
queriedForms = queryFormtype.form_set.filter(is_public=True)
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
logger.info(newQuery.query)
term['count'] = newQuery.count()
If I either remove the initial "is_public=True" or the final "flagged_for_deletion=False)--it works incredibly fast. If I use both as filters, it increases the time for the count() function by something like 2000%
The different QuerySet.query outputs are below:
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510 AND `maqluengine_form`.`flagged_for_deletion` = False)
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510)
The first takes about 20/30 seconds to perform the count(), while the second with only 1 of the two BooleanField's takes less than a second to perform the count()
=======================================
EDIT=======================
Apologies: since the question isn't obvious enough--why is adding an additional AND with a BooleanField increasing the query time by +2000%? Is anyone able to assist in figuring out WHY that's occurring. Thanks.
EDIT=========================
Also discovered that using a exclude(is_public=False) rather than filter(is_public=True) has the same effect as the solution below. Does anyone happen to know why an exclude() works fine--whereas the filter() does not?
==============================
Solution I came up with after a night's rest:
--I keep the query as is(I need it for later because it continues getting chain filtered)
--I need the count() from this stage--which is taking substantially longer than it should with the additional BooleanField AND
--I take a temporary values list to perform a len() on instead:
queriedForms = queryFormtype.form_set.all()
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
tempQuery = newQuery.values_list('is_public',flat=True)
finalQuery = [entry for entry in tempQuery if entry != 'False'] #Remove any indices that contain "False"
term['count'] = len(finalQuery)
The following counts that use chained filters after use the same technique--it's significantly faster--if not as fast as removing one of the Booleans from the filters.

Django ORM query field weight?

I'm doing the following query:
People.objects.filter(
Q(name__icontains='carolina'),
Q(state__icontains='carolina'),
Q(address__icontains='carolina'),
)[:9]
I want the first results of the query to be the people who is named "Carolina" (and also matches other fields, but name first). The problem is that I don't think is any way to determine a field "weight" or "priority".
Any idea?
Thanks!
You'll need to do 3 queries for this to work:
names_match = People.objects.filter(name__icontains='carolina')[:9]
states_match = People.objects.filter(state__icontains='carolina')[:9]
addresses_match = People.objects.filter(address__icontains='carolina')[:9]
all_objects = list(names_match) + list(states_match) + list(addresses_match)
all_objects = all_objects[:9]
There are two problems with this approach, which are fairly easily worked round:
It does unnecessary queries (what if names_match contained enough items already).
It allows for duplicates (what if someone in North Carolina is called Carolina?)
This should work:
qs = People.objects.filter(name__icontains='carolina') | People.objects.filter( Q(state__icontains = 'carolina'), Q(address__icontains='carolina')).distinct()
qs = list(qs)[:9]
Or if you want a pure duplicate free list:
qs = list(set(qs))[:9] #for a duplicate free list

How to sort var length ids (composite string + numeric)?

I have a MySQL database whose keys are of this type:
A_10
A_10A
A_10B
A_101
QAb801
QAc5
QAc25
QAd2993
I would like them to sort first by the alpha portion, then by the numeric portion, just like above. I would like this to be the default sorting of this column.
1) how can I sort as specified above, i.e. write a MySQL function?
2) how can I set this column to use the sorting routine by default?
some constraints that might be helpful: the numeric portion of my ID's never exceeds 100,000. I use this fact in some javascript code to convert my ID's to strings concatenating the non-numeric portion with the (number + 1,000,000). (At the time I had not noticed the variations/subparts as above such as A_10A, A_10B, so I'll have to revamp that part of my code.)
The best way to achieve what you want is to store each part in its own column, and I would strongly recommend to change table structure. If it's impossible, you can try the following:
Create 3 UDFs which returns prefix, numeric part, and postfix of your string. For a better performance they should be native (Mysql, as any other RDMS, is not really good in complex string parsing). Then you can call these functions in ORDER BY clause or in trigger body which validates your column. In any case, it will work slower than if you create 3 columns.
No simple answer that I know of. I had something similar a while back but had to use jQuery to sort it. So what I did was first get the output into an javascript array. Then you may want to insert a zero padding to your numbers. Separate the Alpha from Nummerics using a regex, then reassemble the array:
var zarr = new Array();
for(var i=0; i<val.length; i++){
var chunk = val[i].match(/(\d+|[^\d]+)/g).join(',');
var chunks = chunk.split(",");
for(var s=0; s<chunks.length; s++){
if(isNaN(chunks[s]) == true)
zarr.push(chunks[s]);
else
zarr.push(zeroPad(chunks[s], 5));
}
}
function zeroPad(num,count){
var numZeropad = num + '';
while(numZeropad.length < count) {
numZeropad = "0" + numZeropad;
}
return numZeropad;
}
You'll end up with an array like this:
A_00100
QAb00801
QAc00005
QAc00025
QAd02993
Then you can do a natural sort. I know you may want to do it through straight MySQL but I am not to sure if it does natural sorting.
Good luck!

Sqlalchemy: Produce OR-clause with multiple filter()-Calls

I'm new to sqlalchemy and could use some help.
I'm trying to write a small application for which i have to dynamically change a select-statement. So I do s = select([files]), and then i add filters by s = s.where(files.c.createtime.between(val1, val2)).
This works great, but only with an AND-conjunction.
So, when I want to have all entries with createtime (between 1.1.2009 and 1.2.2009) OR createtime == 5.2.2009, I got the problem that i don't know how to achieve this with different filter-calls. Because of the programs logic it's not possible to use s= s.where(_or(files.c.createtime.between(val1, val2), files.c.createtime == DateTime('2009-02-01')))
Thanks in advance,
Christof
You can build or clauses dynamically from lists:
clauses = []
if cond1:
clauses.append(files.c.createtime.between(val1, val2))
if cond2:
clauses.append(files.c.createtime == DateTime('2009-02-01'))
if clauses:
s = s.where(or_(*clauses))
If you're willing to "cheat" by making use of the undocumented _whereclause attribute on Select objects, you can incrementally specify a series of OR terms by building a new query each time based on the previous query's where clause:
s = select([files]).where(literal(False)) # Start with an empty query.
s = select(s.froms).where(or_(s._whereclause,
files.c.createtime.between(val1, val2)))
s = select(s.froms).where(or_(s._whereclause,
files.c.createtime == datetime(2009, 2, 1)))
Building up a union is another option. This is a bit clunkier, but doesn't rely on undocumented attributes:
s = select([files]).where(literal(False)) # Start with an empty query.
s = s.select().union(
select([files]).where(files.c.createtime.between(val1, val2)))
s = s.select().union(
select([files]).where(files.c.createtime == datetime(2009, 2, 1)))

Correlate 2 columns in SQL

SELECT ica.CORP_ID, ica.CORP_IDB, ica.ITEM_ID, ica.ITEM_IDB,
ica.EXP_ACCT_NO, ica.SUB_ACCT_NO, ica.PAT_CHRG_NO, ica.PAT_CHRG_PRICE,
ica.TAX_JUR_ID, ica.TAX_JUR_IDB, ITEM_PROFILE.COMDTY_NAME
FROM ITEM_CORP_ACCT ica
,ITEM_PROFILE
WHERE (ica.CORP_ID = 1000)
AND (ica.CORP_IDB = 4051)
AND (ica.ITEM_ID = 1000)
AND (ica.ITEM_IDB = 4051)
AND ica.EXP_ACCT_NO = ITEM_PROFILE.EXP_ACCT_NO
I'm trying basically say since the exp account code is '801500' then the Name should return "Miscellaneous Medic...".
It seems as if what you are showing is not possible. Have you edited the data in the editor??? You are joining using ica.EXP_ACCT_NO = ITEM_PROFILE.EXP_ACCT_NO . Therefore, every entry with EXP_ACCT_NO = 801500, should also have the same COMDTY_NAME.
However, it could be the case that your IDs are not actually numbers and that they are strings with whitespace (801500__ vs 801500 ). But since you are not performing a left-outer join, it would also mean you have an entry in ITEM_PROFILE with the same whitespace.
You also need to properly normalize your table data (unless this is a view) but it still means you have erroneous data.
Try to perform the same query, but using the TRIM function to remove whitespace: https://stackoverflow.com/a/6858168/1688441 .
Example:
SELECT ica.CORP_ID, ica.CORP_IDB, ica.ITEM_ID, ica.ITEM_IDB,
ica.EXP_ACCT_NO, ica.SUB_ACCT_NO, ica.PAT_CHRG_NO, ica.PAT_CHRG_PRICE,
ica.TAX_JUR_ID, ica.TAX_JUR_IDB, ITEM_PROFILE.COMDTY_NAME
FROM ITEM_CORP_ACCT ica
,ITEM_PROFILE
WHERE (ica.CORP_ID = 1000)
AND (ica.CORP_IDB = 4051)
AND (ica.ITEM_ID = 1000)
AND (ica.ITEM_IDB = 4051)
AND trim(ica.EXP_ACCT_NO) = trim(ITEM_PROFILE.EXP_ACCT_NO);