Django query on related model - mysql

I have models like below
class Scheduler(models.Model):
id = <this is primary key>
last_run = <referencing to id in RunLogs below>
class RunLogs(models.Model):
id = <primary key>
scheduler = <referencing to id in Scheduler above>
overall_status = <String>
Only when the scheduler reaches the scheduled time of the job, RunLogs entry is created.
Now I am querying on RunLogs to show running schedules as below.
current = RunLog.objects\
.filter(Q(overall_status__in = ("RUNNING", "ON-HOLD", "QUEUED") |
Q(scheduler__last_run__isnull = True))
The above query gives me all records with matching status from RunLogs but does not give me records from Scheduler with last_run is null.
I understand why the query is behaving so but is there a way to get records from scheduler also with last_run is null
?

I just did the same steps which you followed and found the reason why you where getting all the records after running your query. Here is the exact steps and a solution for this.
Steps
Created models
from django.db import models
class ResourceLog(models.Model):
id = models.BigIntegerField(primary_key=True)
resource_mgmt = models.ForeignKey('ResourceMgmt', on_delete=models.DO_NOTHING,
related_name='cpe_log_resource_mgmt')
overall_status = models.CharField(max_length=8, blank=True, null=True)
class ResourceMgmt(models.Model):
id = models.BigIntegerField(primary_key=True)
last_run = models.ForeignKey(ResourceLog, on_delete=models.DO_NOTHING, blank=True, null=True)
Added the data as following:
resource_log
+----+----------------+------------------+
| id | overall_status | resource_mgmt_id |
+----+----------------+------------------+
| 1 | RUNNING | 1 |
| 2 | QUEUED | 1 |
| 3 | QUEUED | 1 |
+----+----------------+------------------+
resource_mgmt
+----+-------------+
| id | last_run_id |
+----+-------------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
| 4 | 3 |
+----+-------------+
According to the above table resource_mgmt(4) is referring to resource_log(3). But thing to be noted is, resource_log(3) is not referring to resource_mgmt(4).
Ran the following command in python shell
In [1]: resource_log1 = ResourceLog.objects.get(id=1)
In [2]: resource_log.resource_mgmt
Out[2]: <ResourceMgmt: ResourceMgmt object (1)>
In [3]: resource_log1 = ResourceLog.objects.get(id=2)
In [4]: resource_log.resource_mgmt
Out[4]: <ResourceMgmt: ResourceMgmt object (1)
In [5]: resource_log1 = ResourceLog.objects.get(id=3)
In [6]: resource_log.resource_mgmt
Out[6]: <ResourceMgmt: ResourceMgmt object (1)>
from this we can understand that all the resource_log objects are referring to 1st object of resource_mgmt(ie, id=1).
Q) Why all the objects are referring to 1st object in the resource_mgmt?
resource_mgmt is a foreign key field which is not null. Its default value is 1. when you create a resource_log object, if you are not specifying resource_mgmt, it will add the default value there which is 1.
Run your query
In [60]: ResourceLog.objects.filter(resource_mgmt__last_run__isnull = True)
Out[60]: <QuerySet [<ResourceLog: ResourceLog object (1)>, <ResourceLog: ResourceLog object (2)>, <ResourceLog: ResourceLog object (3)>]>
This query is returning all three ResourceLog objects because all three are referring to 1st resource_mgmt object which has its is_null value as True
Solution
You actually want to check the reverse relationship.
We can achieve this using two queries:
rm_ids = ResourceMgmt.objects.exclude(last_run=None).values_list('last_run', flat=True)
current = ResourceLog.objects.filter(overall_status__in = ("RUNNING", "QUEUED")).exclude(id__in=rm)
The output is:
<QuerySet [<ResourceLog: ResourceLog object (1)>, <ResourceLog: ResourceLog object (2)>]>
Hope that helps!

Related

SQLAlchemy - filtering rows before today with autoloaded DATETIME column?

I have a MARIADB database radio_progs with a table FUTUREEPISODE. I'm using SQLAlchemy and trying to add a function that selects all entries in the table that are before today.
I'm having problems with the datetime field though. Is this as I'm autoloading the fields? In my real world example I have a number of columns so would prefer to autoload than specify each individually.
error is
eps = self.query.filter_by(IN_LIST=1, EP_ENDTIME < todays_datetime).all()
^
SyntaxError: positional argument follows keyword argument
The table has the following columns
| Column | Type |
| ---------- | ---------- |
| ID | int(11) |
| EP_ENDTIME | datetime |
| IN_LIST | tinyint(1) |
from datetime import datetime
from sqlalchemy import and_, func
from .dbmgr import db
class FutureEpisode(db.Model):
__bind_key__ = 'radio_progs'
__tablename__ = 'FUTUREEPISODE'
__table_args__ = {
'autoload': True,
'autoload_with': db.engine
}
def get_expired(self):
todays_datetime = datetime(datetime.today().year, datetime.today().month, datetime.today().day)
eps = self.query.filter_by(IN_LIST=1, EP_ENDTIME < todays_datetime).all()
return eps
Using filter rather than filter_by works, i.e. changing the query to:
eps = self.query.filter(FutureEpisode.IN_LIST==1, FutureEpisode.EP_ENDTIME < todays_datetime).all()

spark rdd fliter by query mysql

I use spark streaming to stream data from Kafka and I want to filter data judge by data in MySql.
For example, I get data from kafka just like:
{"id":1, "data":"abcdefg"}
and there are data in MySql like this:
id | state
1 | "success"
I need to query the MySql to get the state of term id.
I can define a connect to MySql in the function of filter, and it works. The code like this:
def isSuccess(x):
id = x["id"]
sql = """
SELECT *
FROM Test
WHERE id = "{0}"
""".format(id)
conn = mysql_connection(......)
result = rdbi.query_one(sql)
if result == None:
return False
else:
return True
successRDD = rdd.filter(isSuccess)
But it will define connection for every row of the RDD, and will waste a lot of computing resource.
How to do in filter?
I suggest you go for using mapPartition available in Apache Spark to prevent initialization of MySQL connection for every RDD.
This is the MySQL table that I created:
create table test2(id varchar(10), state varchar(10));
With the following values:
+------+---------+
| id | state |
+------+---------+
| 1 | success |
| 2 | stopped |
+------+---------+
Use the following PySpark Code as reference:
import MySQLdb
data1=[["1", "afdasds"],["2","dfsdfada"],["3","dsfdsf"]] #sampe data, in your case streaming data
rdd = sc.parallelize(data1)
def func1(data1):
con = MySQLdb.connect(host="127.0.0.1", user="root", passwd="yourpassword", db="yourdb")
c=con.cursor()
c.execute("select * from test2;")
data=c.fetchall()
dict={}
for x in data:
dict[x[0]]=x[1]
list1=[]
for x in data1:
if x[0] in dict:
list1.append([x[0], x[1], dict[x[0]]])
else:
list1.append([x[0], x[1], "none"]) # i assign none if id in table and one received from streaming dont match
return iter(list1)
print rdd.mapPartitions(func1).filter(lambda x: "none" not in x[2]).collect()
The output that i got was:
[['1', 'afdasds', 'success'], ['2', 'dfsdfada', 'stopped']]

django query return primary key related column values

I'm new to using databases and making django queries to get information.
If I have a table with id as the primary key, and ages and height as other columns, what query would bring me back a dictionary of all the ids and the related ages?
For instance if my table looks like below:
special_id | ages | heights
1 | 5 | x1
2 | 10 | x2
3 | 15 | x3
I'd like to have a key-value pair like {special_id: ages} where special_id is also the primary key.
Is this possible?
Try this:
from django.http import JsonResponse
def get_json(request):
result = MyModel.objects.all().values('id', 'ages') # or simply .values() to get all fields
result_list = list(result) # important: convert the QuerySet to a list object
return JsonResponse(result_list, safe=False)
You will get classic:
{field_name: field_value}
And if you want {field_value: field_value} you can do:
from django.http import JsonResponse
def get_json(request):
result = MyModel.objects.all()
a = {}
for item in result:
a[item.id] = item.age
return JsonResponse(a)

Summarizing/aggregating a Scala Slick object into another

I'm essentially trying to recreate the following SQL query using Scala Slick:
select labelOne, labelTwo, sum(countA), sum(countB) from things where date > 'blah' group by labelOne, labelTwo;
As you can see, it takes what a table of labeled things and aggregates them, summing various counts. A table with the following info:
ID | date | labelOne | labelTwo | countA | countB
-------------------------------------------------
0 | 0 | foo | cheese | 1 | 2
1 | 0 | bar | wine | 0 | 3
2 | 1 | foo | cheese | 3 | 4
3 | 1 | bar | wine | 2 | 1
4 | 2 | foo | beer | 1 | 1
Should yield the following result if queried across all dates:
labelOne | labelTwo | countA | countB
-------------------------------------
foo | cheese | 4 | 6
bar | wine | 2 | 4
foo | beer | 1 | 1
This is what my Scala code looks like:
import scala.slick.driver.MySQLDriver.simple._
import scala.slick.jdbc.StaticQuery
import StaticQuery.interpolation
import org.joda.time.LocalDate
import com.github.tototoshi.slick.JodaSupport._
case class Thing(
id: Option[Long],
date: LocalDate,
labelOne: String,
labelTwo: String,
countA: Long,
countB: Long)
// summarized version of "Thing": note there's no date in this object
// each distinct grouping of Thing.labelOne + Thing.labelTwo should become a "SummarizedThing", with summed counts
case class SummarizedThing(
labelOne: String,
labelTwo: String,
countASum: Long,
countBSum: Long)
trait ThingsComponent {
val Things: Things
class Things extends Table[Thing]("things") {
def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
def date = column[LocalDate]("date", O.NotNull)
def labelOne = column[String]("labelOne", O.NotNull)
def labelTwo = column[String]("labelTwo", O.NotNull)
def countA = column[Long]("countA", O.NotNull)
def countB = column[Long]("countB", O.NotNull)
def * = id.? ~ date ~ labelOne ~ labelTwo ~ countA ~ countB <> (Thing.apply _, Thing.unapply _)
val byId = createFinderBy(_.id)
}
}
object Things extends DAO {
def insert(thing: Thing)(implicit s: Session) { Things.insert(thing) }
def findById(id: Long)(implicit s: Session): Option[Thing] = Things.byId(id).firstOption
// ???
def summarizeSince(date: LocalDate)(implicit s: Session): Set[SummarizedThing] = {
Query(Things).where(_.date > date).groupBy(x => (x.labelOne, x.labelTwo)).map {
case(thing: Thing) => {
// obviously this line below is wrong, but you can get an idea of what I'm trying to accomplish:
// create a new SummarizedThing for each unique labelOne + labelTwo combo, summing the count columns
new SummarizedThing(thing.labelOne, thing.labelTwo, thing.countA.sum, thing.countB.sum)
}
} // presumably need to run the query and map to SummarizedThing here, perhaps?
}
}
The summarizeSince function is where I'm having trouble. I seem to be able to query Things just fine, filtering by date, and grouping by my fields... however, I'm having trouble summing countA and countB. With the summed results, I'd then like to create a SummarizedThing for each unique labelOne + labelTwo combination. Hopefully that makes sense. Any help would be greatly appreciated.
presumably need to run the query and map to SummarizedThing here, perhaps?
Exactly.
Query(Things).filter(_.date > date).groupBy(x => (x.labelOne, x.labelTwo)).map {
// match on (key,group)
case ((labelOne, labelTwo), things) => {
// prepare results as tuple (note .sum returns an Option)
(labelOne, labelTwo, things.map(_.countA).sum.get, things.map(_.countB).sum.get)
}
}.run.map(SummarizedThing.tupled) // run and map tuple into case class
Same as the other answer, but expressed as a for comprehension, except that .get is exceptional so you probably need getOrElse.
val q = for {
((l1,l2), ts) <- Things.where(_.date > date).groupBy(t => (t.labelOne, t.labelTwo))
} yield (l1, l2, ts.map(_.countA).sum.getOrElse(0L), ts.map(_.countB).sum.getOrElse(0L))
// see the SQL that generates.
println( q.selectStatement )
// select x2.`labelOne`, x2.`labelTwo`, sum(x2.`countA`), sum(x2.`countB`)
// from `things` x2 where x2.`date` > '2013' group by x2.`labelOne`, x2.`labelTwo`
// map the result(s) of your query to your case class
q.map(SummarizedThing.tupled).list

mysql recursive self join

create table test(
container varchar(1),
contained varchar(1)
);
insert into test values('X','A');
insert into test values('X','B');
insert into test values('X','C');
insert into test values('Y','D');
insert into test values('Y','E');
insert into test values('Y','F');
insert into test values('A','P');
insert into test values('P','Q');
insert into test values('Q','R');
insert into test values('R','Y');
insert into test values('Y','X');
select * from test;
mysql> select * from test;
+-----------+-----------+
| container | contained |
+-----------+-----------+
| X | A |
| X | B |
| X | C |
| Y | D |
| Y | E |
| Y | F |
| A | P |
| P | Q |
| Q | R |
| R | Y |
| Y | X |
+-----------+-----------+
11 rows in set (0.00 sec)
Can I find out all the distinct values contained under 'X' using a single self join?
EDIT
Like, Here
X contains A, B and C
A contains P
P contains Q
Q contains R
R contains Y
Y contains C, D and E...
So I want to display A,B,C,D,E,P,Q,R,Y when I query for X.
EDIT
Got it right by programming.
package com.catgen.helper;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.List;
import com.catgen.factories.Nm2NmFactory;
public class Nm2NmHelper {
private List<String> fetched;
private List<String> fresh;
public List<String> findAllContainedNMByMarketId(Connection conn, String marketId) throws SQLException{
fetched = new ArrayList<String>();
fresh = new ArrayList<String>();
fresh.add(marketId.toLowerCase());
while(fresh.size()>0){
fetched.add(fresh.get(0).toLowerCase());
fresh.remove(0);
List<String> tempList = Nm2NmFactory.getContainedNmByContainerNm(conn, fetched.get(fetched.size()-1));
if(tempList!=null){
for(int i=0;i<tempList.size();i++){
String current = tempList.get(i).toLowerCase();
if(!fetched.contains(current) && !fresh.contains(current)){
fresh.add(current);
}
}
}
}
return fetched;
}
}
Not the same table and fields though. But I hope you get the concept.
Thanks guys.
You can't get all the contained objects recursively using a single join with that data structure. You would need a recursive query but MySQL doesn't yet support that.
You could however construct a closure table, then you can do it with a simple query. See Bill Karwin's slideshow Models for heirarchical data for more details and other approaches (for example, nested sets). Slide 69 compares the different designs for ease of implementing 'Query subtree'. Your chosen design (adjacency list) is the most awkward of all four designs for this type of query.
What about reading the whole table into a php array, and determine the children via. a function which would call itself?
But this is not a good solution if the table has more than 10000 rows...