JSON Encoding custom class not calling overriden default - json

I have a class definition which I have based on json.JSONEncoder, within which I have overriden the default method. Now when I call json.dumps on an instance of that class the default method is not being called? Is there something I have missed?
In my example code I do not expect this to magically produce the serialized object but I would expect the print("here") to be executed.
import json
class MyClass(json.JSONEncoder):
id = "myId"
data = "myData"
def default(self, o):
print("here")
print ("Create instance")
obj = MyClass()
print("Serialize")
print(json.dumps(obj))
print ("and done")
I am quite new to Python, so apologies if this is something horribly obvious.

After some further digging and tracing I think I have found the cause. Part of the issue I think it my own misunderstanding of how this is intended to be used.
When calling json.dumps if you wish to use a custom encoder you need to specify the class of that encoder, otherwise dumps defaults to using the standard implementation of JSONEncoder.
json.dumps(obj, cls=MyEncoder)
My misconception was that by basing my class on json.JSONEncoder that dumps would simply recognise the instance as inheriting from JSONEncoder and call the override default method. However this is not the case.
I have now created the logic in it's own class to encode my own class/types and when I call json.dumps I pass in that class name.
So I now have
class MyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, MyClass):
return {"id": obj.id, "data": obj.data}
return json.JSONEncoder.default(self, obj)
And when I wish to serialize I use
json.dumps(object_to_serialize, cls=MyEncoder)
Which recognises my class and handles it, or passes on the encoding to the default encoder.

Related

Unable to serialize a nested python object using json.dumps()

I am new to python so sorry about the naive questions. I have a simple code snipper where I try to serialize a python object to a dictionary using json.dumps()
import json
class Document:
uid = "1"
content = "content1"
domain = "domain"
title = "title"
class ASSMSchema:
requestSource = "unittest"
documents = []
def entry():
myObj = ASSMSchema()
myObj.requestSource = "unittest"
document1 = Document()
document1.uid = "1"
document1.content = "content1"
document1.domain = "domain"
document1.title = "title"
myObj.documents.append(document1)
print(json.dumps(myObj.__dict__))
if __name__ == "__main__":
entry()
I get the following output when I run the above code
{"requestSource": "unittest"}
This is not expected however, since it should also seralize the List of "Document" objects. Appreciate your answers. Thanks in advance!
Your class definition of ASSMSchema defines the class members documents and requestSource. These are not attributes of a single instance of this class, but shared between all instances. When you are running myObj.requestSource = "unittest", you are defining a member variable on the instance myObj. This member is actually reflected in the output of json.dumps, whereas the class members (like documents) are not.
For further reading, see https://docs.python.org/3/tutorial/classes.html#class-and-instance-variables
Depending on the complexity and desired maintainability of your program, there are multiple approaches to archieve your desired behaviour. Firstly, you have to fix the mistake in both class definitions. To define a class with instance variables instead of class variables, do something like this:
class Foo:
# class variables go here
def __init__(self, field1, field2):
# This method is called when you write Foo(field1, field2)
# these are instance variables
self.field1 = field1
self.field2 = field2
If you want to dump this class as JSON, you can simply use the trick with __dict__: print(json.dumps(Foo(1,2).__dict__)) will output something like { "field1": 1, "field2": 2 }.
In your case, there is the documents member though, which is not JSON serializable by default. Therefore, you must handle this separately as well. You could write an encoder for your ASSMSchema (see this thread for more info on that). It could be implemented roughly like this:
from json import JSONEncoder
class ASSMSchemaEncoder(JSONEncoder):
def default(self, o):
return {
"requestSource": o.requestSource,
# Convert the list of Document objects to a list of dict
"documents": [d.__dict__ for d in o.documents]
}
Now, when serializing an instance of ASSMSchema, this implemention is used and the documents member is replaced with a list of dictionaires (which can be serialized by the default encoder). Note, that you have to specify this encoder when calling json.dumps, see the linked thread above.

Python objects in dealloc in cython

In the docs it is written, that "Any C data that you explicitly allocated (e.g. via malloc) in your __cinit__() method should be freed in your __dealloc__() method."
This is not my case. I have following extension class:
cdef class SomeClass:
cdef dict data
cdef void * u_data
def __init__(self, data_len):
self.data = {'columns': []}
if data_len > 0:
self.data.update({'data': deque(maxlen=data_len)})
else:
self.data.update({'data': []})
self.u_data = <void *>self.data
#property
def data(self):
return self.data
#data.setter
def data(self, new_val: dict):
self.data = new_val
Some c function has an access to this class and it appends some data to SomeClass().data dict. What should I write in __dealloc__, when I want to delete the instance of the SomeClass()?
Maybe something like:
def __dealloc__(self):
self.data = None
free(self.u_data)
Or there is no need to dealloc anything at all?
No you don't need to and no you shouldn't. From the documentation
You need to be careful what you do in a __dealloc__() method. By the time your __dealloc__() method is called, the object may already have been partially destroyed and may not be in a valid state as far as Python is concerned, so you should avoid invoking any Python operations which might touch the object. In particular, don’t call any other methods of the object or do anything which might cause the object to be resurrected. It’s best if you stick to just deallocating C data.
You don’t need to worry about deallocating Python attributes of your object, because that will be done for you by Cython after your __dealloc__() method returns.
You can confirm this by inspecting the C code (you need to look at the full code, not just the annotated HTML). There's an autogenerated function __pyx_tp_dealloc_9someclass_SomeClass (name may vary slightly depending on what you called your module) does a range of things including:
__pyx_pw_9someclass_9SomeClass_3__dealloc__(o);
/* some other code */
Py_CLEAR(p->data);
where the function __pyx_pw_9someclass_9SomeClass_3__dealloc__ is (a wrapper for) your user-defined __dealloc__. Py_CLEAR will ensure that data is appropriately reference-counted then set to NULL.
It's a little hard to follow because it all goes through several layers of wrappers, but you can confirm that it does what the documentation says.

Extending AutoDerivation in Circe does not work

My question concerns the second solution offered by mixel here: Scala Circe with generics
Note that the trait named Auto in Circe has been renamed to AutoDerivation in the current version of Circe.
I am using the solution mixel provides in his StackOverflow solution but have not been able to get it to work. I have tried things like updating my Circe version to the most recent one and making sure the Macro Paradise plugin is imported, but still no luck.
Here is my code. The first is its own file, called CirceGeneric.
import io.circe._
import io.circe.parser._
import io.circe.generic.extras._
object CirceGeneric {
trait JsonEncoder[T] {
def apply(in: T): Json
}
trait JsonDecoder[T] {
def apply(s: String): Either[Error, T]
}
object CirceEncoderProvider {
def apply[T: Encoder]: JsonEncoder[T] = new JsonEncoder[T] {
def apply(in: T) = Encoder[T].apply(in)
}
}
object CirceDecoderProvider {
def apply[T: Decoder]: JsonDecoder[T] = new JsonDecoder[T] {
def apply(s: String) = decode[T](s)
}
}
}
object Generic extends AutoDerivation {
import CirceGeneric._
implicit def encoder[T: Encoder]: JsonEncoder[T] = CirceEncoderProvider[T]
implicit def decoder[T: Decoder]: JsonDecoder[T] = CirceDecoderProvider[T]
}
The second is a method for unit testing that uses the Akka function responseAs. The method appears in a class called BaseServiceTest.
def responseTo[T]: T = {
def response(s: String)(implicit d: JsonDecoder[T]) = {
d.apply(responseAs[String]) match {
case Right(value) => value
case Left(error) => throw new IllegalArgumentException(error.fillInStackTrace)
}
}
response(responseAs[String])
}
The idea is to convert the result of responseAs[String] (which returns a string) into a decoded case class.
The code is not behaving as expected. Intellij does not detect any missing implicits, but when compilation time comes around, I am getting problems. I should mention that the BaseServiceTest file contains imports for CirceGeneric._ and Generic._, so a missing import statement is not the problem.
[error] [...]/BaseServiceTest.scala:59: could not find implicit value for parameter d: [...]CirceGeneric.JsonDecoder[T]
[error] response(responseAs[String])
Either the implicit conversion from Decoder[T] to JsonDecoder[T] is not happening, or the Decoder[T] instance is not being created. Either way, something is wrong.
You still need a Decoder or JsonDecoder context bound on responseTo.
def responseTo[T : Decoder]: T = ...
This is because all your code, and indeed mixel's code in the linked answer, is about abstracting from a Decoder out to a JsonDecoder trait which can be used for cross-library support. But you still don't have any way of constructing one without an underlying Decoder instance.
Now, there are some ways of automatically generating Decoders for (for instance) case classes contained in circe.generics.auto, but at this point in your code
def responseTo[T]: T = {
def response(s: String)(implicit d: JsonDecoder[T]) = ...
...
}
you're asking the compiler to be able to provide an implicit JsonDecoder (i.e., in your setup, Decoder) instance for any arbitrary type. As the accepted answer to the linked question explains, that's not possible.
You need to delay the implicit resolution to the point where you know what type you're dealing with - in particular, that you can provide a Decoder[T] instance for it.
EDIT: In your response to your comment regarding what the point is if you can't create JsonDecoders for all types...
My understanding of the linked question is that they're trying to abstract away the circe library in order to allow swapping out the JSON library implementation. This is done as follows:
add the JsonDecoder type class
have a package json which contains implicits (using Circe) for constructing them automatically via the package object extending AutoDerivation
have external code only refer to JsonDecoder and import the implicits in the json package
Then all the JSON serialization and implicit resolution works out without ever needing the calling code to reference io.circe, and it's easy to switch over the json/JsonDecoder to another JSON library if desired. But you're still going to have to use the JsonDecoder context bound, and be restricted to working with types where such an implicit can be constructed. Which is not every type.

Passing generic companion object to super constructor

I'm trying to construct a trait and an abstract class to subtype by messages (In an Akka play environment) so I can easily convert them to Json.
What have done so far:
abstract class OutputMessage(val companion: OutputMessageCompanion[OutputMessage]) {
def toJson: JsValue = Json.toJson(this)(companion.fmt)
}
trait OutputMessageCompanion[OT] {
implicit val fmt: OFormat[OT]
}
Problem is, when I'm trying to implement the mentioned interfaces as follows:
case class NotifyTableChange(tableStatus: BizTable) extends OutputMessage(NotifyTableChange)
object NotifyTableChange extends OutputMessageCompanion[NotifyTableChange] {
override implicit val fmt: OFormat[NotifyTableChange] = Json.format[NotifyTableChange]
}
I get this error from Intellij:
Type mismatch, expected: OutputMessageCompanion[OutputMessage], actual: NotifyTableChange.type
I'm kinda new to Scala generics - so help with some explanations would be much appreciated.
P.S I'm open for any more generic solutions than the one mentioned.
The goal is, when getting any subtype of OutputMessage - to easily convert it to Json.
The compiler says that your companion is defined over the OutputMessage as the generic parameter rather than some specific subtype. To work this around you want to use a trick known as F-bound generic. Also I don't like the idea of storing that companion object as a val in each message (after all you don't want it serialized, do you?). Defining it as a def is IMHO much better trade-off. The code would go like this (companions stays the same):
abstract class OutputMessage[M <: OutputMessage[M]]() {
self: M => // required to match Json.toJson signature
protected def companion: OutputMessageCompanion[M]
def toJson: JsValue = Json.toJson(this)(companion.fmt)
}
case class NotifyTableChange(tableStatus: BizTable) extends OutputMessage[NotifyTableChange] {
override protected def companion: OutputMessageCompanion[NotifyTableChange] = NotifyTableChange
}
You may also see standard Scala collections for an implementation of the same approach.
But if all you need the companion for is to encode with JSON format, you can get rid of it like this:
abstract class OutputMessage[M <: OutputMessage[M]]() {
self: M => // required to match Json.toJson signature
implicit protected def fmt: OFormat[M]
def toJson: JsValue = Json.toJson(this)
}
case class NotifyTableChange(tableStatus: BizTable) extends OutputMessage[NotifyTableChange] {
override implicit protected def fmt: OFormat[NotifyTableChange] = Json.format[NotifyTableChange]
}
Obviously is you also want to decode from JSON you still need a companion object anyway.
Answers to the comments
Referring the companion through a def - means that is a "method", thus defined once for all the instances of the subtype (and doesn't gets serialized)?
Everything you declare with val gets a field stored in the object (instance of the class). By default serializers trying to serialize all the fields. Usually there is some way to say that some fields should be ignored (like some #IgnoreAnnotation). Also it means that you'll have one more pointer/reference in each object which uses memory for no good reason, this might or might not be an issue for you. Declaring it as def gets a method so you can have just one object stored in some "static" place like companion object or build it on demand every time.
I'm kinda new to Scala, and I've peeked up the habit to put the format inside the companion object, would you recommend/refer to some source, about how to decide where is best to put your methods?
Scala is an unusual language and there is no direct mapping the covers all the use cases of the object concept in other languages. As a first rule of thumb there are two main usages for object:
Something where you would use static in other languages, i.e. a container for static methods, constants and static variables (although variables are discouraged, especially static in Scala)
Implementation of the singleton pattern.
By f-bound generic - do you mean the lower bound of the M being OutputMessage[M] (btw why is it ok using M twice in the same expr. ?)
Unfortunately wiki provides only a basic description. The whole idea of the F-bounded polymorphism is to be able to access to the type of the sub-class in the type of a base class in some generic manner. Usually A <: B constraint means that A should be a subtype of B. Here with M <: OutputMessage[M], it means that M should be a sub-type of the OutputMessage[M] which can easily be satisfied only by declaring the child class (there are other non-easy ways to satisfy that) as:
class Child extends OutputMessage[Child}
Such trick allows you to use the M as a an argument or result type in methods.
I'm a bit puzzled about the self bit ...
Lastly the self bit is another trick that is necessary because F-bounded polymorphism was not enough in this particular case. Usually it is used with trait when traits are used as a mix-in. In such case you might want to restrict in what classes the trait can be mixed in. And at the same type it allows you to use the methods from that base type in your mixin trait.
I'd say that the particular usage in my answer is a bit unconventional but it has the same twofold effect:
When compiling OutputMessage the compiler can assume that the type will also somehow be of the type of M (whatever M is)
When compiling any sub-type compiler ensures that the constraint #1 is satisfied. For example it will not let you to do
case class SomeChild(i: Int) extends OutputMessage[SomeChild]
// this will fail because passing SomeChild breaks the restriction of self:M
case class AnotherChild(i: Int) extends OutputMessage[SomeChild]
Actually since I had to use self:M anyway, you probably can remove the F-bounded part here, living just
abstract class OutputMessage[M]() {
self: M =>
...
}
but I'd stay with it to better convey the meaning.
As SergGr already answered, you would need an F-Bounded kind of polymorphism to solve this as it is right now.
However, for these cases, I believe (note this is only my opinion) is better to use Typeclasses instead.
In your case, you only want to provide a toJson method to any value as long as they have an instance of the OFormat[T] class.
You can achieve that with this (more simple IMHO) piece of code.
object syntax {
object json {
implicit class JsonOps[T](val t: T) extends AnyVal {
def toJson(implicit: fmt: OFormat[T]): JsVal = Json.toJson(t)(fmt)
}
}
}
final case class NotifyTableChange(tableStatus: BizTable)
object NotifyTableChange {
implicit val fmt: OFormat[NotifyTableChange] = Json.format[NotifyTableChange]
}
import syntax.json._
val m = NotifyTableChange(tableStatus = ???)
val mJson = m.toJson // This works!
The JsonOps class is an Implicit Class which will provide the toJson method to any value for which there is an implicit OFormat instance in scope.
And since the companion object of the NotifyTableChange class defines such implicit, it is always in scope - more information about where does scala look for implicits in this link.
Additionally, given it is a Value Class, this extension method does not require any instantiation in runtime.
Here, you can find a more detailed discussion about F-Bounded vs Typeclasses.

Check if object is an sqlalchemy model instance

I want to know how to know, given an object, if it is an instance of an sqlalchemy mapped model.
Normally, I would use isinstance(obj, DeclarativeBase). However, in this scenario, I do not have the DeclarativeBase class used available (since it is in a dependency project).
I would like to know what is the best practice in this case.
class Person(DeclarativeBase):
__tablename__ = "Persons"
p = Person()
print isinstance(p, DeclarativeBase)
#prints True
#However in my scenario, I do not have the DeclarativeBase available
#since the DeclarativeBase will be constructed in the depending web app
#while my code will act as a library that will be imported into the web app
#what are my alternatives?
You can use class_mapper() and catch the exception.
Or you could use _is_mapped_class, but ideally you should not as it is not a public method.
from sqlalchemy.orm.util import class_mapper
def _is_sa_mapped(cls):
try:
class_mapper(cls)
return True
except:
return False
print _is_sa_mapped(MyClass)
# #note: use this at your own risk as might be removed/renamed in the future
from sqlalchemy.orm.util import _is_mapped_class
print bool(_is_mapped_class(MyClass))
for instances there is the object_mapper(), so:
from sqlalchemy.orm.base import object_mapper
def is_mapped(obj):
try:
object_mapper(obj)
except UnmappedInstanceError:
return False
return True
the complete mapper utilities are documented here: http://docs.sqlalchemy.org/en/rel_1_0/orm/mapping_api.html
Just a consideration: since specific errors are raised by SQLAlchemy (UnmappedClassError for calsses and UnmappedInstanceError for instances) why not catch them rather than a generic exception? ;)