Let's say I have this model:
#Entity
public class Picture extends Model {
public Blob image;
...
}
When I look the type in mysql database it is just the path to the attachments folder (VARCHAR). Is there some way to change this to save the binary data into mysql (BLOB) using play?
I would like to achieve this:
CREATE TABLE picture (
......
image blob,
...
);
Using JDBC to set the image:
"psmnt.setBinaryStream(3, (InputStream)fis, (int)(image.length()));"
I don't know if this makes sense at all but if not please explain me why! Why having attachments folder into play project?
Well, because storing media files (images/videos/audio/etc etc) is very uncommon (in the database), I'm guessing the team placed that Blob implementation to make it more "effective" instead of fetching binaries in the database (the database will be less hammered). To be honest I never used the Blob function, I know you can just implement your own Blob and have read a few posts about it.
http://groups.google.com/group/play-framework/browse_thread/thread/7e7e0b00a48eeed9
Do note like Guillaume said, Blob is the new version of the "File Attachment" class that was used early before 1.1. If you want to store an image and you are using hibernate
#Entity
public class Picture extends Model {
#Lob(type = LobType.BLOB)
public byte[] image;
...
}
In Play the Blob type stores only the hash reference to the file plus its mime type. The reason is that databases don't like too much big Blobs (for internal reason) and it's a good practice to store the files aside. It will also avoid you headaches related to encodings and backups, trust me (and more importantly, trust Play developers!)
As an alternative, you can store your image as:
#Entity
public class YourClass extends Model {
#Basic(fetch = FetchType.LAZY)
#Lob
public byte[] image;
public String mime_type;
public String file_name;
}
You will need to store the mime type separately (Blob type stores it in the database in the field) to be able to work with the file, and you might want to store the original file name given to you. To detect the mime type I would recommend to use mime-util.
As a last note, be aware that if you use Play's Blob, when you delete the field (via CRUD or the API) the file is not removed from the file system. You will need to create a Job that checks for unused files from time to time, to free space.
This happens (in Gillaume's words) due to the impossibility of having a safe 2-phase transaction between the database and the file system. this is relevant, as depending on your application you might find your file system filled up by unused images :)
Related
I'm currently building a HATEOAS/HAL based REST application with Spring MVC and JPA (Hibernate). Basically the application gives access to a database and allows data retrieval/creation/manipulation.
So far I've already got a lot of things done including a working controller for one of the resources, let's call it x.
But I don't want to give the API user the opportunity to create just an x resource, because this alone would be useless and could be deleted right away. He/she also has to define a new y and a z resource to make things work. So: Allowing to create all those resources independently would not break anything but maybe produce dead data like a z resource floating around without any connection, completely invisible und useless to the user.
Example: I don't want the user to create a new customer without directly attaching a business contract to the customer. (Two different resources: /customers and /contracts).
I did not really find any answers or best practice on the web, except for some sort of bulk POSTing, but only to one resource, where you would POST a ton of customers at once.
Now the following options come to my mind:
Let the user create the resources as he/she wants. If there are customers created and never connected to a contract - I don't care. The logic here would be: Allow the user to create /customers (and return some sort of id, of course). Then if he/she wants to POST a new /contract later I would check if the customer's id given exists and if it does: create the contract.
Expect the user, when POSTing to /customers, to also include contract data.
Option 1 would be the easiest way (and maybe more true to REST?).
Option 2 is a bit more complicated, since the user does not send single resources any more.
Currently, the controller method for adding a customer starts like that:
#RequestMapping(value = "", method = RequestMethod.POST)
public HttpEntity<Customers> addCustomer(#RequestBody Customers customer) {
//stuff...
}
This way the JSON in the RequestBody would directly fit in my customers class and I can continue working with it. Now with two (or more) expected resources included in the RequestBody this cannot be done the same way any more. Any ideas on how to handle that in a nice way?
I could create some sort of wrapper class (like CustomersContracts), that consists of customers and contract data and has the sole purpose of storing this kind of data in it. But this seems ugly.
I could also take the raw JSON in the RequestBody, parse it and then manually create a customer and a contract object from it, save the customer, get its id and attach it to the contract.
Any thoughts?
Coming back to here after a couple of months. I finally decided to create some kind of wrapper resource (these are example class names):
public class DataImport extends ResourceSupport implements Serializable {
/* The classes referenced here are #Entitys */
private Import1 import1;
private Import2 import2;
private List<Import3> import3;
private List<Import4> import4;
}
So the API user always has to send an Import1 and Import2 JSON object and an Import3 and Import4 JSON array (can also be empty).
In my controller class I do the following:
#RequestMapping(*snip*)
public ResponseEntity<?> add(#RequestBody DataImport dataImport) {
Import1 import1 = dataImport.getImport1();
Import2 import2 = dataImport.getImport2();
List<Import3> import3 = dataImport.getImport3();
List<Import4> import4 = dataImport.getImport4();
// continue...
}
I still don't know if it's the best way to do this, but it qorks quite well.
In my PostgreSQL database I have:
CREATE TABLE category (
// ...
category_name_localization JSON not null,
);
In Java, I have a JDO class like so:
#javax.jdo.annotations.PersistenceCapable(table = "category" )
public class Category extends _BlueEntity implements Serializable {
//...
private org.json.simple.JSONObject category_name_localization;
#javax.jdo.annotations.Column( name = "category_name_localization" )
public org.json.simple.JSONObject getCategoryNameLocalization() {
return category_name_localization;
}
}
When I use this class, DataNucleus gives the following exception:
org.datanucleus.exceptions.NucleusUserException: Field "com.advantagegroup.blue.ui.entity.Category.category_name_localization" is a map that has been specified without a join table and neither the key nor the value has a mapped-by specified. This is invalid!
at org.datanucleus.store.rdbms.RDBMSStoreManager.newJoinTable(RDBMSStoreManager.java:2720)
at org.datanucleus.store.rdbms.mapping.java.AbstractContainerMapping.initialize(AbstractContainerMapping.java:82)
at org.datanucleus.store.rdbms.mapping.MappingManagerImpl.getMapping(MappingManagerImpl.java:680)
at org.datanucleus.store.rdbms.table.ClassTable.manageMembers(ClassTable.java:518)
at org.datanucleus.store.rdbms.table.ClassTable.manageClass(ClassTable.java:424)
at org.datanucleus.store.rdbms.table.ClassTable.initializeForClass(ClassTable.java:1250)
at org.datanucleus.store.rdbms.table.ClassTable.initialize(ClassTable.java:271)
at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3288)
at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2897)
at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:118)
at org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1637)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:665)
at org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2098)
at org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1278)
at org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3668)
at org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:2276)
at org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:482)
at org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:122)
at org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:218)
at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:1986)
at org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1830)
at org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1685)
at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:712)
at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:738)
at com.advantagegroup.blue.ui.jdo._BlueJdo.insert(_BlueJdo.java:40)
at ...
This error makes sense in a way, because org.json.simple.JSONObject extends Map. However, this field is not part of any relationships -- it is of type JSON and therefore it is natural to back it with JSONObject
How do I tell JDO / DataNucleus to chill and treat org.json.simple.JSONObject the same way it would a String or a Date?
Thanks!
DC
My understanding of this is that your default attempt is trying to persist a normal Map (since while it doesnt know what a JSONObject is, it does know what a Map is), and it will need a join table for that for RDBMS.
Since you presumably want the JSONObject persisted into a single column then you need to create a JDO AttributeConverter. I've done similar things with my own types and it works fine (i'm on v5.0.5 IIRC).
I also found this in their docs, for when you have your own Map class that it doesn't know how to handle by default in terms of replacing it with a proxy (to intercept the calls to put, putAll etc). If you add that line it will not try to wrap this field with a proxy (which it doesn't know how to do for that type, unless you tell it). If you wanted to auto-detect the JSONObject becoming "dirty" you would need to write a proxy wrapper, as per this page.
This doesn't answer how to map the column for that converter to use a "json" type in PostgreSQL, but i'd guess that if you set the sqlType you may get success in that respect.
In the excellent mvvmcross-library I can use RIO binding to prevent unreadable code:
public INC<String>Title = new NC<String>();
Then I can read and write values using Title.Value. Makes the models much more readable.
Normally, this property would be written as:
private string _title;
public string Title
{
get { return _title; }
set
{
_title = value;
RaisePropertyChanged("Title");
}
}
But when I want to use sqlite-net, these fields cannot be streamed to the database because they are not basic types with a getter and setter.
I can think of a few options how to get around that:
Make a new simple object that is similar to the model, but only with
the direct db-fields. And create a simple import-export static
method on the model. This also could prevent struggling with complex
model-code that never needs to relate to the actual database.
Make sqlite-net understand reading NC-fields. I read into the code of the mapper, but it looks like this is going to be a lot of work because it relies on the getter-setter. I did not find a way to insert custom mapping to a type, that could be generic.
Remove RIO and just put in all the code myself instead of relying on RIO.
Maybe someone has some advice?
Thanks Stuart. It was exactly my thought, so I did implement it that way: my (DB) Models do not contain RIO. Only my viewmodels do, and they reference a Model that is DB-compatible.
So, for posterity the following tips:
- Do not use RIO in your models that need to be database-backed.
- Reference models in your viewmodels. In the binding you can use the . (dot) to reference this model.
This keeps them nicely separated. This gives you also another advantage: if you need to reuse a model (because the same object might be displayed twice on the screen), but under different circumstances, it is much easier to handle this situaties to find this already instantiated model.
I am normally writing all parts of the code in C# and when writing protocols that are serialized I use FastSerializer that serializes/deserializes the classes fast and efficient. It is also very easy to use, and fairly straight-forward to do "versioning", ie to handle different versions of the serialization. The thing I normally use, looks like this:
public override void DeserializeOwnedData(SerializationReader reader, object context)
{
base.DeserializeOwnedData(reader, context);
byte serializeVersion = reader.ReadByte(); // used to keep what version we are using
this.CustomerNumber = reader.ReadString();
this.HomeAddress = reader.ReadString();
this.ZipCode = reader.ReadString();
this.HomeCity = reader.ReadString();
if (serializeVersion > 0)
this.HomeAddressObj = reader.ReadUInt32();
if (serializeVersion > 1)
this.County = reader.ReadString();
if (serializeVersion > 2)
this.Muni = reader.ReadString();
if (serializeVersion > 3)
this._AvailableCustomers = reader.ReadList<uint>();
}
and
public override void SerializeOwnedData(SerializationWriter writer, object context)
{
base.SerializeOwnedData(writer, context);
byte serializeVersion = 4;
writer.Write(serializeVersion);
writer.Write(CustomerNumber);
writer.Write(PopulationRegistryNumber);
writer.Write(HomeAddress);
writer.Write(ZipCode);
writer.Write(HomeCity);
if (CustomerCards == null)
CustomerCards = new List<uint>();
writer.Write(CustomerCards);
writer.Write(HomeAddressObj);
writer.Write(County);
// v 2
writer.Write(Muni);
// v 4
if (_AvailableCustomers == null)
_AvailableCustomers = new List<uint>();
writer.Write(_AvailableCustomers);
}
So its easy to add new things, or change the serialization completely if one chooses to.
However, I now want to use JSON for reasons not relevant right here =) I am currently using DataContractJsonSerializer and I am now looking for a way to have the same flexibility I have using the FastSerializer above.
So the question is; what is the best way to create a JSON protocol/serialization and to be able to detail the serialization as above, so that I do not break the serialization just because another machine hasn't yet updated their version?
The key to versioning JSON is to always add new properties, and never remove or rename existing properties. This is similar to how protocol buffers handle versioning.
For example, if you started with the following JSON:
{
"version": "1.0",
"foo": true
}
And you want to rename the "foo" property to "bar", don't just rename it. Instead, add a new property:
{
"version": "1.1",
"foo": true,
"bar": true
}
Since you are never removing properties, clients based on older versions will continue to work. The downside of this method is that the JSON can get bloated over time, and you have to continue maintaining old properties.
It is also important to clearly define your "edge" cases to your clients. Suppose you have an array property called "fooList". The "fooList" property could take on the following possible values: does not exist/undefined (the property is not physically present in the JSON object, or it exists and is set to "undefined"), null, empty list or a list with one or more values. It is important that clients understand how to behave, especially in the undefined/null/empty cases.
I would also recommend reading up on how semantic versioning works. If you introduce a semantic versioning scheme to your version numbers, then backwards compatible changes can be made on a minor version boundary, while breaking changes can be made on a major version boundary (both clients and servers would have to agree on the same major version). While this isn't a property of the JSON itself, this is useful for communicating the types of changes a client should expect when the version changes.
Google's java based gson library has an excellent versioning support for json. It could prove a very handy if you are thinking going java way.
There is nice and easy tutorial here.
It doesn't matter what serializing protocol you use, the techniques to version APIs are generally the same.
Generally you need:
a way for the consumer to communicate to the producer the API version it accepts (though this is not always possible)
a way for the producer to embed versioning information to the serialized data
a backward compatible strategy to handle unknown fields
In a web API, generally the API version that the consumer accepts is embedded in the Accept header (e.g. Accept: application/vnd.myapp-v1+json application/vnd.myapp-v2+json means the consumer can handle either version 1 and version 2 of your API) or less commonly in the URL (e.g. https://api.twitter.com/1/statuses/user_timeline.json). This is generally used for major versions (i.e. backward incompatible changes). If the server and the client does not have a matching Accept header, then the communication fails (or proceeds in best-effort basis or fallback to a default baseline protocol, depending on the nature of the application).
The producer then generates a serialized data in one of the requested version, then embed this version info into the serialized data (e.g. as a field named version). The consumer should use the version information embedded in the data to determine how to parse the serialized data. The version information in the data should also contain minor version (i.e. for backward compatible changes), generally consumers should be able to ignore the minor version information and still process the data correctly although understanding the minor version may allow the client to make additional assumptions about how the data should be processed.
A common strategy to handle unknown fields is like how HTML and CSS are parsed. When the consumer sees an unknown fields they should ignore it, and when the data is missing a field that the client is expecting, it should use a default value; depending on the nature of the communication, you may also want to specify some fields that are mandatory (i.e. missing fields is considered fatal error). Fields added within minor versions should always be optional field; minor version can add optional fields or change fields semantic as long as it's backward compatible, while major version can delete fields or add mandatory fields or change fields semantic in a backward incompatible manner.
In an extensible serialization format (like JSON or XML), the data should be self-descriptive, in other words, the field names should always be stored together with the data; you should not rely on the specific data being available on specific positions.
Don't use DataContractJsonSerializer, as the name says, the objects that are processed through this class will have to:
a) Be marked with [DataContract] and [DataMember] attributes.
b) Be strictly compliant with the defined "Contract" that is, nothing less and nothing more that it is defined. Any extra or missing [DataMember] will make the deserialization to throw an exception.
If you want to be flexible enough, then use the JavaScriptSerializer if you want to go for the cheap option... or use this library:
http://json.codeplex.com/
This will give you enough control over your JSON serialization.
Imagine you have an object in its early days.
public class Customer
{
public string Name;
public string LastName;
}
Once serialized it will look like this:
{ Name: "John", LastName: "Doe" }
If you change your object definition to add / remove fields. The deserialization will occur smoothly if you use, for example, JavaScriptSerializer.
public class Customer
{
public string Name;
public string LastName;
public int Age;
}
If yo try to de-serialize the last json to this new class, no error will be thrown. The thing is that your new fields will be set to their defaults. In this example: "Age" will be set to zero.
You can include, in your own conventions, a field present in all your objects, that contains the version number. In this case you can tell the difference between an empty field or a version inconsistence.
So lets say: You have your class Customer v1 serialized:
{ Version: 1, LastName: "Doe", Name: "John" }
You want to deserialize into a Customer v2 instance, you will have:
{ Version: 1, LastName: "Doe", Name: "John", Age: 0}
You can somehow, detect what fields in your object are somehow reliable and what's not. In this case you know that your v2 object instance is coming from a v1 object instance, so the field Age should not be trusted.
I have in mind that you should use also a custom attribute, e.g. "MinVersion", and mark each field with the minimum supported version number, so you get something like this:
public class Customer
{
[MinVersion(1)]
public int Version;
[MinVersion(1)]
public string Name;
[MinVersion(1)]
public string LastName;
[MinVersion(2)]
public int Age;
}
Then later you can access this meta-data and do whatever you might need with that.
I finally decided myself to post my problem, after a couple of hours spent searching the Internet for solutions and trying some.
[Problem Context]
I am developing an application which will be deployed in two parts:
an XML Importer tool: its role is to Load/Read an xml file in order to fill some datastructures, which are afterwards serialized into a binary file.
the end user application: it will Load the binary file generated by the XML Importer and do some stuff with the recovered data structures.
For now, I only use the XML Importer for both purposes (meaning I first load the xml and save it to a binary file, then I reopen the XML Importer and load my binary file).
[Actual Problem]
This works just fine and I am able to recover all the data I had after XML loading, as long as I do that with the same build of my XML Importer. This is not viable, as I will need at the very least two different builds, one for the XML Importer and one for the end user application. Please note that the two versions of the XML Importer I use for my testing are exactly the same concerning the source code and thus the datastructures, the only difference lies in the build number (to force a different build I just add a space somewhere and build again).
So what I'm trying to do is:
Build a version of my XML Importer
Open the XML Importer, load an XML file and save the resulting datastructures to a binary file
Rebuild the XML Importer
Open the XML Importer newly built, load the previously created binary file and recover my datastructures.
At this time, I get an Exception:
SerializationException: Could not find type 'System.Collections.Generic.List`1[[Grid, 74b7fa2fcc11e47f8bc966e9110610a6, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null]]'.
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadType (System.IO.BinaryReader reader, TypeTag code)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadTypeMetadata (System.IO.BinaryReader reader, Boolean isRuntimeObject, Boolean hasTypeInfo)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadObjectInstance (System.IO.BinaryReader reader, Boolean isRuntimeObject, Boolean hasTypeInfo, System.Int64& objectId, System.Object& value, System.Runtime.Serialization.SerializationInfo& info)
System.Runtime.Serialization.Formatters.Binary.ObjectReader.ReadObject (BinaryElement element, System.IO.BinaryReader reader, System.Int64& objectId, System.Object& value, System.Runtime.Serialization.SerializationInfo& info)
For your information (don't know if useful or not), the actual type it is struggling to deserialize is a List, Grid being a custom Class (which is correctly serializable, as I am able to do it when using the same version of XML Importer).
[Potential Solution]
I do believe it comes from somewhere around the Assembly, as I read many posts and articles about this. However, I already have a custom Binder taking care of the differences of Assembly names, looking like this:
public sealed class VersionDeserializationBinder : SerializationBinder
{
public override Type BindToType( string assemblyName, string typeName )
{
if ( !string.IsNullOrEmpty( assemblyName ) && !string.IsNullOrEmpty( typeName ) )
{
Type typeToDeserialize = null;
assemblyName = Assembly.GetExecutingAssembly().FullName;
// The following line of code returns the type.
typeToDeserialize = Type.GetType( String.Format( "{0}, {1}", typeName, assemblyName ) );
return typeToDeserialize;
}
return null;
}
}
which I assign to the BinaryFormatter before deserializing here:
public static SaveData Load (string filePath)
{
SaveData data = null;//new SaveData ();
Stream stream;
stream = File.Open(filePath, FileMode.Open);
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Binder = new VersionDeserializationBinder();
data = (SaveData)bformatter.Deserialize(stream);
stream.Close();
Debug.Log("Binary version loaded from " + filePath);
return data;
}
Do any of you guys have an idea on how I could fix it? Would be awesome, pretty please :)
Move the working bits to a separate assembly and use the assembly in both "server" and "client". Based on your explanation of the problem, this should get around the "wrong version" problem, if that is the core issue. I would also take any "models" (i.e. bits of state like Grid) to a domain model project, and use that in both places.
I just bumped into your thread while I had the same problem. Especially your code sample with the SerializationBinder helped me a lot. I just had to modify it slightly to tell a difference between my own assemblies and those of Microsoft. Hopefully it still helps you, too:
sealed class VersionDeserializationBinder : SerializationBinder
{
public override Type BindToType(string assemblyName, string typeName)
{
Type typeToDeserialize = null;
string currentAssemblyInfo = Assembly.GetExecutingAssembly().FullName;
//my modification
string currentAssemblyName = currentAssemblyInfo.Split(',')[0];
if (assemblyName.StartsWith(currentAssemblyName))assemblyName = currentAssemblyInfo;
typeToDeserialize = Type.GetType(string.Format("{0}, {1}", typeName, assemblyName));
return typeToDeserialize;
}
}
I believe the problem is that you are telling it to look for List<> in the executing assembly, whereas in fact it is in the System assembly. You should only re-assign the assembly name in your binder if the original assembly is one of yours.
Also, you might have to handle the parameter types for generics specifically in the binder, by parsing out the type name and making sure the parameter types are not specific to the foreign assembly when you return the parameterized generic type.