How to deserialize a JSON file which contains null values using Serde? - json

I want to deserialize the chemical elements JSON file from Bowserinator on github using Serde. For this I created a structure with all the needed fields and derived the needed macros:
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Element {
name: String,
appearance: String,
atomic_mass: f64,
boil: f64,
category: String,
#[serde(default)]
color: String,
density: f64,
discovered_by: String,
melt: f64,
#[serde(default)]
molar_heat: f64,
named_by: String,
number: String,
period: u32,
phase: String,
source: String,
spectral_img: String,
summary: String,
symbol: String,
xpos: u32,
ypos: u32,
}
This works fine until it gets to fields which contain a "null" value.
E.g. for the field "color": null, in Helium.
The error message I get is { code: Message("invalid type: unit value, expected a string"), line: 8, column: 17 } for this field.
I experimented with the #[serde(default)] Macro. But this only works when the field is missing in the JSON file, not when there is a null value.
I like to do the deserialization with the standard macros avoiding to program a Visitor Trait. Is there a trick I miss?

A deserialization error occurs because the struct definition is incompatible with the incoming objects: the color field can also be null, as well as a string, yet giving this field the type String forces your program to always expect a string. This is the default behaviour, which makes sense. Be reminded that String (or other containers such as Box) are not "nullable" in Rust. As for a null value not triggering the default value instead, that is just how Serde works: if the object field wasn't there, it would work because you have added the default field attribute. On the other hand, a field "color" with the value null is not equivalent to no field at all.
One way to solve this is to adjust our application's specification to accept null | string, as specified by #user25064's answer:
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Element {
color: Option<String>,
}
Playground with minimal example
Another way is to write our own deserialization routine for the field, which will accept null and turn it to something else of type String. This can be done with the attribute #[serde(deserialize_with=...)].
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Element {
#[serde(deserialize_with="parse_color")]
color: String,
}
fn parse_color<'de, D>(d: D) -> Result<String, D::Error> where D: Deserializer<'de> {
Deserialize::deserialize(d)
.map(|x: Option<_>| {
x.unwrap_or("black".to_string())
})
}
Playground
See also:
How can I distinguish between a deserialized field that is missing and one that is null?

Any field that can be null should be an Option type so that you can handle the null case. Something like this?
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Element {
...
color: Option<String>,
...
}

Based on code from here, when one needs default values to be deserialized if null is present.
// Omitting other derives, for brevity
#[derive(Deserialize)]
struct Foo {
#[serde(deserialize_with = "deserialize_null_default")]
value: String,
}
fn deserialize_null_default<'de, D, T>(deserializer: D) -> Result<T, D::Error>
where
T: Default + Deserialize<'de>,
D: Deserializer<'de>,
{
let opt = Option::deserialize(deserializer)?;
Ok(opt.unwrap_or_default())
}
playground link with full example. This also works for Vec and HashMap.

Related

Deserializing JSON field with possible choices using serde

I'm trying to use serde json deserialization to support "choices" using enum, but it doesn't seep to be working (I have python enum background)
let's say I have this json :
{"name": "content", "state": "open"}
and state can be open or closed
in python I would just create an enum and the state type would be that enum eg:
#[derive(Deserialize)]
enum State {
Open(String),
Closed(String),
}
#[derive(Deserialize)]
struct MyStruct {
name: String,
state: State,
}
and the problem is that I don't know how to derserialize open to State::Open and closed to State::Closed
I have looked into implementing my own deserializer, but it seems very complicated and very advanced for me.
is there any straightforward way ?
You should remove the String. Then you'll get another error:
unknown variant `open`, expected `Open` or `Closed`
Because your enum variants are in PascalCase while your JSON is in camelCase (or snake_case, I don't know). To fix that, add #[serde(rename_all = "camelCase")]:
#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
enum State {
Open,
Closed,
}

Deserializing a struct with custom data types

I am trying to #[derive(Deserialize, Serialize)] some structs that involve other custom structs, so I can transform them in and out of JSON, for example:
#[derive(Debug, Clone, Copy, Deserialize, Serialize)]
pub struct Exercise {
#[serde(borrow)]
pub name: &'static str,
pub muscle_sub_groups: [MuscleSubGroup; 2],
pub recommended_rep_range: [u32; 2],
pub equipment: EquipmentType,
}
#[derive(Debug, Clone, Deserialize)]
pub struct SetEntry {
pub exercise: Exercise,
pub reps: u32,
pub weight: Weight, // another struct with two: &'static str
pub reps_in_reserve: f32,
}
…but I am running into this:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter 'de due to conflicting requirements
I've tried multiple different solutions online, including defining lifetimes and I've all but succeeded.
All of my code is here (sorry for spaghetti). The file in question is exercises.rs.
Minimized example which yields the same error:
use serde::Deserialize;
#[derive(Deserialize)]
pub struct Inner {
#[serde(borrow)]
pub name: &'static str,
}
#[derive(Deserialize)]
pub struct Outer {
pub inner: Inner,
}
Playground
To see what's wrong, let's look at the expanded code. It's rather large and not very readable, but even the signatures can help:
impl Deserialize<'static> for Inner {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'static>,
{
todo!()
}
}
impl<'de> Deserialize<'de> for Outer {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
todo!()
}
}
As you can see, the Inner struct (the Exersize in original code) can be deserialized only when the deserializer (i.e. the source data) is 'static, but the Outer struct (the SetEntry in original code) has its deserialization implemented for every deserializer lifetime - essentially, it implements DeserializeOwned, i.e. it doesn't need to borrow its data from anywhere.
Now you might ask, why this restricton? Why the Deserializer<'static> in the first case? The answer is - you asked for that, when placed serde(borrow) over &'static str.
When deserializing struct containing references, Serde has to prove that these references will never be dangling. Therefore, the lifetime of deserialized data (that is, the parameter to Deserialize trait) must be tied to the lifetime of original data - that is, to the parameter of Deserializer. And, if the struct has any restrictions on the lifetime of its contents, these restrictions are automatically transferred to the Deserializer.
In this case, you've asked for the strictest restriction possible - you've asked that the data being deserialized into Inner be available until the end of program. However, there's no such restriction on Outer - in fact, it can treat Inner as owned data, since this struct doesn't have any lifetime parameters at all, - so Serde asks for the most generic deserializer possible and then chokes when Inner requires it to be 'static.
Now, what to do in this case?
First of all, you definitely don't want to use &'static str in any runtime-generated data. This is the type of string literals, i.e. of strings baked into the executable itself, not of the common-case strings.
The simplest and probably most correct way would be to replace any &'static str with the owned String. This will eliminate the need for serde(borrow) and make you struct deserializable from anything.
If, however, you want to use references (e.g. to eliminate unnecessary copies), you have to treat the whole structs tree as a temporary borrow into the deserializer - that is, you'll have the lifetime parameter tied to the &str in every struct which directly or indirectly contains that &str:
use serde::Deserialize;
#[derive(Deserialize)]
pub struct Inner<'a> {
#[serde(borrow)]
pub name: &'a str,
}
#[derive(Deserialize)]
pub struct Outer<'a> {
#[serde(borrow)]
pub inner: Inner<'a>,
}
And then, when creating these structs manually with string literals inside, you can just substitute 'static for 'a.

Transforming "null" in JSON to empty String instead of "None"

I'm using serde to deserialize a JSON file and one of it's values is a String.
To read it, I'm using:
#[serde(default)]
pub key: Option<String>,
because in the JSON file I can have null (then Option handles) or not even pass it (#serde(default)] handles it).
The problem I'm having is that when in the null case, the Option is returning None, which is giving me trouble later. I have to later match the Strings and convert to an i8 like this:
let mut transformed: i8 = 0;
if key.as_ref().unwrap() == "H" {
transformed = 1;
}
else {
transformed = -1; // Case that I'm looking for when null in JSON
}
I searched for match practices to handle the None, but it's also giving me trouble with the String vs &str problem, so I'm looking for a way of when deserializing, assign an empty String "" instead of None, so later I can compare in the same way I'm already doing.
Also would appreciate less verbose solution to directly parse and assign an i8.
As mentioned by mcarton in the comments, the easiest solution is to stick with using an Option<String> in you struct and use .unwrap_or_default() in the code consuming the data. If this isn't an option for you, you can provide a custom deserializer using the #[serde(deserialize_with=...)] attribute:
use serde_json::from_str;
use serde::{Deserialize, Deserializer, Serialize};
#[derive(Serialize, Deserialize, Debug)]
struct A {
#[serde(deserialize_with = "null_to_default")]
key: String,
}
fn null_to_default<'de, D, T>(de: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: Default + Deserialize<'de>,
{
let key = Option::<T>::deserialize(de)?;
Ok(key.unwrap_or_default())
}
fn main() {
let a: A = from_str(r#"{"key": null}"#).unwrap();
dbg!(a);
}
(Playground)

Deserialize json based on an enum in the json

Is it possible to use a value in JSON to determine how to deserialize the rest of the JSON using serde? For example, consider the following code:
use serde::{Serialize, Deserialize};
use serde_repr::*;
#[derive(Serialize_repr, Deserialize_repr, Debug)]
#[repr(u8)]
enum StructType {
Foo = 1,
Bar = 2
}
#[derive(Serialize, Deserialize, Debug)]
struct Foo {
a: String,
b: u8
}
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: String,
y: u32,
z: u16
}
#[derive(Serialize, Deserialize, Debug)]
struct AllMyStuff {
type: StructType,
data: //HELP: Not sure what to put here
}
What I'm trying to achieve is deserialization of the data, even if in multiple steps, where the type field in the AllMyStuff determines which type of struct data is present in data. For example, given the following pseudocode, I'd like to ultimately have a Bar struct with the proper data in it:
data = {"type": "2", "data": { "x": "Hello world", "y": "18", "z": "5" } }
// 1) use serde_json to deserialize a AllMyStuff struct, not erroring on the "data" blob
// 2) Now that we know data is of type "2" (or Bar), parse the remaining "data" into a AllMyStuff struct
If steps (1) and (2) are somehow able to be done in a single step, that would be awesome but not needed. I'm not sure what type of type to declare data in the AllMyStuff struct to enable this as well.
You can use serde_json::Value as the type for AllMyStuff::data. It will deserialize any valid json object and also implements Deserialize itself, so it can be further deserialized once the type to deserialize to is known (via AllMyStuff::type). While this requires more intermittent steps and (mostly temporary) types, it saves you from manually implementing Deserialize on an enum AllMyStuff { Foo(Foo), Bar(Bar) }.
I may be missing something, but AllMyStuff looks as if you are trying to manually distinguish between Foo and Bar.
However, Rust, has a built-in way of doing this:
#[derive(Serialize, Deserialize, Debug)]
enum AllMyStuff {
Foo(Foo),
Bar(Bar),
}
Click here to see it in action.

Rust and JSON serialization

If the JSON object is missing some fields, the decode function throws an exception. For example:
extern crate rustc_serialize;
use rustc_serialize::json;
use rustc_serialize::json::Json;
#[derive(RustcDecodable, RustcEncodable, Debug)]
enum MessageType {
PING,
PONG,
OPT,
}
#[derive(RustcDecodable, RustcEncodable, Debug)]
pub struct JMessage {
msg_type: String,
mtype: MessageType,
}
fn main() {
let result3 = json::decode::<JMessage>(r#"{"msg_type":"TEST"}"#);
println!("{:?}", result3);
// this will print `Err(MissingFieldError("mtype"))`
let result = json::decode::<JMessage>(r#"{"msg_type":"TEST", "mtype":"PING"}"#);
println!("{:?}", &result);
// This will print Ok(JMessage { msg_type: "TEST", mtype: PING })
let result2 = Json::from_str(r#"{"msg_type":"TEST", "mtype":"PING"}"#).unwrap();
println!("{:?}", &result2);
// this will print Object({"msg_type": String("TEST"), "mtype": String("PING")})
}
Is there a way to specify that some fields in a struct are optional?
Why does the function from_str not serialize mtype as an enum?
No, there is no such way. For that, you need to use serde. Serde also has lots of other features, but unfortunately it is not as easy to use as rustc_serialize on stable Rust.
Well, how should it? Json::from_str returns a JSON AST, which consists of maps, arrays, strings, numbers and other JSON types. It simply cannot contain values of your enum. And also there is no way to indicate that you want some other type instead of string, naturally.
Regarding the first question, you can use Option. For example:
pub struct JMessage {
msg_type: Option<String>,
mtype: MessageType,
}
Which defaults to None if the field does not exist.