Omit values that are Option::None when encoding JSON with rustc_serialize - json

I have a struct that I want to encode to JSON. This struct contains a field with type Option<i32>. Let's say
extern crate rustc_serialize;
use rustc_serialize::json;
#[derive(RustcEncodable)]
struct TestStruct {
test: Option<i32>
}
fn main() {
let object = TestStruct {
test: None
};
let obj = json::encode(&object).unwrap();
println!("{}", obj);
}
This will give me the output
{"test": null}
Is there a convenient way to omit Option fields with value None? In this case I would like to have the output
{}

If someone arrives here with the same question like I did, serde has now an option skip_serializing_none to do exactly that.
https://docs.rs/serde_with/1.8.0/serde_with/attr.skip_serializing_none.html

It doesn't seem to be possible by doing purely from a struct, so i converted the struct into a string, and then converted that into a JSON object. This method requires that all Option types be the same type. I'd recommend if you need to have variable types in the struct to turn them into string's first.
field_vec and field_name_vec have to be filled with all fields at compile time, as I couldn't find a way to get the field values, and field names without knowing them in rust at run time.
extern crate rustc_serialize;
use rustc_serialize::json::Json;
fn main() {
#[derive(RustcEncodable)]
struct TestStruct {
test: Option<i32>
}
impl TestStruct {
fn to_json(&self) -> String {
let mut json_string = String::new();
json_string.push('{');
let field_vec = vec![self.test];
let field_name_vec = vec![stringify!(self.test)];
let mut previous_field = false;
let mut count = 0;
for field in field_vec {
if previous_field {
json_string.push(',');
}
match field {
Some(value) => {
let opt_name = field_name_vec[count].split(". ").collect::<Vec<&str>>()[1];
json_string.push('"');
json_string.push_str(opt_name);
json_string.push('"');
json_string.push(':');
json_string.push_str(&value.to_string());
previous_field = true;
},
None => {},
}
count += 1;
}
json_string.push('}');
json_string
}
}
let object = TestStruct {
test: None
};
let object2 = TestStruct {
test: Some(42)
};
let obj = Json::from_str(&object.to_json()).unwrap();
let obj2 = Json::from_str(&object2.to_json()).unwrap();
println!("{:?}", obj);
println!("{:?}", obj2);
}

To omit Option<T> fields, you can create an implementation of the Encodable trait (instead of using #[derive(RustcEncodable)]).
Here I updated your example to do this.
extern crate rustc_serialize;
use rustc_serialize::json::{ToJson, Json};
use rustc_serialize::{Encodable,json};
use std::collections::BTreeMap;
#[derive(PartialEq, RustcDecodable)]
struct TestStruct {
test: Option<i32>
}
impl Encodable for TestStruct {
fn encode<S: rustc_serialize::Encoder>(&self, s: &mut S) -> Result<(), S::Error> {
self.to_json().encode(s)
}
}
impl ToJson for TestStruct {
fn to_json(&self) -> Json {
let mut d = BTreeMap::new();
match self.test {
Some(value) => { d.insert("test".to_string(), value.to_json()); },
None => {},
}
Json::Object(d)
}
}
fn main() {
let object = TestStruct {
test: None
};
let obj = json::encode(&object).unwrap();
println!("{}", obj);
let decoded: TestStruct = json::decode(&obj).unwrap();
assert!(decoded==object);
}
It would be nice to implement a custom #[derive] macro which does this automatically for Option fields, as this would eliminate the need for such custom implementations of Encodable.

Related

Is there a way to derive a struct like with Deserialize to get automatic transform from serde_json::Value?

Deserialising from a string directly into a struct works perfectly. But in some cases, you may already have a serde_json::Value in your hands, and want to try and convert it into a struct.
The following example illustrate just that: loading a Request struct from JSON (in a network library for example), with a type string and a generic content as a Value, and then you want to call a handler (from a client library) with the value transformed into a given struct.
use serde::Deserialize;
use serde_json::{json, Value};
use std::convert::TryFrom;
use std::error::Error;
#[derive(Deserialize)]
struct Request {
#[serde(alias = "type")]
req_type: String,
content: Value
}
#[derive(Deserialize)]
struct Person {
name: String,
age: u8
}
// It there a way to avoid having to declare this???
impl TryFrom<Value> for Person {
type Error = serde_json::Error;
fn try_from(value: Value) -> Result<Self, Self::Error> {
Person::deserialize(value)
}
}
fn say_hello(p: Person) {
println!("Hello, I'm {}, and I'm {} years old!", p.name, p.age);
}
fn main() -> Result<(), Box<dyn Error>> {
let req: Request = Request::deserialize(json!({
"type": "sayHello",
"content": {
"name": "Pierre",
"age": 32
}
}))?;
match req.req_type.as_str() {
"sayHello" => say_hello(req.content.try_into()?),
_ => println!("unknown request")
}
Ok(())
}
So the question is: is there some derive or other magic implemented which would allow the same behaviour as Deserialize from String, so that the client can only write:
#[derive(Deserialize)]
struct Person {
name: String,
age: u8
}
fn say_hello(p: Person) {
println!("Hello, I'm {}, and I'm {} years old!", p.name, p.age);
}
I tried the #[serde(try_from = "Value")] attribute but it does not look like it's intended for that purpose...
There is serde_json::from_value() specifically for this:
pub fn from_value<T>(value: Value) -> Result<T, Error>
where
T: DeserializeOwned,
Given any serde_json::Value and some T: DeserializedOwned, the function will deserialize the Value to that T, if possible.

Reading CSV with list valued columns in rust

I'm trying to use csv and serde to read a mixed-delimiter csv-type file in rust, but I'm having a hard time seeing how to use these libraries to accomplish it. Each line looks roughly like:
value1|value2|subvalue1,subvalue2,subvalue3|value4
and would de-serialize to a struct that looks like:
struct Line {
value1:u64,
value2:u64,
value3:Vec<u64>,
value4:u64,
}
Any guidance on how to tell the library that there are two different delimiters and that one of the columns has this nested structure?
Ok, I'm still a beginner in Rust so I can't guarantee that this is good at all- I suspect it could be done more efficiently, but I do have a solution that works-
use csv::{ReaderBuilder};
use serde::{Deserialize, Deserializer};
use serde::de::Error;
use std::error::Error as StdError;
#[derive(Debug, Deserialize)]
pub struct ListType {
values: Vec<u8>,
}
fn deserialize_list<'de, D>(deserializer: D) -> Result<ListType , D::Error>
where D: Deserializer<'de> {
let buf: &str = Deserialize::deserialize(deserializer)?;
let mut rdr = ReaderBuilder::new()
.delimiter(b',')
.has_headers(false)
.from_reader(buf.as_bytes());
let mut iter = rdr.deserialize();
if let Some(result) = iter.next() {
let record: ListType = result.map_err(D::Error::custom)?;
return Ok(record)
} else {
return Err("error").map_err(D::Error::custom)
}
}
struct Line {
value1:u64,
value2:u64,
#[serde(deserialize_with = "deserialize_list")]
value3:ListType,
value4:u64,
}
fn read_line(line: &str) -> Result<Line, Box<dyn StdError>> {
let mut rdr = ReaderBuilder::new()
.delimiter(b'|')
.from_reader(line.as_bytes());
let mut iter = rdr.deserialize();
if let Some(result) = iter.next() {
let record: Line = result?;
return Ok(Line)
} else {
return Err(From::from("error"));
}
}
[EDIT]
I found that the above solution was intolerably slow, but I was able to make performance acceptable by simply manually deserializing the nested type into a fixed size array by-
#[derive(Debug, Deserialize)]
pub struct ListType {
values: [Option<u8>; 8],
}
fn deserialize_farray<'de, D>(deserializer: D) -> Result<ListType, D::Error>
where
D: Deserializer<'de>,
{
let buf: &str = Deserialize::deserialize(deserializer)?;
let mut split = buf.split(",");
let mut dest: CondList = CondList {
values: [None; 8],
};
let mut ind: usize = 0;
for tok in split {
if tok == "" {
break;
}
match tok.parse::<u8>() {
Ok(val) => {
dest.values[ind] = Some(val);
}
Err(e) => {
return Err(e).map_err(D::Error::custom);
}
}
ind += 1;
}
return Ok(dest);
}

Serde Deserialize into one of multiple structs?

Is there a nice way to tentatively deserialize a JSON into different structs? Couldn't find anything in the docs and unfortunately the structs have "tag" to differentiate as in How to conditionally deserialize JSON to two different variants of an enum?
So far my approach has been like this:
use aws_lambda_events::event::{
firehose::KinesisFirehoseEvent, kinesis::KinesisEvent,
kinesis_analytics::KinesisAnalyticsOutputDeliveryEvent,
};
use lambda::{lambda, Context};
use serde_json::Value;
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
enum MultipleKinesisEvent {
KinesisEvent(KinesisEvent),
KinesisFirehoseEvent(KinesisFirehoseEvent),
KinesisAnalyticsOutputDeliveryEvent(KinesisAnalyticsOutputDeliveryEvent),
None,
}
#[lambda]
#[tokio::main]
async fn main(event: Value, _: Context) -> Result<String, Error> {
let multi_kinesis_event = if let Ok(e) = serde_json::from_value::<KinesisEvent>(event.clone()) {
MultipleKinesisEvent::KinesisEvent(e)
} else if let Ok(e) = serde_json::from_value::<KinesisFirehoseEvent>(event.clone()) {
MultipleKinesisEvent::KinesisFirehoseEvent(e)
} else if let Ok(e) = serde_json::from_value::<KinesisAnalyticsOutputDeliveryEvent>(event) {
MultipleKinesisEvent::KinesisAnalyticsOutputDeliveryEvent(e)
} else {
MultipleKinesisEvent::None
};
// code below is just sample
let s = match multi_kinesis_event {
MultipleKinesisEvent::KinesisEvent(_) => "Kinesis Data Stream!",
MultipleKinesisEvent::KinesisFirehoseEvent(_) => "Kinesis Firehose!",
MultipleKinesisEvent::KinesisAnalyticsOutputDeliveryEvent(_) => "Kinesis Analytics!",
MultipleKinesisEvent::None => "Not Kinesis!",
};
Ok(s.to_owned())
}
You should use the #untagged option.
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, Debug)]
struct KinesisFirehoseEvent {
x: i32,
y: i32
}
#[derive(Serialize, Deserialize, Debug)]
struct KinesisEvent(i32);
#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)]
enum MultipleKinesisEvent {
KinesisEvent(KinesisEvent),
KinesisFirehoseEvent(KinesisFirehoseEvent),
None,
}
fn main() {
let event = MultipleKinesisEvent::KinesisFirehoseEvent(KinesisFirehoseEvent { x: 1, y: 2 });
// Convert the Event to a JSON string.
let serialized = serde_json::to_string(&event).unwrap();
// Prints serialized = {"x":1,"y":2}
println!("serialized = {}", serialized);
// Convert the JSON string back to a MultipleKinesisEvent.
// Since it is untagged
let deserialized: MultipleKinesisEvent = serde_json::from_str(&serialized).unwrap();
// Prints deserialized = KinesisFirehoseEvent(KinesisFirehoseEvent { x: 1, y: 2 })
println!("deserialized = {:?}", deserialized);
}
See in playground
Docs for: Untagged

How to read and process a pipe delimited file in Rust?

I want to read a pipe delimited file, process the data, and generate a result in CSV format.
Input file data
A|1|Pass
B|2|Fail
A|3|Fail
C|6|Pass
A|8|Pass
B|10|Fail
C|25|Pass
A|12|Fail
C|26|Pass
C|26|Fail
I'm want to apply a group by function on column 1 and column 3 and generate column 2's sum according to a particular group.
I'm stuck at the point of how to maintain the records to apply the group by function:
use std::fs::File;
use std::io::{BufReader};
use std::io::{BufRead};
use std::collections::HashMap;
fn say_hello(id: &str, value: i32, no_change : i32) {
if no_change == 101 {
let mut data = HashMap::new();
}
if value == 0 {
if data.contains_key(id) {
for (key, value) in &data {
if value.is_empty() {
}
}
} else {
data.insert(id,"");
}
} else if value == 2 {
if data.contains_key(id) {
for (key, value) in &data {
if value.is_empty() {
} else {
}
}
} else {
data.insert(id,"");
}
}
}
fn main() {
let f = File::open("sample2.txt").expect("Unable to open file");
let br = BufReader::new(f);
let mut no_change = 101;
for line in br.lines() {
let mut index = 0;
for value in line.unwrap().split('|') {
say_hello(&value,index,no_change);
index = index + 1;
}
}
}
I'm expecting a result like:
name,result,num
A,Fail,15
A,Pass,9
B,Fail,12
C,Fail,26
C,Pass,57
Is there any specific technique to read a pipe-delimited file and process the data like above? Python's pandas accomplished this requirement but I want to do it in Rust.
As was mentioned, use the csv crate to do the heavy lifting of parsing the file. Then it's just a matter of grouping each row by using a BTreeMap which also helpfully performs sorting. The entry API helps efficiently insert into the BTreeMap.
extern crate csv;
extern crate rustc_serialize;
use std::fs::File;
use std::collections::BTreeMap;
#[derive(Debug, RustcDecodable)]
struct Record {
name: String,
value: i32,
passed: String,
}
fn main() {
let file = File::open("input").expect("Couldn't open input");
let mut csv_file = csv::Reader::from_reader(file).delimiter(b'|').has_headers(false);
let mut sums = BTreeMap::new();
for record in csv_file.decode() {
let record: Record = record.expect("Could not parse input file");
let key = (record.name, record.passed);
*sums.entry(key).or_insert(0) += record.value;
}
println!("name,result,num");
for ((name, passed), sum) in sums {
println!("{},{},{}", name, passed, sum);
}
}
You'll note that the output is correct:
name,result,num
A,Fail,15
A,Pass,9
B,Fail,12
C,Fail,26
C,Pass,57
I'd suggest something like this:
use std::str;
use std::collections::HashMap;
use std::io::{BufReader, BufRead, Cursor};
fn main() {
let data = "
A|1|Pass
B|2|Fail
A|3|Fail
C|6|Pass
A|8|Pass
B|10|Fail
C|25|Pass
A|12|Fail
C|26|Pass
C|26|Fail";
let lines = BufReader::new(Cursor::new(data))
.lines()
.flat_map(Result::ok)
.flat_map(parse_line);
for ((fa, fb), s) in group(lines) {
println!("{}|{}|{}", fa, fb, s);
}
}
type ParsedLine = ((String, String), usize);
fn parse_line(line: String) -> Option<ParsedLine> {
let mut fields = line
.split('|')
.map(str::trim);
if let (Some(fa), Some(fb), Some(fc)) = (fields.next(), fields.next(), fields.next()) {
fb.parse()
.ok()
.map(|v| ((fa.to_string(), fc.to_string()), v))
} else {
None
}
}
fn group<I>(input: I) -> Vec<ParsedLine> where I: Iterator<Item = ParsedLine> {
let mut table = HashMap::new();
for (k, v) in input {
let mut sum = table.entry(k).or_insert(0);
*sum += v;
}
let mut output: Vec<_> = table
.into_iter()
.collect();
output.sort_by(|a, b| a.0.cmp(&b.0));
output
}
playground link
Here a HashMap is used for grouping entries and then results are moved to a Vec for sorting.

rustc-serialize custom enum decoding

I have a JSON structure where one of the fields of a struct could be either an object, or that object's ID in the database. Let's say the document looks like this with both possible formats of the struct:
[
{
"name":"pebbles",
"car":1
},
{
"name":"pebbles",
"car":{
"id":1,
"color":"green"
}
}
]
I'm trying to figure out the best way to implement a custom decoder for this. So far, I've tried a few different ways, and I'm currently stuck here:
extern crate rustc_serialize;
use rustc_serialize::{Decodable, Decoder, json};
#[derive(RustcDecodable, Debug)]
struct Car {
id: u64,
color: String
}
#[derive(Debug)]
enum OCar {
Id(u64),
Car(Car)
}
#[derive(Debug)]
struct Person {
name: String,
car: OCar
}
impl Decodable for Person {
fn decode<D: Decoder>(d: &mut D) -> Result<Person, D::Error> {
d.read_struct("root", 2, |d| {
let mut car: OCar;
// What magic must be done here to get the right OCar?
/* I tried something akin to this:
let car = try!(d.read_struct_field("car", 0, |r| {
let r1 = Car::decode(r);
let r2 = u64::decode(r);
// Compare both R1 and R2, but return code for Err() was tricky
}));
*/
/* And this got me furthest */
match d.read_struct_field("car", 0, u64::decode) {
Ok(x) => {
car = OCar::Id(x);
},
Err(_) => {
car = OCar::Car(try!(d.read_struct_field("car", 0, Car::decode)));
}
}
Ok(Person {
name: try!(d.read_struct_field("name", 0, Decodable::decode)),
car: car
})
})
}
}
fn main() {
// Vector of both forms
let input = "[{\"name\":\"pebbles\",\"car\":1},{\"name\":\"pebbles\",\"car\":{\"id\":1,\"color\":\"green\"}}]";
let output: Vec<Person> = json::decode(&input).unwrap();
println!("Debug: {:?}", output);
}
The above panics with an EOL which is a sentinel value rustc-serialize uses on a few of its error enums. Full line is
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: EOF', src/libcore/result.rs:785
What's the right way to do this?
rustc-serialize, or at least its JSON decoder, doesn't support that use case. If you look at the implementation of read_struct_field (or any other method), you can see why: it uses a stack, but when it encounters an error, it doesn't bother to restore the stack to its original state, so when you try to decode the same thing differently, the decoder is operating on an inconsistent stack, eventually leading to an unexpected EOF value.
I would recommend you look into Serde instead. Deserializing in Serde is different: instead of telling the decoder what type you're expecting, and having no clear way to recover if a value is of the wrong type, Serde calls into a visitor that can handle any of the types that Serde supports in the way it wants. This means that Serde will call different methods on the visitor depending on the actual type of the value it parsed. For example, we can handle integers to return an OCar::Id and objects to return an OCar::Car.
Here's a full example:
#![feature(custom_derive, plugin)]
#![plugin(serde_macros)]
extern crate serde;
extern crate serde_json;
use serde::de::{Deserialize, Deserializer, Error, MapVisitor, Visitor};
use serde::de::value::MapVisitorDeserializer;
#[derive(Deserialize, Debug)]
struct Car {
id: u64,
color: String
}
#[derive(Debug)]
enum OCar {
Id(u64),
Car(Car),
}
struct OCarVisitor;
#[derive(Deserialize, Debug)]
struct Person {
name: String,
car: OCar,
}
impl Deserialize for OCar {
fn deserialize<D>(deserializer: &mut D) -> Result<Self, D::Error> where D: Deserializer {
deserializer.deserialize(OCarVisitor)
}
}
impl Visitor for OCarVisitor {
type Value = OCar;
fn visit_u64<E>(&mut self, v: u64) -> Result<Self::Value, E> where E: Error {
Ok(OCar::Id(v))
}
fn visit_map<V>(&mut self, visitor: V) -> Result<Self::Value, V::Error> where V: MapVisitor {
Ok(OCar::Car(try!(Car::deserialize(&mut MapVisitorDeserializer::new(visitor)))))
}
}
fn main() {
// Vector of both forms
let input = "[{\"name\":\"pebbles\",\"car\":1},{\"name\":\"pebbles\",\"car\":{\"id\":1,\"color\":\"green\"}}]";
let output: Vec<Person> = serde_json::from_str(input).unwrap();
println!("Debug: {:?}", output);
}
Output:
Debug: [Person { name: "pebbles", car: Id(1) }, Person { name: "pebbles", car: Car(Car { id: 1, color: "green" }) }]
Cargo.toml:
[dependencies]
serde = "0.7"
serde_json = "0.7"
serde_macros = "0.7"