I'm trying to parse a JSON file with the following loose format using serde_json in Rust:
{
"Source_n": {
"Destination_n": {
"distance": 2,
"connections": [
{
"color": "Any",
"locomotives": 0,
"tunnels": 0
}
]
}
...
where Source and Destination can be any number of keys (Link to full file).
I've created the following structs in an attempt to deseralize the JSON:
#[derive(Debug, Deserialize)]
struct L0 {
routes: HashMap<String, L1>,
}
#[derive(Debug, Deserialize)]
struct L1 {
destination_city: HashMap<String, L2>,
}
#[derive(Debug, Deserialize)]
struct L2 {
distance: u8,
connections: Vec<L3>,
}
#[derive(Debug, Deserialize, Clone)]
struct L3 {
color: String,
locomotives: u8,
tunnels: u8,
}
When I try to read the JSON as an L0 object I get a panic on this line:
let data: L0 = serde_json::from_str(&route_file_as_string).unwrap();
Panic:
Finished dev [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/ticket-to-ride`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("missing field `routes`", line: 1889, column: 1)', src/route.rs:39:64
stack backtrace:
0: rust_begin_unwind
at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/panicking.rs:517:5
1: core::panicking::panic_fmt
at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/panicking.rs:101:14
2: core::result::unwrap_failed
at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/result.rs:1617:5
3: core::result::Result<T,E>::unwrap
at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/result.rs:1299:23
4: ticket_to_ride::route::route_file_to_L0
at ./src/route.rs:39:20
5: ticket_to_ride::route::routes_from_file
at ./src/route.rs:44:33
6: ticket_to_ride::main
at ./src/main.rs:6:5
7: core::ops::function::FnOnce::call_once
at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/ops/function.rs:227:5
I've been able to read the JSON as a HashMap<String, Value> object, but whenever I try to start working at the lower levels I get an error. It seems to be looking for a key named routes, but what I actually want to just a nested HashMap, similar to how you can read a JSON in Python in a nested fashion.
Any advice on how to proceed? Is what I'm attempting reasonable with this library?
As Sven Marnach says in their comment, add #[serde(flatten)] to create the HashMap from data that uses keys as JSON fields:
#[derive(Debug, Deserialize)]
struct L0 {
#[serde(flatten)]
routes: HashMap<String, L1>,
}
#[derive(Debug, Deserialize)]
struct L1 {
#[serde(flatten)]
destination_city: HashMap<String, L2>,
}
Functioning code parsing the referenced JSON is below. The demo function executes the parsing.
use serde::Deserialize;
use std::collections::HashMap;
use std::fs;
use std::clone::Clone;
#[derive(Debug, Deserialize)]
pub struct L1 {
#[serde(flatten)]
destination_city: HashMap<String, L2>,
}
#[derive(Debug, Deserialize)]
struct L2 {
distance: u8,
connections: Vec<L3>,
}
#[derive(Debug, Deserialize, Clone)]
struct L3 {
color: String,
locomotives: u8,
tunnels: u8,
}
fn route_file_to_hashmap(fpath: &str) -> HashMap<String, L1> {
let route_file_as_string = fs::read_to_string(fpath).expect("Unable to read file");
let data: HashMap<String, L1> = serde_json::from_str(&route_file_as_string).unwrap();
return data;
}
pub fn routes_from_file(fpath: &str) -> HashMap<String, L1> {
let route_file_as_map: HashMap<String, L1> = route_file_to_hashmap(fpath);
return route_file_as_map;
}
pub fn demo() {
let routes: HashMap<String, L1> = routes_from_file("usa.routes.json");
println!("---Cities---");
for (k, _) in &routes {
println!("{}", k);
}
let chicago: &HashMap<String, L2> = &routes.get("Chicago").unwrap().destination_city;
println!("---Destinations from Chicago---");
for (k, _) in chicago {
println!("{}", k);
}
let to_omaha: &L2 = chicago.get("Omaha").unwrap();
println!("---Data on Route to Omaha---");
println!("Distance: {}", to_omaha.distance);
print!("Connections: ");
for c in &to_omaha.connections {
println!("{} ", c.color);
}
}
Related
I have an enum:
#[derive(Serialize, Deserialize)]
enum Action {
Join,
Leave,
}
and a struct:
#[derive(Serialize, Deserialize)]
struct Message {
action: Action,
}
and I pass a JSON string:
"{\"action\":0}" // `json_string` var
but when I try deserialzing this like this:
let msg: Message = serde_json::from_str(json_string)?;
I get the error expected value at line 1 column 11.
In the JSON if I were to replace the number 0 with the string "Join" it works, but I want the number to correspond to the Action enum's values (0 is Action::Join, 1 is Action::Leave) since its coming from a TypeScript request. Is there a simple way to achieve this?
You want serde_repr!
Here's example code from the library's README:
use serde_repr::{Serialize_repr, Deserialize_repr};
#[derive(Serialize_repr, Deserialize_repr, PartialEq, Debug)]
#[repr(u8)]
enum SmallPrime {
Two = 2,
Three = 3,
Five = 5,
Seven = 7,
}
fn main() -> serde_json::Result<()> {
let j = serde_json::to_string(&SmallPrime::Seven)?;
assert_eq!(j, "7");
let p: SmallPrime = serde_json::from_str("2")?;
assert_eq!(p, SmallPrime::Two);
Ok(())
}
For your case:
use serde_repr::{Serialize_repr, Deserialize_repr};
#[derive(Serialize_repr, Deserialize_repr)]
#[repr(u8)]
enum Action {
Join = 0,
Leave = 1,
}
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize)]
struct Message {
action: Action,
}
Without adding any extra dependencies, the least verbose way is possibly to use "my favourite serde trick", the try_from and into container attributes. But in this case I feel that custom implementations of Deserialize and Serialize are more appropriate:
impl<'de> Deserialize<'de> for Action {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
match i8::deserialize(deserializer)? {
0 => Ok(Action::Join),
1 => Ok(Action::Leave),
_ => Err(serde::de::Error::custom("Expected 0 or 1 for action")),
}
}
}
impl Serialize for Action {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.serialize_i8(match self {
Action::Join => 0,
Action::Leave => 1,
})
}
}
The custom implementations only redirect to serializing/deserializing i8. Playground
I want to serialize/deserialize a CSV file with variable row length and content, like the following:
./test.csv
Message,20200202T102030,Some message content
Measurement,20200202T102031,10,30,40,2
AnotherMeasurement,20200202T102034,0,2
In my opinion, the easiest way to represent this is the following enum:
#[derive(Debug, Serialize, Deserialize)]
pub enum Record {
Message { timestamp: String, content: String }, // timestamp is String because of simplicity
Measurement { timestamp: String, a: u32, b: u32, c: u32, d: u32 },
AnotherMeasurement { timestamp: String, a: u32, b: u32 },
}
Cargo.toml
[dependencies]¬
csv = "^1.1.6"¬
serde = { version = "^1", features = ["derive"] }
Running the following
main.rs
fn example() -> Result<(), Box<dyn Error>> {
let mut rdr = csv::ReaderBuilder::new()
.has_headers(false)
.delimiter(b',')
.flexible(true)
.double_quote(false)
.from_path("./test.csv")
.unwrap();
for result in rdr.deserialize() {
let record: Record = result?;
println!("{:?}", record);
}
Ok(())
}
fn write_msg() -> Result<(), Box<dyn Error>> {
let msg = Record::Message {
timestamp: String::from("time"),
content: String::from("content"),
};
let mut wtr = csv::WriterBuilder::new()
.has_headers(false)
.flexible(true)
.double_quote(false)
.from_writer(std::io::stdout());
wtr.serialize(msg)?;
wtr.flush()?;
Ok(())
}
fn main() {
if let Err(err) = example() {
println!("error running example: {}", err);
}
if let Err(err) = write_msg() {
println!("error running example: {}", err);
}
}
prints
error running example: CSV deserialize error: record 0 (line: 1, byte: 0): invalid type: unit variant, expected struct variant
error running example: CSV write error: serializing enum struct variants is not supported
Is there an easy solution to do this with serde and csv? I feel like I missed one or two serde attributes, but I was not able to find the right one in the documentation yet.
EDITS
Netwave suggested adding the #[serde(tag = "type")] attribute. Serializing now works, Deserializing gives the following error:
error running example: CSV deserialize error: record 0 (line: 1, byte: 0): invalid type: string "Message", expected internally tagged enum Record
Research I did that did not lead to a solution yet
Is there a way to "flatten" enums for (de)serialization in Rust?
https://docs.rs/csv/1.1.6/csv/tutorial/index.html
Custom serde serialization for enum type
https://serde.rs/enum-representations.html
Make your enum tagged (internally tagged specifically):
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum Record {
Message { timestamp: String, content: String }, // timestamp is String because of simplicity
Measurement { timestamp: String, a: u32, b: u32, c: u32, d: u32 },
AnotherMeasurement { timestamp: String, a: u32, b: u32 },
}
Playground
Before csv header(time,ampl), there are some 'invalid' data.
the csv is about:
LECROYWS3024,13568,Waveform
Segments,1,SegmentSize,100002
Segment,TrigTime,TimeSinceSegment1
#1,01-Apr-2021 16:49:34,0
Time,Ampl
-2.510018e-005,0
-2.509968e-005,0
-2.509918e-005,0
-2.509868e-005,0
-2.509818e-005,0
...
when i build and run the exe, then an error is occured as below :
the error is :
CSV deserialize error: record 1 (line: 1, byte: 29): missing field Time
How can I deal with the invalid data with serde or other crates? Thanks!
use std::error::Error;
use std::io;
use std::process;
use serde::Deserialize;
#[derive(Debug, Deserialize)]
struct Record {
Time: Option<f32>,
Ampl:Option<f32>,
}
...
fn example() -> Result<(), Box<dyn Error>> {
let mut rdr = csv::Reader::from_path("foo.csv")?;
for result in rdr.deserialize() {
let record: Record = result?;
let x0= match record.Time{
Some(x)=> x,
None=> 0.0,
};
...
}
Ok(())
}
fn main() {
if let Err(err) = example() {
println!("error running example: {}", err);
process::exit(1);
}
}
You can use the csv crate, which has a custom deserializer: csv::invalid_option.
Then you can use a macro like this in your struct:
#[derive(Debug, Deserialize)]
struct Record {
Time: Option<f32>,
#[serde(deserialize_with = "csv::invalid_option")]
Ampl:Option<f32>,
}
to have invalid data converted to None values
Is there a nice way to tentatively deserialize a JSON into different structs? Couldn't find anything in the docs and unfortunately the structs have "tag" to differentiate as in How to conditionally deserialize JSON to two different variants of an enum?
So far my approach has been like this:
use aws_lambda_events::event::{
firehose::KinesisFirehoseEvent, kinesis::KinesisEvent,
kinesis_analytics::KinesisAnalyticsOutputDeliveryEvent,
};
use lambda::{lambda, Context};
use serde_json::Value;
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
enum MultipleKinesisEvent {
KinesisEvent(KinesisEvent),
KinesisFirehoseEvent(KinesisFirehoseEvent),
KinesisAnalyticsOutputDeliveryEvent(KinesisAnalyticsOutputDeliveryEvent),
None,
}
#[lambda]
#[tokio::main]
async fn main(event: Value, _: Context) -> Result<String, Error> {
let multi_kinesis_event = if let Ok(e) = serde_json::from_value::<KinesisEvent>(event.clone()) {
MultipleKinesisEvent::KinesisEvent(e)
} else if let Ok(e) = serde_json::from_value::<KinesisFirehoseEvent>(event.clone()) {
MultipleKinesisEvent::KinesisFirehoseEvent(e)
} else if let Ok(e) = serde_json::from_value::<KinesisAnalyticsOutputDeliveryEvent>(event) {
MultipleKinesisEvent::KinesisAnalyticsOutputDeliveryEvent(e)
} else {
MultipleKinesisEvent::None
};
// code below is just sample
let s = match multi_kinesis_event {
MultipleKinesisEvent::KinesisEvent(_) => "Kinesis Data Stream!",
MultipleKinesisEvent::KinesisFirehoseEvent(_) => "Kinesis Firehose!",
MultipleKinesisEvent::KinesisAnalyticsOutputDeliveryEvent(_) => "Kinesis Analytics!",
MultipleKinesisEvent::None => "Not Kinesis!",
};
Ok(s.to_owned())
}
You should use the #untagged option.
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, Debug)]
struct KinesisFirehoseEvent {
x: i32,
y: i32
}
#[derive(Serialize, Deserialize, Debug)]
struct KinesisEvent(i32);
#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)]
enum MultipleKinesisEvent {
KinesisEvent(KinesisEvent),
KinesisFirehoseEvent(KinesisFirehoseEvent),
None,
}
fn main() {
let event = MultipleKinesisEvent::KinesisFirehoseEvent(KinesisFirehoseEvent { x: 1, y: 2 });
// Convert the Event to a JSON string.
let serialized = serde_json::to_string(&event).unwrap();
// Prints serialized = {"x":1,"y":2}
println!("serialized = {}", serialized);
// Convert the JSON string back to a MultipleKinesisEvent.
// Since it is untagged
let deserialized: MultipleKinesisEvent = serde_json::from_str(&serialized).unwrap();
// Prints deserialized = KinesisFirehoseEvent(KinesisFirehoseEvent { x: 1, y: 2 })
println!("deserialized = {:?}", deserialized);
}
See in playground
Docs for: Untagged
I have a JSON structure where one of the fields of a struct could be either an object, or that object's ID in the database. Let's say the document looks like this with both possible formats of the struct:
[
{
"name":"pebbles",
"car":1
},
{
"name":"pebbles",
"car":{
"id":1,
"color":"green"
}
}
]
I'm trying to figure out the best way to implement a custom decoder for this. So far, I've tried a few different ways, and I'm currently stuck here:
extern crate rustc_serialize;
use rustc_serialize::{Decodable, Decoder, json};
#[derive(RustcDecodable, Debug)]
struct Car {
id: u64,
color: String
}
#[derive(Debug)]
enum OCar {
Id(u64),
Car(Car)
}
#[derive(Debug)]
struct Person {
name: String,
car: OCar
}
impl Decodable for Person {
fn decode<D: Decoder>(d: &mut D) -> Result<Person, D::Error> {
d.read_struct("root", 2, |d| {
let mut car: OCar;
// What magic must be done here to get the right OCar?
/* I tried something akin to this:
let car = try!(d.read_struct_field("car", 0, |r| {
let r1 = Car::decode(r);
let r2 = u64::decode(r);
// Compare both R1 and R2, but return code for Err() was tricky
}));
*/
/* And this got me furthest */
match d.read_struct_field("car", 0, u64::decode) {
Ok(x) => {
car = OCar::Id(x);
},
Err(_) => {
car = OCar::Car(try!(d.read_struct_field("car", 0, Car::decode)));
}
}
Ok(Person {
name: try!(d.read_struct_field("name", 0, Decodable::decode)),
car: car
})
})
}
}
fn main() {
// Vector of both forms
let input = "[{\"name\":\"pebbles\",\"car\":1},{\"name\":\"pebbles\",\"car\":{\"id\":1,\"color\":\"green\"}}]";
let output: Vec<Person> = json::decode(&input).unwrap();
println!("Debug: {:?}", output);
}
The above panics with an EOL which is a sentinel value rustc-serialize uses on a few of its error enums. Full line is
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: EOF', src/libcore/result.rs:785
What's the right way to do this?
rustc-serialize, or at least its JSON decoder, doesn't support that use case. If you look at the implementation of read_struct_field (or any other method), you can see why: it uses a stack, but when it encounters an error, it doesn't bother to restore the stack to its original state, so when you try to decode the same thing differently, the decoder is operating on an inconsistent stack, eventually leading to an unexpected EOF value.
I would recommend you look into Serde instead. Deserializing in Serde is different: instead of telling the decoder what type you're expecting, and having no clear way to recover if a value is of the wrong type, Serde calls into a visitor that can handle any of the types that Serde supports in the way it wants. This means that Serde will call different methods on the visitor depending on the actual type of the value it parsed. For example, we can handle integers to return an OCar::Id and objects to return an OCar::Car.
Here's a full example:
#![feature(custom_derive, plugin)]
#![plugin(serde_macros)]
extern crate serde;
extern crate serde_json;
use serde::de::{Deserialize, Deserializer, Error, MapVisitor, Visitor};
use serde::de::value::MapVisitorDeserializer;
#[derive(Deserialize, Debug)]
struct Car {
id: u64,
color: String
}
#[derive(Debug)]
enum OCar {
Id(u64),
Car(Car),
}
struct OCarVisitor;
#[derive(Deserialize, Debug)]
struct Person {
name: String,
car: OCar,
}
impl Deserialize for OCar {
fn deserialize<D>(deserializer: &mut D) -> Result<Self, D::Error> where D: Deserializer {
deserializer.deserialize(OCarVisitor)
}
}
impl Visitor for OCarVisitor {
type Value = OCar;
fn visit_u64<E>(&mut self, v: u64) -> Result<Self::Value, E> where E: Error {
Ok(OCar::Id(v))
}
fn visit_map<V>(&mut self, visitor: V) -> Result<Self::Value, V::Error> where V: MapVisitor {
Ok(OCar::Car(try!(Car::deserialize(&mut MapVisitorDeserializer::new(visitor)))))
}
}
fn main() {
// Vector of both forms
let input = "[{\"name\":\"pebbles\",\"car\":1},{\"name\":\"pebbles\",\"car\":{\"id\":1,\"color\":\"green\"}}]";
let output: Vec<Person> = serde_json::from_str(input).unwrap();
println!("Debug: {:?}", output);
}
Output:
Debug: [Person { name: "pebbles", car: Id(1) }, Person { name: "pebbles", car: Car(Car { id: 1, color: "green" }) }]
Cargo.toml:
[dependencies]
serde = "0.7"
serde_json = "0.7"
serde_macros = "0.7"