How to implement a generic serde_json::from_str [duplicate] - json

This question already has answers here:
How to fix lifetime error when function returns a serde Deserialize type?
(2 answers)
Closed 5 months ago.
I am trying to write a generic code that reads a json file into an object. But it seems that there is something I am missing here.
use serde::Deserialize;
use std::{error::Error, fs::File, io::Read};
pub fn from_file<'a, T>(filename: String) -> Result<T, Box<dyn Error>>
where
T: Deserialize<'a>,
{
let mut file = File::open(filename)?;
let mut content: String = String::new();
file.read_to_string(&mut content)?;
Ok(serde_json::from_str::<T>(&content)?)
}
I am getting the following error
error[E0597]: `content` does not live long enough
--> src\util\file.rs:11:34
|
4 | pub fn from_file<'a, T>(filename: String) -> Result<T, Box<dyn Error>>
| -- lifetime `'a` defined here
...
11 | Ok(serde_json::from_str::<T>(&content)?)
| --------------------------^^^^^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `content` is borrowed for `'a`
12 | }
| - `content` dropped here while still borrowed
For more information about this error, try `rustc --explain E0597`.
error: could not compile `flavour` due to previous error
What I understand is that I have to bound T, thanks for this SO question. But I am uncertain which bounds to add.
And how to infer the required bounds for such problems. I tried to read from_str and found that it requires T: de::Deserialize<'a> only.

The lifetime argument to Deserialize<'de> indicates the lifetime of the data you're deserializing from. For efficiency reasons, serde is allowed to borrow data straight from the structure you're reading from, so for example if your JSON file contained a string and the corresponding Rust structure contained a &str, then serde would borrow the string directly from the JSON body. That means that the JSON body, at least for Deserialize, has to live at least as long as the structure deserialized from it, and that length of time is captured by the 'de lifetime variable (called 'a in your example).
If you want to read data without borrowing anything from it, you're looking for DeserializeOwned. From the docs,
Trait serde::de::DeserializeOwned
A data structure that can be deserialized without borrowing any data from the deserializer.
This is primarily useful for trait bounds on functions. For example a from_str function may be able to deserialize a data structure that borrows from the input string, but a from_reader function may only deserialize owned data.
So if your bound is DeserializeOwned (which is really just for<'de> Deserialize<'de>), then you can read data from the string without borrowing anything from it. In fact, since you're reading from a file, you can use from_reader, which does so directly, and you don't even have to worry about the intermediate string at all.

Related

Handling different ways to represent null in serde

I'm writing a client library around this REST server. Octoprint (A server to manage a 3d printer), to be precise.
Here's one of the types i'm working with:
#[derive(Serialize, Deserialize, Debug)]
pub struct JobInfo {
/// The file that is the target of the current print job
pub file: FileInfo,
/// The estimated print time for the file, in seconds
#[serde(rename = "estimatedPrintTime")]
pub estimated_print_time: Option<f64>,
/// The print time of the last print of the file, in seconds
#[serde(rename = "lastPrintTime")]
pub last_print_time: Option<f64>,
/// Information regarding the estimated filament usage of the print job
pub filament: Option<Filament>,
}
Pretty straightforward, Using the multiplicity property defined in the specification of the API, I determined which properties should be considered optional, hence why some of these props are wrapped in options.
Unfortunately the documentation lies a little bit in the way multiplicity works here; here's an example on what a response looks like when the printer is in an offline state. For the sake of brevity, I will omit most of the body of this JSON message and keep just enough to get the point across
{
"job": {
"file": null,
"filepos": null,
"printTime": null,
... etc
},
...
"state": "Offline"
}
Here's the type that I'm expecting for this response:
#[derive(Serialize, Deserialize, Debug)]
pub struct JobInformationResponse {
/// Information regarding the target of the current print job
job: JobInfo,
/// Information regarding the progress of the current print job
progress: ProgressInfo,
/// A textual representation of the current state of the job
/// or connection. e.g. "Operational", "Printing", "Pausing",
/// "Paused", "Cancelling", "Error", "Offline", "Offline after error",
/// "Opening serial connection" ... - please note that this list is not exhaustive!
state: String,
/// Any error message for the job or connection. Only set if there has been an error
error: Option<String>,
}
Now I could just wrap all of these types in Options, but the previous example json wouldn't parse, since technically since job is an object, it's not going to deserialize as None despite the fact that each of it's keys are null. I was wondering if there were some sort of attribute in serde that would be able to handle this weird kind of serialization issue. I'd like to avoid just wrapping every single property in Options just to handle the edge case where the printer is offline
Edit: I guess what I'm trying to say is that I would expect that if all props on a struct in the json representation were null, that the object itself would serialize as None
If you're willing to redesign a little bit, you might be able to do something like this:
#[serde(tag = "state")]
enum JobInformationResponse {
Offline {}
// If a field only appears on one type of response, use a struct variant
Error { error: String },
// If multiple response types share fields, use a newtype variant and a substruct
Printing(JobInformationResponseOnline),
Paused(JobInformationResponseOnline),
// ...
}
struct JobInformationResponseOnline {
job: JobInfo,
progress: ProgressInfo,
}
This works in the Offline case because by default, serde ignores properties that don't fit into any field of the struct/enum variant. So it won't check whether all entries of job are null.
If you have fields that appear in every message, you can further wrap JobInformationResponse (you should probably rename it):
struct JobInformationResponseAll {
field_appears_in_all_responses: FooBar,
#[serde(flatten)]
state: JobInformationResponse // Field name doesn't matter to serde
}
But I'm not sure whether that works for you, since I certainly haven't seen enough of the spec or any real example messages.
To answer your question directly: No, there is no attribute in serde which would allow an all-null map to be de/serialized as None. You'd need two versions of the struct, one without options (to be used in your rust code) and one with (to be used in a custom deserialization function where you first deserialize to the with-options struct and then convert). Might not be worth the trouble.
And a side note: You might be happy to find #[serde(rename_all = "camelCase")] exists.

Get json value from byte using rust

I need to get the name from a base64 value, I tried like following but I didn't able to parse it and get the name property, any idea how can I do it ?
extern crate base64;
use serde_json::Value;
fn main() {
let v = "eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ";
let bytes = base64::decode(v).unwrap();
println!("{:?}", bytes);
let v: Value = serde_json::from_slice(bytes);
}
The value represnt json like
{
"sub": "1234567890",
"name": "John Doe",
"iat": 1516239022
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1df0f644a139f8d526a44af8abf78e8e
At the end I need to print "name": "John Doe"
this is the decoded value
Masklinn explains why you have an error, and you should read his answer.
But IMO the simplest and safest solution is to use serde's derive to decode into the desired structure:
use serde::Deserialize;
/// a struct into which to decode the thing
#[derive(Deserialize)]
struct Thing {
name: String,
// add the other fields if you need them
}
fn main() {
let v = "eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ";
let bytes = base64::decode(v).unwrap(); // you should handle errors
let thing: Thing = serde_json::from_slice(&bytes).unwrap();
let name = thing.name;
dbg!(name);
}
Denys has provided for the usual way to use serde, and you should definitely apply their solution if that's an option (if the document is not dynamic).
However:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1df0f644a139f8d526a44af8abf78e8e
Have you considered at least following the indications of the compiler? The Rust compiler is both pretty expressive and very strict, if you give up at every compilation error without even trying to understand what's happening, you'll have a very hard time.
If you try to compile your snippet, the first thing it tells you is
error[E0308]: mismatched types
--> src/main.rs:12:43
|
12 | let v: Value = serde_json::from_slice(bytes);
| ^^^^^
| |
| expected `&[u8]`, found struct `Vec`
| help: consider borrowing here: `&bytes`
|
= note: expected reference `&[u8]`
found struct `Vec<u8>`
and while it doesn't always work out perfectly, the compiler's suggestion is spot-on: base64 has to allocate space for the return value so it yields a Vec, but serde_json doesn't really care where the data comes from so it takes a slice. Just referencing (applying the & operator) the vec allows rustc to coerce it to a slice.
The second suggestion is:
error[E0308]: mismatched types
--> src/main.rs:12:20
|
12 | let v: Value = serde_json::from_slice(bytes);
| ----- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected enum `Value`, found enum `Result`
| |
| expected due to this
|
= note: expected enum `Value`
found enum `Result<_, serde_json::Error>`
which doesn't provide a solution but a simple unwrap would do for testing. However you really want to read up on Rust error handling as Result is very much the normal way to signal fallibility / errors, and thus like compiler errors you will also encounter it a lot.
Anyway that yields a proper serde_json::value::Value, which you can manipulate the normal way e.g.
v.get("name").and_then(Value::as_str));
will return an Option<&str> of value None if the key is missing or not mapping to a string, and Some(s) if the key is present and mapping to a string: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5025d399644694b4b651c4ff1b9125a1

Error while reading JSON file in Rust using BufReader: the trait bound Result: std::io::Read is not satisfied [duplicate]

This question already has answers here:
Unable to read file contents to string - Result does not implement any method in scope named `read_to_string`
(2 answers)
Closed 2 years ago.
I am trying to read JSON from a file:
use std::error::Error;
use std::fs::File;
use std::io::BufReader;
use std::path::Path;
impl Params {
pub fn new(raw_opt2: opt::Opt, path: String) -> Self {
// Open the file in read-only mode with buffer.
let file = File::open(path);
let reader = BufReader::new(file);
Self {
opt_raw: raw_opt2,
module_settings: serde_json::from_reader(reader).unwrap(),
}
}
}
But I'm getting an error:
error[E0277]: the trait bound `std::result::Result<std::fs::File, std::io::Error>: std::io::Read` is not satisfied
--> src\params.rs:20:37
|
20 | let reader = BufReader::new(file);
| ^^^^ the trait `std::io::Read` is not implemented for `std::result::Result<std::fs::File, std::io::Error>`
|
= note: required by `std::io::BufReader::<R>::new`
The File::open operation returns a Result - signifying that the open operation could succeed or fail.
This is one standout feature of Rust compared to many other languages; it tries to force you to deal with errors. Instead of:
C - just returns an int
Python - exceptions (try: finally:)
C++ - exceptions (needs a libstdc++ runtime)
As you can expect, this leads to more programming time at the start, but overall much less hassles and higher quality programs.
After the line let file = File::open(path); you have to deal with the result.
If you don't care, and want to crash the program if the file can't be opened:
let file = File::open(path).unwrap();
To make a better error message in the crash:
let file = File::open(path).expect("Unable to open file");
To do it properly - read the Rust book
Most likely, you'll want to return a Result yourself from your function. Then you could rewrite it something like this (to use a match):
impl Params {
pub fn new(raw_opt2: opt::Opt, path: String) -> Result<Self, std::io::Error> {
// Open the file in read-only mode with buffer.
match File::open(path) {
Ok(file) => {
let reader = BufReader::new(file);
Ok(Self {
opt_raw: raw_opt2,
module_settings: serde_json::from_reader(reader).unwrap(),
})
}
Err(err) => Err(err),
}
}
}
.. or a more functional way:
impl Params {
pub fn new(raw_opt2: opt::Opt, path: String) -> Result<Self, std::io::Error> {
// Open the file in read-only mode with buffer.
File::open(path).map(|file| {
let reader = BufReader::new(file);
Self {
opt_raw: raw_opt2,
module_settings: serde_json::from_reader(reader).unwrap(),
}
})
}
}
Update:
Now I generally use these two libraries for error management:
thiserror When I'm writing libraries and creating my own error types
anyhow When writing applications or scripts or tests to easily handle all the library errors.
.. and of course I didn't mention the ? operator, which makes working with results and options so much easier.

Borrowing error while trying to read JSON from stdin [duplicate]

This question already has answers here:
"borrowed value does not live long enough" seems to blame the wrong thing
(2 answers)
Cannot move out of value which is behind a shared reference when unwrapping
(2 answers)
Closed 3 years ago.
I am trying to parse JSON line-by-line from stdin using the pom library.
I've stolen the json implementation provided on the homepage (and have omitted that code below; it's not relevant), and am getting a borrow error from the following code:
fn main() {
for line in io::stdin().lock().lines() {
let line2 = line.unwrap().as_bytes();
let _value = json().parse(line2).unwrap();
}
}
The error:
error[E0716]: temporary value dropped while borrowed
--> src/main.rs:73:23
|
73 | let tmpline = line.unwrap().as_bytes();
| ^^^^^^^^^^^^^------------ temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
| argument requires that borrow lasts for `'static`
.parse in the pom libray has the type:
pub fn parse(&self, input: &'a [I]) -> Result<O>
.as_bytes() has the type:
pub fn as_bytes(&self) -> &[u8]
Obviously, I'm borrowing incorrectly here, but I'm not entirely sure how to fix this.
The problem here is that you're using a reference to a value whose lifetime is shorter than you need, and lies in this line: line.unwrap().as_bytes().
as_bytes() returns a reference to the underlying slice of u8s. Now, that underlying slice, returned by unwrap(), happens to be a temporary which will die at the end of the statement.
In Rust, you can re-declare variables with the same name in the current scope and they will shadow the one(s) previously defined. To fix the problem, store the value somewhere, and then get a reference to it. Like so:
fn main() {
for line in io::stdin().lock().lines() {
let line = line.unwrap();
let bytes = line.as_bytes();
let _value = json().parse(bytes).unwrap();
}
}
Now the value returned by as_bytes() can point to something that lives as long as the current scope. Previously, instead, you had this:
fn main() {
for line in io::stdin().lock().lines() {
let line2 = line.unwrap().as_bytes(); // <-- the value returned by unwrap dies here
let _value = json().parse(line2).unwrap(); // <-- line2 would be dangling here
}
}
line.unwrap() returns a String, which you then borrow from with as_bytes(). Since you never bind the String itself, only the borrowed byte slice, the String is dropped at the end of the statement, and the borrowed byte slice is invalidated.
Bind the temporary String to a variable with let s = line.unwrap(), then pass s.as_bytes() to json().parse.

How I can I lazily read multiple JSON values from a file/stream in Rust?

I'd like to read multiple JSON objects from a file/reader in Rust, one at a time. Unfortunately serde_json::from_reader(...) just reads until end-of-file; there doesn't seem to be any way to use it to read a single object or to lazily iterate over the objects.
Is there any way to do this? Using serde_json would be ideal, but if there's a different library I'd be willing use that instead.
At the moment I'm putting each object on a separate line and parsing them individually, but I would really prefer not to need to do this.
Example Use
main.rs
use serde_json;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let stdin = std::io::stdin();
let stdin = stdin.lock();
for item in serde_json::iter_from_reader(stdin) {
println!("Got {:?}", item);
}
Ok(())
}
in.txt
{"foo": ["bar", "baz"]} 1 2 [] 4 5 6
example session
Got Object({"foo": Array([String("bar"), String("baz")])})
Got Number(1)
Got Number(2)
Got Array([])
Got Number(4)
Got Number(5)
Got Number(6)
This was a pain when I wanted to do it in Python, but fortunately in Rust this is a directly-supported feature of the de-facto-standard serde_json crate! It isn't exposed as a single convenience function, but we just need to create a serde_json::Deserializer reading from our file/reader, then use its .into_iter() method to get a StreamDeserializer iterator yielding Results containing serde_json::Value JSON values.
use serde_json; // 1.0.39
fn main() -> Result<(), Box<dyn std::error::Error>> {
let stdin = std::io::stdin();
let stdin = stdin.lock();
let deserializer = serde_json::Deserializer::from_reader(stdin);
let iterator = deserializer.into_iter::<serde_json::Value>();
for item in iterator {
println!("Got {:?}", item?);
}
Ok(())
}
One thing to be aware of: if a syntax error is encountered, the iterator will start to produce an infinite sequence of error results and never move on. You need to make sure you handle the errors inside of the loop, or the loop will never end. In the snippet above, we do this by using the ? question mark operator to break the loop and return the first serde_json::Result::Err from our function.