Borrowing error while trying to read JSON from stdin [duplicate] - json

This question already has answers here:
"borrowed value does not live long enough" seems to blame the wrong thing
(2 answers)
Cannot move out of value which is behind a shared reference when unwrapping
(2 answers)
Closed 3 years ago.
I am trying to parse JSON line-by-line from stdin using the pom library.
I've stolen the json implementation provided on the homepage (and have omitted that code below; it's not relevant), and am getting a borrow error from the following code:
fn main() {
for line in io::stdin().lock().lines() {
let line2 = line.unwrap().as_bytes();
let _value = json().parse(line2).unwrap();
}
}
The error:
error[E0716]: temporary value dropped while borrowed
--> src/main.rs:73:23
|
73 | let tmpline = line.unwrap().as_bytes();
| ^^^^^^^^^^^^^------------ temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
| argument requires that borrow lasts for `'static`
.parse in the pom libray has the type:
pub fn parse(&self, input: &'a [I]) -> Result<O>
.as_bytes() has the type:
pub fn as_bytes(&self) -> &[u8]
Obviously, I'm borrowing incorrectly here, but I'm not entirely sure how to fix this.

The problem here is that you're using a reference to a value whose lifetime is shorter than you need, and lies in this line: line.unwrap().as_bytes().
as_bytes() returns a reference to the underlying slice of u8s. Now, that underlying slice, returned by unwrap(), happens to be a temporary which will die at the end of the statement.
In Rust, you can re-declare variables with the same name in the current scope and they will shadow the one(s) previously defined. To fix the problem, store the value somewhere, and then get a reference to it. Like so:
fn main() {
for line in io::stdin().lock().lines() {
let line = line.unwrap();
let bytes = line.as_bytes();
let _value = json().parse(bytes).unwrap();
}
}
Now the value returned by as_bytes() can point to something that lives as long as the current scope. Previously, instead, you had this:
fn main() {
for line in io::stdin().lock().lines() {
let line2 = line.unwrap().as_bytes(); // <-- the value returned by unwrap dies here
let _value = json().parse(line2).unwrap(); // <-- line2 would be dangling here
}
}

line.unwrap() returns a String, which you then borrow from with as_bytes(). Since you never bind the String itself, only the borrowed byte slice, the String is dropped at the end of the statement, and the borrowed byte slice is invalidated.
Bind the temporary String to a variable with let s = line.unwrap(), then pass s.as_bytes() to json().parse.

Related

How to implement a generic serde_json::from_str [duplicate]

This question already has answers here:
How to fix lifetime error when function returns a serde Deserialize type?
(2 answers)
Closed 5 months ago.
I am trying to write a generic code that reads a json file into an object. But it seems that there is something I am missing here.
use serde::Deserialize;
use std::{error::Error, fs::File, io::Read};
pub fn from_file<'a, T>(filename: String) -> Result<T, Box<dyn Error>>
where
T: Deserialize<'a>,
{
let mut file = File::open(filename)?;
let mut content: String = String::new();
file.read_to_string(&mut content)?;
Ok(serde_json::from_str::<T>(&content)?)
}
I am getting the following error
error[E0597]: `content` does not live long enough
--> src\util\file.rs:11:34
|
4 | pub fn from_file<'a, T>(filename: String) -> Result<T, Box<dyn Error>>
| -- lifetime `'a` defined here
...
11 | Ok(serde_json::from_str::<T>(&content)?)
| --------------------------^^^^^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `content` is borrowed for `'a`
12 | }
| - `content` dropped here while still borrowed
For more information about this error, try `rustc --explain E0597`.
error: could not compile `flavour` due to previous error
What I understand is that I have to bound T, thanks for this SO question. But I am uncertain which bounds to add.
And how to infer the required bounds for such problems. I tried to read from_str and found that it requires T: de::Deserialize<'a> only.
The lifetime argument to Deserialize<'de> indicates the lifetime of the data you're deserializing from. For efficiency reasons, serde is allowed to borrow data straight from the structure you're reading from, so for example if your JSON file contained a string and the corresponding Rust structure contained a &str, then serde would borrow the string directly from the JSON body. That means that the JSON body, at least for Deserialize, has to live at least as long as the structure deserialized from it, and that length of time is captured by the 'de lifetime variable (called 'a in your example).
If you want to read data without borrowing anything from it, you're looking for DeserializeOwned. From the docs,
Trait serde::de::DeserializeOwned
A data structure that can be deserialized without borrowing any data from the deserializer.
This is primarily useful for trait bounds on functions. For example a from_str function may be able to deserialize a data structure that borrows from the input string, but a from_reader function may only deserialize owned data.
So if your bound is DeserializeOwned (which is really just for<'de> Deserialize<'de>), then you can read data from the string without borrowing anything from it. In fact, since you're reading from a file, you can use from_reader, which does so directly, and you don't even have to worry about the intermediate string at all.

Get json value from byte using rust

I need to get the name from a base64 value, I tried like following but I didn't able to parse it and get the name property, any idea how can I do it ?
extern crate base64;
use serde_json::Value;
fn main() {
let v = "eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ";
let bytes = base64::decode(v).unwrap();
println!("{:?}", bytes);
let v: Value = serde_json::from_slice(bytes);
}
The value represnt json like
{
"sub": "1234567890",
"name": "John Doe",
"iat": 1516239022
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1df0f644a139f8d526a44af8abf78e8e
At the end I need to print "name": "John Doe"
this is the decoded value
Masklinn explains why you have an error, and you should read his answer.
But IMO the simplest and safest solution is to use serde's derive to decode into the desired structure:
use serde::Deserialize;
/// a struct into which to decode the thing
#[derive(Deserialize)]
struct Thing {
name: String,
// add the other fields if you need them
}
fn main() {
let v = "eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ";
let bytes = base64::decode(v).unwrap(); // you should handle errors
let thing: Thing = serde_json::from_slice(&bytes).unwrap();
let name = thing.name;
dbg!(name);
}
Denys has provided for the usual way to use serde, and you should definitely apply their solution if that's an option (if the document is not dynamic).
However:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1df0f644a139f8d526a44af8abf78e8e
Have you considered at least following the indications of the compiler? The Rust compiler is both pretty expressive and very strict, if you give up at every compilation error without even trying to understand what's happening, you'll have a very hard time.
If you try to compile your snippet, the first thing it tells you is
error[E0308]: mismatched types
--> src/main.rs:12:43
|
12 | let v: Value = serde_json::from_slice(bytes);
| ^^^^^
| |
| expected `&[u8]`, found struct `Vec`
| help: consider borrowing here: `&bytes`
|
= note: expected reference `&[u8]`
found struct `Vec<u8>`
and while it doesn't always work out perfectly, the compiler's suggestion is spot-on: base64 has to allocate space for the return value so it yields a Vec, but serde_json doesn't really care where the data comes from so it takes a slice. Just referencing (applying the & operator) the vec allows rustc to coerce it to a slice.
The second suggestion is:
error[E0308]: mismatched types
--> src/main.rs:12:20
|
12 | let v: Value = serde_json::from_slice(bytes);
| ----- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected enum `Value`, found enum `Result`
| |
| expected due to this
|
= note: expected enum `Value`
found enum `Result<_, serde_json::Error>`
which doesn't provide a solution but a simple unwrap would do for testing. However you really want to read up on Rust error handling as Result is very much the normal way to signal fallibility / errors, and thus like compiler errors you will also encounter it a lot.
Anyway that yields a proper serde_json::value::Value, which you can manipulate the normal way e.g.
v.get("name").and_then(Value::as_str));
will return an Option<&str> of value None if the key is missing or not mapping to a string, and Some(s) if the key is present and mapping to a string: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5025d399644694b4b651c4ff1b9125a1

How I can I lazily read multiple JSON values from a file/stream in Rust?

I'd like to read multiple JSON objects from a file/reader in Rust, one at a time. Unfortunately serde_json::from_reader(...) just reads until end-of-file; there doesn't seem to be any way to use it to read a single object or to lazily iterate over the objects.
Is there any way to do this? Using serde_json would be ideal, but if there's a different library I'd be willing use that instead.
At the moment I'm putting each object on a separate line and parsing them individually, but I would really prefer not to need to do this.
Example Use
main.rs
use serde_json;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let stdin = std::io::stdin();
let stdin = stdin.lock();
for item in serde_json::iter_from_reader(stdin) {
println!("Got {:?}", item);
}
Ok(())
}
in.txt
{"foo": ["bar", "baz"]} 1 2 [] 4 5 6
example session
Got Object({"foo": Array([String("bar"), String("baz")])})
Got Number(1)
Got Number(2)
Got Array([])
Got Number(4)
Got Number(5)
Got Number(6)
This was a pain when I wanted to do it in Python, but fortunately in Rust this is a directly-supported feature of the de-facto-standard serde_json crate! It isn't exposed as a single convenience function, but we just need to create a serde_json::Deserializer reading from our file/reader, then use its .into_iter() method to get a StreamDeserializer iterator yielding Results containing serde_json::Value JSON values.
use serde_json; // 1.0.39
fn main() -> Result<(), Box<dyn std::error::Error>> {
let stdin = std::io::stdin();
let stdin = stdin.lock();
let deserializer = serde_json::Deserializer::from_reader(stdin);
let iterator = deserializer.into_iter::<serde_json::Value>();
for item in iterator {
println!("Got {:?}", item?);
}
Ok(())
}
One thing to be aware of: if a syntax error is encountered, the iterator will start to produce an infinite sequence of error results and never move on. You need to make sure you handle the errors inside of the loop, or the loop will never end. In the snippet above, we do this by using the ? question mark operator to break the loop and return the first serde_json::Result::Err from our function.

Wrong number of type arguments when returning a HashMap from a function [duplicate]

This question already has answers here:
How do I fix "wrong number of type arguments" while trying to implement a method?
(1 answer)
How do I specify that a function takes a HashMap?
(1 answer)
Return local String as a slice (&str)
(7 answers)
Closed 5 years ago.
I've built a basic function which loads application settings from a configuration file. These settings are parsed and inserted into a HashMap so I can reference them later on. I cannot seem to find a correct way to return my HashMap and the data it contains:
use std::collections::HashMap;
use std::fs::File;
use std::io::BufReader;
fn load_config_files(settings: &str) -> HashMap {
println!("Loading configuration file...");
let mut settings_data = HashMap::new();
//function returns a vector with the application settings...
let handle = File::open(settings).unwrap();
for line in BufReader::new(handle).lines() {
let mut line_data = &line.unwrap();
//get the first charater, if its a # then the line is ignored... e.g. comments
let first_character = &line_data[0..1];
if first_character != "#" {
//sort into a vector
let mut settings_info_split = line_data.split("=");
let settings_info = settings_info_split.collect::<Vec<&str>>();
//need to add the values to the "settings_vector"
settings_data.insert(settings_info[0], settings_info[1]);
}
}
return settings_data;
}
This doesn't work:
error[E0243]: wrong number of type arguments: expected at least 2, found 0
--> src/main.rs:5:41
|
5 | fn load_config_files(settings: &str) -> HashMap {
| ^^^^^^^ expected at least 2 type arguments
I've tried including the parameters it's after, but I cannot find the right ones... E.g. it did not like &str, it did not like String.

Borrowed value does not live long enough while writing an HTML parser

I am very new to Rust, and trying to build a HTML parser.
I first tried to parse the string and put it in the Hashmap<&str, i32>.
and I figured out that I have to take care of letter cases.
so I added tag.to_lowercase() which creates a String type. From there it got my brain to panic.
Below is my code snippet.
fn html_parser<'a>(html:&'a str, mut tags:HashMap<&'a str, i32>) -> HashMap<&'a str, i32>{
let re = Regex::new("<[:alpha:]+?[\\d]*[:space:]*>+").unwrap();
let mut count;
for caps in re.captures_iter(html) {
if !caps.at(0).is_none(){
let tag = &*(caps.at(0).unwrap().trim_matches('<').trim_matches('>').to_lowercase());
count = 1;
if tags.contains_key(tag){
count = *tags.get_mut(tag).unwrap() + 1;
}
tags.insert(tag,count);
}
}
tags
}
which throws this error,
src\main.rs:58:27: 58:97 error: borrowed value does not live long enough
src\main.rs:58 let tag:&'a str = &*(caps.at(0).unwrap().trim_matches('<').trim_matches('>').to_lowercase());
^~~~~~~~~~~~~~~~~~~
src\main.rs:49:90: 80:2 note: reference must be valid for the lifetime 'a as defined on the block at 49:89...
src\main.rs:49 fn html_parser<'a>(html:&'a str, mut tags:HashMap<&'a str, i32>)-> HashMap<&'a str, i32>{
src\main.rs:58:99: 68:6 note: ...but borrowed value is only valid for the block suffix following statement 0 at 58:98
src\main.rs:58 let tag:&'a str = &*(caps.at(0).unwrap().trim_matches('<').trim_matches('>').to_lowercase());
src\main.rs:63
...
error: aborting due to previous error
I read about lifetimes in Rust but still can not understand this situation.
If anyone has a good HTML tag regex, please recommend so that I can use it.
To understand your problem it is useful to look at the function signature:
fn html_parser<'a>(html: &'a str, mut tags: HashMap<&'a str, i32>) -> HashMap<&'a str, i32>
From this signature we can see, roughly, that both accepted and returned hash maps may only be keyed by subslices of html. However, in your code you are attempting to insert a string slice completely unrelated (in lifetime sense) to html:
let tag = &*(caps.at(0).unwrap().trim_matches('<').trim_matches('>').to_lowercase());
The first problem here (your particular error is about exactly this problem) is that you're attempting to take a slice out of a temporary String returned by to_lowercase(). This temporary string is only alive during this statement, so when the statement ends, the string is deallocated, and its references would become dangling if this was not prohibited by the compiler. So, the correct way to write this assignment is as follows:
let tag = caps.at(0).unwrap().trim_matches('<').trim_matches('>').to_lowercase();
let tag = &*tag;
(or you can just use top tag and convert it to a slice when it is used)
However, your code is not going to work even after this change. to_lowercase() method allocates a new String which is unrelated to html in terms of lifetime. Therefore, any slice you take out of it will have a lifetime necessarily shorter than 'a. Hence it is not possible to insert such slice as a key to the map, because the data they point to may be not valid after this function returns (and in this particular case, it will be invalid).
It is hard to tell what is the best way to fix this problem because it may depend on the overall architecture of your program, but the simplest way would be to create a new HashMap<String, i32> inside the function:
fn html_parser(html:&str, tags: HashMap<&str, i32>) -> HashMap<String, i32>{
let mut result: HashMap<String, i32> = tags.iter().map(|(k, v)| (k.to_owned(), *v)).collect();
let re = Regex::new("<[:alpha:]+?[\\d]*[:space:]*>+").unwrap();
for caps in re.captures_iter(html) {
if let Some(cap) = caps.at(0) {
let tag = cap
.trim_matches('<')
.trim_matches('>')
.to_lowercase();
let count = result.get(&tag).unwrap_or(0) + 1;
result.insert(tag, count);
}
}
result
}
I've also changed the code for it to be more idiomatic (if let instead of if something.is_none(), unwrap_or() instead of mutable local variables, etc.). This is a more or less direct translation of your original code.
As for parsing HTML with regexes, I just cannot resist providing a link to this answer. Seriously consider using a proper HTML parser instead of relying on regexes.