How to write a Rust function that takes an iterator? - function

I'd like to write a function that accepts an iterator and returns the results of some operations on it. Specifically, I'm trying to iterate over the values of a HashMap:
use std::collections::HashMap;
fn find_min<'a>(vals: Iterator<Item=&'a u32>) -> Option<&'a u32> {
vals.min()
}
fn main() {
let mut map = HashMap::new();
map.insert("zero", 0u32);
map.insert("one", 1u32);
println!("Min value {:?}", find_min(map.values()));
}
But alas:
error: the `min` method cannot be invoked on a trait object
--> src/main.rs:4:10
|
4 | vals.min()
| ^^^
error[E0277]: the trait bound `std::iter::Iterator<Item=&'a u32> + 'static: std::marker::Sized` is not satisfied
--> src/main.rs:3:17
|
3 | fn find_min<'a>(vals: Iterator<Item = &'a u32>) -> Option<&'a u32> {
| ^^^^ `std::iter::Iterator<Item=&'a u32> + 'static` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `std::iter::Iterator<Item=&'a u32> + 'static`
= note: all local variables must have a statically known size
error[E0308]: mismatched types
--> src/main.rs:11:41
|
11 | println!("Min value {:?}", find_min(map.values()));
| ^^^^^^^^^^^^ expected trait std::iter::Iterator, found struct `std::collections::hash_map::Values`
|
= note: expected type `std::iter::Iterator<Item=&u32> + 'static`
found type `std::collections::hash_map::Values<'_, &str, u32>`
I get the same error if I try to pass by reference; if I use a Box, I get lifetime errors.

You want to use generics here:
fn find_min<'a, I>(vals: I) -> Option<&'a u32>
where
I: Iterator<Item = &'a u32>,
{
vals.min()
}
Traits can be used in two ways: as bounds on type parameters and as trait objects. The book The Rust Programming Language has a chapter on traits and a chapter on trait objects that explain these two use cases.
Additionally, you often want to take something that implements IntoIterator as this can make the code calling your function nicer:
fn find_min<'a, I>(vals: I) -> Option<&'a u32>
where
I: IntoIterator<Item = &'a u32>,
{
vals.into_iter().min()
}

Since Rust 1.26 impl Trait are available. Here is a more compact version using impl Trait.
use std::collections::HashMap;
fn find_min<'a>(vals: impl Iterator<Item = &'a u32>) -> Option<&'a u32> {
vals.min()
}
fn main() {
let mut map = HashMap::new();
map.insert("zero", 0u32);
map.insert("one", 1u32);
println!("Min value {:?}", find_min(map.values()));
}
playground

This behaviour is a little unintuitive from those with a Python background rather than, say, a C++ background, so let me clarify a little.
In Rust, values are conceptually stored inside the name that binds them. Thus, if you write
let mut x = Foo { t: 10 };
let mut y = x;
x.t = 999;
y.t will still be 10.
So when you write
let x: Iterator<Item=&'a u32>;
(or the same in the function parameter list), Rust needs to allocate enough space for any value of type Iterator<Item=&'a u32>. Even if this was possible, it wouldn't be efficient.
So what Rust does instead is offer you the option to
Put the value on the heap, eg. with Box, which gives Python-style semantics. Then you can take generically with &mut Iterator<Item=&'a u32>.
Specialize each function invocation for each possible type to satisfy the bound. This is more flexible, since a trait reference is a possible specialization, and gives the compiler more opportunities for specialization, but means you can't have dynamic dispatch (where the type can vary dependent on runtime parameters).

Related

Function return type mismatch doesn't produce any errors

I'm new to Rust and trying to understand how reference works. Below is a simple function.
fn f1(x: &i32) -> &i32{
x
}
Since x is of type &i32, return it directly matches the return type &i32. But I found that if I change the function to this, it also compiles without any problem:
fn f1(x: &i32) -> &i32{
&x
}
Here x is of type &i32, &x is of type &&i32 and doesn't match the return type &i32. Why it compiles?
This is a result of type coercion.
To make the language more ergonomic, a certain set of coercions are allowed in specific situations. One of the situations is determining the return value of a function.
Among the allowed coercions is this:
&T or &mut T to &U if T implements Deref<Target = U>
In this particular case, there is an implementation of the deref trait in the std library:
impl<T: ?Sized> const Deref for &T {
type Target = T;
#[rustc_diagnostic_item = "noop_method_deref"]
fn deref(&self) -> &T {
*self
}
}
In your case, the target T is i32, and the trait is implemented for &T (ie. &i32) so deref(&self) -> &T can coerce from &&i32 to &i32.
In your second function Rust automatically dereferences &x to x (i.e. &&i32 to &i32). That's done via the so called "Deref coercion".
Consider this example:
fn f1(x: &i32) -> &i32{
let t1: &&i32 = &x; // nothing special here
let t2: &i32 = &x; // this works too
let t3: &i32 = &&&&x; // this works too!
let t4: i32 = x; // this doesn't work though
x
}

How to return concrete type from generic function?

In the example below the Default trait is used just for demonstration purposes.
My questions are:
What is the difference between the declarations of f() and g()?
Why g() doesn't compile since it's identical to f()?
How can I return a concrete type out of a impl trait generically typed declaration?
struct Something {
}
impl Default for Something {
fn default() -> Self {
Something{}
}
}
// This compiles.
pub fn f() -> impl Default {
Something{}
}
// This doesn't.
pub fn g<T: Default>() -> T {
Something{}
}
What is the difference between the declarations of f() and g()?
f returns some type which implements Default. The caller of f has no say in what type it will return.
g returns some type which implements Default. The caller of g gets to pick the exact type that must be returned.
You can clearly see this difference in how f and g can be called. For example:
fn main() {
let t = f(); // this is the only way to call f()
let t = g::<i32>(); // I can call g() like this
let t = g::<String>(); // or like this
let t = g::<Vec<Box<u8>>(); // or like this... and so on!
// there's potentially infinitely many ways I can call g()
// and yet there is only 1 way I can call f()
}
Why g() doesn't compile since it's identical to f()?
They're not identical. The implementation for f compiles because it can only be called in 1 way and it will always return the exact same type. The implementation for g fails to compile because it can get called infinitely many ways for all different types but it will always return Something which is broken.
How can I return a concrete type out of a impl trait generically typed declaration?
If I'm understanding your question correctly, you can't. When you use generics you let the caller decide the types your function must use, so your function's implementation itself must be generic. If you want to construct and return a generic type within a generic function the usual way to go about that is to put a Default trait bound on the generic type and use that within your implementation:
// now works!
fn g<T: Default>() -> T {
T::default()
}
If you need to conditionally select the concrete type within the function then the only other solution is to return a trait object:
struct Something;
struct SomethingElse;
trait Trait {}
impl Trait for Something {}
impl Trait for SomethingElse {}
fn g(some_condition: bool) -> Box<dyn Trait> {
if some_condition {
Box::new(Something)
} else {
Box::new(SomethingElse)
}
}
how can I return a concrete type out of a "impl trait" generically typed declaration?
By "impl trait" generically typed declaration I presume you mean "impl trait" rewritten to use named generics. However, that's a false premise - impl Trait in return position was introduced precisely because you can't express it using named generics. To see this, consider first impl Trait in argument position, such as this function:
fn foo(iter: impl Iterator<Item = u32>) -> usize {
iter.count()
}
You can rewrite that function to use named generics as follows:
fn foo<I: Iterator<Item = u32>>(iter: I) -> usize {
iter.count()
}
Barring minor technical differences, the two are equivalent. However, if impl Trait is in return position, such as here:
fn foo() -> impl Iterator<Item = u32> {
vec![1, 2, 3].into_iter()
}
...you cannot rewrite it to use generics without losing generality. For example, this won't compile:
fn foo<T: Iterator<Item = u32>>() -> T {
vec![1, 2, 3].into_iter()
}
...because, as explained by pretzelhammer, the signature promises the caller the ability to choose which type to return (out of those that implement Iterator<Item = u32>), but the implementation only ever returns a concrete type, <Vec<u32> as IntoIterator>::IntoIter.
On the other hand, this does compile:
fn foo() -> <Vec<u32> as IntoIterator>::IntoIter {
vec![1, 2, 3].into_iter()
}
...but now the generality is lost because foo() must be implemented as a combination of Vec and into_iter() - even adding a map() in between the two would break it.
This also compiles:
fn foo() -> Box<dyn Iterator<Item = u32>> {
Box::new(vec![1, 2, 3].into_iter())
}
...but at the cost of allocating the iterator on the heap and disabling some optimizations.

How to write a function that takes a slice of functions?

I am trying to write a function that takes a slice of functions. Consider the following simple illustration:
fn g<P: Fn(&str) -> usize>(ps: &[P]) { }
fn f1() -> impl Fn(&str) -> usize { |s: &str| s.len() }
fn f2() -> impl Fn(&str) -> usize { |s: &str| s.len() }
fn main() {
g(&[f1(), f2()][..]);
}
It fails to compile:
error[E0308]: mismatched types
--> src/main.rs:6:15
|
6 | g(&[f1(), f2()][..]);
| ^^^^ expected opaque type, found a different opaque type
|
= note: expected type `impl for<'r> std::ops::Fn<(&'r str,)>` (opaque type)
found type `impl for<'r> std::ops::Fn<(&'r str,)>` (opaque type)
Is there any way to do this?
Your problem is that every element of the array must be of the same type, but the return of a function declared as returning impl Trait is an opaque type, that is an unspecified, unnamed type, that you can only use by means of the given trait.
You have two functions that return the same impl Trait but that does not mean that they return the same type. In fact, as your compiler shows, they are different opaque types, so they cannot be part of the same array. If you were to write an array of values of the same type, such as:
g(&[f1(), f1(), f1()]);
then it would work. But with different functions, there will be different types and the array is impossible to build.
Does that mean there is no solution for your problem? Of course not! You just have to invoke dynamic dispatch. That is you have to make your slice of type &[&dyn Fn(&str) -> usize]. For that you need to do two things:
Add a level of indirection: dynamic dispatching is always done via references or pointers (&dyn Trait or Box<dyn Trait> instead of Trait).
Do an explicit cast to the &dyn Trait to avoid ambiguities in the conversion.
There are many ways to do the cast: you can cast the first element of the array, or you can declare the temporary variables, or give the slice a type. I prefer the latter, because it is more symmetric. Something like this:
fn main() {
let fns: &[&dyn Fn(&str) -> usize] =
&[&f1(), &f2()];
g(fns);
}
Link to a playground with this solution.

How to use an external object in a struct callback such as when appending data to a CSV?

My understanding was that objects created outside of the scope are available inside the scope (hence things such as shadowing allowed), but it does not seem to work in this scenario:
extern crate csv;
extern crate rand;
use rand::Rng;
use std::path::Path;
use std::time::SystemTime;
#[derive(Debug)]
struct Event {
time: SystemTime,
value: u32,
}
impl Event {
fn new(t: SystemTime, n: u32) -> Event {
Event {
time: SystemTime,
value: n,
}
}
}
struct Process;
impl Process {
fn new() -> Process {
Process {}
}
fn start(&self) {
loop {
let now = SystemTime::now();
let random_number: u32 = rand::thread_rng().gen();
let event = Event::new(now, random_number);
self.callback(event);
}
}
fn callback(&self, event: Event) {
println!("{:?}", event);
wtr.write_record(&event).unwrap();
wtr.flush().unwrap();
}
}
fn main() {
let file_path = Path::new("test.csv");
let mut wtr = csv::Writer::from_path(file_path).unwrap();
let process: Process = Process::new();
process.start();
}
The errors are:
error[E0423]: expected value, found struct `SystemTime`
--> src/main.rs:17:19
|
17 | time: SystemTime,
| ^^^^^^^^^^ constructor is not visible here due to private fields
error[E0425]: cannot find value `wtr` in this scope
--> src/main.rs:41:9
|
41 | wtr.write_record(&event).unwrap();
| ^^^ not found in this scope
error[E0425]: cannot find value `wtr` in this scope
--> src/main.rs:42:9
|
42 | wtr.flush().unwrap();
| ^^^ not found in this scope
How can I append data (Event) to a CSV file from inside the callback function for Process?
I strongly encourage you to go back and re-read The Rust Programming Language, specifically the chapter about functions. This code appears to show fundamental issues around the entire model of how functions work.
For example, the code attempts to make use of the variable wtr in the function callback without it being passed in either directly or indirectly.
If such code worked1, programmers would likely hate dealing with this language because it would be almost impossible to tell what and where the value wtr even came from.
The solution is straightforward: pass any value that a piece of code needs to that code. Then it's easy (or easier) to tell where the value came from. There are multiple avenues that can work.
Pass an argument to the callback method:
use std::io::Write;
impl Process {
fn start<R>(&self, wtr: &mut csv::Writer<R>)
where
R: Write,
{
loop {
// ...
self.callback(wtr, event);
}
}
fn callback<R>(&self, wtr: &mut csv::Writer<R>, event: Event)
where
R: Write,
{
// ...
}
}
fn main() {
// ...
process.start(&mut wtr);
}
Pass an argument to the constructor and save it inside the struct:
use std::io::Write;
struct Process<'a, R>
where
R: Write + 'a,
{
wtr: &'a mut csv::Writer<R>,
}
impl<'a, R> Process<'a, R>
where
R: Write,
{
fn new(wtr: &'a mut csv::Writer<R>) -> Self {
Process { wtr }
}
// ...
fn callback(&self, event: Event) {
// ...
self.wtr.write_record(event).unwrap();
self.wtr.flush().unwrap();
}
}
fn main() {
// ...
let process = Process::new(&mut wtr);
}
The code has other issues in how it uses the CSV library that I'm ignoring because they are unrelated to your question. I encourage you to start with a simpler piece of code, get it working, then make it more complex. That way you are only dealing with simpler errors at first.
Once you understand this basic usage of functions, you may wish to learn about closures. These allow you to "capture" variables from an outer scope and pass them in (in the same two methods as above) without having to deal with the specific count or type of variables.
objects created outside of the scope are available inside the scope
This is true for a single function. It does not apply across functions.
hence things such as shadowing allowed
Shadowing has nothing to to with scopes. You are allowed to shadow inside the same scope:
let a = Some(32);
let a = a.unwrap();
1. Such languages exist; they are languages with dynamic scope and some people prefer them. They are in the minority, programs written in these languages are hard to reason about!

Trying to write a CSV record has the error "the trait `std::convert::AsRef<[u8]>` is not implemented for `u8`"

Here's a minimal repro:
extern crate csv;
use std::fs::File;
use std::io::Write;
fn do_write(writer: &mut csv::Writer<File>, buf: &[u8]) {
// The error is coming from this line
writer.write_record(buf);
}
fn main() {
let mut writer = csv::Writer::from_path(r"c:\temp\file.csv").unwrap();
let str = "Hello, World!".to_string();
do_write(&mut writer, str.as_bytes());
}
Which causes a compilation error:
error[E0277]: the trait bound `u8: std::convert::AsRef<[u8]>` is not satisfied
--> src/main.rs:7:16
|
7 | writer.write_record(buf);
| ^^^^^^^^^^^^ the trait `std::convert::AsRef<[u8]>` is not implemented for `u8`
|
= note: required because of the requirements on the impl of `std::convert::AsRef<[u8]>` for `&u8`
What does this error mean? It seems that I'm already passing a u8 slice?
Review the signature for write_record:
fn write_record<I, T>(&mut self, record: I) -> Result<()>
where
I: IntoIterator<Item = T>,
T: AsRef<[u8]>,
It expects something that can become an iterator of values. You are providing a &[u8], which is an iterator, but only of &u8 values. The error is that these &u8s do not implement AsRef<[u8]>.
You can wrap the single passed-in string in another array to create something that can act as an iterator:
writer.write_record(&[buf]);