Rust iterator adapters for handling ownership
Cătălin May 01, 2023 [Rust] #IteratorsIntro
If you've passed the beginner stages of learning Rust, you've probably come across the suggestion to replace your dusty old for-loops with iterators. This is sound advice for multiple reasons.
Firstly, this is more succinct and will make the code more semantically representative of its intent. This approach also leaves less room for errors, as the author will not use additional variables for managing the loop. The cherry on top is that this has virtually no performance cost over using a plain old for-loop.
For one familiar with functional operators from other languages, this shouldn't be too hard, but Rust's ownership mechanism might make this slightly more difficult than expected.
Let's discuss some gotcha's using the following example:
We have a sequence of bytes, and we want to collect segments of it into various fields on a type.
The idiomatic way to handle this would be to iterate, use the take
adapter, and then collect to the owned structure we want. The initial trivial problem we'll encounter is that the iterator gives us references (wrapped in a Some()
), while we need owned types for the collection.
1 2 //does not compile
3 let segment: = input.iter
4 .take
5 .collect;
6 // ........
7
Check out the video
If you don't want to read:
The 'copied' and 'cloned' iterator adapters
The naive way to handle this would be to call .map(|elem| *elem)
trying to dereference. This will work, but if you check with Clippy, you'll get a warning, as this is not the idiomatic way to handle this scenario.
Clippy recommends using the .copied()
adapter, which will make the iterator copy all its elements.
1 2 let segment: = input.iter
3 .copied
4 .take
5 .collect;
6 // ........
7
Now, the .copied()
adapter works only for Copy
types and has the same effect as the map()
call discussed previously. What would happen, instead for `Clone' types?
Right off the bat, the .map()
call wouldn't just return a warning, but result in a compile error, because the dereference would result in a move of the borrowed value. The .map()
statement could be fixed by changing the deref with a call to .clone()
.
1
2 3 id: u32,
4 name: String
5
6
7 8 let segment: = input.iter
9 .map
10 .take
11 .collect;
12 // ........
13
The idiomatic way to do it is to use the .cloned()
adapter on the iterators, which does the same things as above.
1 // --- struct code ---
2
3 4 let segment: = input.iter
5 .take
6 .cloned
7 .collect;
8 // ........
9
The copied
adapter will not compile if you use it on a Clone
type. On the other hand, using the cloned
adapter on a Copy
type will just give a linter warning - as the clone will essentially just do a copy, but it's better to be explicit about it.
Note: .copied()
and .cloned()
can be placed after .iter()
or .take()
since both of the return types implement the Iterator trait. I'm not sure if it's better to use the .cloned
call immediately after .iter
. If the compiler doesn't optimize cloning only the necessary elements, it might make sense to use .cloned
after take
, in order to reduce the number of clones.
The 'by_ref' adapter
Getting past this, we might encounter a second issue - if we try to use .take()
to collect multiple groups from the iterators, the compiler won't let us. This is because the take
call consumes the iterator. In other words, ownership is moved, and the initial reference to the iterator is no longer valid for the following calls.
If we want to avoid consuming the iterator with the take
call we can use the by_ref
adapter after the .iter()
call like so:
1 2 let mut byte_iter = input.iter.copied;
3
4 let first_segment: = byte_iter.by_ref
5 .take
6 .collect;
7
8 let second_segment: = byte_iter.by_ref
9 .take
10 .collect;
11 // ........
12
In this manner, we can take multiple groups from the iterator without compiler errors. Keep in mind that the by_ref
for the second segment isn't strictly necessary unless we want to do an additional operation with the iterator afterward.