Type bounds for standard library types in Rust
Cătălin September 17, 2023 [Rust] #Traits #Standard LibraryIntro
I often hear complaints that Rust code is hard to read. The following is an example of this:
What's going on here in the trait bounds for this implementation?
I'm going to show that the conditions here are not as complex as they appear, and I'm also going to explain some of the reasoning behind the standard library design for this. We are also going to cover two more advanced topics around traits: associated types and blanket implementations. As a bonus, this will also explain the relationship between the Iterator
and IntoIterator
traits - which are very important for designing APIs / libraries.
Check out the video
If you don't feel like reading:
Iterator
Let's start with the first trait involved in the definition: Iterator
This trait allows us to process collections by iterating over their elements. It has multiple associated methods, but the only one that you need to implement is next
. All the other methods come with default implementations by using next. This is only a small part of them.
// Provided methods
ⓘ
where Self: Sized
ⓘ
where Self: Sized,
U:
ⓘ
where Self: Sized,
U: IntoIterator
//...
You can overwrite these implementations if you want, but you get them for free after providing an implementation for next
.
The next
method will return the upcoming Item
in the collection, wrapped in an Option
. As you can see in the method signature, we're returning the Item
type of Self
. Self
is used here to denote the type of the struct that will implement Iterator
.
Item
is declared above using the "type Item" declaration. This alias is what we call an associated type.
Now, you may be thinking: "Why do we need this special syntax, isn't this the same as using generics? Can't we just do it like this?:
And yes, we can always choose to use generic types instead of associated types, but the effect is different in terms of API design. Associated types will force a single concrete type for each implementing struct, while generics will allow any number of types for a given implementing struct.
To illustrate this, if the Iterator
trait would use generic syntax, when I implement it for a collection of Strings
I could define an implementation that returns String
as the Item (which is fine), but I could also define one that returns a number, dates, and so on... which is not what we want as a library designer.
//...
Since I know that there is only one type that makes sense as the return Item, I can limit users of my library to only defining one type per implementation by using by associated types. This is why we say that associated types have a "grouping" effect - If we were to use a generic instead, we would get a whole family of possible implementations - as we saw in the previous example.
In other words, for a given set of generic trait params, the associated types of the trait will be constant. Yes! what I said means that we can use generics together with associated types, as we'll see with the upcoming examples.
I said that the associated type can only have a single concrete value for each implementing type. What happens if I add another impl block trying to redefine it?
Here's the struct definition for which we're implementing Iterator
:
//doesn't compile
After adding the second block that tries to redefine Item
, we'll get the following compilation error:
conflicting implementations of trait std::iter::Iterator for type Something
If we now introduce a generic parameter T
, we can define different items for Something<T>
without the error. This is because Something<u8>
is an essentially different type from Something<u16>
.
//doesn't compile
But, after the third block is added using the same generic <u16>
param - I'll get the error again. It doesn't even matter if I change Item or not as the compiler knows that there should be only one implementation definition for Something <u16>
conflicting implementations of trait std::iter::Iterator for type Something<u16>
IntoIterator
Okay, let's look at the next trait we need: IntoIterator
This is a trait that we use to obtain an Iterator
from other types. IntoIterator
won't handle iteration, but it will provide an Iterator
to us. The types which will usually implement IntoIterator
will be collection types. Let's look at the associated types in the definition - we've seen Item
, what about IntoIter
?
IntoIter
is the associated type that marks the kind of Iterator that IntoIterator
should return when the required method is called. The trait bound is logical - since it is the return type of conversion method IntoIter
needs to be something that implements Iterator
.
But now we have something weird in the angle brackets. This means that the Item
of the IntoIter
associated type needs to be the same as the Item
on Self (which is the type that will implement IntoIterator
.
You might ask, why do we need this Item
inside IntoIterator
- can't we just use the associated Item
of the associated IntoIter
type? And you're right, type Item
is completely unnecessary here. You could always just refer to Item
of IntoIter
, but for generic implementations (which are used a lot), this would add an extra generic type and an extra where
clause. Here's an example.
Before the change, you would have to do this:
After the Item
type was added to IntoIterator
as well, we can write it like this:
This gets rid of a generic type and a line in the where clause.
In the implementation, we're calling .into_iter()
on the collection and then calling .next()
on the resulting iterator as long as we get a Some
variant as a result. When we get a None
, iteration stops.
As you may know, the implementation can be simplified to this:
for elem in collection
The for loop is just syntactic sugar over what we were doing before - converting to an iterator and calling .next()
on it until it returns a None
variant.
You may also have observed that where
clauses can be added to normal functions as well - not just for trait definitions or implementations.
Coming back to IntoIterator
's design - the standard library team decided to add the Item
to it, to simplify common where clauses and generic types. So, while the Item
wasn't necessary for IntoIterator
to work, since it was added to the definition, now it's mandatory to define when implementing IntoIterator
.
For maximum flexibility, a lot of utility methods work with IntoIterator
so you don't have to manually convert to an Iterator
before calling them. But what about cases where I already have an Iterator
? - how do I call those methods then?
Fortunately, all Iterators implement IntoIterator
by default How was this achieved? - by using some Rust magic called "blanket implementations". Here we have the implementation block for this:
Since IntoIterator
just has to return an Iterator
, Iterators trivially implement this by returning themselves. One important thing to note here is that this implementation is for any generic type I
that implements Iterator
. This means that any newly created type that implements Iterator
will also implement IntoIterator
because of this. This is the power of blanket implementations - We can have implementations for types before we even write them!
Flatten
Now that we've wrapped this up, let's go back to the implementation definition at the start
We're implementing Iterator for the Flatten
struct. Flatten
is an abstraction over a one-level nested Iterator
structure - an iterator that returns collections that can also be iterated on. We have a 2d structure that we want to treat like a 1d structure. Normally, we would have to iterate over each separate collection in the nested structure manually. The Flatten
abstracts the job of handling each collection on it's own, allowing us to iterate over the structure as if it were a regular iterator
Let's look at the definition of the struct:
Here, you can already see the descriptive power of Rust's type system. Flatten
is generic over I where I has to implement Iterator
and the associated Item
of that Iterator
needs to Implement IntoIterator
. This essentially means that Flatten works for any Iterator
that returns Items that can be converted into Iterators.
We can get Flattens by calling the .flatten()
method on any Iterator
. This means the .flatten()
method is part of Iterator
's standard provided methods. Here's the implementation definition for the it:
The first where clause indicates that the Iterator
needs to implement Sized
- this is a marker trait which indicates that the type needs to have a known size at compile time.
The second clause indicates that the associated Item
of the Iterator
needs to implement IntoIterator
Let's see how the Flatten
is used actually used:
let nested: = vec!;
let simple = vec!;
let flatten = nested.iter.flatten;
for &val in flatten
//doesn't compile
let flatten_fail = simple.iter.flatten;
Here we have a nested vector - a Vec
containing two Vec
s of number, and a simple one containing numbers. We can get an iterator from the nested Vec
and call .flatten()
on it. Now, we can just iterate over the Flatten
that we created.
If I try to call .flatten()
with the simple vector, I'll get a compilation error - because this does not satisfy the type bounds for flatten. The item of the obtained Iterator
needs to implement IntoIterator
, where here, it's just a reference to a number.
If Flatten
allows us to iterate over it, this means that Iterator
is implemented for Flatten
- and that's exactly the definition we saw at the beginning:
Before you try to run this on your own, I have to warn you - this kind of syntax is unstable, so you won't be able to use it when writing your own code - it's not available on the Rust nightly toolchain either. Rust library designers can use unstable features as long as they are enclosed within other features that are marked as stable in the library. So.... don't try this at home! I'll show you how you'd write this on your own with stable syntax after I explain what the current syntax is doing.
This implementation is generic over I
and U
. I
here marks the outer wrapping iterator. The bound specifies that the Item
of this outer Iterator
has to be IntoIterator
(or in other words, has to be a collection that can be converted to an Iterator
).
We mark the associated IntoIter
type as being the generic type U
in the definition, and the Item
is defined as the Item
of U
(which is IntoIter)
. As explained before, this relationship is already implicit - the Item
of IntoIterator
has to be the Item
of the associated IntoIter
type - this is specified (and restricted) in the IntoIterator
definition, but we need to specify it here again because at this time the Rust compiler doesn't treat the bounds from the trait definition as also implicit for the implementation when using this kind of syntax.
There's a similar limitation for the stable syntax as well (though it applies somewhat differently) The language team has been looking at making the compiler recognize these implicit bounds so you don't have to repeat them, but some of the implementation details are not easy to resolve.
This is also the case for U: Iterator
in the second line of the where clause - it's also specified in the trait definition. In other words, this clause would not be needed if the implied bounds feature were available. If we use the stable syntax for the implementation, we can get rid of the U
type, but the Item
definition gets more complicated:
This represents that the Item
of the Iterator
(in this case the type of the inner collections) needs to implement IntoIterator
(this is exactly as in the Flatten
struct definition)
Here we're defining that the Item
of the Flatten
is the Item
associated with the outer Iterator
's Item
. We use the "as
" syntax here to make it clear to the compiler on which trait definition we're accessing the associated type Item
. Right now, the compiler needs this type qualification because the associated name (Item
in this case) might be defined on multiple traits that the generic type I
can implement - as with this case I
implements Iterator
and IntoIterator
, both with associated Items. In this particular situation, the Items happen to be identical, but in case of a conflict, the compiler wouldn't know which to pick.
Phew... that was quite a bit - hope that you're more comfortable reading trait bounds and that you've got a better grasp of how associated types and generic implementations are used.