Void() Mathieu De Coster

Fighting the Borrow Checker

One of the most common questions asked by beginners about Rust is “How do I satisfy the borrow checker?”. The borrow checker is probably one of the steepest parts of Rust’s learning curve, and it is understandable that beginners have some trouble applying the concepts in a real world situation.

Just recently, on the Rust subreddit, a post was made, called “Tips to not fight the borrow checker?”.

Many Rust community members replied with helpful tips on how to avoid running into trouble with the borrow checker - tips that also shine some light on how you should design your Rust code (hint: not like your OOP Java code).

In this blog post, I will attempt to create some pseudo real world examples of common pitfalls.

First, let’s recap the borrow checker rules:

  • You can only have one mutable reference to a variable at a time
  • You can have as many immutable references to a variable as you want
  • You can not mix mutable and immutable references to the same variable

Blocks that are too long

Because you can only have one mutable borrow at any given time, you might get issues if you want to mutably borrow something twice in the same function. Even if the borrows don’t overlap, the borrow checker will complain.

Let’s look at an example that does not compile.

struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn new(name: &str, age: u8) -> Person {
        Person {
            name: name.into(),
            age: age,
        }
    }
    
    fn celebrate_birthday(&mut self) {
        self.age += 1;
        
        println!("{} is now {} years old!", self.name, self.age);
    }
    
    fn name(&self) -> &str {
        &self.name
    }
}

fn main() {
    let mut jill = Person::new("Jill", 19);
    let jill_ref_mut = &mut jill;
    jill_ref_mut.celebrate_birthday();
    println!("{}", jill.name()); // cannot borrow `jill` as immutable
                                 // because it is also borrowed
                                 // as mutable
}

The problem here is that we have mutably borrowed jill and then try to use it again to print her name. The fix is indeed to limit the scope of the borrow.

fn main() {
    let mut jill = Person::new("Jill", 19);
    {
        let jill_ref_mut = &mut jill;
        jill_ref_mut.celebrate_birthday();
    }
    println!("{}", jill.name());
}

In general, it can be a good idea to limit the scope of your mutable references. This avoids problems like the one showcased above.

Chaining function calls

You often want to chain function calls to reduce the number of local variables and let-bindings in your code. Consider that you have a library that provides Person and Name structs. You want to get a mutable reference to a person’s name and update it.

#[derive(Clone)]
struct Name {
    first: String,
    last: String,
}

impl Name {
    fn new(first: &str, last: &str) -> Name {
        Name {
            first: first.into(),
            last: last.into(),
        }
    }
    
    fn first_name(&self) -> &str {
        &self.first
    }
}

struct Person {
    name: Name,
    age: u8,
}

impl Person {
    fn new(name: Name, age: u8) -> Person {
        Person {
            name: name,
            age: age,
        }
    }
    
    fn name(&self) -> Name {
        self.name.clone()
    }
}

fn main() {
    let name = Name::new("Jill", "Johnson");
    let mut jill = Person::new(name, 20);
    
    let name = jill.name().first_name(); // borrowed value does not
                                         // live long enough
}

The problem here is that Person::name returns an owned value instead of a reference. If we then try to obtain a reference using Name::first_name, the borrow checker will complain. As soon as the statement ends, the value that is returned from jill.name() will be dropped, and name will be a dangling reference.

The solution is to introduce a temporary variable.

fn main() {
    let name = Name::new("Jill", "Johnson");
    let mut jill = Person::new(name, 20);
    
    let name = jill.name();
    let name = name.first_name();
}

Normally, we would return a &Name from Person::name, but there are some cases in which returning an owned value is the only reasonable option. If this happens to you, it’s good to know how to fix your code.

Circular references

Sometimes, you end up with circular references in your code. This is something I used to do way too often in C. Trying to fight the borrow checker in Rust showed me how dangerous this kind of code can actually be.

Let’s create a representation of a class with enrolled pupils. The class references the pupils, and they also keep references to the classes they are enrolled in.

struct Person<'a> {
    name: String,
    classes: Vec<&'a Class<'a>>,
}

impl<'a> Person<'a> {
    fn new(name: &str) -> Person<'a> {
        Person {
            name: name.into(),
            classes: Vec::new(),
        }
    }
}

struct Class<'a> {
    pupils: Vec<&'a Person<'a>>,
    teacher: &'a Person<'a>,
}

impl<'a> Class<'a> {
    fn new(teacher: &'a Person<'a>) -> Class<'a> {
        Class {
            pupils: Vec::new(),
            teacher: teacher,
        }
    }
    
    fn add_pupil(&'a mut self, pupil: &'a mut Person<'a>) {
        pupil.classes.push(self);
        self.pupils.push(pupil);
    }
}

fn main() {
    let jack = Person::new("Jack");
    let jill = Person::new("Jill");
    let teacher = Person::new("John");
    
    let mut borrow_chk_class = Class::new(&teacher);
    borrow_chk_class.add_pupil(&mut jack);
    borrow_chk_class.add_pupil(&mut jill);
}

If we try to compile this, we get bombarbed with errors. The main problem here is that we are trying to store references to classes in persons and vice versa. When the variables get dropped (in reverse order of creation), teacher will be dropped while still being referenced indirectly by jill and jack through their enrollments!

The simplest (but hardly cleanest) solution here is to avoid the borrow checker altogether and use Rc<RefCell>s instead.

use std::rc::Rc;
use std::cell::RefCell;

struct Person {
    name: String,
    classes: Vec<Rc<RefCell<Class>>>,
}

impl Person {
    fn new(name: &str) -> Person {
        Person {
            name: name.into(),
            classes: Vec::new(),
        }
    }
}

struct Class {
    pupils: Vec<Rc<RefCell<Person>>>,
    teacher: Rc<RefCell<Person>>,
}

impl Class {
    fn new(teacher: Rc<RefCell<Person>>) -> Class {
        Class {
            pupils: Vec::new(),
            teacher: teacher.clone(),
        }
    }
    
    fn pupils_mut(&mut self) -> &mut Vec<Rc<RefCell<Person>>> {
        &mut self.pupils
    }
    
    fn add_pupil(class: Rc<RefCell<Class>>, pupil: Rc<RefCell<Person>>) {
        pupil.borrow_mut().classes.push(class.clone());
        class.borrow_mut().pupils_mut().push(pupil);
    }
}

fn main() {
    let jack = Rc::new(RefCell::new(Person::new("Jack")));
    let jill = Rc::new(RefCell::new(Person::new("Jill")));
    let teacher = Rc::new(RefCell::new(Person::new("John")));
    
    let mut borrow_chk_class = Rc::new(RefCell::new(Class::new(teacher)));
    Class::add_pupil(borrow_chk_class.clone(), jack);
    Class::add_pupil(borrow_chk_class, jill);
}

Note that now we no longer have the safety guarantees of the borrow checker. Edit: As /u/steveklabnik1 pointed out, a better way to phrase this is:

Note that because Rc and RefCell both rely on run-time mechanisms to ensure safety, we’ve lost some amount of compile-time checking: RefCell will panic if we try to borrow_mut twice, for example.

Additionally, here is a comment indicating why the code is still safe with runtime checks:

Perhaps a better option is to refactor your code in a way that you no longer need circular references.

If you’ve ever normalised a relational database, this is actually quite similar. We store the references between persons and classes in a separate struct.

struct Enrollment<'a> {
    person: &'a Person,
    class: &'a Class<'a>,
}

impl<'a> Enrollment<'a> {
    fn new(person: &'a Person, class: &'a Class<'a>) -> Enrollment<'a> {
        Enrollment {
            person: person,
            class: class,
        }
    }
}

struct Person {
    name: String,
}

impl Person {
    fn new(name: &str) -> Person {
        Person {
            name: name.into(),
        }
    }
}

struct Class<'a> {
    teacher: &'a Person,
}

impl<'a> Class<'a> {
    fn new(teacher: &'a Person) -> Class<'a> {
        Class {
            teacher: teacher,
        }
    }
}

struct School<'a> {
    enrollments: Vec<Enrollment<'a>>,
}

impl<'a> School<'a> {
    fn new() -> School<'a> {
        School {
            enrollments: Vec::new(),
        }
    }
    
    fn enroll(&mut self, pupil: &'a Person, class: &'a Class) {
        self.enrollments.push(Enrollment::new(pupil, class));
    }
}

fn main() {
    let jack = Person::new("Jack");
    let jill = Person::new("Jill");
    let teacher = Person::new("John");
    
    let borrow_chk_class = Class::new(&teacher);
    
    let mut school = School::new();
    school.enroll(&jack, &borrow_chk_class);
    school.enroll(&jill, &borrow_chk_class);
}

This is a better design, in any way. There is no reason a person should know which class they are in, nor should a class know which people are enrolled in it. Should they need this information, they can obtain it from the list of enrollments.

Closing notes

In case you don’t really understand why the rules of the borrow checker are the way they are, this explanation by redditor /u/Fylwind might help:

Finally, while the borrow checker is not infallible, and while you might fight it at first, once you learn to work with it, you will also learn to love it.

Discuss this post on Reddit.