Explained: Ownership in Rust

Urvashi
11 min readJul 22, 2024

--

If you’re someone who works mostly with either Java, Python or JavaScript, then you’re already familiar with garbage collected languages.

However, if you’ve been working with either C, C++ or Assembly (kudos to you btw!), then you must be familiar with manual memory management.

Memory management is the process of allocating memory and deallocating memory. In other words, it’s the process of finding unused memory and later returning that memory when it is no longer used.
- The Rust Book

But what is the difference between these two types of languages — garbage collected and non-garbage collected (manual memory management)?

Garbage Collection

In garbage collected languages, there’s a garbage collector program that identifies and reclaims memory that is no longer used.

It is a form of automatic memory management which helps prevent memory leaks and optimise the use of available memory.

The garbage collector automatically frees up memory allocated to objects that are no longer reachable or needed, allowing you, the programmer, to focus on other aspects of coding without worrying about manually managing memory deallocation.

However, garbage collection introduces performance overheads due to the additional work required to automatically manage memory.

Manual Memory Management

Non-garbage collected languages on the other hand are those where memory management is typically handled manually by the programmer.

This means that programmers must explicitly allocate memory, and deallocate it to prevent memory leaks and other issues.

Though manual memory management gives developers direct control over when and how memory is allocated and deallocated and has no garbage collection overheads, but it also comes with several potential issues like memory leaks (when allocated memory is not properly deallocated), dangling pointers (when a pointer still references a memory location after that memory has been freed), double free (when an attempt is made to free the same memory location more than once) etc.

So, where does Rust fit in?

Rust does not use a garbage collector nor does it rely on manual memory management.

Instead, Rust employs a unique memory management system based on ownership with the help of the borrow checker.

It guarantees memory-safety, preventing common bugs associated with manual memory management and has no runtime overhead associated with garbage collection.

But before we talk about ownership in detail, first let’s understand a few concepts.

Stack vs Heap

If you’re coming from a traditional computer science background, then you might remember learning about the stack and the heap memory.

Stack memory is for static memory allocation, typically used for variables with a known lifetime and scope.

Heap memory is for dynamic memory allocation, suitable for objects that need to persist beyond the scope of a single function call.

Let’s talk about memory within the context of Rust.

Rust’s Memory Model

Variables Live in the Stack

When you call a function in Rust, it allocates a stack frame for it. You can think of a stack frame as a mapping from variables to their values within a single scope, such as a function.

For example:

In the given code, when main is called, Rust allocates a stack frame for it and after the function returns, Rust automatically deallocates the function’s frame.

L1 — Function Entry (fn main() {):

  • The main function is called.
  • The stack frame for main is created, but it's currently empty.

L2 — First Variable Declaration (let x = 1;):

  • Variable x is declared and initialized to 1.
  • The stack frame now contains x with a value of 1.

L3 — Second Variable Declaration (let y = 1;):

  • Variable y is declared and initialized to 1.
  • The stack frame now contains both x and y, each with a value of 1.

L4 — Function Exit (}):

  • The main function is about to exit.
  • The stack frame for main is emptied as the function scope ends, and local variables x and y go out of scope.

Consider another example:

L1 — Function Entry (fn main() {):

  • The main function is called.
  • The stack frame for main is created, but it's currently empty.

L2 — First Variable Declaration in main (let x = 1;):

  • Variable x is declared and initialized to 1.
  • The stack frame now contains x with a value of 1.

L3 — Function Call (test();):

  • The test function is called.
  • The stack frame for main now shows a pending call to test.

L4 — Entry into test Function (fn test() {):

  • The test function's stack frame is created but currently empty.
  • The stack now includes frames for both main and test.

L5 — Variable Declaration in test (let y = 1;):

  • Variable y is declared and initialized to 1 within the test function.
  • The stack frame for test now includes y with a value of 1.

L6 — Exit from test Function (}):

  • The test function finishes execution and its stack frame is removed.
  • Control returns to the main function with its stack frame intact.

L7 — Resuming main after test returns:

  • The test function has returned, and the stack frame for test is now empty.
  • The stack frame for main still contains x.

L8 — Exit from main Function (}):

  • The main function finishes execution.
  • The stack frame for main is cleared as the program ends.

Notice in the above diagram that these frames are neatly organised into a stack of currently-called-functions where the most recent frame added is always the next frame freed (LIFO).

Copying Data

When an expression reads a variable (for example is assignments and function calls), the variable’s value is copied from its slot in the stack frame.

However, copying data can take up a lot of memory.

Imagine if x was an array containing a million elements! Copying x into y would cause the main frame to contain 2 million elements.

This is not a very efficient use of the available memory. And this problem would only multiply if our program handled multiple such large datasets.

This is where pointers step in.

Pointing to data in the heap

One way to mitigate this problem is by allocating this data in the heap and pointing to it.

A pointer is a value that describes a location in memory.

Rust data structures like Vec, String, and HashMap use the heap memory by default.

For example:

L1 — Function Entry (fn main() {):

  • The main function is called.
  • The stack frame for main is created, but it is currently empty.

L2 — String Allocation (let name = String::from("John");):

  • A String is created from the literal "John".
  • The variable name is declared and stored on the stack.
  • The actual string data "John" is stored on the heap.
  • The stack frame for main now contains a reference (pointer) to the heap-allocated string data.

L3 — Function Exit (}):

  • The main function is about to exit.
  • The stack frame for main is cleared, and the name variable goes out of scope.

Note that, the variable still lives in the stack but its value is stored in the heap.

Rust also provides you the Box construct for putting data on the heap.

Now consider the previous example where we had an array of million elements, but this time instead of storing the elements in the stack we’re going to store it in the heap using Box::new() :

L2 —Heap Allocation(let x = Box::new([0; 1_000_000]);):

  • A Box is created, which allocates an array of 1,000,000 zeros on the heap.
  • The variable x is declared and stored on the stack which points to the heap-allocated array.

L3 — Move of Box (let y = x;):

  • Unlike previously, y does not create another copy of the (million) elements in the heap and point to it.
  • Instead, the ownership of x is moved to y. (Notice how x is now greyed out)

This is what is meant by a move:

In Rust, all heap data must be owned by exactly one variable. When you do let y = x;Rust copies the pointer from x into y, but the pointed-to data is not copied.

And now the new owner is y so x becomes invalid and you cannot use it to access the heap anymore:

fn main() {
let x = Box::new([0; 1_000_000]);
let y = x;
println!("{}", x); // ERROR
}
error[E0382]: borrow of moved value: `x`
--> src/main.rs:4:22
|
2 | let x = Box::new([0; 1_000_000]);
| - move occurs because `x` has type `Box<[i32; 1000000]>`, which does not implement the `Copy` trait
3 | let y = x;
| - value moved here
4 | println!("{:?}", x);
| ^ value borrowed here after move

Heap data can only be accessed through its current owner y, not the previous owner:

fn main() {
let x = Box::new([0; 1_000_000]);
let y = x;
println!("{:?}", y); // CORRECT: y is the new owner
}

Deallocation of Memory

Stack frames are automatically managed by Rust. When a function is called, Rust allocates a stack frame for the called function. When the call ends, Rust deallocates the stack frame.

But what about the heap data?

Well, Rust automatically frees a box’s heap memory when it deallocates the variable’s stack frame that owns the box.

So going back to the previous example:

L4 — Function Exit (}):

  • The main function is about to exit.
  • The stack frame for main is cleared, and the y variable goes out of scope.
  • The heap-allocated array is deallocated when the Box owner goes out of scope.

Consider another example where ownership moves around different functions:

L1 — Function Entry (fn main() {):

  • The main function is called.
  • The stack frame for main is created.

L2 — String Creation (let first = String::from("Ferris");):

  • A String named first is created with the value "Ferris".
  • The variable first is allocated on the stack, and the actual string data is stored on the heap.
  • first stores a pointer to the heap data.

L3 — Function Call (let full = add_suffix(first);):

  • The add_suffix function is called with first as an argument.
  • Ownership of the first string is moved to the add_suffix function and first becomes invalid.

L4 — Function Entry (fn add_suffix(mut name: String) -> String {):

  • The add_suffix function is called with name initialized to "Ferris".
  • The stack frame for add_suffix is created, and the name variable is allocated on the stack.
  • name is now the new owner of the heap data.

L5 — Modify String (name.push_str(" Jr.");):

  • The push_str method is called on name, appending " Jr." to it. This does three things. First, it creates a new larger allocation. Second, it writes “Ferris Jr.” into the new allocation. Third, it frees the original heap memory. first now points to deallocated memory (denoted by ⦻).
  • The name variable now contains the value "Ferris Jr.".

L6 — Return Modified String (name):

  • The modified name string is returned from the add_suffix function.
  • Ownership of the string is moved back to the caller (main function).
  • The stack frame for add_suffix is cleared.

L7 — Continue in main Function:

  • After returning from add_suffix, control is back in the main function.
  • The variable full now owns the string "Ferris Jr.".

L8 — Print the Result (println!("{full}");):

  • The println! macro is called to print the value of full.
  • The value "Ferris Jr." is printed.

L9 — Function Exit (}):

  • The main function exits.
  • The stack frame for main is cleared, and the full variable goes out of scope.
  • The heap-allocated memory for the String is also deallocated since it was owned by full.

Notice how in the above example the heap data is not tied to just one stack frame, instead it attaches itself to different stack frames based on which variable owns it.

Now you might be wondering how does this concept of Ownership guarantee memory safety in Rust?

In short, it prevents undefined behaviour which thus ensures memory-safety.

Undefined behaviour can lead to various issues such as program crashes, security vulnerabilities, or data corruption.

But what do we mean by undefined behaviour?

Undefined Behaviour

Undefined behaviour in Rust refers to actions that the Rust compiler and runtime do not guarantee to behave predictably or correctly.

Consider this slightly different version of the previous code:

Here, when you do name.push_str(" Doe"); , it deallocates the previous string and reallocates the new updated string. This results in first pointing to freed memory.

Without ownership, you would have been able to access first thus resulting in undefined behaviour.

However, Rust prevents this undefined behaviour during compilation itself:

fn main() {
let first = String::from("John");
let mut name = first;
name.push_str(" Doe");
println!("{first}");
}
Compiling chapter-3 v0.1.0 (/Users/urvashi/work/rust_projects/chapter-3)
error[E0382]: borrow of moved value: `first`
--> src/main.rs:5:15
|
2 | let first = String::from("John");
| ----- move occurs because `first` has type `String`, which does not implement the `Copy` trait
3 | let mut name = first;
| ----- value moved here
4 | name.push_str(" Doe");
5 | println!("{first}");
| ^^^^^^^ value borrowed here after move

Similarly, imagine if Rust allowed you to manually deallocate memory using a function like free:

let b = Box::new([0; 100]);
free(b);
assert!(b[0] == 0); // UNDEFINED BEHAVIOUR

Here, again it would result in undefined behaviour as you’re trying to read the pointer b after freeing its memory. That would attempt to access invalid memory, which could cause the program to crash. Or worse, it could not crash and return arbitrary data. Therefore this program is unsafe.

Instead, Rust automatically frees a box’s heap memory using the ownership model to prevent such undefined behaviour.

A foundational goal of Rust is to ensure that your programs never have undefined behavior. That is the meaning of “safety.”

A secondary goal of Rust is to prevent undefined behavior at compile-time instead of run-time.

Summary

Rust employs a unique memory management system based on ownership, borrowing, and lifetimes to ensure memory safety without the need for a garbage collector or manual memory management.

  • Stack vs. Heap: Rust differentiates between stack and heap memory. Stack memory is used for local variables with a known, short lifetime, while heap memory is used for data that needs to persist beyond the current scope.
  • Copy and Move Semantics: Rust minimises unnecessary data copying by allowing ownership to be moved rather than copied. This is particularly useful for large data structures, reducing memory usage and improving performance.
  • Ownership: Each piece of data in Rust has a single owner, and the data is automatically deallocated when the owner goes out of scope. This prevents memory leaks and dangling pointers.
  • Undefined Behaviour Prevention: By ensuring that memory is always accessed safely and correctly, Rust prevents undefined behaviour, which can lead to program crashes, security vulnerabilities, and data corruption.

However, this was just one piece of the puzzle. Rust uses a combination of concepts like ownership, borrowing, and slices to ensure memory safety. You can learn more about Rust here.

References

--

--

Urvashi
Urvashi

Responses (1)