Safety off: Programming in Rust with `unsafe`

Thursday July 11, 2024. 06:28 PM , from InfoWorld

Reasons abound for Rust’s growing popularity: it’s fast, memory-safe without needing garbage collection, and outfitted with world-class tooling. Rust also allows experienced programmers to selectively toggle off some—although not all—of its safeties for the sake of speed or direct low-level memory manipulation.
“Unsafe” Rust is the general term for any Rust code that is contained in a block delineated by the unsafe keyword. Inside an unsafe block, you can bend (but not break) some (but not all) of Rust’s safety rules.
If you’ve come from C or another low-level systems language, you may be tempted to reach for unsafe whenever you want to use some familiar pattern for manipulating things in a low-level way. In some cases, you may be right: there are a few things you can’t do in Rust except through unsafe code. But in many cases you don’t actually need unsafe. Rust already has you covered if you know where to look.
In this article, we’ll explore what unsafe Rust is actually for, what it can and can’t do, and how to use it sensibly.
What you can do with ‘unsafe’ Rust
The unsafe keyword in Rust lets you delineate a block of code, or function, to enable a specific subset of features in the language. Let’s look at the features you can access inside Rust’s unsafe blocks.
Raw pointers
Rust’s raw pointers can refer to mutable or immutable values and are closer to C’s idea of a pointer than Rust’s references. With them, you can ignore some of the rules for how borrowing works:

Raw pointers can be null values.
Multiple raw mutable pointers can point to the same space in memory.
You can also use both immutable and mutable pointers to refer to the same memory.
You don’t need to guarantee that a raw pointer points to a valid region of memory.

Raw pointers are useful if you need to do things like access hardware directly (e.g., for a device driver), or talk to an application written in another language by way of a raw region of memory.
External function calls
Another common use of unsafe is to make calls through a foreign function interface, or FFI. There’s no guarantee that what we get back from such a call will follow Rust’s rules, and there’s also a chance we will need to supply things that don’t conform to those rules (such as a raw pointer).
Consider this example (from Rust’s documentation):

extern 'C' {
fn abs(input: i32) -> i32;
}

fn main() {
unsafe {
println!('Absolute value of -3 according to C: {}', abs(-3));
}
}

Any calls made to the functions exposed via the extern 'C' block must be wrapped in unsafe, the better to ensure you take proper responsibility for what you send to it and get back from it.
Altering mutable static variables
Global or static variables in Rust can be set to mutable, since they occupy a fixed memory address. However, it’s only possible to modify a mutable static variable inside an unsafe block.
Data races are the biggest reason you need unsafe to alter mutable static variables. You’d get unpredictable results if you allowed the same mutable static variable to be modified from different threads. So, while you can use unsafe to make such changes, any data race issues would be your responsibility, not Rust’s. In general, Rust cannot entirely prevent data races, but you need to be doubly cautious about that in unsafe blocks.
Creating unsafe methods and traits
Methods (functions) can be made unsafe with the declaration unsafe fn (). You’d use this to ensure that any call to such a method must also be performed inside an unsafe block.
For instance, if you had a function that required a raw pointer as an argument, you’d want to ensure the caller did their due diligence for how the call was performed in the first place. Any safeties don’t begin and end at the function-call boundary.
You can also declare traits unsafe, along with their implementations, using a similar syntax: unsafe trait for the trait, and unsafe impl for the implementation.
Unlike an unsafe method, though, an unsafe trait implementation does not have to be called inside an unsafe block. The burden of safety is on the one writing the implementation, not the one calling it.
Unions
Unions in Rust are essentially the same as unions in C: a struct that has multiple possible type definitions for its contents. This kind of loosey-goosey behavior is acceptable in C, but Rust’s sterner promises of correctness and safety don’t allow it by default.
However, sometimes you need to create a Rust structure that maps to a C union, such as when you call into a C library to work with a union. To do this, you need to use unsafe to access one particular field definition at a time.
Here’s an example from the Comprehensive Rust guide:

#[repr(C)]
union MyUnion {
i: u8,
b: bool,
}

fn main() {
let u = MyUnion { i: 42 };
println!('int: {}', unsafe { u.i });
println!('bool: {}', unsafe { u.b }); // Undefined behavior!
}

For each access to the union, you have to use unsafe. The borrow checker also requires borrowing all the fields of a union even if you just want to access one of them at a time.
Note that writing to a union does not have the same restrictions: in Rust’s eyes you’re not writing to anything that needs tracking. That’s why you don’t need unsafe when defining the contents of the union with the let statement.
What you cannot do with Rust ‘unsafe’
Outside of the four big points listed above, unsafe doesn’t give you any other special powers.
The single biggest thing you can’t do is use unsafe to circumvent the borrow checker. Borrows are still enforced on values in unsafe, same as anyplace else. One of Rust’s truly immutable principles is how borrowing and references work, and unsafe does not alter those rules.
For a good example of this, look at the way borrowing is still enforced in unions, as described in the previous section. See Steve Klabnik’s blog post for a more detailed discussion on this topic. It’s better to think of unsafe as a superset of Rust—something that adds a few new features without taking away existing ones.
Best practices with unsafe code
unsafe is like any other language feature: it has its uses and limitations, so use it judiciously and with care. Here are some pointers.
Wrap as little code as possible in ‘unsafe’
The smaller an unsafe block is, the better. In many cases, you don’t need to make unsafe blocks more than a couple of lines. It’s worth thinking about how much of the code actually needs to be unsafe, and how to enforce boundaries and interfaces around that code. It’s best to have safe interfaces to code that’s not safe.
Sometimes the interfaces themselves have to be unsafe. I mentioned earlier that you can declare an entire function as unsafe—for example, unsafe fn (). If an argument to a function requires unsafe, any calls to such a function also must be unsafe.
On the whole, though, it’s best to start with individual unsafe blocks, and promote them to functions only if needed.
Be mindful of undefined behaviors
Undefined behaviors exist in Rust, and so they also exist in unsafe Rust. For instance, Rust’s basic safeties don’t guard against data races, or reading from uninitialized memory. Be extra careful of how you might be introducing undefined behavior in your unsafe blocks.
For examples of what to avoid, the Rust Reference has a long but not exhaustive list of undefined behavior.
Document why ‘unsafe’ is needed
It’s been said that we comment code not to say what is being done, but why. That idea absolutely applies to unsafe blocks.
Whenever possible, document exactly why unsafe is needed at any given point. This gives others some idea of the rationale behind the use of unsafe, and—potentially—future insight into how the code could be rewritten to not need unsafe.
Read the Rustonomicon
Rust’s documentation is second to none. That includes the Rustonomicon, an entire book-length document created to thoroughly document “all the awful details that you need to understand when writing Unsafe Rust programs.” Don’t open it until you are comfortable with Rust generally, as it delves into great detail about how “unsafe” and “regular” Rust interoperate.
If you’re coming in from the C side of things, another useful document is Learn Rust The Dangerous Way. This collection of articles talks about Rust in a way that people used to low-level C programming can wrap their heads around, including best practices for using Rust’s unsafe feature.