In my recent article, I showed how Vale's linear types can provide memory safety without a borrow checker, generations, reference counting, or tracing GC.
My friend Mike asked, "So linear types give zero-overhead memory safety, like borrow checking?"
"Perhaps," I said. "What do you mean by zero-overhead memory safety? Do you mean like zero-cost abstractions?" 0
"No, I mean no memory safety overhead, since it has no RC, GC, or generations." he said.
A really interesting conversation ensued, with a surprising conclusion: zero-overhead memory safety doesn't actually exist, and never has. In other words, any program in any language will have some run-time cost to maintain memory safety. 1
This will come as a surprise to a lot of people, who believe various approaches have no memory safety overhead.
Let's embark on a little expedition, to find a language with zero-overhead memory safety!
In Arrrlang, everything lived in a global array.
...and so on.
In Arrrlang, there were no pointers!
If a Ship wanted to point to its Engine, it would instead index into the global engines array.
struct Ship {
engineIndex int;
}
Everything was indexing into everything else. It actually felt like using a relational database.
With these basic rules, Arrrlang was memory safe. 4
Memory safety being a zero-overhead abstraction means that if you're not using a certain method of memory safety, you're not paying any costs for the option.
Though, one can theoretically contrive a simple program that just does a return 0; which has no cost. We're talking about real-world programs here.
"Toy language" just means a small language, usually made as a fun hobby. I once also made a toy language for generating an in-memory time-traveling database for my game Shattered Forest.
Some people think it's name is just Perry, but there are dubious notes predating that which suggests that its full name was Zeddicus Zu'l Zorander the Second.
Some details:
Using an old index has some risks: accessing an unexpected live element or old data can cause logic bugs. I wouldn't call Arrrlang "safe", but it was at least memory safe.
Arrrlang had no reference counting, no tracing garbage collection, and no generations. So by Mike's definition, Arrrlang had zero-overhead memory safety.
But of course, it had plenty of costs.
It might be tempting to think that these are inevitable costs, fundamental to any form of managing memory.
However, tracing garbage collection doesn't have these costs. 5 It has other costs, but the point still stands.
So it seems this zero-overhead memory-safe language actually still has some run-time overhead in practice.
If we're looking for a language with zero-overhead memory safety, our expedition apparently isn't over.
Specifically:
Rust is in the top three favorites for both me and Mike, so it came up pretty quickly in our conversation.
The first thought is, of course, that safe Rust has bounds checking too, which are an obvious run-time cost (anywhere between 1-15%, usually on the lower side of that).
But that's unavoidable; all safe languages have bounds checking! 6 Let's steelman the question and instead ask, "Does safe Rust have more bounds checking costs than other languages?" 7
Sometimes, it's less. It's not about eliding bounds checks in loops (which really just moves the bounds check to the i < len loop condition), but when combined with loop unrolling it can skip quite a few, and that's a big point in Rust's favor.
But there are cases where it has more bounds checking.
When the borrow checker denies two simultaneous &mut references to an object, we often make one into an index into a central Vec or HashMap, which have extra bounds checking or hashing.
To see this in action, try implementing a doubly linked list, observers, delegates, intrusive data structures, subscriptions, back-references, graphs, or any other kind of interconnected data. 8
Or, when we want to read from an array of structs and write to a different field in that same array, we often "flatten" it into two arrays (with twice the bounds checking), similar to a column-oriented database or an ECS approach. 9 10
Another common pattern is to clone more often with .clone(), often to get around the borrow checker. This is cheap for primitives, but can be expensive for types like String and Vec that involve heap allocations. 11 With practice this lessens, but it never goes away completely.
This additional cloning, hashing, and bounds checking are all consequences of having a borrow checker.
This isn't a surprise to anyone who knows Rust well. Rust never promised zero-overhead memory safety. Rather, it lets you explicitly choose which memory-safety overhead you incur. This is known as a "zero-cost abstraction", a phrase that I suspect my friend Mike may have misunderstood.
But alas, we still need to answer his original question. It seems that Rust is another "zero-overhead" language that actually has some overhead. Our expedition isn't over!
Some techniques such as dependent types can help a compiler avoid bounds checks, though they really just move the overhead elsewhere, for example by requiring a lot of if-statements to maintain a certain range for an integer that we later use for indexing into an array.
We can also skip bounds checks in Rust with get_unchecked etc, which can be a great tool when you can prove a different way that the index in bounds. For this article, I'll be mainly talking about safe Rust.
As Jonathan Corbet found, intrusive data structures are nigh impossible in Rust.
This is one reason why the borrow checker likes ECS so much.
Another good solution for the same array-of-structs problem is to generate an "effect struct" containing data describing some changes that should happen later when we have an exclusive &mut again. However, this effect struct can often require clones or heap allocations.
In the words of LAC-Tech:
"I've improved but I still have a lot of "wtf" moments. End up doing a lot of deep copying and heap allocations just to shut the compiler up, which makes me question just how "zero cost" the safety is."
In my linear types post, I talk about how Vale's overhead comes in the form of generation checks, but we can reduce them to zero with linear types and regions.
After using this "linear style" for a while, I realized that it feels really similar to using a borrow checker. That makes sense, as they're both a kind of substructural type system.
And indeed, it had a lot of the same consequences as borrow checking. Instead of aliasing, we would use more indexes and IDs into central arrays and hash maps. I also ended up with more cloning, just to get around this linear style's restrictions.
So even though this linear style Vale meets Mike's definition of a zero-overhead memory-safe language, it also still has overhead, just like Rust.
It seems like there's a lower bound to memory safety overhead. Even the most zero-overhead approaches have some overhead. Why is that?
Someone once challenged me to make a zero-overhead observer in Vale or Rust. It turns out, it still had overhead.
For example, we tried having a central "subscriptions" collection that tracked which objects were interested in which events from which sources.
There were a few different flavors:
The fastest approach in Vale was to just do a regular observer. Just using some generational references was faster than the "zero-overhead" linear style. 13
I think this is indicative of some underlying rule. If I had to guess, it's that non-temporary many-to-one and many-to-many relationships have inherent run-time memory-safety overhead.
These relationships are inherent to most programs, so most programs will have some memory-safety overhead.
A "cache miss" is when the CPU requests memory from a 64B chunk that's not already in the CPU cache, and it has to wait idle for hundreds of cycles waiting for the data to come in from RAM.
I once saw a consensus on the Rust discord server to always avoid Rc'd observers and instead use Yew.
As much as I like Yew, it's worrying when people advise bringing in an entire front-end framework with hundreds of dependencies just to avoid Rc.
In this little journey, we've come across six kinds of overhead:
These costs are not all equal, though. The first two are much more expensive:
A cache miss is when the CPU requests memory from a 64B chunk ("cache line") that's not already in the CPU cache, and it has to wait idle for hundreds of cycles waiting for the data to come all the way from RAM.
The latter four don't have as many cache misses, they're generally much faster than reference counting and tracing garbage collection.
Though, they still have their costs:
To get good performance, one has to pick their poison and decide what overhead they're willing to suffer. There's no avoiding overhead completely.
Often, a particular situation will cause one method to be faster or slower than usual. For example:
So even though some overhead is inevitable, we can still think about which method is best for a particular situation. 16
Easier said than done, but it is possible. Pony's ORCA system and Verona's regions are both separately-collected realms of memory. If you destroy them before the first collection, you get the best of all worlds. I suspect this will be a very nice technique for temporary calculations.
This means the compiler can instruct the CPU to tentatively proceed executing as if the check will pass.
This is why I like designs like the Cone language. It gives you access to whatever memory management method you might like.
There's no way to get to zero-overhead memory safety.
However, we can get close.
For language designers, it means that the search isn't over. Since there is still overhead, there is still room to improve or rearrange.
Perhaps it means that we should keep chasing the myth of zero-overhead memory safety, because we aren't there yet.
That's what I'm trying to do with Vale, and led to discovering its linear-aliasing model, generational references, and regions.
A lot of other languages are exploring these areas too, like Austral, Val, and Verona.
Or perhaps it means that we can stop chasing the myth.
Back when I coded in Rust a lot more, I made the mistake of caring too much about avoiding heap allocations, Rc, RefCell, virtual calls, etc. It was idiomatic to avoid these particular sources of overhead. I took a sense of pride in going the extra mile and spending the extra time reducing my overhead down to "zero".
In retrospect, I never actually reached zero, because instead I used more cloning, more bounds checking, and more hashing. It got a bit faster, but I was chasing a definition of perfection that was arbitrary and unreachable. I wasted a lot of time past the point of diminishing returns, chasing that myth.
It wasn't a problem with Rust. It was a problem with me. Perhaps instead of taking a community's idioms as gospel, we should always ask when idioms should be followed, and trust our own profiling and experience.
Or perhaps it means that we can relax a bit about reducing our overhead to zero, since we can never get there. Perhaps we should instead spend our time making our code simple and flexible, optimize where necessary, and actually solve people's problems instead of playing arbitrary overhead golf. It's a principle I knew, but didn't follow as much as I should have.
Or perhaps it means, since memory safety often has an unavoidable cost, that it can be okay to use unsafe, C, C++, or Zig in situations where memory safety isn't as important as optimal performance or flexibility. Blasphemy, I know. 17
What does it all mean? I'm not sure! There's a lot of lessons we could take away from the unreachability of zero-overhead memory safety.
Even though I made a memory safe language, I stand by this blasphemy. There are cases where memory safety is merely a priority, not the priority.
I talked about this at Handmade Seattle, and this article talks about cases like AAA games, sandboxed apps, and client-side programs that always talk to a trusted first-party server.
I worked on Google Earth (written in C++) and memory safety wasn't nearly as big of a problem there as absolutists would have us believe.
My conversation with Mike had a lot more:
...but these are topics worthy of their own entire posts! So I'll stop here.
That's all for now!
Next week I'll be writing about how we can use these techniques (and a few others) to design some memory safety for C++, so keep an eye on the RSS feed, twitter, discord server, or subreddit!
If you enjoyed this article, please consider donating via patreon or GitHub!
See you next time,
- Evan Ovadia
It's Zeddicus Zu'l Zorander the Second, and I will die on this hill.