While diving the depths of single ownership, we discovered a hidden gem from the most unlikely of places. With it, we were able to reassemble C++ into something that really unleashes the full potential of RAII.
Recall these ancient mantras, known to any C++ developer:
In this journey, we discovered language solutions for all of these.
This article is about C++'s RAII and single ownership, and how we can take it even further. 0
C++'s syntax often makes single ownership look more difficult than it is, so we also use Vale to illustrate how easy and powerful single ownership can be. 1
Our journey started in 2011, when C++11's unique_ptr brought single ownership and move semantics to C++ programmers worldwide, and changed our lives forever. 2
In one fell swoop, it basically single-handedly solved memory leaks. Single-ownership is one of those notions that, once it clicked, felt right. It was probably because this is how we already think: in C, we would mentally track ownership, to know whose responsibility it was to free an object. Even in GC'd languages, we would implicitly track who's responsible for calling .dispose().
We slowly discovered that we could use RAII for things other than freeing memory! We could:
We realized: RAII wasn't just a way to track who should free an object, and these weren't just neat tricks. RAII is much more, it's a way to track responsibility. 4
It's a promise that's enforced by the compiler. Instead of just "The compiler will make sure we free this," RAII is "The compiler will make sure that we XYZ" where XYZ is whatever we want. 5
RAII stands for Resource Acquisition Is Initialization, which is a fancy way of saying "put that code in a destructor so you can be sure it actually happens."
Vale is still in early alpha, and rapidly approaching v0.1. Check out the Roadmap for progress and plans!
All the features mentioned here are available in Vale, but Resilient Mode, regions, RC elision, and weak references are still on the way.
Single ownership and move semantics existed even before C++, in Ada, Common Lisp (via the with-... macros), Mesa/Cedar at Xerox PARC, and Object Pascal (for object/class types). (pjmlp)
Search your Java code for removeObserver(this) and you'll find that most of them are in methods named "dispose", "destroy", "close", etc.
Now imagine if the language could make sure you couldn't forget to call that method! That's RAII.
"Single ownership" and "RAII" aren't the same thing.
Single ownership is when a single reference controls an object's lifetime.
RAII is when we use ownership to make sure something will happen, and in a timely fashion.
One can have RAII with shared ownership, but it's risky and more limited, as shown below.
In C++, the XYZ is calling the destructor, which is a function that takes no parameters and returns no useful information. We'll show how we can use RAII to make sure we call any of multiple methods which have no such restrictions!
Single ownership in modern C++ uses owning unique_ptr<T>s, and non-owning T* raw pointers.
Where unique_ptr is the sheriff, the raw pointer is the infamous mercenary who rides into town and makes everyone mighty nervous. When things go well, he's useful... but if things get dicey, he might just decide to dereference that pointer and cause all sorts of chaos.
We discovered that the sheriff and the mercenaries can work together, with some solid rules. We discovered patterns that worked pretty well.
For example, we'd often have a BigClass, that owns a bunch of smaller classes ("subcomponents"), where each subcomponent has raw pointers to subcomponents made before it.
C++'s member initializer list even enforces that we don't refer to a not-yet-initialized member.
The big class constructs these in the right order, and destructs them in the correct reverse order.
With this, there won't be any unfortunate seg-faulting in our small town.
The world discovered many patterns like this for handling raw pointers. 6
Some other safe patterns:
Vale's default reference is an owning reference, like C++'s unique_ptr.
In Vale, constructors are called just like any other function, no new or make_unique required.
One can think of &a like C++'s unique_ptr::get.
Interestingly, in this picture, there are never any dangling pointers. It's even better than never dereferencing any dangling pointers: rather the pointers never become dangling to begin with!
Indeed, every reference to an object is destroyed before the object itself is.
This kind of "non-outliving pointer" is everywhere. Some examples:
This pointer, which shouldn't outlive what it's pointing to, may seem oddly familiar to many of us: SQL has them! 10
In SQL, a foreign key constraint is a reference that cannot outlive the object (otherwise, it aborts the current transaction).
For that reason, we call this kind of pointer a constraint reference. 11
We've used constraint references in C++! 12 We simply:
We fell in love with the approach instantly:
With raw pointers, if someone deletes the object your raw pointer is pointing to, you won't see the problem until much later, when you try and dereference it. Constraint refs answer the question "who destroyed that object I'm pointing to?" much sooner; 15 we get a nice debugger pause or stack trace when someone accidentally frees what we're pointing at.
To summarize, we can get speed and memory safety with ease by, when developing and testing, making the program halt when we free an object that any constraint reference is pointing at.
Rust's borrow references also do something like this.
Constraint references have the safety of borrow references, and we can alias them as much as we want!
And counter-intuitively, constraint references can sometimes be more efficient when you consider the program as a whole, especially when combined with region borrow checking. Keep reading to learn how!
In 2007, Gel was the first language to incorporate constraint references, described in Ownership You Can Count On as the "alias counting" technique.
According to legend, some C++ game engines already do this.
Or, if asserting isn't quite your fancy, there's a mode that pauses and shows a "Continue?" prompt which keeps it alive until the last constraint reference disappears.
To "alias" a pointer means to make another pointer, pointing to the same thing. Memory safety in the presence of aliasing has always been challenging, but constraint references solve it for us.
This can be controlled on a case-by-case basis; if we don't want this, we can use a weak reference instead, explained below.
Assist Mode is used in development and testing, where we halt the program when we accidentally free an object that a constraint reference is pointing at.
Fast Mode is used for release, and compiles the references down to raw pointers.
If someone prefers absolute safety, then they could use Resilient Mode for release, where we compile constraint_ptr to use a weak_ptr internally, and it will halt the program when we try to dereference a freed object instead. This is similar to running a program with Valgrind or ASan.
Unfortunately, C++'s shared_ptr and weak_ptr use atomic ref-counting under the hood, which would make these new constraint references very slow.
Fast Mode could be useful for high performance computing like games, and areas where we have other measures for safety, like webassembly or other sandboxes. Vale's Resilient Mode is still incredibly fast and has zero unsafety, which would make it perfect for use in servers and apps.
We coded in this style for years, to see how far constraint refs could go. Whenever we reached for shared_ptr, we stopped, and pondered if there was a way to solve the problem with single ownership.
We suddenly started discovering certain recurring patterns, like nuggets of gold, deep in the mines.
There are amazing recent advances in optimized ref-counting, such as in Lobster's Algorithm which optimizes away 95% of ref-counts. Vale also has read-only regions and bump regions, where ref-counting overhead is reduced to zero.
Constraint references also solve the cycle problem for ref-counting, by enforcing that there are no other references to an object when we let go of its owning reference.
One pattern was the clasp pattern, which solved a certain problem with callbacks.
Imagine we have a Network class, shown here.
Let's say we had a class named Thing, whose doRequest method would say network->request("vale.dev", this);
Wait, danger lurks!
If this (the Thing) is destroyed before the response comes back, then Network would call into a dangling pointer and crash!
We almost concluded that we needed some shared ownership acrobatics for memory safety here. 18
Instead, we made two tiny classes, Request and RequestHandle.
Each had only a pointer to the other. Thing owned one, Network owned the other.
When one was destroyed, it would reach into the other to null out the pointer, thus severing the connection.
This pattern of having two mutual constraint references was so common that we gave it a name: the clasp pattern. It obviated a vast swath of our shared_ptr usage.
We iterated on it, simplified it, and even made a one-to-many version, which was so useful that we promoted it to its own reference type, the weak reference.
We could refactor our codebase to make all our Things shared, so we could give Network a shared_ptr<Thing>... a bit invasive though.
We could give Network a shared_ptr<ThingRespHandler>. In fact, that's what std::function is: a function pointer and a shared_ptr around some arguments.
In the end, we didn't need either.
& is a read-only reference, like C++'s const. We use &! to make a non-const reference.
Sometimes, we want a pointer to outlive what it points to.
For example, a missile launched by a spaceship should keep flying, even if its targeted asteroid disappears.
We can use a weak reference for this. 20
Note that this is very different from C++'s weak_ptr:
In our quest, single ownership unexpectedly solved a major recurring problem.
We previously had a system where a shared_ptr'd object's destructor would remove it from the display when the last reference to it disappeared. This was a terrible thing; Every month, there would be a fresh bug saying "I hit the delete button, but the thing is still in the view!" and it'd take forever to figure out "who is keeping my object alive?" 21
The ironic part was that we knew who the owner should be. We knew the exact line that should have had the last reference... 22 but apparently, it wasn't. Somewhere, another reference was preventing the destructor call.
This problem evaporated, because constriant references would notify us of the problem much earlier. 23
C++ weak refs are a bit involved, but feel free to comment and we'll explain how to do it!
This is a common complaint in GC'd languages too. An accidental reference way over in some corner of the codebase is keeping my very large object alive and in memory.
We call these "memory leaks". Yes, GC'd languages can have memory leaks!
These can also lead to dangerous bugs where network responses or button observers call into objects we thought we got rid of.
This is common in all languages: we often have a "main" reference to an object.
We have a VM (and soon, a compilation option!) which tells us which constraint references are still pointing at an object when we try to free it.
We were new to this way of thinking, so we expected that maybe a quarter of our references could become constraint refs. We were shocked when we were able to get rid of every single raw pointer and shared_ptr, and make it into either a constraint ref, or occasionally a weak ref. 24
We didn't know it at the time, but we had found the key to unlock the next steps for RAII. Below, we explain how Vale and a hypothetical C++++ could harness this new freedom.
We didn't run into any, but there are some hypothetical cases where one might want shared ownership. Luckily, you can implement shared references with single ownership, as an escape hatch.
Unexpectedly, getting rid of shared ownership made destructor parameters possible!
Let's back up a step and talk about shared_ptr. Anyone who has a shared_ptr<X> might be the unlucky one to call Xs destructor. This is why destructors don't have parameters: every time you let go of a shared_ptr, you would have to somehow obtain the right arguments to pass them in to the destructor, somehow. 25 Owning and constraint references are different: you know exactly who should be calling the destructor.
There were other reasons C++ couldn't have destructor parameters, but they all have easy solutions from a language design standpoint:
We could also use a deleter, set up when we create the object, but thats often too early to know what parameters to pass into the destructor.
Exceptions weren't a problem for us, but they prevent this improved RAII just as much as shared ownership does. C++ will need to introduce a no-exceptions mode before it can do improved RAII.
Go-style defer blocks can make this even nicer.
In Vale, if you use the % operator to propagate errors upwards, it will automatically call .drop() on any local in scope.
However, if you have a local x which doesn't have a zero-arg .drop(), you have to hold onto the error, call the correct destructor for x, and then continue to return the error upwards.
Since we could have destructor parameters, we could improve our Transaction class, shown to the right.
Notice how we have to call setRollbackMode before the destructor.
We'd forget that all the time!
However, now that we have destructor parameters, we can get rid of setRollbackMode, get rid of mode_, and use this destructor instead:
We've seen this pattern everywhere: since destructors couldn't take parameters, we had to hack them into members. Now we dont have to!
Since we don't have shared ownership anymore, we no longer need a single zero-arg destructor, and we can add destructor overloads!
Notice how the destructors now have names.
Recall how RAII is where "the compiler will make sure that we XYZ". Here, the compiler will make sure that someone holding a Transaction either calls commit or rollback.
Our hypothetical C++++ syntax is starting to show some cracks, so lets see this in Vale.
Here, commit and rollback are just regular methods that take an owning this and happen to free it (with destruct). 29
(That's all a destructor is, when you think about it.)
This isn't just useful for transactions. Imagine a Future<T, E> class with two destructors:
Now, we can never accidentally drop a future without resolving or rejecting it first!
Your signature doesn't matter, it's whats inside that counts. What makes you a destructor is whether you free this inside your function, and don't let anyone tell you otherwise!
Notice how read takes a constraint reference (&!this), but the two "destructors" take in an owning reference (this).
The parentheses here cause us to move into a method, equivalent to commit(transaction).
A regular . like in transaction.commit() gives a constraint reference to the method, equivalent to commit(&transaction).
A common C++ wish is to be able to return things from destructors.
However, a shared_ptr<T> would just throw away the ~T()'s return value anyway. So why even allow one?
Now that we don't have shared ownership, we can start returning values from destructors.
As you use this kind of improved RAII more, you start to see opportunities for it everywhere.
Imagine if std::thread's destructor could return the result of a thread's calculation!
Imagine a std::function-like class where its destructor called the underlying lambda and destroyed this at the same time, thus guaranteeing it could only be called once. The possibilities are endless!
Recently, C++17 added the nodiscard attribute, which was useful for functions like Result<ImportantResult, ImportantError> doSomethingImportant();, to prevent the user from ignoring the Result.
C++ wouldn't have needed a special attribute if it had this kind of improved RAII: Simply don't provide a default destructor, and provide other destructors, with return values:
We might want to return a object to a free-list, instead of free()ing it.
Normally, we would need to use an allocator. But instead, we could take in the free-list as a parameter, and move this into it.
This is impossible in C++'s syntax (we don't get to move this), 32 so we'll use Vale syntax:
By now you've noticed that Destructors can have overloads, take parameters, return values, and even decline to destroy this! There's hardly anything that separates them from regular functions.
In fact, in Vale, the whole "destructor" side of the language is built from one small rule:
In one fell swoop, by removing our dependence on shared_ptr, we had taken one of the thorniest corners of C++ and completely simplified it away.
Using constraint references, we unleashed the power of single ownership and found the next steps for RAII:
With C++'s existing RAII, destructors can do very little. With improved RAII, an object can offer multiple options for destructors, each with return values and parameters.
Someday, we might be able to add these features to C++, but before that can happen, we need to show the world that single ownership is powerful, and we don't need shared ownership as much as we thought.
This isn't even the end of the single ownership saga! In the coming weeks, we'll explain how this consistent single ownership approach enables other unique capabilities in Vale, such as cross-compilation, the region borrow checker, and lightning fast memory management.
Until then, we want to hear from you! We'd love to hear your thoughts on single ownership, RAII, Vale, and any ideas you have! Come share your thoughts in the Reddit posts, the Hacker News post, and come join the r/Vale subreddit subreddit and the Vale discord!
Maybe we could make this work in C++ if it allowed us to specify an explicit this parameter, which was wrapped in a unique_ptr. Something like Rust's Arbitrary Self Types.
All contributions are welcome! Soon, we're going to:
If any of this interests you, come join us!