LanguagesArchitecture
Note: This article was published before we changed from constraint references to generational references.

While diving the depths of single ownership, we discovered a hidden gem from the most unlikely of places. With it, we were able to reassemble C++ into something that really unleashes the full potential of RAII.

Recall these ancient mantras, known to any C++ developer:

  • "Who destroyed that object I was pointing at?"
  • "Blasted segfaults!"
  • "We need to influence the destructor, but destructors take no parameters!"
  • "Who's still holding a shared_ptr to my object? It should be dead!"
  • "I can't throw from a destructor, or have multiple exceptions in flight?"
  • "Destructors can't return a status, either?!"

In this journey, we discovered language solutions for all of these.

This article is about C++'s RAII and single ownership, and how we can take it even further. 0

C++'s syntax often makes single ownership look more difficult than it is, so we also use Vale to illustrate how easy and powerful single ownership can be. 1

Single Ownership

Our journey started in 2011, when C++11's unique_ptr brought single ownership and move semantics to C++ programmers worldwide, and changed our lives forever. 2

In one fell swoop, it basically single-handedly solved memory leaks. Single-ownership is one of those notions that, once it clicked, felt right. It was probably because this is how we already think: in C, we would mentally track ownership, to know whose responsibility it was to free an object. Even in GC'd languages, we would implicitly track who's responsible for calling .dispose().

We slowly discovered that we could use RAII for things other than freeing memory! We could:

  • Remove self from an observers list. 3
  • Close a file stream, or clean up a temp file.
  • Close a mutex's lock.
  • Stop an ongoing calculation in another thread.
  • Roll-back or commit a transaction.
  • Cancel a retrying network request.
  • Remove some UI elements from the view.
  • Resolve or reject a future.
  • Notify others that we're being freed.
  • Enforce that we actually handled an error, rather than dropping it on the floor.

We realized: RAII wasn't just a way to track who should free an object, and these weren't just neat tricks. RAII is much more, it's a way to track responsibility. 4

It's a promise that's enforced by the compiler. Instead of just "The compiler will make sure we free this," RAII is "The compiler will make sure that we XYZ" where XYZ is whatever we want. 5

Side Notes
(interesting tangential thoughts)
0

RAII stands for Resource Acquisition Is Initialization, which is a fancy way of saying "put that code in a destructor so you can be sure it actually happens."

1

Vale is still in early alpha, and rapidly approaching v0.1. Check out the Roadmap for progress and plans!

All the features mentioned here are available in Vale, but Resilient Mode, regions, RC elision, and weak references are still on the way.

2

Single ownership and move semantics existed even before C++, in Ada, Common Lisp (via the with-... macros), Mesa/Cedar at Xerox PARC, and Object Pascal (for object/class types). (pjmlp)

3

Search your Java code for removeObserver(this) and you'll find that most of them are in methods named "dispose", "destroy", "close", etc.

Now imagine if the language could make sure you couldn't forget to call that method! That's RAII.

4

"Single ownership" and "RAII" aren't the same thing.

Single ownership is when a single reference controls an object's lifetime.

RAII is when we use ownership to make sure something will happen, and in a timely fashion.

One can have RAII with shared ownership, but it's risky and more limited, as shown below.

5

In C++, the XYZ is calling the destructor, which is a function that takes no parameters and returns no useful information. We'll show how we can use RAII to make sure we call any of multiple methods which have no such restrictions!

Safe Handling of Aliases

Single ownership in modern C++ uses owning unique_ptr<T>s, and non-owning T* raw pointers.

Where unique_ptr is the sheriff, the raw pointer is the infamous mercenary who rides into town and makes everyone mighty nervous. When things go well, he's useful... but if things get dicey, he might just decide to dereference that pointer and cause all sorts of chaos.

We discovered that the sheriff and the mercenaries can work together, with some solid rules. We discovered patterns that worked pretty well.

For example, we'd often have a BigClass, that owns a bunch of smaller classes ("subcomponents"), where each subcomponent has raw pointers to subcomponents made before it.

C++'s member initializer list even enforces that we don't refer to a not-yet-initialized member.

The big class constructs these in the right order, and destructs them in the correct reverse order.

With this, there won't be any unfortunate seg-faulting in our small town.

The world discovered many patterns like this for handling raw pointers. 6

C++
class BigClass {
public:
  unique_ptr<A> a;
  unique_ptr<B> b;
  unique_ptr<C> c;
  unique_ptr<D> d;
  BigClass() :
    a(make_unique<A>()),
    b(make_unique<B>(a.get())),
    c(make_unique<C>(a.get())),
    d(make_unique<D>(a.get(), c.get()))
  { }
};
prevale
struct BigClass {
  a A; 7
  b B;
  c C;
  d D;
  func BigClass() {
    this.a = A(); 8
    this.b = B(&this.a); 9
    this.c = C(&this.a);
    this.d = D(&this.a, &this.c);
  }
}

6

Some other safe patterns:

  • Destroying things in the reverse order they were made.
  • Giving a raw pointer to a non-moved local to a function that doesn't let it escape.
  • Giving a raw pointer to a movable local to a function that doesn't let it escape, unless a closure moves it.

7

Vale's default reference is an owning reference, like C++'s unique_ptr.

8

In Vale, constructors are called just like any other function, no new or make_unique required.

9

One can think of &a like C++'s unique_ptr::get.

Interestingly, in this picture, there are never any dangling pointers. It's even better than never dereferencing any dangling pointers: rather the pointers never become dangling to begin with!

Indeed, every reference to an object is destroyed before the object itself is.

Constraint References

This kind of "non-outliving pointer" is everywhere. Some examples:

  • A PhoneCall points to two Accounts. The PhoneCall shouldn't outlive the Accounts; we should delete the PhoneCall before deleting either Account.
  • In a graph, an edge points to its nodes. The edge shouldn't outlive the nodes, we should delete the edge before deleting any of its nodes.
  • A HandShake should only exist while the two Hands exist.

This pointer, which shouldn't outlive what it's pointing to, may seem oddly familiar to many of us: SQL has them! 10

In SQL, a foreign key constraint is a reference that cannot outlive the object (otherwise, it aborts the current transaction).

For that reason, we call this kind of pointer a constraint reference. 11

We've used constraint references in C++! 12 We simply:

  • Wrapped shared_ptr in a class called owning_ptr which uses move semantics like unique_ptr and, when destroyed, checks the ref count to assert 13 that it's the last pointer to the object.
  • Made another wrapper around shared_ptr called constraint_ptr.
  • In release mode, compiled owning_ptr to a unique_ptr, and compiled constraint_ptr to a raw pointer.

We fell in love with the approach instantly:

  • It was memory safe! We never had a use-after-free once we switched to constraint references.
  • It was conservative! In development mode, we couldn't even make pointers dangle, much less dereference them. It caught risky behavior much earlier and made it extremely unlikely to have unsafe behavior in release mode.
  • It was zero-cost! They compiled to a unique_ptr and raw pointers, so there was no extra overhead in release mode.
  • It was easy! We could alias 14 with freedom, and most of our pointers didn't outlive what they were pointing to anyway, so there was no learning curve to struggle against!

With raw pointers, if someone deletes the object your raw pointer is pointing to, you won't see the problem until much later, when you try and dereference it. Constraint refs answer the question "who destroyed that object I'm pointing to?" much sooner; 15 we get a nice debugger pause or stack trace when someone accidentally frees what we're pointing at.

To summarize, we can get speed and memory safety with ease by, when developing and testing, making the program halt when we free an object that any constraint reference is pointing at.

10

Rust's borrow references also do something like this.

Constraint references have the safety of borrow references, and we can alias them as much as we want!

And counter-intuitively, constraint references can sometimes be more efficient when you consider the program as a whole, especially when combined with region borrow checking. Keep reading to learn how!

11

In 2007, Gel was the first language to incorporate constraint references, described in Ownership You Can Count On as the "alias counting" technique.

12

According to legend, some C++ game engines already do this.

13

Or, if asserting isn't quite your fancy, there's a mode that pauses and shows a "Continue?" prompt which keeps it alive until the last constraint reference disappears.

14

To "alias" a pointer means to make another pointer, pointing to the same thing. Memory safety in the presence of aliasing has always been challenging, but constraint references solve it for us.

15

This can be controlled on a case-by-case basis; if we don't want this, we can use a weak reference instead, explained below.

Constraint Behavior Modes

Assist Mode is used in development and testing, where we halt the program when we accidentally free an object that a constraint reference is pointing at.

Fast Mode is used for release, and compiles the references down to raw pointers.

If someone prefers absolute safety, then they could use Resilient Mode for release, where we compile constraint_ptr to use a weak_ptr internally, and it will halt the program when we try to dereference a freed object instead. This is similar to running a program with Valgrind or ASan.

Unfortunately, C++'s shared_ptr and weak_ptr use atomic ref-counting under the hood, which would make these new constraint references very slow.

Luckily, Vale's region isolation allows Assist Mode and Resilient Mode to use non-atomic ref counting, which is much faster. 16 17

Fast Mode could be useful for high performance computing like games, and areas where we have other measures for safety, like webassembly or other sandboxes. Vale's Resilient Mode is still incredibly fast and has zero unsafety, which would make it perfect for use in servers and apps.

Emerging Patterns

We coded in this style for years, to see how far constraint refs could go. Whenever we reached for shared_ptr, we stopped, and pondered if there was a way to solve the problem with single ownership.

We suddenly started discovering certain recurring patterns, like nuggets of gold, deep in the mines.

16

There are amazing recent advances in optimized ref-counting, such as in Lobster's Algorithm which optimizes away 95% of ref-counts. Vale also has read-only regions and bump regions, where ref-counting overhead is reduced to zero.

17

Constraint references also solve the cycle problem for ref-counting, by enforcing that there are no other references to an object when we let go of its owning reference.

Clasp

One pattern was the clasp pattern, which solved a certain problem with callbacks.

Imagine we have a Network class, shown here.

Let's say we had a class named Thing, whose doRequest method would say network->request("vale.dev", this);

Wait, danger lurks!

If this (the Thing) is destroyed before the response comes back, then Network would call into a dangling pointer and crash!

We almost concluded that we needed some shared ownership acrobatics for memory safety here. 18

C++
class INetworkCallback {
public:
  virtual void handleResponse(
    const std::string& resp) = 0;
  virtual ~INetworkCallback() = default;
};
class Network {
public:
  void request(
    const std::string& url,
    INetworkCallback* callback)
  { ... }
};
prevale
interface INetworkCallback {
  func handleResponse(&this, resp Str);
  func drop(this);
}
struct Network {
  func request(
    &this, 19
    url Str,
    callback &!INetworkCallback)
  { ... }
}

Instead, we made two tiny classes, Request and RequestHandle.

Each had only a pointer to the other. Thing owned one, Network owned the other.

When one was destroyed, it would reach into the other to null out the pointer, thus severing the connection.

This pattern of having two mutual constraint references was so common that we gave it a name: the clasp pattern. It obviated a vast swath of our shared_ptr usage.

We iterated on it, simplified it, and even made a one-to-many version, which was so useful that we promoted it to its own reference type, the weak reference.

18

We could refactor our codebase to make all our Things shared, so we could give Network a shared_ptr<Thing>... a bit invasive though.

We could give Network a shared_ptr<ThingRespHandler>. In fact, that's what std::function is: a function pointer and a shared_ptr around some arguments.

In the end, we didn't need either.

19

& is a read-only reference, like C++'s const. We use &! to make a non-const reference.

Weak Reference

Sometimes, we want a pointer to outlive what it points to.

For example, a missile launched by a spaceship should keep flying, even if its targeted asteroid disappears.

We can use a weak reference for this. 20

Note that this is very different from C++'s weak_ptr:

  • When you lock a weak_ptr, you get a shared_ptr which will delay destruction and extend the lifetime of the object if the other shared_ptrs disappear.
  • When you lock a weak reference, you get a constraint reference, which will halt the program if the owning reference disappears.

Simplification

In our quest, single ownership unexpectedly solved a major recurring problem.

We previously had a system where a shared_ptr'd object's destructor would remove it from the display when the last reference to it disappeared. This was a terrible thing; Every month, there would be a fresh bug saying "I hit the delete button, but the thing is still in the view!" and it'd take forever to figure out "who is keeping my object alive?" 21

The ironic part was that we knew who the owner should be. We knew the exact line that should have had the last reference... 22 but apparently, it wasn't. Somewhere, another reference was preventing the destructor call.

This problem evaporated, because constriant references would notify us of the problem much earlier. 23

20

C++ weak refs are a bit involved, but feel free to comment and we'll explain how to do it!

21

This is a common complaint in GC'd languages too. An accidental reference way over in some corner of the codebase is keeping my very large object alive and in memory.

We call these "memory leaks". Yes, GC'd languages can have memory leaks!

These can also lead to dangerous bugs where network responses or button observers call into objects we thought we got rid of.

22

This is common in all languages: we often have a "main" reference to an object.

23

We have a VM (and soon, a compilation option!) which tells us which constraint references are still pointing at an object when we try to free it.

Surprise!

We were new to this way of thinking, so we expected that maybe a quarter of our references could become constraint refs. We were shocked when we were able to get rid of every single raw pointer and shared_ptr, and make it into either a constraint ref, or occasionally a weak ref. 24

We didn't know it at the time, but we had found the key to unlock the next steps for RAII. Below, we explain how Vale and a hypothetical C++++ could harness this new freedom.

24

We didn't run into any, but there are some hypothetical cases where one might want shared ownership. Luckily, you can implement shared references with single ownership, as an escape hatch.

Language Implications

Destructor Parameters!

Unexpectedly, getting rid of shared ownership made destructor parameters possible!

Let's back up a step and talk about shared_ptr. Anyone who has a shared_ptr<X> might be the unlucky one to call Xs destructor. This is why destructors don't have parameters: every time you let go of a shared_ptr, you would have to somehow obtain the right arguments to pass them in to the destructor, somehow. 25 Owning and constraint references are different: you know exactly who should be calling the destructor.

There were other reasons C++ couldn't have destructor parameters, but they all have easy solutions from a language design standpoint:

  • Exceptions: We need to call an object's destructor automatically when an exception is in flight, where do we get the parameters for that?
    • We don't use exceptions anyway! In fact, entire companies' style guides prohibit them. Use Result instead. 26
  • If we implicitly call the destructor at the end of a block, how do we know what parameters to pass in?
    • If the destructor requires parameters, don't implicitly call it, and require us to explicitly call its destructor! 27 28
  • If we have a vector<MyObj>, how would ~vector() know what to pass into ~MyObj()?
    • Destructors can have parameters now, so pass a "consumer" functor std::function<void(std::unique_ptr<MyObj>)>) (or fn(MyObj)Void), and the vector could give each element to it to destroy.
25

We could also use a deleter, set up when we create the object, but thats often too early to know what parameters to pass into the destructor.

26

Exceptions weren't a problem for us, but they prevent this improved RAII just as much as shared ownership does. C++ will need to introduce a no-exceptions mode before it can do improved RAII.

27

Go-style defer blocks can make this even nicer.

28

In Vale, if you use the % operator to propagate errors upwards, it will automatically call .drop() on any local in scope.

However, if you have a local x which doesn't have a zero-arg .drop(), you have to hold onto the error, call the correct destructor for x, and then continue to return the error upwards.

Since we could have destructor parameters, we could improve our Transaction class, shown to the right.

Notice how we have to call setRollbackMode before the destructor.

We'd forget that all the time!

However, now that we have destructor parameters, we can get rid of setRollbackMode, get rid of mode_, and use this destructor instead:

C++++
virtual ~Transaction(RollMode mode) {
  if (!committed_) {
    /* use mode to roll back */
  }
}

...

// invoke destructor
~transaction(TUMBLE);

We've seen this pattern everywhere: since destructors couldn't take parameters, we had to hack them into members. Now we dont have to!

C++
class Transaction {
public:
  ReadResult read(ReadQuery query);

  TransactionResult setRollbackMode(
      RollMode mode) {
    mode_ = mode;
  }

  void commit() {
  ...
  committed_ = true;
}

virtual ~Transaction() {
  if (!committed_) {
  /* use mode_ to roll back */
  }
}

private:
  bool committed_;
  RollMode mode_;
};

...

transaction->setRollbackMode(TUMBLE);
// invoke destructor
transaction = nullptr;

Destructor Overloads

Since we don't have shared ownership anymore, we no longer need a single zero-arg destructor, and we can add destructor overloads!

Notice how the destructors now have names.

Recall how RAII is where "the compiler will make sure that we XYZ". Here, the compiler will make sure that someone holding a Transaction either calls commit or rollback.

C++
class Transaction {
public:
  ReadResult read(ReadQuery query);

  virtual ~commit() { ... }

  virtual ~rollback(RollMode mode)
    { ... }
};

// To commit:
transaction->~commit();
// To rollback:
transaction->~rollback(TUMBLE);

Our hypothetical C++++ syntax is starting to show some cracks, so lets see this in Vale.

Here, commit and rollback are just regular methods that take an owning this and happen to free it (with destruct). 29

(That's all a destructor is, when you think about it.)

This isn't just useful for transactions. Imagine a Future<T, E> class with two destructors:

  • void ~resolve(T successValue);
  • void ~reject(E errorValue);

Now, we can never accidentally drop a future without resolving or rejecting it first!

prevale
struct Transaction {
  func read(&!this, query ReadQuery)
    ReadResult { ... }

  func commit(this) { 30
    ...
    destruct this;
  }

  func rollback(this, mode RollMode) {
    ...
    destruct this;
  }
}

exported func main() {
  // ...
  // To commit:
  (transaction).commit(); 31
  // To rollback:
  (transaction).rollback(TUMBLE);
}
29

Your signature doesn't matter, it's whats inside that counts. What makes you a destructor is whether you free this inside your function, and don't let anyone tell you otherwise!

30

Notice how read takes a constraint reference (&!this), but the two "destructors" take in an owning reference (this).

31

The parentheses here cause us to move into a method, equivalent to commit(transaction).

A regular . like in transaction.commit() gives a constraint reference to the method, equivalent to commit(&transaction).

Destructor Returns

A common C++ wish is to be able to return things from destructors.

However, a shared_ptr<T> would just throw away the ~T()'s return value anyway. So why even allow one?

Now that we don't have shared ownership, we can start returning values from destructors.

C++
class Transaction {
public:
  ReadResult read(ReadQuery query);

  virtual void ~commit() { ... }

  virtual RollbackStatus ~rollback(
      RollMode mode) {
    ...;
    return SUCCESS;
  }
};

As you use this kind of improved RAII more, you start to see opportunities for it everywhere.

Imagine if std::thread's destructor could return the result of a thread's calculation!

Imagine a std::function-like class where its destructor called the underlying lambda and destroyed this at the same time, thus guaranteeing it could only be called once. The possibilities are endless!

Recently, C++17 added the nodiscard attribute, which was useful for functions like Result<ImportantResult, ImportantError> doSomethingImportant();, to prevent the user from ignoring the Result.

C++ wouldn't have needed a special attribute if it had this kind of improved RAII: Simply don't provide a default destructor, and provide other destructors, with return values:

  • ImportantResult ~getResult();
  • ImportantError ~getError();
  • void ~printResult();

Non-destroying Destructors

We might want to return a object to a free-list, instead of free()ing it.

Normally, we would need to use an allocator. But instead, we could take in the free-list as a parameter, and move this into it.

This is impossible in C++'s syntax (we don't get to move this), 32 so we'll use Vale syntax:

prevale
struct Transaction {
  func read(&!this, query ReadQuery) ReadResult { ... }

  func commit(this, list &TransactionList) {
    ...
    list.reclaim(this); // move this into a different function
  }

  func rollback(this, list &TransactionList, mode RollMode) RollbackStatus {
    ...
    list.reclaim(this); // move this into a different function
    return SUCCESS;
  }
}

What even is a destructor?

By now you've noticed that Destructors can have overloads, take parameters, return values, and even decline to destroy this! There's hardly anything that separates them from regular functions.

In fact, in Vale, the whole "destructor" side of the language is built from one small rule:

"If an owning reference goes out of scope, call .drop() on it. If no public .drop() exists, give a compile error."

In one fell swoop, by removing our dependence on shared_ptr, we had taken one of the thorniest corners of C++ and completely simplified it away.

RAII: Past, Present, Future

Using constraint references, we unleashed the power of single ownership and found the next steps for RAII:

  • Multiple destructors: Mark errors handled in different ways, rollback or commit a transaction, end things how you want!
  • Destructor parameters: Resolve or reject futures with certain values, set a priority for a close operation, you name it!
  • Destructor return values: Return error status, return the result of a thread's calculation, whatever you need!
  • Non-destroying destructors: Reuse objects, give out ownership without risking freeing, sky's the limit!

With C++'s existing RAII, destructors can do very little. With improved RAII, an object can offer multiple options for destructors, each with return values and parameters.

Someday, we might be able to add these features to C++, but before that can happen, we need to show the world that single ownership is powerful, and we don't need shared ownership as much as we thought.

We made Vale for exactly that reason. It's still in alpha, so if you want to help bring improved RAII into the world, come by the r/Vale subreddit or the Vale discord! 33

This isn't even the end of the single ownership saga! In the coming weeks, we'll explain how this consistent single ownership approach enables other unique capabilities in Vale, such as cross-compilation, the region borrow checker, and lightning fast memory management.

Until then, we want to hear from you! We'd love to hear your thoughts on single ownership, RAII, Vale, and any ideas you have! Come share your thoughts in the Reddit posts, the Hacker News post, and come join the r/Vale subreddit subreddit and the Vale discord!

32

Maybe we could make this work in C++ if it allowed us to specify an explicit this parameter, which was wrapped in a unique_ptr. Something like Rust's Arbitrary Self Types.

33

All contributions are welcome! Soon, we're going to:

  • Write a standard library! (sets, hash maps, lists, etc)
  • Add weak pointers!
  • Finish designing the region borrow checker!
  • Replace the temporary combinator-based parser with a real one!
  • Make syntax highlighters! (VSCode, Sublime, Vim, Emacs, etc)
  • Enable support gdb/lldb for debugging!
  • Add better error reporting!
  • Add a "show all constraint refs" option in debug mode to our LLVM codegen stage!

If any of this interests you, come join us!