Weak references are weird.
This post is a collection of all the mysterious and arcane mechanisms that languages use to offer weak references.
Our quest was to figure out the best approach to use for Vale, that aligned with its goals of being fast, memory safe, and easy to use.
In the end, we made a completely new approach!
In most languages, 0 normal references are "strong". As long as there is at least one strong reference to an object, the object stays alive. When the last strong reference goes away, the object is deallocated. 1
We can also have weak references to the object, which don't keep the object alive. After the target object dies, they appear to be null. 2
These aren't just useful to break ref-counting cycles. They're also used to know when something still logically exists, to factor into our program's behavior. If you ever seen an alive or dead boolean in a class, you're likely seeing a good use-case for weak references.
For example, we can use it in an observer:
func handleClick(self &Button) {
// Check if self.clickObserverWeakRef points to an object that's still alive.
// If it is, refer to it as clickObserver.
if clickObserver = self.clickObserverWeakRef.lock() {
// Now that we know it's alive, call its onClick.
clickObserver.onClick();
}
}
Or perhaps a magic missile can check if its enemy still exists:
func takeAction(self &Missile) {
// Check if self.enemyWeakRef points to an enemy that's still alive.
// If it is, refer to it as enemy.
if enemy = self.enemyWeakRef.lock() {
// Now that we know it's alive, move one step closer to it.
self.position.goToward(enemy.position);
if self.position == enemy.position {
// We've reached the enemy, explode!
self.explode();
}
} else {
// The enemy is gone. Explode where we are anyway!
self.explode();
}
}
It's a rather useful tool! And it's pretty simple under the hood.
...except in Objective-C.
Objective-C has a mutex-guarded global hash map, tracking all the weak references and what objects they point to.
Specifically:
Unexpectedly, there are benefits to this approach!
But of course, it has drawbacks:
More specifically, in shared-ownership languages like Python, Swift, Obj-C, C#.
In reference counted languages like Swift, the language immediately notices there are no more strong references to the object, and deallocates the object right away. In tracing-garbage-collected languages like Java, the interpreter eventually notices that nobody points at this object, and deallocates it then.
Languages rarely set them to null. More often, a weak reference has some way of knowing whether the target object is still alive, so it can pretend to be null.
I'm not certain, but if the global map is an unordered_map<void*, vector<void*>*> hash map (in C++ terms) with a load factor of 0.5, then making the first weak reference to a new object could cost ~48B, and any subsequent weak references would cost 16B. If anyone wants to dive in, the source code is available in this repo, likely somewhere around objc-weak.h. (Thanks to mayoff for the lead!)
When Swift isn't using the Objective-C approach for compatibility reasons, Swift actually does something pretty elegant. 6
A Swift object has two counters in it:
Remember how we said "when the last strong reference goes away, the object is deallocated"? That's not true for Swift.
Let's say an object has strong references and weak references pointing at it.
When the last strong reference goes away, the object is deinitialized, but not deallocated. I like to think that deinitializing just zeroes out all the fields. 7
The object isn't deallocated quite yet; the object is in kind of a "zombie state" because it has no usable contents, but it's kept alive while there are still weak references pointing to it.
If we ask one of these weak references "are you pointing at a live object?", it will look into the object and check if the strong ref count is positive. If so, it responds true. If it's zero, it responds false. 8
Later, when we let go of the last weak reference, the weak reference counter goes to zero, and the object is deallocated; the zombie is finally destroyed.
Swift's approach has a big benefit:
and some costs:
Swift's approach is very simple, and I like that. 10
Thanks to electromaster666, mayoff, and jaredgrubb for noticing this!
Source: Mike Ash
This isn't quite true, it actually doesn't change the memory, but it helps a lot to think of it this way.
If the answer is false, Swift also sets the reference to be actually null, rather than just pointing at a zombie object, so it can answer more quickly next time, without having to dereference again.
This means that the counters are on the same cache line as the fields, which means if we access the counter, the CPU naturally brings some nearby fields into cache, making subsequent accesses faster.
Especially compared to Objective-C's approach!
If we look at the memory layout, C++ is similar to Swift; right next to the object we can have a strong ref count and a weak ref count. 11
In C++, we choose whether an object can have weak references to it or not. A Spaceship will by default not have any counters, but a shared_ptr<Spaceship> will have them.
We can make a weak reference, a weak_ptr<Spaceship>, from any shared_ptr<Spaceship>.
This has benefits:
And a cost:
And some oddities:
This is the case if we use std::make_shared. If we initialize a shared_ptr directly, then the counters will be somewhere else in the heap.
This is a principle called "zero-cost abstraction"; if we don't use a feature, it doesn't cost us any space or CPU time.
A typical implementation of weak_ptr stores two pointers:
Source: cppreference
Rust's Rc and Weak are basically C++'s shared_ptr and weak_ptr, with a couple differences:
We don't often see Rc and Weak in Rust. Instead, we use other mechanisms that behave like weak references.
We've now seen a few different kinds of weak references. However, if we zoom out a bit, we see a lot of things that act very similarly to weak references.
A weak reference is basically something that you can trade for a regular reference to the object if it still exists.
When you think of it that way, a lot of things are weak references.
For example, a string containing a filename, like myfile.txt. We can trade it for the contents of a file, if the file still exists:
func main() {
if contents = readFileAsString("myfile.txt") {
// File exists!
println(contents);
} else {
println("File doesn't exist!");
}
}
Or perhaps we have an integer ID, and we can use it to look up a Spaceship in a map:
func printName(ships &HashMap<int, Spaceship>, ship_id int) {
if ship = ships.get(ship_id) {
println("Ship exists! {ship.name}")
} else {
println("Ship doesn't exist!");
}
}
Notice how we first check for existence, and then use the resulting data. Just like a weak reference!
My favorite kind of weak-reference-in-disguise is the generational index, often used in C++ and Rust programs.
We often store our objects in arrays or vectors, such as a Vec<Spaceship>. 15 When we destroy a Spaceship, we often like to reuse its spot for another Spaceship.
Sometimes, we remember a Spaceship's index in the Vec. Later, we might want to know if that Spaceship is still there, or if it's been reused. Here's how!
Next to every object in the vector, we have an integer: Vec<(Spaceship, i64)>. That i64 is the Spaceship's current generation number. Every time we reuse a slot, we increment that number.
Whenever we want to remember a Spaceship's index, we'll also remember its generation number at the time. This is the remembered generation number.
For convenience, let's put the index and this "remembered generation number" in a little struct called a GenerationalIndex:
struct GenerationalIndex {
index: i64,
remembered_generation: i64
}
struct Missile {
enemy_ref: GenerationalIndex
}
Now, if we want to know whether the Spaceship still exists, we just compare the current generation number to the remembered generation number, like:
if enemies[missile.enemy_ref.index] == missile.enemy_ref.remembered_generation {
// Enemy still exists!
let enemy = &enemies[missile.enemy_ref.index];
...
}
It's as if the generational index is saying:
and the element at index 7 says:
or instead:
That's a generational index. It's like a weak reference!
However, there is one downside to generational indices: to "dereference" it, we need access to the containing Vec (like the enemies Vec above), usually passed in through a parameter.
This can sometimes be inconvenient: when we add a new parameter, we have to change our callers to supply it, and then our callers' callers, and our callers' callers' callers, which can result in "refactoring shockwaves". Doing this too often can cause churning APIs.
Sometimes, to work around this drawback, we give up and put all of our containers into a "god object" and pass that around as an argument to every function in our codebase.
Perhaps there's a better way to address this drawback. Read on!
Vale adds something like the generational index, which we call a generational reference.
Every object has a "current generation number" next to it in memory. Whenever we destroy an object, we increment that number. 16
To make a weak reference to that object, we get two things:
...and stick them together.
To know whether the object still exists, Vale just compares the object's current generation number to our reference's remembered generation number.
Similar to the generational index, It's as if the reference is saying:
and the person who opens the door says:
or instead:
We implemented generational references, and found them to be at least 2.3x faster than reference counting!
Compared to the other weak reference approaches, this has benefits:
And costs:
We foresee the average Vale program blending three different approaches:
By offering these all in the standard library, we can make it easy to have fast weak references. That's good, because Vale's goal is to make speed and safety easier than ever before.
Or a C++ std::vector<Spaceship> or a Java ArrayList<Spaceship>.
Note that we don't free() it to the operating system quite yet, because we need that generation number to stick around so we can compare it to weak references' remembered generation numbers. Later on, we release the physical memory back to the OS using some virtual memory techniques. See Mesh for more!
Since it's not an index, we don't need access to any container!
Reference-counting approaches such as Python suffer from this. We give a reference to C code, the C code has to manually remember to increment/decrement it. Forgetting to do so will corrupt the Python object. No such incrementing is needed for generational references!
See Mesh, an algorithm for merging virtual addresses into one underlying physical page.
Thanks for reading! In the coming weeks, I'll be writing about how we can augment generational references with an "automatic borrow checker," so subscribe to our RSS feed, twitter, or the r/Vale subreddit, or come hang out in the Vale discord.
- Evan Ovadia