Borrow checking, RC, GC, and the Eleven (!) Other Memory Safety Approaches

The Memory Safety Grimoire, Part 1

April 24, 2024 — Evan Ovadia

A fellow named Zeke came into my server one day.

Zeke: "Wait, so with generational references, we now have four ways to do memory safety?"

Evan: "In fact, there are fourteen by my count. Maybe more!" 0

Zeke: "Fourteen?!"

I've gotten so used to it that it's not surprising to me anymore, so it's always a delight to vicariously feel people's surprise when I tell them this.

Evan: "Indeed," and I proceed to show him the grimoire 1 that I've kept secret all these years.

Zeke: "How did you find all these?!"

At this point, I likely told him some nonsense like "I just kept my eyes open and collected them over the years!" but I think that you, my dear reader, deserve to know the truth! 2

This article is the introduction to my secret collection of memory safety techniques, which I call the memory safety grimoire.

With this wisdom, one can see the vast hidden landscape of memory safety, get a hint about where the programming world might head in the next decades, and even design new memory safety approaches. 3

If you like this topic, check out this Developer Voices episode where Kris Jenkins and I talked about linear types and regions!

Side Notes

(interesting tangential thoughts)

Notes [–] Notes [+] 0 1 2 3

0

And I'd bet that someone on reddit or HN will comment on some I haven't heard before, and I'll have to change the title and add to the list!

1

A grimoire is a cursed spellbook, like the necronomicon.

However, those of weak wills should be careful not to read grimoires... they might end up pursuing the dark arts for years.

2

Or perhaps this entire article is just a clever ruse, the mask behind the mask, and the truth still remains a secret.

3

Such as for C++!

A Curséd Tome, and Ancient Memory Safety

(Or skip to the list!)

Mid-2023, a team of archaeologists discovered an ancient Mayan city in Campeche. I was on the ground as the team's consulting software engineer, as the Mayans were known as some of the earliest and most powerful practitioners of the software arts.

The hours are long and there's always the chance of being kidnapped and ransomed by the marauding bands of white-nosed coatis, 4 but nobody asks me if I can add an RSS feed to a DBMS, so there's that. 5

Our leader Ivan was hoping that this ancient city might contain a fifth memory safety tome, similar to the ancient borrow checking codex that Graydon Hoare found in the Kukulkan pyramid in 2016.

We made it to the central pyramid, and discovered a pedestal with a tattered tome, surprisingly intact after all this time. 6

With my heart pounding, I approached the pedestal, looking closer at the inscriptions on the front. Sure enough, it had the Mayan symbols for "stack frame", "reference", and "region" on the front! We'd found it!

Notes [–] Notes [+] 4 5 6

4

The coatis probably wouldn't be so endangered if they robbed banks instead, like monkeys and snakes do. Their 180-degree ankles would be invaluable for heisting.

5

Free cookies to anyone who got the reference!

6

Our team lead Ivan Spracj explained that modern paper degrades much more quickly than the amate paper the Mayans used. Their ancient techniques were better than ours, both in paper and memory safety.

Blending Techniques

Before this, we only had three choices for memory safety, each with it's own tradeoffs:

Garbage collection 7 is easy, flexible, has high throughput, but uses more energy, more memory, and has nondeterministic pauses.
Reference counting is simple and uses less memory, but is slow and can leak cycles. 8
Borrow checking is faster and allows for aliasing inline data 9, but can cause complexity and can't do patterns like observers, intrusive data structures, many kinds of RAII, etc. 10

But we've long suspected that the Mayans had ways to blend these together at a more fundamental level.

We've certainly tried blending them before, and we've even had some success. For example, to work around the borrow checker in Rust we can put an object in an Rc<RefCell<T>> to make it reference counted, though that just delays the borrow checker to later when we borrow/borrow_mut the contents. Did the Mayans have a way to truly blend the two approaches, like hinted on the walls of the Mayan Sak Tz'i' tablets? 11

Some ancient writings also describe a way to make fast languages that are not only safe but also correct in a way that no languages are today 12, functional languages that use neither GC nor RC under the hood, 13 and ways we can have unbounded reference counted objects without a count integer. 14 How did they do all this?

This is why we were so excited to find this tome. Perhaps it had the answers!

Notes [–] Notes [+] 7 8 9 10 11 12 13 14

7

By "garbage collection", we're referring to tracing garbage collection.

8

With good use of weak references, one can avoid the leaks.

9

"Inline data" means we can have a struct's memory live on the stack, or inside another struct's memory, or directly in an array next to the other arrays' elements' memory. This is the default in C, and impossible in e.g. Javascript.

10

Luckily in Rust we can work around this limitation with reference counting!

11

Below I talk about how we can use regions to blend borrowing and reference counting to get the benefits of both worlds.

12

Except for Austral! It's safe because of borrow checking, and correct because it adds linear types. More on this below.

13

This is referring to Kindelia's HVM project!

14

See "Linear reference counting" below.

Impossibly Prophetic

As I deciphered the first few pages, I was shocked to find that it was referencing things that hadn't happened yet.

It was referencing to elucent's Basil language, Fernando's Austral, and Marco's Forty2 language... and yet carbon dating tells us that the book is hundreds of years old!

Somehow, the Mayans were looking forward in time to techniques that were invented by people alive today. So maybe the Mayans were just time-traveling collectors, and this tome contains techniques from the recent past...

...and also the recent future. There's an entire half of this tome that seems to build on strange higher concepts that don't exist in our world yet. 15 Other pages seem to mention laws we haven't yet discovered. 16

Notes [–] Notes [+] 15 16

15

Almost as if their CPU designs were slightly different than our own, in a key way that unlocked more possibilities. I'm still scratching my head on this one.

16

And there's one technique that I've tried to re-engineer thirty one (!) times without success: Hybrid-Generational Memory, one of my most elusive goals. I've gotten close by combining regions and generational references, but not in the automatic way that the Mayans seem to describe. Perhaps one of you can solve the rest of this puzzle!

The List

To keep this post short, 17 I'll assume the reader knows the basics of reference counting, (tracing) garbage collection, and knows how borrow checking works at a high level.

Even so, this is a very dense, compact overview of each technique, mainly meant as a starting point for further reading.

Don't worry, I'll be posting many follow-up posts (RSS, subreddit) that describe each one much more clearly and how it fits into the larger puzzle. 18

Afterward, I'll also talk about the interesting gaps in the puzzle, and the hints that might lead to discovery there.

Without further ado, here's the list!

Notes [–] Notes [+] 17 18

17

Because I've wasted all this space with the dramatic buildup, my bad!

18

I unfortunately can't give a timeline on this though, my health is a bit unstable lately.

1: Move-only programming was the most surprising one to me. In this, every object can only be known to one variable (or field or array element), and that one "owner" can give it up to transfer it to another variable, field, array element, or function parameter or return. In Java terms, only one reference can point to an object at any given time.

Some of you may recognize these as affine types (and also kind of linear types), but will be surprised to learn that we can write entire programs like this, without making any more references to any objects, as long as we're willing to go through some acrobatics. 19

Various languages build on this with different mechanisms: Rust adds borrow checking, Austral adds borrow checking and linear typing, and Vale has the linear-aliasing model. I suspect the tome is hinting at other possible blends too. 20

Notes [–] Notes [+] 19 20

19

For example, you can't just point to something that is currently in a hash map... you first have to temporarily remove it so you can read it.

20

Specifically, if we can add Pony-style val to it, we might get an interesting result.

2: Reference counting is fairly mainstream. Some will be surprised to learn that it can coexist with tracing GC (like in Python and Nim), and we also learned that there's a whole spectrum between the two.

And as it turns out, reference counting can be blended with immutable region borrowing to greatly reduce its cache misses and make it data-race safe, something no language has done yet. 21 22

For those who are into the more functional-programming side of things, I'm really interested in what Koka's doing with Perceus. Also check out Koka's Perceus

Notes [–] Notes [+] 21 22

21

I couldn't resist prototyping this, so the Vale compiler actually has a flag that switches Vale from generational references to RC so we can see this in action. See this experimental repo for more.

22

Nim could theoretically do this, but alas, I was unable to convince Araq that it was possible. I also thought for a while that Rust could do this, but it's unfortunately foiled by the RefCell escape hatch.

3: Borrow checking lets our code be as fast as C and even almost as safe as Haskell. 23 It works by ensuring that we only use pointers temporarily (in certain scopes) and in restricted ways to ensure others won't change the data that you're reading.

Austral takes it a step further: it's not only safe, but also correct by adding liveness via linear types which any code can use to ensure that some future action will happen. This is a pattern I call Higher RAII in Vale, but I think it naturally occurs in any language with linear types. 24

If you're curious for more, check out this Developer Voices episode where Kris Jenkins interviewed me on linear types and higher RAII.

Notes [–] Notes [+] 23 24

23

I say almost because Rust's single-ownership nature sometimes introduces failure conditions that wouldn't exist in Haskell.

24

Haskell can have linear types too! I foresee a future where functional programming and linear types are actually the best choice for safety-critical systems that can handle nondeterministic GC pauses. And the Mayans mention a new way that we can nearly eliminate those pauses...

4: Arena-only programming is where we never use malloc or free, and always use arenas instead, even for function returns. This is a familiar paradigm to users of C, Ada, Zig, and especially Odin which has a way to automatically decouple code from allocator strategy.

As described, this is more of a memory management approach than a memory safety approach. However, Cyclone and Ada/SPARK show us that we can track which pointers are pointing into which arenas, to prevent any use-after-frees. Verona shows us that by combining arenas with regions (described below), we can take things even further. 25

Notes [–] Notes [+] 25

25

We could also combine arena-only programming with generational references, regions, constraint references, or MMM++.

5: Ada/SPARK has a mechanism where a pointer cannot point to an object that is more deeply scoped than itself. If you imagine the stack growing to the right, 26 pointers are only allowed to point left. If you really stretch your brain, this has some similarities to mutable value semantics or borrow checking. 27

Notes [–] Notes [+] 26 27

26

"But Evan, stacks grow down!" Listen here, no it doesn't, nobody knows the orientation of the RAM chip in the computer.

27

From fghvbnvbnfe.

6: Regions are surprisingly flexible and powerful. I first learned about them from Pony's iso keyword: an iso'd object (and its contents) are only reachable from your pointer; nobody else points at that object or anything the object contains. In other words, it establishes an isolated subgraph of objects, and you hold the only reference to the entire thing.

Colin Gordon showed how we can "temporarily open" an isolated subgraph for a given scope, and afterward it would still be properly isolated.

I later wrote an article about how we could temporarily open an isolated subgraph and see it as an immutable region so to speak 28 to completely eliminate the memory safety cost for references pointing into that immutable region.

In that article I also explored how we can use a pure function to temporarily reinterpret all pre-existing regions as immutable, removing a vast amount of overhead. In 2023, we completed the first prototype showing this in action for generational references. It also helps reference counting approaches too.

Other languages such as Forty2 and Verona are going all-in on regions, and you'll see why further below.

I'm thinking about separating "regions" into two separate simpler concepts in my writing: regions (a set of objects that can freely point to each other) and region views (a mutable or immutable view of a region). Region views are really what unlock regions' potential, I think. If anyone has opinions, drop me an email!

Notes [–] Notes [+] 28

28

"Explicit locking" in the linked article.

7: Stack arenas is an approach that's spiritually similar to arena-only programming, but it's automatic and does it for every stack frame. Elucent's Basil used to do this! It wasn't efficient, but the Mayans mention something about combining it with other techniques to make it a lot faster. I can kind of see what they mean 29 but I haven't seen anyone try it yet.

Notes [–] Notes [+] 29

29

I think it's compatible with reference counting, and I'm fairly certain it's compatible with linear types.

8: Generational References is a technique that prevents use-after-frees by telling us whether a pointer is pointing at a valid object, by comparing a pointer's accompanying "remembered" generation number to the "current" generation number living in the object it's pointing at. We increment an object's generation number whenever we want to destroy it, preventing any future accesses. This approach is made much faster by regions: we never have to do that comparison if the object is in a temporarily immutable region, which we can establish with a pure function or block.

I hope other languages start using the generational references approach. There are a couple attempts in Rust (here and here), but the language's rules prevent them from doing the faster variant described below.

9: Random Generational References is a faster variant that lets us have "inline data", in other words it lets us put structs on the stack and inside arrays or other structs. This is similar to memory tagging, but much more reliable because of a wider tag (64 bits instead of 4) and when paired with perfect determinism, 30 more secure. 31

This improvement is exciting to me because it lets the generation live right next to the object, and lets both live anywhere: on the stack, in an array, or inline inside another object. This makes it much faster in theory, because it means a program will incur less cache misses.

One can even blend this with a technique that can reduce generation checks to zero where desired and regions for eliminating them everywhere else.

I think this blend has a lot of potential, because it has the strengths of C++ (architectural simplicity 32) and Rust (memory safety) while being simpler and easier than both.

But I'm a bit biased of course, as any human would be about their own idea!

Notes [–] Notes [+] 30 31 32

30

Perfect determinism is where the language doesn't introduce any features (e.g. reading uninitialized memory or casting pointers to integers) that could let nondeterminism leak into the program's logic. It's required for perfect replayability

31

Specifically, it means that we can defend against side-channel attacks at the program's architectural level, by never letting any nondeterminism leak into any untrusted code.

32

This means we can organize our program how we want, without interference from upwardly viral constraints (like async/await or borrow checking) or immutability concerns (like in functional programming). A litmus test for a language's architectural simplicity is whether you can implement a basic observer. Languages that don't have it will tend to have less stable APIs and a lot more refactoring.

10: MMM++ 33 is where objects are allocated from global arrays, and slots in those arrays are released and reused for other objects of the same type, thus avoiding use-after-free's normal memory unsafety problems. See Arrrlang for a simple theoretical example, and one usually adds other techniques too to make a real paradigm out of it. This is similar to how a lot of embedded, safety-critical, and real-time software works today, 34 though no language comprehensively enforces it yet.

Notes [–] Notes [+] 33 34

33

I'm sure this has a better name, someone let me know!

34

Including many servers, databases, and games. For example, TigerBeetleDB has a similar set of rules.

11: Tracing garbage collection is familiar to all of us, but there's a surprising twist: there's a secret way to make a garbage collector without the stop-the-world pauses! Pony does this: by separating each actor into its own little world, each actor can do its own garbage collection without stopping the other ones. Its ORCA mechanism then enables sharing data between the worlds via an interesting reference counting message passing mechanism.

Verona then takes this a step further by adding regions, giving the user more fine-grained control over when and where garbage collection might happen, and lets them use a regular bump allocator for a region instead if they wish.

If Verona or a new language allowed us to set the maximum memory for a GC'd region, that would make the entire approach completely deterministic, solving the biggest problem for garbage collection (in my opinion).

Don't tell anyone I said this, but I believe that 30 years from now, this blend is going to be the most widely used paradigm for servers.

12: Interaction nets are a very fast way to manage purely immutable data without garbage collection or reference counting. The HVM runtime implements this for Haskell. HVM starts with affine types (like move-only programming), but then adds an extremely efficient lazy .clone() primitive, so it can strategically clone objects instead of referencing them. Check out its guide to learn more! 35

Notes [–] Notes [+] 35

35

And if someone has a better explanation, please send it to me! I don't understand interaction nets that well. I think it's actually a blend of automatic borrowing and cloning, but the guide says it's not really borrowing, so I'm not sure.

13: Constraint references is a blend of reference counting and single ownership (in the C++ sense, unrelated to borrow checking). In this approach, every object has a single owner, doesn't necessarily need to be on the heap, and has a counter for all references to it. When we try to destroy the object, we just assert that there are no other references to this object.

This is used surprisingly often. Some game developers have been using this for a long time, and it can be used as the memory safety model for an entire language like in Gel. It supports a lot more patterns than borrow checking (intrusive data structures, graphs, observers, back-references, dependency references, callbacks, delegates, many forms of RAII, etc).

However, this checking is at run-time. Halting in release mode is often undesirable, so this technique shines the most when it's very targeted or when we can fall back to a different strategy in release mode.

14: Linear reference counting is an elusive concept, where we can completely eliminate the counter integer, and do all of the reference counting at compile time. No language can do this today, but there might be a way to get close with linear types. 36 Basically, we have two types of linear reference:

A "tine" reference remembers (in its type, at compile-time) L, the number of forks to get from the original value to here.
A "fork" reference holds the original value (or a tine reference 37), and remembers L and also N, the number of L+1 tine references created at the same time as this fork reference. Reclaiming the contents requires destroying this and all (N) of the L+1 tine references.

I don't expect anyone to understand that rushed explanation, but I hope to write an article soon on this! 38

I'm not actually sure what kind of architectural restrictions it might impose, how situational it is, or if it even works at all. It's just something I came up with--I mean uh, the grimoire mentions--as a hypothetical avenue to explore.

Notes [–] Notes [+] 36 37 38

36

Matthieu's static-rc crate gets pretty close to this, but without linear types in Rust, it has to leak under certain conditions. It's quite possible that his work on static-rc inspired this idea!

37

This is another difference between this idea and static-rc crate, I believe this will allow us to "borrow the borrow references" in objects, without lifetimes, rather than just on the stack. (Update: This might not be the case, see this thread for a more accurate comparison)

38

I also talked about it a bit on discord here and here and here and on Reddit here. (If someone knows how to make better publicly-accessible discord logs, let me know!)

15: Not-MVS is a very interesting approach. Imagine a Java or Swift where every object has exactly one reference pointing to it at any given time (similar to move-only programming) but that reference can be lent out to a function call. It's like a Rust with no shared references (&), only unique references (&mut) which can't be stored in structs. It's simple, fast, and powerful, though we may have to .clone() more often than even Rust programs.

Edit: I originally thought this is how Mutable Value Semantics worked, and I was totally wrong (thanks to dist1ll for the correction!). I'll leave it here as "Not-MVS" because I think what I described would probably still work as a memory safety approach.

For those curious about Mutable Value Semantics, dist1ll writes:

MVS as implemented by Hylo has multiple parameter passing modes. The immutable mode (which is equivalent to Rust’s &) is the default, but you can declare a mutable mode (i.e. &mut) with the inout keyword. The other two parameter passing modes are for transferring ownership (sink) and callee initialization (set).

16: CHERI is a hardware-software blend that can run languages like C in a memory-safe way. In CHERI, a pointer is represented as a 128-bit 39 type that contains an address range and permissions describing the operations that may be done with the pointer (and some other things), to achieve spatial memory safety. The hardware keeps track of whether something is a pointer or not via a 1-bit tag.

Check out David Chisnall's comment for some good explanation and clarifications!

Cornucopia adds temporal memory safety to that with special allocators that don't reuse memory in a page until it's empty and all of the existing capabilities have been revoked 40 via an application-wide memory sweep done concurrently in the background. Cornucopia Reloaded and CHERIoT are also mechanisms that bring use-after-free protections to CHERI.

If a new language used a system like this plus some techniques to prevent use-after-free on the stack, it could have a brand new memory safety model nobody's seen before.

Notes [–] Notes [+] 39 40

39

Or 64-bit on 32-bit systems.

40

This is an important memory safety concept: Memory unsafety comes not from use-after-free, but use-after-reuse. In fact, even that's too loose; memory unsafety comes from "use after shape change", which I'll explain later in the grimoire.

17: Neverfree doesn't really count, but I'll mention it as a bonus item just for fun. Basically, just don't call free! If you never free memory, you can't use-after-free, which instantly solves the hardest part of memory safety. 41 The idea is from this famous email conversation:

Norman Cohen said:

The only programs I know of with deliberate memory leaks are those whose executions are short enough, and whose target machines have enough virtual memory space, that running out of memory is not a concern. (This class of programs includes many student programming exercises and some simple applets and utilities; it includes few if any embedded or safety-critical programs.)

Kent Mitchell replied:

I was once working with a customer who was producing on-board software for a missile. In my analysis of the code, I pointed out that they had a number of problems with storage leaks. Imagine my surprise when the customers chief software engineer said "Of course it leaks". He went on to point out that they had calculated the amount of memory the application would leak in the total possible flight time for the missile and then doubled that number. They added this much additional memory to the hardware to "support" the leaks. Since the missile will explode when it hits its target or at the end of its flight, the ultimate in garbage collection is performed without programmer intervention.

It kind of makes sense in a way. If you have a program that uses all the memory all the way until the end (like sort), why not skip the expensive frees and let the OS clean it up when the process exits? You can't use-after-free if you never free!

Wait a minute, this list goes to 17, yet the intro only mentions 14! I actually did that because a couple might overlap 42 and a couple of them are half-approaches 43, and that last one is just here for fun. Besides, as I learn more approaches and add them to the list, the title will get more and more out of date anyway.

Notes [–] Notes [+] 41 42 43

41

One might need to also add some bounds checking and a few other measures, but it's a start!

42

Ada/SPARK might be a blend of MMM++ and arena-only programming, perhaps. I haven't used Ada/SPARK, so let me know!

43

It could be said that regions on its own isn't really a memory safety approach, and it could be said that arena-only programming is just a memory management technique. But hey, when you put those two halves together you get Verona's memory safety approach, so together they probably count as one.

What do we do with this avalanche of knowledge?

Perhaps someone, after reading this article, will go forth and design a new memory safety blend! It's not impossible, I even used this grimoire to make a theoretical blend for C++.

The world needs more memory safety blends and techniques! Especially ones that let us have better architectures, more simplicity, and less constraints. And who knows, searching for new techniques and blends might lead to interesting spinoff features, like Vale's perfect replayability and concurrency without data-coloring.

We often fall into a mental trap where we optimistically believe that we've solved everything there is to solve, and pessimistically believe there's nothing left to discover. That mental trap is a mind-killer, because we can't discover new things if we aren't open to their existence.

In fact, one of my favorite cognitive science tricks is to convince myself that there is a better solution and it's just barely within reach if I just give it a little more thought. For some reason, that removes the mental barriers and lets one truly, fully explore. 44

If someone were to ask me why we should keep looking, I'd show them the unique strengths of each paradigm:

RC's weak pointers let us easily know 45 when another object's logical lifetime has ended, which is a surprisingly common need when you're looking out for it.
C and C++ let us use intrusive data structures, like no other language can. 46
Austral and Vale's linear types allow for Higher RAII, which lets compilers prevent a lot of logic problems.
GC is the easiest, and depending on one's definitions, the safest too. 47

...and then I'd show them the whole list of approaches, and how many ways there are to blend them together.

With that in mind, it's pretty clear that memory safety is truly a wide-open world, waiting to be explored!

Notes [–] Notes [+] 44 45 46 47

44

This is a great technique in algorithm interviews, by the way.

45

In other languages, we need some sort of central tracking data structure to pull this off.

46

GC'd languages generally don't allow long-lived references to inline data, and Rust's borrow checker prevents intrusive data structures

47

It's a tricky topic. When one thinks not just about memory safety but about safety in general, a null-safe functional GC'd language has an edge over other approaches, even over borrow checking which forces long-term-referrable objects into central collections which have their own potential edge cases.

More in the Grimoire

The above list is not complete, of course. There are some half-deciphered hints and building blocks in the grimoire that might be able to assist memory safety models in new ways.

Beware: We don't know which of these techniques actually help memory safety, and which summon ancient demons. Proceed at your own risk!

Here's just a handful:

Type stability shows us that use-after-free isn't the enemy, but rather a simplistic approximation of the enemy. The real memory safety problems arise when we access some memory after we've released and reused it for something of a different type. 48
Final references (like in Java) can help a lot in designing memory safety models. I won't explain too much here, but email or discord me (Verdagon) and I can explain there. I dare not write publicly about what these unlock for memory safety, for what it would do to the world.
Unique References, in other words, guaranteeing that you have the only usable reference to an object, has been the key breakthrough in more approaches than I can count, including borrow checking, mutable value semantics, move-only programming, etc. There are even techniques that can make any object temporarily unique (like how Swift's inout works, or how in generational references we can just temporarily change the generation). 49
Change detectors is a mechanism that will track at run-time whether something's been changed. Java collections use a modCount to prevent modifying while iterating, and one could conceivably use this to assist in memory safety as well.
Check-on-set is a pattern where we check at run-time if we're allowed to modify an object in a certain way. The best example of this how we can freeze a Javascript object, and whenever we modify an object, the runtime will assert it's not frozen. Anything that can guarantee an object immutable could be used for a memory safety approach.
Thread isolation is where we guarantee that an object is only visible to one thread at a given time. This property has helped enable borrow checking, generational references, Vale's immutable region borrowing, and faster forms of reference counting. It's likely important to other potential memory safety approaches.
Page Headers are where an allocator can strategically put metadata about an object at the top of its 4096-byte page.
Fat pointers is where some other data always accompanies a pointer. The biggest example of this is Rust's trait references and Vale uses it for its generational references memory safety approach.
Top-byte ignore refers to how some CPUs ignore the top byte of any particular pointer, so you can conveniently put anything you want there. You can even simulate top-byte-ignore on other systems by manually masking that byte off before dereferencing. This can be used to e.g. store how far the reference count integer is, so that you can point to the interior of a reference counted object. There are more arcane ways to use bits in the middle and end too. 50
And many more, hopefully in upcoming articles!

You would be surprised how many little tricks can be used to complete or assist new memory safety models.

If you know of any more memory safety techniques, or want to see in-progress decipherings, then come on over to the #grimoire channel in the Vale discord.

Notes [–] Notes [+] 48 49 50

48

"Shape stability" takes that a step further: we can reuse e.g. an integer's memory for a float and still not trigger memory unsafety, so really the problem is when we confuse a pointer for an integer or vice versa.

49

Easter egg note!

William "Billy" Windsor I is a cashmere goat who served as lance corporal in the British Army's Royal Welsh 1st Battalion from 2001 until 2009.

He was demoted to rank fusilier for three months for inappropriate behaviour during the 2006 Queen's Official Birthday celebrations while on active duty with the battalion on Cyprus.

If you read this note, mention "that one lance corporal goat" anywhere on HN or reddit! Nobody will believe you.

(Cheers to cbsmith, kubanczyk, TheGoldenMinion, lovich, padraig_oh, and leksak for the last one!)

50

We can also use the lower bits if we know the alignment of the data we're pointing to. We might even be able to use the bits in the middle by manually specifying the address mmap should give us.

That's all!

I hope you enjoyed this article! It represents my findings after a decade of searching and designing, so I hope it helps a lot of people out there.

In the next post, I'll talk about how we can blend reference counting with some of the above techniques to drastically reduce their overhead and add fearless concurrency, so keep an eye out on our RSS feed, twitter, discord server, or subreddit!

Donations and sponsorships for Vale are currently paused, but if you like these articles, please Donate to Kākāpō Recovery and let me know. I love those birds, let's save them!

Cheers,

- Evan Ovadia

Thank you!

I want to give a huge thanks to Arthur Weagel, Kiril Mihaylov, Radek Miček, Geomitron, Chiuzon, Felix Scholz, Joseph Jaoudi, Luke Puchner-Hardman, Jonathan Zielinski, Albin Kocheril Chacko, Enrico Zschemisch, Svintooo, Tim Stack, Alon Zakai, Alec Newman, Sergey Davidoff, Ian (linuxy), Ivo Balbaert, Pierre Curto, Love Jesus, J. Ryan Stinnett, Cristian Dinu, and Florian Plattner (plus a very generous anonymous donor!) for sponsoring Vale over all these years.

Recent events may have forced me to stop coding Vale for a while and led me to pause donations and sponsorships, but your support all this time is still giving me spirit and strength! Things are looking up, and I hope to be back soon.