LanguagesArchitecture
Note: Regions are still a work-in-progress. Part 1 has been successfully prototyped, but parts 2-5 are only a preview describing how we expect them to work in practice, to show where we're headed and what we're aiming for. They could surpass our wildest expectations, or they could shatter and implode into a glorious fireball, who knows! Follow along as we implement all this, and reach out if anything isn't clear! 0 1

Vale has an ambitious goal: to be fast, memory safe, and most importantly, easy. There are a lot of stellar languages that have two, and we suspect it's possible to really maximize all three.

To do this, we're harnessing a new concept called regions.

In Part 1 we saw how we can use pure functions to easily immutably borrow data to make it faster to access.

Part 2 showed us how we could more precisely create regions via isolates, and immutably borrow them too.

Part 3 showed us how we can get the benefit of isolates with many more kinds of data.

Let's kick it up a notch, and use regions to immutably borrow part of an object while being able to modify the rest of it.

This pattern is incredibly versatile, and helps us eliminate memory safety overhead for iterating over collections, accessing private data, and even entire architectures such as entity-component-system.

A simple example

Later on, we'll show how to use this for arrays, hash maps, and larger data structures.

First, let's see how we can use regions to make zero-cost iteration of a linked list.

Here's a singly-linked list of Ships.

struct ShipListNode { ship Ship; next priv vary ?^ShipListNode; 2 } struct Ship { name str; hp int; }

Here we iterate over it. There's a much cleaner way to do this, but we'll be verbose here for clarity.

Iterating over this list incurs a few generation checks:

  • maybe_cur.NonEmpty()
  • maybe_cur.Expect()
  • cur.ship
  • ship.hp
  • ship.name
  • cur.next

exported func main() { head = Some( ^ShipListNode( Ship("Serenity", 10), Some( ^ShipListNode( Ship("Raza", 22), None)))); maybe_cur = head; while maybe_cur.NonEmpty() { cur = maybe_cur.Expect(); ship = cur.ship; set cur.hp -= 5; println("Damaged {cur.name}!"); maybe_cur = cur.next; } }

Generation checks usually aren't a significant source of overhead, for various reasons. 3 But if we want to squeeze every ounce of performance out of this part of the program, and the profiler tells us that this area of the code is worth optimizing, we can bring out our region skills to get the job done.

The first question to ask is: which parts of my data shouldn't change right now?

The data in the contained Ship is changing, when we do set cur.hp -= 5.

The ShipListNodes themselves don't seem to be changing though. Perhaps we can put them in a region?

But... the ShipListNode contains a Ship inline. Can we have a struct in one region contain a struct in another one?

Yes we can!

A struct in two worlds

Here are those same structs, but now ShipListNode has some region markers:

Note the ship a'Ship. The a' here means that this data, even though it's inline, is still part of another region.

struct ShipListNode<a'> { ship a'Ship; next priv vary ?^ShipListNode<a'>; } struct Ship { name str; hp int; }

Here, we put the list into an isolate with '. We specify self' for the Ships to tell the compiler that they're in main's region.

head is of type '?^ShipListNode<main'>.

And now, we borrow it immutably, using .imm. This makes maybe_cur and cur both immutable, which eliminates the generation checks from:

  • maybe_cur.NonEmpty()
  • maybe_cur.Expect()
  • cur.ship
  • cur.next

There are still a couple generation checks: ship.hp and ship.name.

In this example, the compiler actually eliminates these too with static analysis, because it knows they are owned by a region that's currently immutable.

This is pretty common; a region's immutability often helps optimize things around it.

exported func main() { head = 'Some( ^ShipListNode( main'Ship("Serenity", 10), Some( ^ShipListNode( main'Ship("Raza", 22), None)))); maybe_cur = head.imm; while maybe_cur.NonEmpty() { cur = maybe_cur.Expect(); ship = cur.ship; set cur.hp -= 5; println("Damaged {cur.name}!"); maybe_cur = cur.next; } }
Side Notes
(interesting tangential thoughts)
0

If anything isn't clear, feel free to reach out via discord, twitter, or the subreddit! We love answering questions, and it helps us know how to improve our explanations.

1

We're aiming to complete regions by early 2024, check out the roadmap for more details.

2

?X means "Option", and ^ means "on the heap", so this is an optional ShipListNode on the heap.

3

A couple reasons:

  • They're perfectly predicted; the language always knows which way the CPU should speculatively execute.
  • The generations are usually on the same cache line as the data itself.

Most generic structures are multi-region objects

If we made the above list into a generic struct, it would look like this.

struct ListNode<T> { ship T; next priv vary ?^ListNode<T>; }

It looks like an ordinary generic struct; there's not even any region markers.

That's because in Vale, T actually includes three things:

  • The type, such as Ship.
  • The ownership, whether it be owned, heap-owned ('^'), non-owning ('&'), or weak ('weak&')
  • The region.

When someone says ListNode<&myiso'Ship>, T is: non-owning (&) reference to a Ship from region myiso.

If T is a x'Ship, that means ListNode owns data in another region, just like we saw with ShipListNode.

So really, any generic struct might own data in another region.

Conclusion

Every array, list, hash map, and other generic container in Vale is using multi-region data under the hood.

This is incredibly powerful, because it lets us freeze the container while accessing the contained data, such as we saw in the above ShipListNode, and makes our entire program much faster. 4

Between pure functions, isolates, and multi-region objects, we can eliminate the vast majority of memory safety overhead for our programs.

The best thing about all of these mechanisms is that they are opt-in:

  • A programmer can write a complete Vale program without ever learning about regions or multi-region-objects.
  • A programmer can ignore any region markers and still understand the code; regions don't affect a program's semantics.

This is consistent with Vale's philosophy of avoiding forced complexity.

Next up is Part 5, where we talk about how we can make iteration much faster, and how to use regions to make entire architectures (such as entity-component-system) zero-cost.

That's all for now! We hope you enjoyed this article. Stay tuned for the next article, which shows how one-way isolation works.

If you're impressed with our track record and believe in the direction we're heading, please consider sponsoring us on GitHub!

With your support, we can bring regions to programmers worldwide.

See you next time!

- Evan Ovadia

4

Draft TODO: estimate how many checks are eliminated just from the stdlib doing this

Vale's Vision

Vale aims to bring a new way of programming into the world that offers speed, safety, and ease of use.

The world needs something like this! Currently, most programming language work is in:

  • High-overhead languages involving reference counting and tracing garbage collection.
  • Complex languages (Ada/Spark, Coq, Rust, Haskell, etc.) which impose higher complexity burden and mental overhead on the programmer.

These are useful, but there is a vast field of possibilities in between, waiting to be explored!

Our aim is to explore that space, discover what it has to offer, and make speed and safety easier than ever before.

In this quest, we've discovered and implemented a lot of new techniques:

  • Generational Memory, for a language to ensure an object still exists at the time of dereferencing.
  • Higher RAII, a form of linear typing that enables destructors with parameters and returns.
  • Fearless FFI, which allows us to call into C without risk of accidentally corrupting Vale objects.
  • Perfect Replayability, to record all inputs and replay execution, and completely solve heisenbugs and race bugs.

These techniques have also opened up some new emergent possibilities, which we hope to implement:

  • Region Borrow Checking, which adds mutable aliasing support to a Rust-like borrow checker.
  • Hybrid-Generational Memory, which ensures that nobody destroys an object too early, for better optimizations.
  • Seamless concurrency, the ability to launch multiple threads that can access any pre-existing data without data races, without the need for refactoring the code or the data.
  • Object pools and bump-allocators that are memory-safe and decoupled, so no refactoring needed.

We also gain a lot of inspiration from other languages, and are finding new ways to combine their techniques:

  • We can mix an unsafe block with Fearless FFI to make a much safer systems programming language!
  • We can mix Erlang's isolation benefits with functional reactive programming to make much more resilient programs!
  • We can mix region borrow checking with Pony's iso to support shared mutability.

...plus a lot more interesting ideas to explore!

The Vale programming language is a novel combination of ideas from the research world and original innovations. Our goal is to publish our techniques, even the ones that couldn't fit in Vale, so that the world as a whole can benefit from our work here, not just those who use Vale.

Our medium-term goals:

  • Finish the Region Borrow Checker, to show the world that shared mutability can work with borrow checking!
  • Prototype Hybrid-Generational Memory in Vale, to see how fast and easy we can make single ownership.
  • Publish the Language Simplicity Manifesto, a collection of principles to keep programming languages' learning curves down.
  • Publish the Memory Safety Grimoire, a collection of "memory safety building blocks" that languages can potentially use to make new memory models, just like Vale combined generational references and scope tethering.

We aim to publish articles biweekly on all of these topics, and create and inspire the next generation of fast, safe, and easy programming languages.

If you want to support our work, please consider sponsoring us on GitHub!

With enough sponsorship, we can:

  • Work on this full-time.
  • Turn the Vale Language Project into a 501(c)(3) non-profit organization.
  • Make Vale into a production-ready language, and push it into the mainstream!