Vale has an ambitious goal: to be fast, memory safe, and most importantly, easy. There are a lot of stellar languages that have two, and we suspect it's possible to really maximize all three.
To do this, we're harnessing a new concept called regions.
In Part 1, we saw how we can use pure functions to easily "immutably borrow" a region, to eliminate memory safety costs when accessing it.
That works really well for the sections of our program that can be phrased as pure functions. 2
We can use regions to eliminate memory safety overhead in other places too, using a concept called "isolates".
An isolate is a hierarchy of data that nobody outside can point to, except for one owning reference outside that points at the root.
Some examples:
By default, nothing inside an isolate can point out either, though Part 3 shows how we can enable that using one-way isolation. Most private data can actually be expressed very cleanly with one-way isolation. 3
Part 4 also shows how one-way isolation can make certain patterns (iterating, determinants, etc.) and entire architectures (like entity-component-system) zero-cost. 4
For the rest of this post though, we'll focus on regular isolates, where nothing inside can point out.
Most programs are naturally hierarchies of isolated data, 5 no matter what language they're written in. We can use the ' syntax to help the compiler know where those isolates are, so it can make more optimal code. 6
One doesn't need to '-annotate everything, of course. Consistent with Vale's philosophy of avoiding forced complexity, isolates are opt-in.
One would '-annotate only the parts of their program that profiling suggests would benefit from optimization.
If anything isn't clear, feel free to reach out via discord, twitter, or the subreddit! We love answering questions, and it helps us know how to improve our explanations.
We're aiming to complete regions by early 2024, check out the roadmap for more details.
Which is quite a lot, really. Our sample roguelike game, a very stateful mutation-heavy program, actually spent the vast majority of its time inside pure functions.
pure functions use it under the hood as well, if you squint hard enough.
Together, isolates, pure functions, and one-way isolation combine to form something that looks suspiciously like an entire new programming paradigm... whether that's true remains to be seen!
More specifically, they're naturally hierarchies of one-way isolated data. Private data often points outward, especially in stateful code.
We can even take this to the extreme and ' everything, and we'd end up with data roughly the same shape as in Rust programs. In practice, there's a balance somewhere in-between.
We can immutably borrow an isolate's data, allowing the compiler to skip generation checks when reading it. This can make our code faster.
This is the same mechanism used by pure functions, described in Part 1.
This allows us to get the optimization benefits of affine types (like those seen in Rust and Cyclone), but with a vital improvement: it doesn't have aliasing restrictions, and enables techniques and optimizations that require shared mutability. 7
First, here's an example before we use isolation. We'll add isolation to it afterward.
This is a simple Cannon. 8
Here, we see it fire on an enemy ship.
fire isn't defined here, we show it further below.
When it fires on something it first calculates its strength, based on a very complex algorithm.
This algorithm has a little bit of overhead: whenever we read from cannon, such as the cannon.strength, it incurs a generation check.
Generation checks are rarely a source of significant slowdowns. However, if we're in the hot path of a performance-critical program, profiling might suggest that we optimize this function.
Let's optimize!
If we can immutably borrow cannon, then we can skip those generation checks!
One way to do this is to make cannon isolated, and then open it up .immutably. 9
There are four changes here:
Now, cannon is immutably borrowed, so reading it (like cannon.strength) is faster because it doesn't incur any generation checks.
In this example, reading anything from cannon is zero cost, because we used .imm to open the isolate immutably.
This is the true strength of isolates: they tell the compiler when areas of our data are immutable, so it can read them with zero cost.
Let's say we want a level generator for a simple game.
We're going to coalesce a black-and-white image's pixels until it gives us something interesting, and then use that to inspire our level's terrain.
This is known as a Cellular Automata algorithm, and is very similar to blurring an image.
Here's our main function which shows the general structure of our program.
The ' in front of [][]bool specifies the array is isolated.
The image.read in coalesceImage(&rand, image.read) will open up the image isolate as immutable, and then pass it in as an argument to coalesceImage.
Further below, we'll see how coalesceImage keeps the two inputs separate from each other.
Techniques like intrusive data structures and graphs, plus useful patterns like observers, back-references, dependency references, callbacks, delegates and many forms of RAII and higher RAII.
Originally designed in the year 2347, this cannon an ion-based Hawking Propulsor, historically used pretty heavily in the Hegemony fleet. With the invention of the Shearing Field it has largely fallen out of use, but it's still a vital component in some outworld colony defenses like ours.
The other way is to pull the damage = cannon.strength * 2 out into a pure function. That usually works well, but this post is about isolation so we'll show that instead.
image's type is '[][]bool.
Here's the coalesceImage function.
The important part here is the r' in &r'[][]bool. It means this parameter is in a separate region from the rest of the parameters, and we only see the region as read-only. 12
When main calls coalesceImage, the compiler sees two things:
The compiler concludes that nobody's going to change anything in the region, therefore it's temporarily immutable. It generates a coalesceImage function that is optimized accordingly, eliminating memory safety overhead when reading from that region.
foreach will automatically create an isolated array if it's constructed with a bunch of isolated elements.
In general, one can change <r'> to <r' rw> for a read-write region here.
If you're curious, here's the averageNeighbors function, which takes its parameters in a similar way:
Above, we showed how an isolate in a variable can be passed to a function that reads from it.
Isolates are incredibly versatile when held in local variables, as we saw above.
However, when a struct member is an isolate, it can only be accessed when: 13
A cell relaxes these restrictions by moving some checks to run-time. 16
For example, this Ship contains an "Engine cell".
foo uses the .read syntax to gain access to the contained 'Engine.
The above example did an immutable borrow using .imm. We can also:
There are two caveats to using cells:
So what's actually happening with that line, engine = ship.engine.imm;?
This is because an isolate must never be opened twice at the same time, it must only be opened once. If we allowed opening a struct member isolate, then it could indirectly be opened twice simultaneously.
This restriction can be combined with Higher RAII to make some pretty interesting mechanisms. It's also conceptually similar to GhostToken in Rust.
Design TBD: Can an isolate be used like a ghosttoken, to pass around permission to open, say, a mutex? It would be zero cost.
For anyone familiar with Rust, a cell is similar to a RefCell, but with some improvements:
When we borrow from a cell, we get a cell guard.
For example, if ship.engine is a ''Engine, then immutably borrowing it like engine = ship.engine.imm will make engine a CellGuard<imm, Engine>.
Under the hood, a CellGuard contains a pointer to the original cell.
When the CellGuard goes out of scope, will inform the original cell that we're done borrowing it.
This is how the language enforces that we don't immutably borrow and readwrite borrow the cell at the same time, which protects us from memory problems.
So if engine is a CellGuard<imm, Engine>, how is it possible to say engine.fuel, like in println(engine.fuel)?
A cell guard has an interesting quirk: it wears a mask.
Mentioning its name, like the engine in println(engine.fuel), does not give us the CellGuard<imm, Engine>.
Instead, it gives us the contents, &e'Engine. e is hidden, an implicit region that's conceptually tied to the cell guard.
The compiler then makes sure that no references into this region outlive the cell guard itself, similar to when we open a regular isolate.
Besides having better performance via immutable borrowing, isolates and cells can also have architectural benefits as well.
First, it indirectly helps us stick to unidirectional data flow, the pattern where after we're done modifying the data, it's read-only for the rest of the operation. This pattern might be familiar:
Second, it ensures that a struct's private data won't be unexpectedly changed by anyone outside.
In other languages, we can accidentally make a reference to some private data that escapes to someone outside our class. Then, someone uses that reference to modify the data that we thought was private, in unexpected ways that cause bugs. With regionsthe compiler prevents this from ever happening.
The best thing about isolation however is that it is opt-in. Consistent with Vale's philosophy of avoiding forced complexity, we never have to use isolation, or isolates, or regions at all. As more of us experiment with regions, we can learn the best places to apply them.
As we saw, isolates can be surprisingly powerful for optimization, and using them well can make a program much faster.
There are some details we didn't cover in the article:
Part 3 also shows how an isolate's contents can point outside the isolate using one-way isolation, which fits well with most structs' private data.
Part 4 shows how we can have one object contain another region's data inline.
Part 5 then shows how to combine that with one-way isolation to make certain patterns (iterating collections, calculating determinants, etc.) and entire architectures (like entity-component-system) zero-cost. 19
That's all for now! We hope you enjoyed this article. Stay tuned for the next article, which shows how one-way isolation works.
If you're impressed with our track record and believe in the direction we're heading, please consider sponsoring us on GitHub!
With your support, we can bring regions to programmers worldwide.
See you next time!
- Evan Ovadia
Actual syntax TBD.
This might be merged with how immutable structs currently work. It would recursively call drop first, presumably.
Together, isolates, pure functions, and one-way isolation combine to form something that looks suspiciously like an entire new programming paradigm... whether that's true remains to be seen!
This is just a draft! TODOs:
Vale aims to bring a new way of programming into the world that offers speed, safety, and ease of use.
The world needs something like this! Currently, most programming language work is in:
These are useful, but there is a vast field of possibilities in between, waiting to be explored!
Our aim is to explore that space, discover what it has to offer, and make speed and safety easier than ever before.
In this quest, we've discovered and implemented a lot of new techniques:
These techniques have also opened up some new emergent possibilities, which we hope to implement:
We also gain a lot of inspiration from other languages, and are finding new ways to combine their techniques:
...plus a lot more interesting ideas to explore!
The Vale programming language is a novel combination of ideas from the research world and original innovations. Our goal is to publish our techniques, even the ones that couldn't fit in Vale, so that the world as a whole can benefit from our work here, not just those who use Vale.
Our medium-term goals:
We aim to publish articles biweekly on all of these topics, and create and inspire the next generation of fast, safe, and easy programming languages.
If you want to support our work, please consider sponsoring us on GitHub!
With enough sponsorship, we can: