Vale has an ambitious goal: to be fast, memory safe, and most importantly, easy. There are a lot of stellar languages that have two, and we suspect it's possible to really maximize all three.
To do this, we're harnessing a new concept called regions.
In Part 1, we saw how we can use pure functions to easily "immutably borrow" a region, to eliminate memory safety costs when accessing it.
That works really well for the sections of our program that can be phrased as pure functions. 2
We can use regions to eliminate memory safety overhead in other places too, using a concept called "isolates".
An isolate is a hierarchy of data that nobody outside can point to, except for one owning reference outside that points at the root.
Some examples:
By default, nothing inside an isolate can point out either, though Part 3 shows how we can enable that using one-way isolation. Most private data can actually be expressed very cleanly with one-way isolation. 3
Part 4 also shows how one-way isolation can make certain patterns (iterating, determinants, etc.) and entire architectures (like entity-component-system) zero-cost. 4
For the rest of this post though, we'll focus on regular isolates, where nothing inside can point out.
Most programs are naturally hierarchies of isolated data, 5 no matter what language they're written in. We can use the ' syntax to help the compiler know where those isolates are, so it can make more optimal code. 6
One doesn't need to '-annotate everything, of course. Consistent with Vale's philosophy of avoiding forced complexity, isolates are opt-in.
One would '-annotate only the parts of their program that profiling suggests would benefit from optimization.
If anything isn't clear, feel free to reach out via discord, twitter, or the subreddit! We love answering questions, and it helps us know how to improve our explanations.
We're aiming to complete regions by early 2024, check out the roadmap for more details.
Which is quite a lot, really. Our sample roguelike game, a very stateful mutation-heavy program, actually spent the vast majority of its time inside pure functions.
pure functions use it under the hood as well, if you squint hard enough.
Together, isolates, pure functions, and one-way isolation combine to form something that looks suspiciously like an entire new programming paradigm... whether that's true remains to be seen!
More specifically, they're naturally hierarchies of one-way isolated data. Private data often points outward, especially in stateful code.
We can even take this to the extreme and ' everything, and we'd end up with data roughly the same shape as in Rust programs. In practice, there's a balance somewhere in-between.
We can immutably borrow an isolate's data, allowing the compiler to skip generation checks when reading it. This can make our code faster.
This is the same mechanism used by pure functions, described in Part 1.
This allows us to get the optimization benefits of affine types (like those seen in Rust and Cyclone), but with a vital improvement: it doesn't have aliasing restrictions, and enables techniques and optimizations that require shared mutability. 7
First, here's an example before we use isolation. We'll add isolation to it afterward.
This is a simple Cannon. 8
Here, we see it fire on an enemy ship.
fire isn't defined here, we show it further below.
struct Cannon {
strength int;
...
}
struct Ship {
hp int;
}
exported func main() {
cannon = Cannon(12, ...);
ship = Ship(100);
fire(&cannon, &ship);
println("Ship's new hp: {ship.hp}");
}
Ship's new hp: 88
When it fires on something it first calculates its strength, based on a very complex algorithm.
This algorithm has a little bit of overhead: whenever we read from cannon, such as the cannon.strength, it incurs a generation check.
func fire(
cannon &Cannon,
ship &Ship)
void {
// Read cannon to calculate damage.
damage = cannon.strength * 2;
// Now hit the ship!
set ship.hp -= damage;
}
Generation checks are rarely a source of significant slowdowns. However, if we're in the hot path of a performance-critical program, profiling might suggest that we optimize this function.
Let's optimize!
If we can immutably borrow cannon, then we can skip those generation checks!
One way to do this is to make cannon isolated, and then open it up .immutably. 9
There are four changes here:
// (using above Cannon and Ship)
exported func main() {
cannon = 'Cannon(12, ...);
ship = Ship(100);
fire(cannon.imm, &ship);
println("Ship's new hp: {ship.hp}");
}
func fire<c'>(
cannon &c'Cannon,
ship &Ship)
void {
// This is faster now, no more
// generation checks!
damage = cannon.strength * 2;
// Now hit the ship!
set ship.hp -= damage;
}
Now, cannon is immutably borrowed, so reading it (like cannon.strength) is faster because it doesn't incur any generation checks.
In this example, reading anything from cannon is zero cost, because we used .imm to open the isolate immutably.
This is the true strength of isolates: they tell the compiler when areas of our data are immutable, so it can read them with zero cost.
Let's say we want a level generator for a simple game.
We're going to coalesce a black-and-white image's pixels until it gives us something interesting, and then use that to inspire our level's terrain.
This is known as a Cellular Automata algorithm, and is very similar to blurring an image.
Here's our main function which shows the general structure of our program.
The ' in front of [][]bool specifies the array is isolated. 10
The image.read in coalesceImage(&rand, image.read) will open up the image isolate as immutable, and then pass it in as an argument to coalesceImage.
Further below, we'll see how coalesceImage keeps the two inputs separate from each other.
exported func main() {
// Our random number generator.
rand = LCGRand(1337);
// An isolated 16x16 array of bools.
image =
'[][]bool(16, x => {
'[]bool(16, y => {
rand.next() mod 2 == 0
})
});
// We want coalesceImage to treat
// the input rand as mutable, and
// the input image as immutable.
image_coalesced =
coalesceImage(&rand, image.read);
// And do it again, it'll look nice.
image_coalesced_again =
coalesceImage(
&rand, image_coalesced.read);
display(image_coalesced_again.read);
}
Techniques like intrusive data structures and graphs, plus useful patterns like observers, back-references, dependency references, callbacks, delegates and many forms of RAII and higher RAII.
Originally designed in the year 2347, this cannon an ion-based Hawking Propulsor, historically used pretty heavily in the Hegemony fleet. With the invention of the Shearing Field it has largely fallen out of use, but it's still a vital component in some outworld colony defenses like ours.
The other way is to pull the damage = cannon.strength * 2 out into a pure function. That usually works well, but this post is about isolation so we'll show that instead.
image's type is '[][]bool.
Here's the coalesceImage function. 11
func coalesceImage<r'>(
rand &LCGRand,
tiles &r'[][]bool)
'[][]bool { // Returns a new array.
// Make a new isolated array.
tiles '[][]bool =
foreach [x, column] in tiles.entries() {
foreach [y, walkable] in column.entries() {
// Produces a boolean
averageNeighbors(rand, tiles, x, y)
}
};
return tiles;
}
The important part here is the r' in &r'[][]bool. It means this parameter is in a separate region from the rest of the parameters, and we only see the region as read-only. 12
When main calls coalesceImage, the compiler sees two things:
The compiler concludes that nobody's going to change anything in the region, therefore it's temporarily immutable. It generates a coalesceImage function that is optimized accordingly, eliminating memory safety overhead when reading from that region.
foreach will automatically create an isolated array if it's constructed with a bunch of isolated elements.
In general, one can change <r'> to <r' rw> for a read-write region here.
If you're curious, here's the averageNeighbors function, which takes its parameters in a similar way:
func averageNeighbors<r'>(rand &LCGRand, tiles &r'[][]bool, x int, y int) bool {
num_neighbors = 0;
num_white_neighbors = 0;
foreach neighbor_x in range(x - 1, x + 1) {
foreach neighbor_y in range(y - 1, y + 1) {
if neighbor_x in 0..tiles.len() and neighbor_x in 0..tiles[0].len() {
set num_neighbors += 1;
set num_white_neighbors += if white { 1 } else { 0 };
}
}
}
result_is_white =
if num_white_neighbors * 2 == num_neighbors {
// Even number of white and black neighbors, so choose randomly.
rand.next() mod 2 == 0
} else {
// true if most neighbors are white
num_white_neighbors > num_neighbors / 2
};
return result_is_white;
}
Above, we showed how an isolate in a variable can be passed to a function that reads from it.
Isolates are incredibly versatile when held in local variables, as we saw above.
However, when a struct member is an isolate, it can only be accessed when: 13
A cell relaxes these restrictions by moving some checks to run-time. 16
For example, this Ship contains an "Engine cell".
foo uses the .read syntax to gain access to the contained 'Engine.
struct Ship {
engine ''Engine;
}
struct Engine {
fuel int;
}
func foo(ship &Ship) {
engine = ship.engine.imm;
// Can read engine freely!
println(engine.fuel);
}
The above example did an immutable borrow using .imm. We can also:
There are two caveats to using cells:
So what's actually happening with that line, engine = ship.engine.imm;?
This is because an isolate must never be opened twice at the same time, it must only be opened once. If we allowed opening a struct member isolate, then it could indirectly be opened twice simultaneously.
This restriction can be combined with Higher RAII to make some pretty interesting mechanisms. It's also conceptually similar to GhostToken in Rust.
Design TBD: Can an isolate be used like a ghosttoken, to pass around permission to open, say, a mutex? It would be zero cost.
For anyone familiar with Rust, a cell is similar to a RefCell, but with some improvements:
When we borrow from a cell, we get a cell guard.
For example, if ship.engine is a ''Engine, then immutably borrowing it like engine = ship.engine.imm will make engine a CellGuard<imm, Engine>.
Under the hood, a CellGuard contains a pointer to the original cell.
When the CellGuard goes out of scope, will inform the original cell that we're done borrowing it.
This is how the language enforces that we don't immutably borrow and readwrite borrow the cell at the same time, which protects us from memory problems.
So if engine is a CellGuard<imm, Engine>, how is it possible to say engine.fuel, like in println(engine.fuel)?
A cell guard has an interesting quirk: it wears a mask.
Mentioning its name, like the engine in println(engine.fuel), does not give us the CellGuard<imm, Engine>.
Instead, it gives us the contents, &e'Engine. e is hidden, an implicit region that's conceptually tied to the cell guard.
The compiler then makes sure that no references into this region outlive the cell guard itself, similar to when we open a regular isolate.
Besides having better performance via immutable borrowing, isolates and cells can also have architectural benefits as well.
First, it indirectly helps us stick to unidirectional data flow, the pattern where after we're done modifying the data, it's read-only for the rest of the operation. This pattern might be familiar:
Second, it ensures that a struct's private data won't be unexpectedly changed by anyone outside.
In other languages, we can accidentally make a reference to some private data that escapes to someone outside our class. Then, someone uses that reference to modify the data that we thought was private, in unexpected ways that cause bugs. With regionsthe compiler prevents this from ever happening.
The best thing about isolation however is that it is opt-in. Consistent with Vale's philosophy of avoiding forced complexity, we never have to use isolation, or isolates, or regions at all. As more of us experiment with regions, we can learn the best places to apply them.
As we saw, isolates can be surprisingly powerful for optimization, and using them well can make a program much faster.
There are some details we didn't cover in the article:
Part 3 also shows how an isolate's contents can point outside the isolate using one-way isolation, which fits well with most structs' private data.
Part 4 shows how we can have one object contain another region's data inline.
Part 5 then shows how to combine that with one-way isolation to make certain patterns (iterating collections, calculating determinants, etc.) and entire architectures (like entity-component-system) zero-cost. 19
That's all for now! We hope you enjoyed this article. Stay tuned for the next article, which shows how one-way isolation works.
See you next time!
- Evan Ovadia
Actual syntax TBD.
This might be merged with how immutable structs currently work. It would recursively call drop first, presumably.
Together, isolates, pure functions, and one-way isolation combine to form something that looks suspiciously like an entire new programming paradigm... whether that's true remains to be seen!
This is just a draft! TODOs: