Most of Hybrid Generational Memory’s (HGM’s) overhead is caused by generation checks 0 that occur when a pointer to an object is dereferenced: to see if an object is still alive, HGM checks the generation number at the top of the object’s allocation to ensure that it matches the target generation number stored with the pointer. 1
This is how HGM guarantees memory safety, but getting the generation number at the top of the object’s allocation is often expensive because it incurs a cache miss on the CPU. To combat this slow down, HGM uses static analysis at compile time to eliminate as many of these generation checks as possible.
When an object is created, its allocated memory is guaranteed to be safe until its owning reference (here, s) is destroyed via a call to drop (i.e. drop(s)). This call is usually implicit when the owning reference goes out of scope.
HGM leverages this to eliminate generation checks for references to objects who’s owning reference is still available.
struct Spaceship {
fuel int;
}
exported func main() int {
s = Spaceship(10);
b = &s;
return b.fuel;
// s goes out of scope here,
// so there's an implicit drop(s),
// freeing the Spaceship.
}
Generation checks are also known as liveness checks.
See Generational References for more on how this works!
Vale has a couple different types of references. Objects are tied to the lifetime of their owning reference, usually the first reference that points to the object. Read more about single ownership and references at References.
Information about the type of reference and a unique identifier for each local is stored in the AST.
The AST node that dereferences a pointer contains a ‘knownLive’ field.
This field contains a boolean specifying whether the Object pointed to by the reference is known to be alive (meaning the generation check can be skipped).
Prior to Catalyst’s modifications, all ‘knownLive’ fields are false.
Catalyst uses 2 separate hashmaps for each scope within a program, one for mapping objects to liveness information, and one for mapping references to objects.
Catalyst’s tables in Vale syntax:
Objects = HashMap<str, bool>();
Variables = HashMap<str, str>();
Using the example from the previous section, Catalyst’s tables after each line in main would read as follows (assuming that each reference’s unique identifier is the variable name):
struct Spaceship {
fuel int;
}
exported func main() int {
s = Spaceship(10);
// Objects = {‘s’->true}
// Variables = {}
// s is an owning reference, so it is added to the Objects table; and
// because it has just been initialized, its value is set to true.
b = &s;
// Objects = {‘s’->true}
// Variables = {‘b’->’s’}
// b is a non-owning reference, so it is added to the Variables table;
// and because it references the object owned by s, its value is set to s
return b.fuel; // Requires liveness check
// implicit drop(s) here, so:
// Objects = {‘s’->false}
// Variables = {}
// When s is dropped, Catalyst sets the value of s to false in Objects
// and removes all references to s from Variables
}
How Catalyst parses the AST node(s) corresponding to the final line of main:
Once Catalyst has made all its modifications, it passes the new AST to Midas.
When the program is run, the generation check caused by ‘main’s return statement will be skipped because ‘knownLive’ will be true.
This is a basic algorithm showing what happens when we make a non-owning reference from an owning reference 3 and then immediately dereference it. This pattern is very simple and rarely seen in real Vale code, but it serves as the foundation for the next steps in Catalyst, where we'll improve the algorithm to track objects through function calls and struct members.
Remember that owning references never require generation checks because as long as the owning reference is available, its object is alive.