Lisp Personal Computer
A literate program for a minimal lisp virtual machine
This document is a literate program for a simple virtual machine that runs a
very minimalistic version of a lisp. This is the result of an investigation into
the intersection between Uxn (by 100rabbits) and the Lisp Machines of yore. The
core idea is that the machine is a single contiguous array of cells. Cells are
what are commonly known as CONS cells, they are 32 bit and are split in two,
16 bits + 16 bits.
In the lisp tradition we'll call the first 16 bits the CAR of the cell and the
second half the CDR of the cell. The CAR and the CDR are each a tagged
value with two possible types:
| Type | Bit pattern | Meaning |
|---|---|---|
PTR |
\(\underbrace{0}_{\text{tag}}\,\underbrace{a_{14}\,a_{13}\,\cdots\,a_{0}}_{\text{15-bit address}}\) | Pointer to another cell |
INT |
\(\underbrace{1}_{\text{tag}}\,\underbrace{n_{14}\,n_{13}\,\cdots\,n_{0}}_{\text{15-bit value}}\) | Numerical value |
With this scheme it's clear that we have 15 bits for addressing and thus we have
\(2^{15} = 32768\) cells in the machine. I decided to have a PTR type separate
from INT type is to facilitate two things: Know when you can follow a value
and to simplify building a GC. We'll talk about the GC later down, since the
idea is to implement it as a program from within the VM instead of being a
property of the VM.
#define CELLCOUNT 32768
That the second type is INT doesn't mean we cannot represent other types, for
example, it's easy to see that we could restrict ourselves to the ASCII range
and using the CONS cell structure we could build strings as lists of
ASCII-range INT.
The objective of LPC is a system where as much as possible is done inside the VM so it can be changed and evolved by the user.
1. The Cell Array
At the core of the LPC machine is the cell array (or memory). It holds all
memory for the LPC machine with a predefined structure. Each cell is a CONS
cell, and thus has by default two pieces as explained above. This gives us some
properties:
- We don't need to reserve contiguous pieces of memory, every structure that is
connected is a linked list using the structure of the memory itself. Code is
linked lists, frames are linked lists, the stack is a linked list. As long as
a cell is free you can allocate. This is interesting in a machine with limited
memory like this since there is virtually no fragmentation (except unused
CDRI guess) - Unified data model. Code truly is data (and data is code if you want as
well!), there is nothing special about code. It's just a list of opcodes (as
INT). Technically you can point theINSTRPto any cell and it may execute it (although probably not since it will probably not be in the opcode range for random cells).
The unified data model plus the malleable cell memory also has the side effect that there is really no distinction between a macro and a function. Any function can modify any other function at any time.
With this cell memory concept in place, we can now define the full struct of the machine, the MACHINE_T struct.
typedef struct machine {
cell_t memory[CELLCOUNT];
} machine_t;
Where CELL_T is a small struct of the CAR and the CDR of the cell.
typedef uint16_t cval_t;
typedef struct cell { cval_t a, d; } cell_t;
We could have this as a global array, but wrapping it in a struct allows us to spawn several of these machines if needed in the future. A design decision that was made here is that the execution can be paused at any time and serialized by just saving the whole memory, so all state necessary for execution lives inside the cell memory.
There are several cells in the system that have a special meaning.
| Cell | Name | Meaning |
|---|---|---|
0x0 |
GFRAME |
Pointer to the global frame (where top-level defs live) |
0x1 |
FREEC |
Head of the free cell list |
0x2 |
INSTP |
Instruction pointer |
0x3 |
DSTACK |
Top of the data stack |
0x4 |
RSTACK |
Top of the return stack |
On machine initialization (for a bare image), we set the memory to zero, and then populate the free list by chaining all cells.
static inline void initialize_machine(machine_t *m) {
memset(m->memory, 0, sizeof(m->memory));
for (cellptr_t i = 5; i < CELLCOUNT - 1; i++) {
m->memory[i] = (cell_t)MKPTR(i + 1);
}
m->memory[FREEC] = (cell_t)MKPTR(4) << 16;
}
At this point we have the machine in a basic known state:
| Component | State |
|---|---|
| Globals frame | Empty (points to nil) |
| Free list | Occupies all memory (except 5 reserved cells) |
instrp |
Points to nil |
| Both stacks | Empty (point to nil) |
At this point nothing can execute, since we are pointing to nil. We'll talk about bootstrapping later.
2. The Cell Operations
Now we are going to talk about the main cell operations that will be pervasive throught the code. We'll start by creating some simple macros to deal with the two types of tagged values.
#define TAG(x) ((x) >> 15) #define VAL(x) ((x) & 0x7FFF) #define MKPTR(a) ((uint16_t)((a) & 0x7FFF)) #define MKINT(n) ((uint16_t)(((n) & 0x7FFF) | 0x8000)) #define IS_PTR(x) (TAG(x) == 0) #define IS_INT(x) (TAG(x) == 1)
Then we have the actual core cell functions that operate on the machine, let's
start with reads. Given a cell address, get the CAR or CDR.
cval_t car(machine_t *m, cval_t ptr) { return m->mem[ptr].a; }
cval_t cdr(machine_t *m, cval_t ptr) { return m->mem[ptr].d; }
We have to have a way to construct a new cell from a CAR and a CDR, for that
we'll use the traditional CONS function. We'll look at CELL_ALLOC after this
since it's less semantically important and more of an implementation detail.
static inline cellptr_t cons(machine_t *m, uint16_t car, uint16_t cdr) {
cellptr_t new_cell = cell_alloc(m);
m->memory[new_cell] = (car << 16) | cdr;
return new_cell;
}
We also write some helpers to set the CAR and CDR of a cell address.
void scar(machine_t *m, cval_t ptr, cval_t v) { m->mem[ptr].a = v; }
void scdr(machine_t *m, cval_t ptr, cval_t v) { m->mem[ptr].d = v; }
Finally we take a look at CELL_ALLOC, it pops from the free list and updates
the FREEC cell. We also add another related helper that does the opposite
operation, FREE.
static inline cellptr_t cell_alloc(machine_t *m) {
cellptr_t head = VAL(car(m, FREEC));
scar(m, FREEC, cdr(m, head));
m->memory[head] = 0;
return head;
}
void cell_free(machine_t *m, cval_t cell_ptr) {
scdr(m, cell_ptr, car(m, FREEC));
scar(m, FREEC, cell_ptr);
}
We introduce our first opcodes
| Opcode | Sig | Description |
|---|---|---|
CONS |
CAR CDR -- ADDR |
Takes two elements on the stack and creates a freshly allocated CONS from it |
CAR |
ADDR -- VAL |
Takes an address from the stack and pushes the value in the CAR of the pointed cell |
CDR |
ADDR -- VAL |
Takes an address from the stack and pushes the value in the CDR of the pointed cell |
SCAR |
ADDR VAL -- |
Takes an address and a value and sets the CAR pointed by ADDR to VAL |
SCDR |
ADDR VAL -- |
Takes an address and a value and sets the CDR pointed by ADDR to VAL |
3. Code Representation in the LPC
After understanding cells we can now explain how code is represented. Code is
another linked list, each CAR of the list is an opcode (or opcode params) and
the CDR points to the next instruction (or param). This means that there is an
implicit JMP-like instruction after each instruction. We can use this to
encode loops and branches as we'll see.
Now I'm going to introduce the assembly language for LPC, named HIROI for
now. HIROI is s-expr based and it looks a lot like a lispy Forth. It is very
close to the Instruction Set, but with some syntactic sugar around nested
blocks, literals and pointers. A simple example:
(#3 #4 + HALT)
If you are familiar with Forth this should be very easy to understand, push
immediate 3 and 4 to the stack add them up and then halt execution. Internally
this would be represented as a series of CONS cells.
(LIT . (4 . (LIT . (4 . (ADD . (HALT . NIL))))))
Now let's examine an if expression in MPL
(#3 #4 < BRNCH (#100) (#200) #5 + HALT)
We push 3 and 4 to the stack and compare them, this gives us a 1 (truth value), then the if checks this value and depending on if it's a truthy value or not it will change the instruction pointer to either the list of code at the then or the else position. This will basically create independent strings of code.
;; At address A (LIT . (100 . @D)) ;; At address B (LIT . (200 . @D)) ;; At address C (LIT 3 LIT 4 < BRNCH @A @B) ;; At address D (LIT 5 ADD HALT)
if expression compiled to MPL.
@D is syntax to create a pointer literal. As you can see we encode the
execution graph in the cons cells themselves so no JMP instruction is needed.
The basic follow cons cells by following the pointer in the CDR takes care of
that. This is even more exemplified by the loop structure a loop is basically an
if where the then branch folds onto itself.
(#0 &loop-start dup #10 < BRNCH (#1 + . @loop-start) (HALT))
Several things to unpack here:
&loop-start is MPL's synax for a label and can be paired with @ to then
point to that cons cell. The . syntax allows you to set the CDR value of the
end of a list instead of the default. In this case we are setting the CDR of
the final cell of the then-branch to the loop-start. So just by following the
next cell in the chain you loop back to the start. The else-branch then
becomes what happens after the loop.
In this case we end up with three lists of code in the memory:
;; At address A, the loop body (LIT . (1 . (ADD . @(C + 2))) ;; At address B, after the loop (HALT . NIL) ;; At address C, before the loop and loop comparison (LIT . (0 . (DUP . (LIT . (10 . (LTH . (BRNCH . (@A . (@B . NIL))))))))))
loop compiled to MPL.
DUP is not an opcode, instead is a HIROI word that operates on the stack by
means of the cell operations. HIROI operates very much like a Forth does,
there are no frames, no returns, it's all operations on the data stack and
jumping from procedure to procedure. It's very uncomplicated.
4. The Stacks
There are two stacks in LPC, the data stack and the return stack. The data
stack holds all data for calculations, operations, etc. While the return stack
holds the call frames. We'll talk about frames and functions in the next
section. For now let's talk about how stacks are managed.
We introduce two functions, PUSH and POP. That operate on arbitrary stacks
given a stack ptr (usually either DSTACK, RSTACK).
void push(machine_t *m, cval_t stack_ptr, cval_t v) {
cval_t top_ptr = car(m, stack_ptr);
cval_t new_top = cons(m, v, top_ptr);
set_car(m, stack_ptr, new_top);
}
void pop(machine_t *m, cval_t stack_ptr) {
cval_t top_ptr = car(m, stack_ptr);
cval_t val = car(m, top_ptr);
scar(m, stack_ptr, cdr(m, top_ptr));
scdr(m, top_ptr, car(m, FREEC));
scar(m, FREEC, top_ptr);
return v;
}
These functions are needed for bootstrapping since we need them for the core interpreter (for now? trying to get them out. one of my objectives is to make the stack exist on top of the machine not be an intrinsic part of it).
5. Frames & Functions
I'm going to define how frames work in the default runtime and language inside
of LPC, KODAKAI (小高い). We are no longer in the VM, the VM is over,
everything from now on is happening in "user-space" (for lack of a better term).
And frames, calling conventions, calling a function or a closure is all
happening in "user-space". We don't have a CALL, RET or TAIL opcode. These
will be procedures in HIROI (従).
A frame is (you guessed it), a chain of CONS cells. The cell contains
the pointer to the parent frame in the CAR, a the INSTP to return to on
return and a pointer to the head of a list of locals. The list of locals are a
list of cons cells, each cell contains the value of that local in the CAR. The
CDR of course points to the next element in the list.
When a function is called, the compiler needs to emit first the creation of a frame that is then pushed to the return stack. Any operation over a local is done at a particular depth (how many parents above) and index (which cons cell in the frame). It then populates arguments from the stack to the first locals. And then it starts.
On return that frame is popped from the return stack and the INSTP is restored.
5.1. Closures
A closure is a CONS cell with the CAR being the first cell of code, and
the CDR being the frame it was created in.
6. GC
TODO, but some thoughts and ideas:
- Built as a program in the system itself, so it's fully modifiable at runtime.
- How do you make sure you can GC in a system low on memory in these conditions?
- XOR a fingerprint while it runs as a kind of a marker of where we reached? Then a final pass that clears everything not marked. Should never delete