SandBox is a relatively safe and simplified virtual machine designed to minimize the usual hasssle in getting an operating system working. Much of the complexity of process management has been automated in architecturally supported semaphors, so that a full context switch, including memory manager retargeting, can be done in less than 100 machine instructions. Input and output has also been largely idealized and simplified to make writing device drivers as painless as possible. The virtual machine also includes a debugging mode with single-step capabilities.
FIXED SandBox now runs continuously without crashing!
FIXED The keyboard interrupt now really works OK, and the return key now (really) makes it in.
FIXED The virtual disk hardware is working correctly now.
FIXED The MMU hardware is now known to work. (2004 March 24, fixed a couple more MMU bugs)
The Pause button, if you click it to stop execution, turns back into the Run button and immediately clicks itself. Click on the Step button to stop execution.
Not very many parts are still untested.
Figure 1. The normal execution interface.
Clicking on the Debug button toggles between this form and the debugging mode shown in Figure 2, which includes all the interface elements of the normal execution mode, plus several additional debugging elements. These are described below. It also opens a second window for source level debugging. For more detail on how things work, see Understanding SandBox Code.
MemorySelectors Memory Display
Figure 2. The debugging mode interface.
Video Screen -- The large panel on the left is the simulated video screen. It is initially a light gray when SandBox is initially started, and changes to a dark gray as shown when you first roll the mouse over it, thus showing that the system is ready to begin.
Print Button -- Clicking this button activates the printer by opening an output text file to receive the printed output, or deactivates it again by closing the file.
Print File -- This is the name of the currently open print file. It is blank if no file is open.
Text -> Floppy -- Click this button to convert a specially formatted text file such as produced by the compiler, into a virtual floppy disk and mount it.
Mount -- Click this button to mount a virtual floppy disk, or to "eject" the the currently mounted floppy. It's a little flakey and often takes several attempts, but you can click on this button while in Run mode to change out a floppy without stopping execution.
Floppy File -- This is the name of the currently mounted virtual floppy disk file, or blank if none.
Boot -- Click this button to load a program from the currently mounted floppy disk into memory and start it. In Debug mode the program is paused, ready to step or run.
Restart -- Click this button to stop current execution and restart the program last booted. In Debug mode the program is paused, ready to step or run. Note that you can use this button to start up your boot program with a different floppy in the drive.
Debug -- Click this button to toggle the debug mode and enlarge or shrink the window to match.
Run/Stop -- Click this button to run continuously. To pause execution you should click some other button, such as Step.
Step -- Click this button to step one virtual machine instruction. If the source code is available, you can trace one line of source in the Source Code Debugger window. The effects will be shown on the trace panel.
PreMo -- Click this button after you have stopped in a repeatable error or a breakpoint, to restart execution and trace the last 1000 cycles before the stop. This is useful when you are deeply in your program when the error condition occurs, and you don't want to trace a million irrelevant lines just to find out what happened going into the error condition.
ClockInts -- If this checkbox is not clicked, you will not get clock interrupts in debug mode. This is helpful when single-stepping, because they come all the time otherwise.
Registers -- This is the current values of the four CPU registers, plus the MMU address register. If the MMU is active, it also shows the real addresses the four CPU registers map to. The checkbox indicates whether the MMU (and interrupts) are enabled.
Breakpoints -- (Not in this screen shot) You can type in up to 9 memory addresses, which will stop execution when the program touches that address, either as instruction, or else load or store. Delete the currently visible breakpoint by clicking the Delete button.
Trace -- This panel traces the stepped execution. If the Trace All checkbox is checked, it shows all instructions in Run mode; otherwise it shows only the first and last. Normally you would leave this unchecked when using the Source Code Debugger.
Memory Selectors -- These four combo popups make it easy to select memory to view based on a register value, or the I/O register area. You can also type in the hex address to be displayed. Choose +0020 to display the next 32 bytes after the line above it.
Memory Display -- These four lines
of text display the current memory contents as selected by the memory selectors.
All declared local variables in each stack frame are listed below the function name, as name, type, relative offset, absolute (real) address, and current value. Arrays and other structures larger than two bytes show only the first two bytes.
The Trace panel displays the most recent executed code. If the Trace Disasm checkbox is checked, the listing shows both source code and interleaved disassembled machine code, as shown. If it is not checked, only source code is shown. In any case this trace always executes complete lines before pausing. If you have the Printer enabled, it also writes the trace listing to that file.
The Step button executes machine code until the next embedded line number, which typically is one line of source code. It will stop with the line about to be executed last on the listing. In the screen shot here, line 164 is ready to be executed.
The Over button is like Step, but in the case of subroutine calls or context switches it continues without tracing until the subroutine or interrupt returns. If your memory manager moves the real address of the code or stack while this button is awaiting return, it probably will not recognize the return. Click the Pause button to regain control if this happens.
The StepOut button is like Over, but it continues until the current function exits. This can be useful if you step into a function then realize it will take a long time to process and its operation is not interesting at the moment. Note that StepOut has no way of knowing if it is activated in an interrupt handler, so it only applies to the current function in the current process. If the context switches while waiting, it will continue to wait until the context switches back.
The Disassemble button is used to create a complete, source-order listing of your program file, including the disassembled machine code. If you have the Printer enabled, it also writes the disassembly listing to that file. WARNING: There is a bug in VisualBasic that limits the scroll capability of large text fields. It will look like the disassembler has stopped, but it's still running; just wait, and when the listing panel returns to the top line, it will have finished.
The stack frame panel is continuously and fully refreshed during execution; if you need a reference copy of the current state, you can click the Stack --> Print button to get it added to the current open print file.
Note that gathering and formatting the debugging information takes considerably
longer than execution in the non-debug mode. With full trace it probably
runs three or more orders of magnitude slower than with the dibugger closed;
even if you are not currently tracing, it is likely to run two orders of
magnitude slower. For more detail on how things work, see Understanding
Memory is addressed by byte from 00000 to 1FFFF, which is divided into 1K pages by the MMU, the last page at address 1FC00 is reserved for Input/Ouput registers, but at this time all I/O registers are in the last 36 bytes. The penultimate memory page at 1F800 does not exist and always generates a memory fault on access. Except for instruction fetch, all memory access is in 2-byte words; there is no penalty for odd address access, but it should not normally occur in normal operation except for byte data. Word addressing is "Big-Endian" so that the addressed byte represents the most significant byte of the 16-bit word, and the least significant byte is in the next sequentially higher address.
The MMU and the semaphore mechanism require their data to be in the first 64K or 32K of memory, that is to have a 2-byte positive address.
There are four programmable 16-bit CPU registers:
SP -- The stack pointer always points to the top two bytes of the stack, which grows in a positive direction. All operations work on 2-byte data at the top of the stack, pushing or popping as necessary.
FP -- The frame pointer is the basis for local variable management. It is advanced to the current SP position on subroutine entry, and restored to its previous value on exit. Local variables are at positive or negative offsets from the FP.
PC -- The program counter always points to the next instruction to be executed, which unlike the other three registers, is often at an odd byte address. Most instructions are one byte, but the FP-based load and store instructions, and the branch and call instructions are generally two bytes. Two instructions occupy three bytes.
CB -- The Code Base register points to the beginning of a program's executable code, which starts with a table of 2-byte subroutine offsets, relative to CB. A program is thus potentially 64K bytes long and completely self-relative.
During context switch, all registers except the SP are pushed onto the
stack, then the SP is exchanged with another program's SP in lower memory,
then the three registers from that program's stack are popped, leaving
execution to resume in the new context. The MMU (and with it, interrupts)
is selectively disabled by context switched into the kernel, and re-enabled
on resuming user mode. Thus the kernel is expected to run largely with
the MMU turned off. The MMU provides all the system protection; there are
no priviledged instructions.
1FFFE -- (MMU.address) MMU page table addressThere is also a 16-byte Interrupt Vector at address 00000.
1FFFC -- (MMU.fault) MMU fault address
1FFFA -- (SignalWait.address) Semaphore address
1FFF8 -- (Enables.datum) Interrupt enable, Printer online, Floppy mounted, Disk interrupt pending
1FFF6 -- (Print.datum) Printer/Keyboard data
1FFF6 -- (KeyIn.datum) Printer/Keyboard data
1FFF6 -- (Mouse.datum) Mouse btn & modifier keys
1FFF4 -- (Disk.count) Disk byte count
1FFF2 -- (Disk.address) Disk DMA address
1FFF0 -- (Disk.sector) Disk sector number
1FFEE -- (Mouse.H) Mouse horizontal position
1FFEC -- (Mouse.V) Mouse vertical position
1FFEA -- (Video.address) Video DMA address
1FFE8 -- (Video.datum) Video color/operation
1FFE6 -- (Video.right) Video rectangle right
1FFE4 -- (Video.bottom) Video rectangle bottom
1FFE2 -- (Video.left) Video rectangle left
1FFE0 -- (Video.top) Video rectangle top
1FFDE -- (Date.seconds) Current time, in seconds in this hour (0..3599)
1FFDC -- (Date.hours) Date & time, in hours from midnight 2000 Jan 1
1FFD0 -- (MMU.CB, MMU.FP, MMU.PC) MMU double-fault register save
1FF00 -- MMU odometers
In normal usage, the operating system would normally allocate page 0 in lower memory for semaphore and process ID block usage, then the remainder of the program's required memory sequentially as needed from upper memory, then fill out the rest of the 64-byte table with 126, meaning that no memory is allocated for that page. The stack grows upward into that unallocated space, so additional pages can be allocated on demand. For dynamic memory requirements, the pages can be allocated downward from the top of the table, until 64K has been allocated. No single process can simultaneously access more than 64K, although a system might choose to offer paging facilities for swapping additional memory in and out of visibility.
In case of a page fault -- either attempting to write into a read-only page, or else any access to page 126 -- the MMU Page Fault interrupt is invoked. The (virtual) address being attempted is stored into the fault address register, where it is easily accessed by the interrupt driver for swapping or other management chores. If the cause of the fault was the stack growing into an unmapped or read-only page, then of course the processor state cannot be saved on the stack. In that case only, three registers in the I/O area (MMU.CB, MMU.FP, and MMU.PC) get the processor state. You can tell this happened if MMU.PC is non-zero, and you should copy all three of them into the appropriate places on the user stack after enabling a writable page there.
Note that the MMU register is only 16 bits, so the page table must always be in the first 64K of real memory; since you cannot even touch the second 32K without the MMU turned on, it is practically limited to the first 32K. The MMU is disabled during interrupt servicing, but you can still use the current MMU by invoking the mmx instruction just before the memory load or store inside the interrupt handler. The compiler gives you access to this operation in the mmPEEK() and mmPOKE() functions.
Implementing a virtual memory strategy requires
some additional information on page usage, so you can know which pages
to swap out. Every time a mapped page is touched two bytes in the I/O space
are incremented (up to a maximum of 255). At address 1FF00 are
128 bytes representing real memory pages. At 1FF80 are 64 more
bytes representing the pages visible under the current memory map table.
Depending on your aging strategy, you can use either or both tables. Reset
them to zero, then look at them from time to time to see what has been
used since they were reset; look at their values to see how many times
they were used (up to 255). These bytes are not updated for memory access
when the MMU is turned off or disabled (as in an interrupt driver), except
for mmx usage.
Note that the interrupted process SP is exchanged with the respective
interrupt handler's SP; the semaphore itself is left unchanged. It is the
process manager's responsibility (in the case of wait)
to remove the interrupted process SP from the interrupt control pointer
(SemSP in the MiniOS) and link it into
the semaphore; or (in the case of signal) to unlink a process
from the semaphore and add it to the ready list before returning to the
Note that the keyboard and mouse share this input register, so keyboard
data also has a sign bit reflecting the current state of the mouse button.
Disk input and output is controlled by a Direct Memory Access (DMA) controller: the program stores a memory address into the address register, a sector number into the sector register, and then stores a byte count to read or write into the count register. Once the count register has been set, the disk controller begins a seek to the sector address, then starts DMA transfer of the data bytes into or from memory, until the count is used up or the sector has been fully transferred. If the count is not yet zero, it continues to wait for the next sector to become available, then transfers another 256 bytes (or less), and so on. When the count reaches 0, the disk interrupt is invoked. A seek with no transfer can be triggered by issuing a count of 0.
If the count stored is positive, the disk data will be read; if negative, the data is written. Note that a whole sector is always written, regardless of the count value.
The floppy drive is accessed with negative sector numbers (sector 0 is -1), but otherwise works the same. If no floppy disk is mounted, it interrupts immediately with no transfer.
One of the user buttons on the SandBox window is labelled "Boot". If
there is a mounted floppy disk image, this button loads it into memory
at address 18000 and starts executing the program loaded. If there is no
image mounted, it requests a mount, then continues. The floppy disk is
read beginning from sector 4, which is where the "->Floppy" button puts
the compiled memory image from the OSL compiler. If your programs write
their own boot floppy disk, you need to be sure the code is in the right
place. If you attempt to boot a disk with other data in that sector, it
will still try to load and execute it, probably catastrophically.
The virtual screen has a "graphics accelerator" that does much of the video display work needed by a simple operating system. It has five commands, defined by the upper bits of the Video.datum register. All operations are immediate and synchronous; there is no need for an interrupt. (Actually, the operations take some real time, but the CPU is stalled for the duration).
To operate the video accelerator, the program first sets the boundary rectangle coordinates, top, left, bottom, and right. Depending on the operation, it may also set a memory address for the data. If there is color information, it is stored in the datum register with the display command. These are the defined commands:
0: Picture Copy. Use this to copy an image from memory to the screen. Each pixel of the image must be one of the 216 colors in a byte (the other values are not displayed); the low order bits of the command word are not a color (because the color information is in the data), but rather the number of bytes each line occupies (which could be different from the displayed image width). The image data is read from sequential bytes in memory in left-to-right row-major order.Each two bytes of font data represents one vertical column of the generated text, with the least significant bit at the top. Think of the bits in the character rotated 90 degrees to the left for storage, thus:
1: Fill rectangle with color. Use this also for horizontal and vertical lines by setting the rectangle with to one pixel.
2: Generate a character by setting the pixels of the top 16 bits of the rectangle to the specified color where the bits of the memory data is 1, and leaving the previous colors unchanged where the bits are 0.
3: Define a clipping region to draw in. There are two sub-commands, ClipRect to set an initially rectangular region to draw in, and HideRect to punch out rectangular holes in that region.
The color in each case is a single byte encoding one of six values each
for red, green, and blue, thus:
Summing the color values for each of the primaries gives a reasonable
distribution of colors. Black is 0, white is 215, and gray is any multiple
of 37. Using color values greater than 215 may have unpredictable results.
The clipping region is initially set unbounded. For a non-rectangular region, you should first set the outside bounding rectangle using ClipRect, then delete unwanted parts using HideRect. For example, in the region here, you would first set a ClipRect 10,20,50,90, then HideRect 10,65,25,90, and again HideRect 35,20,50,70. You are limited to somewhere between four to six holes (15 rectangular subregions), which should be sufficient for a simple windowing system with not too many overlapping windows.
// iVector is MMU
// iVector is Clock
// iVector is Disk
// iVector is KeyIn
// iVector is Print
// iVector is SysCall
// iVector is Signal
// iVector is Wait
The two push instructions push respectively a full 16-bit constant onto the stack, or else a small number between 0 and 31. The Load and Store frame instructions, ldf and stf, have a signed word number offset, which is doubled and used to access that word offset from the FP. Load copies that word and pushed it onto the top of stack, while store pops the top of stack and stores it into that word. The two indexed instructions ldx and stx pop a 16-bit address off the stack, and then either push the addressed word, or else pop another word to store at that address.
Everything in the SandBox virtual machine is (2-byte)
word oriented except the machine instructions. However, it is possible
to access individual bytes computationally. To load a byte, one simply
loads the 2-byte value ending with that byte, then masks off the unwanted
upper byte using the and instruction. Storing a byte into
memory can also be done by masking, but it is quite complex, so the StoreByte
instruction provides a somewhat simpler way to accomplish the same end.
This has the same operation code as the aix
instruction, but the top word on the stack is 0; the two words under it
are the same as for the stx instruction, except that stb
stores only the low byte of the lower word.
|16||leq||Less or equal|
|17||geq||Greater or equal|
|1F||div||Divide with remainder|
Negative neg replaces the top of stack with its two's complement negative. IsZero isz replaces the top of stack with 1 or 0 depending respectively on whether the previous value is zero or not. It can also be used as a logical NOT operator on boolean values.
The six comparison operators pop two numbers and push 1 (true) or 0 (false) as the two values are equal, less, greater, or not; the right operand is popped first, that is they are pushed onto the stack in left-to-right order. The three bitwise logical operators pop two numbers, perform the operation, and push the 16-bit result; they can also be used on boolean values. To obtain a bitwise complement, use xor with a second operand -1.
The arithmetic operators similarly pop their two operands, do the arithmetic, and push the result. The subtract instruction sub subtracts the top number from the second, again assuming they were pushed in normal left-to-right order. The integer divide instruction div pushes two results, first the quotient, then the remainder. Thus to obtain just the integer quotient, push the dividend, then the divisor, then do the divide, then pop (discard) the remainder. To obtain just the remainder, proceed as above, but swap (exchange) the results before popping. If the divisor (the first value to be popped) is zero, the divide operation does nothing, leaving both original operands on the stack unchanged.
The shift instruction shft is a little more complex:
the top value on the stack is taken as a signed value, the direction and
number of places to shift the second value. Positive shift values shift
left, and negative values shift it right with zero fill. The 16-bit shifted
result is pushed back on the stack; bits shifted off either end are lost.
|04||dupe||Duplicate top of stack|
|05||pop||Pop and discard top|
|06||swap||Exchange top two words|
These operations manipulate the top elements of the stack without altering
their values. Duplicate pushes a second copy of the top element; pop removes
and discards the top. The other two move the top word down one or two words
and bring the others closer to the top. These can be used, for example,
to duplicate the top two words and leave them in pairs: dupe,
This sequence leaves them reversed from the order they were originally
in, but that can be corrected by a couple more swaps.
|03||exit||Return from subroutine|
|Cx,xx||bz||Branch if zero|
There are seven instructions to effect programmed flow control. The three branch instructions add a signed offset (either from the instruction word, or else popped from the stack) to the PC to jump to some other part of the program, possibly nearby. The bz instruction first pops a value (typically the result of a compare) from the stack, then branches only if that value is zero; otherwise it continues with the next instruction in sequence. In an if-statement, the condition is evaluated, then bz branches to the else-part if the condition is false. In a while loop, the condition is evaluated, then bz branches out of the loop if the popped value is false. The indexed branch brx is normally used to calculate some number of bytes to jump over, such as into a switch table.
The call instruction assumes that the CB register points to the front of the program, and that there is a table of 2-byte offsets there; the low 13 bits of the call instruction select one of 8191 entries in the table, which is added to the CB to form the address of the subroutine entry. The previous value of the PC (the address of the instruction after the call) is pushed onto the stack, then the PC is replaced by the calculated subroutine address, and execution continues at that address. Subroutine #0 is not used; call#0 pushes the PC but continues with the next instruction. This can be used to calculate the real (virtual, if MMU is active) address of some other part of the program code, such as a constant table.
The enter and exit instructions set up and tear down the stack frame on subroutine entry and exit, respectively. Each operation takes a single operand which it pops off the stack, presumably placed there by a prior push instruction. In the case of enter, the operand represents the number of local variable words to allocate space for; in the case of exit, it is the number of argument words to remove from under the return address. Prior to executing the call instruction, the calling program should push first a placeholder for the return value, then any parameter values, then finally the address of the program globals (static variables). The call itself pushes the return address, then the entry sequence pushes a number representing the space to allocate for local varuables. At this point the enter instruction pops off that number and replaces it with the current (now previous) frame pointer, sets the FP to point to the globals address (just below the return address), and moves the SP up to accomodate the local variables requested, leaving the stack looking as at the left here. The parameters can be accessed with negative offsets on the FP, and the local variables with positive offsets. To access global or static variables, the program simply loads the globals address from offset 0, adds to it whatever offset is necessary to address the variable, then uses an indexed load or store.
This process is reversed in the case of an exit. The number pushed should be the number of words the caller pushed (not counting the result word, but including the globals pointer), and the exit instruction restores the old FP and PC from the stack, pops off the indicated number of argument words, leaving the stack with just the result on top, in the calling program, ready to store or otherwise use that result.
Giving a negative number of local variables to enter is nominally invalid, so it replaces it with the current FP and continues with the next instruction, without completing the entry protocol. This is useful for calculating an indexed address of a local variable.
The cocall and syscall operations are used for completely switching context between separate processes. The syscall essentially invokes a software interrupt; the interrupt handler can examine the stack contents for parameters, perform the requested function, and return with a cocall. These two operations have a further side effect: like all interrupts, the syscall instruction disables further (hardware) interrupts and the MMU; the cocall instruction always enables interrupts and the MMU (if it has a non-zero table address).
The interrupt vector in the first 16 bytes of memory gives memory addresses
for eight interrupts, one of them (000A) being syscall.
It is assumed that at the addressed memory location there is a stored SP
for the interrupt handler, which the interrupt exchanges the current SP
(of the interrupted program) for this memory value to effect a complete
context switch; note that the other three CPU registers are saved on and
reloaded from their respective stacks. When the handler completes its duties,
it pushes the address of this same memory location, then executes a cocall
to reverse the context switch and be ready for the next interrupt.
|07||aix||Array bounds check|
The mmx instruction overrides the MMU disable for one instruction, so that an interrupt handler can access a memory location using the interrupted program's MMU table. Normally the interrupts and MMU are disabled by an interrupt (including syscall). However it may be necessary to access the interrupted program data space using addresses supplied by that program. Since the program has no way of knowing its own real addresses, the system must do that; mmx makes it substantially easier. Although the user program can also execute the mmx instruction, it has no effect because the MMU is already enabled.
The Array Index Bounds Check aix instruction takes the address of the front of an array, the calculated byte offset, and the number of bytes allocated to the array (pushed in that order) to determine if the calculated byte offset (which is the programmed index value multiplied by the element size) lies within the array bounds; the top of stack is replaced with a boolean true or false (1 or 0) exactly as the index value is within bounds or not. The base address and offset are also summed and left below the boolean result. For dynamic arrays whose size cannot be known at compile time, a negative size can be pushed, and this instruction looks at the value stored in the two words immediately before the array for a byte size number to use. An array size of zero must be checked by the dynamic method, because a zero compile-time size is reserved for the stb instruction, which uses the same opcode.
The nop instruction has no effect on the program state (other than advancing the PC; its sole purpose is to embed line numbers in the code, so to link to source code in the debugger.
Process management in SandBox is complex enough
to deserve its own document.
Rev: 2004 April 6