Understanding SandBox Code


This document is a tutorial to walk you through some of the technical details that you should have a grasp of when designing and debugging an operating system for SandBox. For best results, you should try stepping through your own copy of the compiled code (actual addresses may be different), and looking at what is in memory where. Skimming over a listing like this is not nearly as instructive as watching your own go.

Fast forward to: Clock Interrupt; Semaphores
 

Understanding a Trace

Consider this simple program:
// Tiny Demo Program

int abc;           // 4. first uninitialized var
int def = 5;       // 1. first initialized var
int[5] ghi;        // 3. last static array
boolean tf = true; // 2. second initialized var
char ccc;          // 5. other uninitialized var(s)

void doSomething(int p) { // first method, #2
  if (tf) abc = p;
  } // end of doSomething

void main() { // second method, #3
  int x;      // 2. first uninitialized var
  int y = 3;  // 1. first initialized var
  x = 7;      // first executable, non declaration
  ghi[y] = def + y;
  ccc = '\r';
  doSomething(ORD(ccc));
  } // end of main

Disassembled (in the Source Debugger window), it looks like this, with annotations added:
0000:  000A            -- CB points here; offset to startup code
0002:  0091 _Up        -- offset to subroutine #1, the ref counter for dynamic pointers (at end)
0004:  002E doSomething -- offset to subroutine #2
0006:  004E main       -- offset to subroutine #3, main()
0008:  FFF0

(_Startup_)
000A: 20     push 0    -- first instruction to execute, allocate 0 bytes of variable space
000B: 02     enter
000C: 25     push 5    -- first initialized variable, def, in memory address 00016
000D: 21     push 1    -- second initialized variable, tf, in address 00018
000E: 30     push 16    -- set up one array...
000F: 10     neg       -- -16 is the ref count (=infinity, never dispose it)
0010: 2A     push 10    -- 10 byte size, used by LENGTH()
0011: 20     push 0    -- fill in zeros for uninitialized data (including that array)
0012: 20     push 0
0013: 20     push 0
0014: 20     push 0
0015: 20     push 0
0016: 20     push 0
0017: 20     push 0
0018: 20     push 0    -- other arrays (if they exist) would have their size and ref count set here
                          1: // Tiny Demo Program
                          2:
0019: 000002 nop/line # 2 -- list all unlisted lines so far, in this case, #1 and #2
001C: 4000   ldf 0      -- copy the base address of program globals, = 0010
001E: 6003   call #3 main -- call it
0020: 20     push 0    -- return here when main is done
0021: 03     exit

(_Up)
0091: 20     push 0    -- this subroutine up-counts dynamic variables
0092: 02     enter
0093: 4000   ldf 0
0095: C012   bz 18 (00A9)
0097: 4000   ldf 0
0099: 24     push 4
009A: 1D     sub
009B: 04     dupe
009C: 0C     ldx
009D: 30     push 16
009E: 10     neg
009F: 19     and
00A0: 11     isz
00A1: C006   bz 6 (00A9)
00A3: 04     dupe
00A4: 0C     ldx
00A5: 21     push 1
00A6: 1C     add
00A7: 06     swap
00A8: 0D     stx
00A9: 20     push 0
00AA: 03     exit

(doSomething)
002E: 20     push 0    -- first subroutine in source code (no local variables)
002F: 02     enter     -- space for return value (none), parameter p, globals pointer,
                       --   and return address, are already on stack; this adds saved FP
                          3: int abc;           // 4. first uninitialized var
                          4: int def = 5;       // 1. first initialized var
                          5: int[5] ghi;        // 3. last static array
                          6: boolean tf = true; // 2. second initialized var
                          7: char ccc;          // 5. other uninitialized var(s)
                          8:
                          9: void doSomething(int p) { // first method, #2
                         10:   if (tf) abc = p;
0030: 00000A nop/line # 10 -- list all unlisted lines so far, 3-10
0033: 4000   ldf 0      --tf is a global variable, at 00018 = (globals)+8
0035: 28     push 8
0036: 1C     add       -- now its address is on the stack
0037: 0C     ldx       -- get it
0038: C00A   bz 10 (0044) -- branch forward 10 bytes if false (to code address 0044)
003A: 00000A nop/line # 10 -- line #10 repeated here, for the statement inside the if
003D: 4FFF   ldf -2     -- parameter p is at negative offset -2 from FP
003F: 4000   ldf 0      --abc is another global variable, at 00028 = (globals)+24
0041: 38     push 24
0042: 1C     add
0043: 0D     stx       -- store it
                         11:   } // end of doSomething
0044: 00000B nop/line # 11
0047: 22     push 2    -- unstack parameter p and globals pointer (2 items) when exiting
0048: 03     exit

(main)
004E: 20     push 0
004F: 02     enter
0050: 23     push 3    -- first initialized variable, y, in memory address 00034
0051: 20     push 0    -- uninitialized variable, x, in memory address 00036
                         12:
                         13: void main() { // second method, #3
                         14:   int x;      // 2. first uninitialized var
                         15:   int y = 3;  // 1. first initialized var
                         16:   x = 7;      // first executable, non declaration
0052: 000010 nop/line # 16
0055: 27     push 7    -- first executable line of non-declaration code in main
0056: 5004   stf 8      -- store 7 into local variable at offset +8, which is x
                         17:   ghi[y] = def + y;
0058: 000011 nop/line # 17
005B: 4000   ldf 0      -- get global variable def
005D: 26     push 6
005E: 1C     add
005F: 0C     ldx
0060: 4003   ldf 6      -- get local variable y
0062: 1C     add       -- sum, ready to store;
0063: 4000   ldf 0      -- get address of global variable ghi
0065: 2E     push 14
0066: 1C     add
0067: 4003   ldf 6      -- get local variable y
0069: 22     push 2    -- convert index to byte offset by multiplying *2
006A: 1E     mpy
006B: 2A     push 10    -- check for out of bounds...
006C: 07     aix/stb
006D: 11     isz
006E: C002   bz 2 (0072) -- branch if OK
0070: 21     push 1    --SysCall #1 if not OK
0071: 09     syscall
0072: 0D     stx       -- it's OK, store the value there
                         18:   ccc = '\r';
0073: 000012 nop/line # 18
0076: 2D     push 13    --ASCII value of return character
0077: 4000   ldf 0
0079: 3A     push 26
007A: 1C     add
007B: 0D     stx
                         19:   doSomething(ORD(ccc));
007C: 000013 nop/line # 19
007F: 4000   ldf 0      -- get global variable ccc
0081: 3A     push 26
0082: 1C     add
0083: 0C     ldx       -- it's the first param to doSomething
0084: 4000   ldf 0      -- copy globals pointer, also for doSomething
0086: 6002   call #2 doSomething

0088: 000014 nop/line # 20
008B: 21     push 1    -- got back, and done. There are no parameters to unstack.
008C: 03     exit

Here now is the execution trace for the same program (clicking the Step button in the source debugger window), similarly annotated. You need to become familiar with this mode of debugging, it's about all we have when we are doing system software (well, actually, it's a lot more than we normally have):
F00A: 20     push 0, 0014=0000 -- execution starts with first executable instruction; CB=F000
F00B: 02     enter #0 =0014
F00C: 25     push 5, 0016=0005  -- here is first variable, def, stack address 0016, value 0005
F00D: 21     push 1, 0018=0001
F00E: 30     push 16, 001A=0010
F00F: 10     neg 001A=FFF0
F010: 2A     push 10, 001C=000A
F011: 20     push 0, 001E=0000  -- this part of the initialization can take a while,
F012: 20     push 0, 0020=0000  --  if there are many global variables to clear
F013: 20     push 0, 0022=0000
F014: 20     push 0, 0024=0000
F015: 20     push 0, 0026=0000
F016: 20     push 0, 0028=0000
F017: 20     push 0, 002A=0000
F018: 20     push 0, 002C=0000
                          2:
F019: 000002 nop # 2
F01C: 4000   ldf 002E=0010<-00010 -- globals pointer = 0010
F01E: 6003   call #3 main
F04E: 20     push 0, 0032=0000  -- now we are in main()
F04F: 02     enter #0 =0032
F050: 23     push 3, 0034=0003  -- initializing local variables...
F051: 20     push 0, 0036=0000
                         16:   x = 7;      // first executable, non declaration
F052: 000010 nop # 16
F055: 27     push 7, 0038=0007  -- now the main execution...
F056: 5004   stf = 0007<-00036  -- value 7 is stored into memory address 00036
                         17:   ghi[y] = def + y;
F058: 000011 nop # 17
F05B: 4000   ldf 0038=0010<-0002E
F05D: 26     push 6, 003A=0006
F05E: 1C     add 0038=0016
F05F: 0C     ldx 0038=0005<-00016
F060: 4003   ldf 003A=0003<-00034
F062: 1C     add 0038=0008
F063: 4000   ldf 003A=0010<-0002E
F065: 2E     push 14, 003C=000E
F066: 1C     add 003A=001E
F067: 4003   ldf 003C=0003<-00034
F069: 22     push 2, 003E=0002
F06A: 1E     mpy 003C=0006
F06B: 2A     push 10, 003E=000A
F06C: 07     aix 003C=1        -- result of array bounds check is true (in-bounds)
F06D: 11     iszero 003C=0     -- invert that result
F06E: C002   bz 2 (0)          -- branch on false, top of stack value is false, so it branches
F072: 0D     stx = 0008<-00024  -- store value into calculated array address 00024
                         18:   ccc = '\r';
F073: 000012 nop # 18
F076: 2D     push 13, 0038=000D
F077: 4000   ldf 003A=0010<-0002E
F079: 3A     push 26, 003C=001A
F07A: 1C     add 003A=002A
F07B: 0D     stx = 000D<-0002A
                         19:   doSomething(ORD(ccc));
F07C: 000013 nop # 19
F07F: 4000   ldf 0038=0010<-0002E
F081: 3A     push 26, 003A=001A
F082: 1C     add 0038=002A
F083: 0C     ldx 0038=000D<-0002A -- push value of ccc (=13, fetched from calc'd address 0002A)
F084: 4000   ldf 003A=0010<-0002E -- push my copy of globals pointer
F086: 6002   call #2 doSomething
F02E: 20     push 0, 003E=0000   -- now we are in subroutine doSomething
F02F: 02     enter #0 =003E
                         10:   if (tf) abc = p;
F030: 00000A nop # 10
F033: 4000   ldf 0040=0010<-0003A
F035: 28     push 8, 0042=0008
F036: 1C     add 0040=0018
F037: 0C     ldx 0040=0001<-00018 -- get global variable tf, value true (=1)
F038: C00A   bz 10 (1)           -- branch on false, top of stack value is true, so no branch
                         10:   if (tf) abc = p;
F03A: 00000A nop # 10
F03D: 4FFF   ldf 0040=000D<-00038 -- parameter value p, (int) 13
F03F: 4000   ldf 0042=0010<-0003A
F041: 38     push 24, 0044=0018
F042: 1C     add 0042=0028
F043: 0D     stx = 000D<-00028
                         11:   } // end of doSomething
F044: 00000B nop # 11          -- stop here and "print" the stack frame (see below)
F047: 22     push 2, 0040=0002
F048: 03     exit #2, 0036=0007
F088: 000014 nop # 20          -- back in main() with nothing to do except exit again...
F08B: 21     push 1, 0038=0001
F08C: 03     exit #1, 002C=0000
F020: 20     push 0, 002E=0000  -- back in startup code with nothing to do except exit again...
F021: 03     exit #0, 0010=0010
0000: --- Stop (PC=0) ---       -- so quit.
If we stop tracing for an instant before exiting the subroutine, we can capture the stack frames for later analysis:
--- Test ---

doSomething                    0003A -- the FP for this instance of doSomething
                p I  (00038) = 000D  -- parameter value p, = (int) 13, at FP offset -2

main                           0002E -- the FP for main()
                y I  (00034) = 0003  -- local variables, type, address, and current value
                x I  (00036) = 0007

(Startup)                      00010 -- the globals pointer
              ccc C  (0002A) = 000D  -- global variables...
               tf B  (00018) = 0001
              ghi A  (0001E) = 0000  -- only first element of array is shown.
              def I  (00016) = 0005
              abc I  (00028) = 0000

At any time we can also look at a short non-disassembled trace, which is captured even when the debug mode is turned off. It's only available in the open Print file, by clicking the PreM button:
(Recent trace)
  F00A 0014 0000 FFFF -- you must look at your disassembly listing to know where this is,
  F00B 0014 0010 0000 --   and what it's trying to do: F00B is the instruction address
  F00C 0016 0005 FFFF -- 0016 is the current stack pointer, and 0005 is likely the value there
  F00D 0018 0001 FFFF --   (different instructions may show other useful info instead)
  F00E 001A 0010 FFFF
  F00F 001A FFF0 FFFF
  F010 001C 000A FFFF
  F011 001E 0000 FFFF
  F012 0020 0000 FFFF
  F013 0022 0000 FFFF
  F014 0024 0000 FFFF
  0000 0000 0000 0000 -- a line of all zeros is where SandBox stopped to look for keystrokes
  F015 0026 0000 FFFF
  F016 0028 0000 FFFF
  F017 002A 0000 FFFF
  F018 002C 0000 FFFF
  F019 002C FFFF 0002 -- line #2
  0000 0000 0000 0000
  F01C 002E 0010 0010
  F01E 002E FFFF FFFF
  F04E 0032 0000 FFFF
  F04F 0032 0010 0000
  F050 0034 0003 FFFF
  F051 0036 0000 FFFF
  0000 0000 0000 0000
  F052 0036 FFFF 0010
  0000 0000 0000 0000
  F055 0038 0007 FFFF
  F056 0038 0007 0036 -- storing 7 into address 0036
  F058 0036 FFFF 0011
  0000 0000 0000 0000
  F05B 0038 0010 002E
  F05D 003A 0006 FFFF
  F05E 0038 0016 FFFF
  F05F 0038 0005 0016 -- fetching 5 from address 0016
  F060 003A 0003 0034 -- fetching 3 from address 0034
  F062 0038 0008 FFFF -- add them
  F063 003A 0010 002E
  F065 003C 000E FFFF
  F066 003A 001E FFFF
  F067 003C 0003 0034
  F069 003E 0002 FFFF
  F06A 003C 0006 FFFF
  F06B 003E 000A FFFF
  F06C 003C 000A 0001
  F06D 003C 0000 FFFF
  F06E 003C 0000 F072
  F072 003A 0008 0024
  F073 0036 FFFF 0012
  0000 0000 0000 0000
  F076 0038 000D FFFF
  F077 003A 0010 002E
  F079 003C 001A FFFF
  F07A 003A 002A FFFF
  F07B 003A 000D 002A
  F07C 0036 FFFF 0013
  0000 0000 0000 0000
  F07F 0038 0010 002E
  F081 003A 001A FFFF
  F082 0038 002A FFFF
  F083 0038 000D 002A
  F084 003A 0010 002E
  F086 003A FFFF FFFF
  F02E 003E 0000 FFFF
  F02F 003E 002E 0000
  F030 003E FFFF 000A
  0000 0000 0000 0000
  F033 0040 0010 003A
  F035 0042 0008 FFFF
  F036 0040 0018 FFFF
  F037 0040 0001 0018
  F038 0040 0001 F03A
  F03A 003E FFFF 000A
  0000 0000 0000 0000
  F03D 0040 000D 0038
  F03F 0042 0010 003A
  F041 0044 0018 FFFF
  F042 0042 0028 FFFF
  F043 0042 000D 0028
  F044 003E FFFF 000B
  0000 0000 0000 0000
  F047 0040 0002 FFFF
  F048 0036 F088 0002
  F088 0036 FFFF 0014
  0000 0000 0000 0000
  F08B 0038 0001 FFFF
  F08C 002C F020 0001
  F020 002E 0000 FFFF
  0000 0000 0000 0000
  F021 0010 0000 0000 -- final exit
  0000 0000 0000 0000
(Memory)             -- here follows a dump of all non-zero memory...
00000  0000 0000 0000 0000 0000 0000 0000 0000  0010 0000 0010 0005 0001 FFF0 000A 0000
                                               ^^^ global variables  ^^^...
00020  0000 0000 0008 0000 000D 000D 0000 0000  DEAD BEEF DEAD BEEF DEAD BEEF DEAD BEEF
                                    the stack grew into these (DEADBEEF), then shrank
00040  DEAD BEEF DEAD 0000 0000 0000 0000 0000  0000 0000 0000 0000 0000 0000 0000 0000
                                               ^^^ unused memory ^^^
1EFE0  0000 0000 0000 0000 0000 0000 0000 0000  0000 0000 0000 0000 0000 0000 0000 00AD
       program code follows...
1F000  000A 0091 002E 004E FFF0 2002 2521 3010  2A20 2020 2020 2020 2000 0002 4000 6003
1F020  2003 646F 536F 6D65 7468 696E 670B 2002  0000 0A40 0028 1C0C C00A 0000 0A4F FF40
1F040  0038 1C0D 0000 0B22 036D 6169 6E04 2002  2320 0000 1027 5004 0000 1140 0026 1C0C
1F060  4003 1C40 002E 1C40 0322 1E2A 0711 C002  2109 0D00 0012 2D40 003A 1C0D 0000 1340
1F080  003A 1C0C 4000 6002 0000 1421 035F 5570  0320 0240 00C0 1240 0024 1D04 0C30 1019
1F0A0  11C0 0604 0C21 1C06 0D20 0300 0000 0000  0000 0000 0000 0000 0000 0000 0000 0000
In this post-mortem dump, you can see the current values of all variables. If your program crashes and you cannot easily step up to the point of crash, sometimes you can tell from the debris left behind what might have happened.  In particular, you can look at the last 200 or so instructions before it died, and you can look at what's in memory. For example, here we see that the stack grew up to address 0046 at its deepest. Fortunately, there's nothing beyond that for it to bump into, but if you are allocating many stack frames, you can look here for evidence that one of them overran its allotted space and crushed somebody else's data. Also, the global variables are still intact. In particular, you can see the 0008 still in the ghi array, at offset 6 (absolute address 0024, which would be ghi[3]). Also you might notice a duplicated 000D in 0028 and 002A; that's often a hint that something went wrong, but in this case it's just variable abc = 13, and ccc = '\r'.
 

Clock Interrupt

OK, let's look at something a little more complicated. You've seen this code:
void DoClock() {
  // This interrupt happens 8x each second, for timing purposes;
  //   we use it to maintain theTicks.
  int SemSP = 0;  // maps to ClockFrame[3]
  int prior; // holds previous Date.seconds reading, updated
  int temp;
  ClockFrame[6] = &ClockFrame[9];  // initial SP
  ClockFrame[8] = &ClockFrame[0];  // initial FP
  ClockFrame[0] = mmPEEK((&SemSP)-6);  // copy my globals ptr
  ClockFrame[1] = 0;        // halt if it returns (it shouldn't)
  SemSP = FORK(&ClockFrame[6]);  // returns twice:
  if (SemSP==0) { // returned in caller's process..
    iVector[1] = &ClockFrame[3]; // set interrupt vector
    return;}
  // otherwise returned in new process (first)..
  mmPOKE(SemSP-6,0);  // clear caller's return value
  while (true) {  // this is the interrupt handler loop..
    COCALL(&SemSP);  // wait for interrupt
    temp = Date.seconds;
    if (temp != prior) { // ticks management, got a new second..
      prior = temp;
      theTicks = prior*8;}
    else theTicks = theTicks+1;
    // YOUR CODE GOES HERE
  } } // ~DoClock
There's a certain amount of migic in my choice of where to put things -- meaning, I thought about it hard, then set things up and made sure they worked, and then tweaked them when they didn't. That's why this code was supplied for you. The hardest part to grasp is that we are building a stack frame for the interrupt service routine to use long after we are gone. The whole frame will reside inside the global array ClockFrame, which is located at address 00102. Here's what we want it to look like at the instant of the CoCall (inside the FORK line):
0114  9F2A  ClockFrame[9], this will become PC inside the FORK line; SP points here
0112  0102  ClockFrame[8], this is initially the FP, pointing to ClockFrame[0]
0110  8000  ClockFrame[7], this will become CB inside the FORK line
010E  0114  ClockFrame[6], start the SP here, then logically push 3 registers, so it points to 0114
010C  0000  ClockFrame[5], the first allocated variable (offset 10), it will be temp
010A  0000  ClockFrame[4], the second allocated variable (offset 8), it will be prior
0108  0000  ClockFrame[3], the first allocated variable (offset 6), it will be SemSP
0106  0010   (unused, it would be the saved FP, if there were one)
0104  0000   (unused, it would be the saved PC return address, if there were one)
0102  0010  ClockFrame[0], FP points here, so it needs a copy of the globals pointer
If you put more variables in your frame, then you will need to move the rest of the stuff up to accomodate them. Better to call another method with all your extra variables in it (be sure to make the frame array big enough), or else make the variables global, and leave this routine with exactly three local variables. Then you don't have to recalculate all the array indices every time you add or delete another variable.

Note that ClockFrame[6] starts off with the SP of the new process, then the CoCall exchanges it with the SP of the calling process (02AC), which we can save in our new frame's local variable SemSP and test it for =0. Later we will also use it to access the return value on the caller's stack, to force that return value to zero, so that the next CoCall returns with a different result. What happens when we step through it? Here's the trace, again annotated:

                          875:   DoClock();
A60F: 00036B nop # 875         -- we're in MiniOS(), just now calling DoClock()...
A612: 4000   ldf 029A=0010<-00294
A614: 601A   call #26 DoClock
9EC9: 20     push 0, 029E=0000
9ECA: 02     enter #0 =029E
9ECB: 20     push 0, 02A0=0000  -- three local variables, all initially zero
9ECC: 20     push 0, 02A2=0000
9ECD: 20     push 0, 02A4=0000
                          677:   ClockFrame[6] = &ClockFrame[9];  // initial SP
9ECE: 0002A5 nop # 677         -- now setup as much of that frame as we can...
9ED1: 4000   ldf 02A6=0010<-0029A
9ED3: 010104 push 260, 02A8=0104
9ED6: 1C     add 02A6=0114
9ED7: 4000   ldf 02A8=0010<-0029A
9ED9: 0100FE push 254, 02AA=00FE
9EDC: 1C     add 02A8=010E
9EDD: 0D     stx = 0114<-0010E
                          678:   ClockFrame[8] = &ClockFrame[0];  // initial FP
9EDE: 0002A6 nop # 678
9EE1: 4000   ldf 02A6=0010<-0029A
9EE3: 0100F2 push 242, 02A8=00F2
9EE6: 1C     add 02A6=0102
9EE7: 4000   ldf 02A8=0010<-0029A
9EE9: 010102 push 258, 02AA=0102
9EEC: 1C     add 02A8=0112
9EED: 0D     stx = 0102<-00112
                          679:   ClockFrame[0] = mmPEEK((&SemSP)-6); // copy my globals ptr
9EEE: 0002A7 nop # 679
9EF1: 3F     push -1, 02A6=FFFF
9EF2: 02     enter #-1 =9EF2=029A
9EF3: 26     push 6, 02A8=0006
9EF4: 1C     add 02A6=02A0
9EF5: 26     push 6, 02A8=0006
9EF6: 1D     sub 02A6=029A
9EF7: 0E     mmx (0000)
9EF8: 0C     ldx 02A6=0010<-0029A
9EF9: 4000   ldf 02A8=0010<-0029A
9EFB: 0100F2 push 242, 02AA=00F2
9EFE: 1C     add 02A8=0102
9EFF: 0D     stx = 0010<-00102
                          680:   ClockFrame[1] = 0;   // halt if it returns (it shouldn't)
9F00: 0002A8 nop # 680
9F03: 20     push 0, 02A6=0000
9F04: 4000   ldf 02A8=0010<-0029A
9F06: 0100F4 push 244, 02AA=00F4
9F09: 1C     add 02A8=0104
9F0A: 0D     stx = 0000<-00104
                          681:   SemSP = FORK(&ClockFrame[6]);  // returns twice:
9F0B: 0002A9 nop # 681         -- here's where the rubber hits the road...
9F0E: 4000   ldf 02A6=0010<-0029A
9F10: 0100FE push 254, 02A8=00FE
9F13: 1C     add 02A6=010E
9F14: 04     dupe 02A8=010E
9F15: 04     dupe 02AA=010E
9F16: 22     push 2, 02AC=0002
9F17: 1C     add 02AA=0110
9F18: 6000   call #0 02AC=9F1A  -- this pushes the address of the next instruction,
9F1A: 011F1A push 7962, 02AE=1F1A -- subtract its code offset, giving CB
9F1D: 1D     sub 02AC=8000
9F1E: 06     swap 8000/0110
9F1F: 0D     stx = 8000<-00110
9F20: 04     dupe 02AA=010E
9F21: 26     push 6, 02AC=0006
9F22: 1C     add 02AA=0114
9F23: 24     push 4, 02AC=0004
9F24: 6000   call #0 02AE=9F26  -- do it again, but not to get PC...
9F26: 1C     add 02AC=9F2A
9F27: 06     swap 9F2A/0114
9F28: 0D     stx = 9F2A<-00114  -- here we go!! Check out the memory at this point
9F29: 08     cocall 010E=010E:9F2A,0102 --9F2A is the new PC; 0102 is the new FP
9F2A: 5003   stf = 02AC<-00108  -- stop here and look at your registers and frames...
--- MiniOS ---
DoClock                        00102 -- here's the frame (FP) we just created
             temp I  (0010C) = 0000
            prior I  (0010A) = 0000
            SemSP I  (00108) = 02AC  -- yup, that's the SP we just stored

DoClock                        0029A -- here's the frame we will go back to
             temp I  (002A4) = 0000
            prior I  (002A2) = 0000
            SemSP I  (002A0) = 0000

                          682:   if (SemSP==0) { // returned in caller's process..
9F2C: 0002AA nop # 682
9F2F: 4003   ldf 010E=02AC<-00108 --02AC is actually the SP of the other thread
9F31: 11     iszero 010E=0     -- nope, it's not zero, so take the branch...
9F32: C015   bz 21 (0)
                          686:   mmPOKE(SemSP-6,0);  // clear caller's return value
9F46 0002AE line #686
9F49: 20     push 0, 010E=0000
9F4A: 4003   ldf 0110=02AC<-00108 -- SP -> PC, SP-2 -> FP, SP-4 -> CB, SP-6 -> return value
9F4C: 26     push 6, 0112=0006
9F4D: 1D     sub 0110=02A6
9F4E: 0E     mmx (0000)
9F4F: 0D     stx = 0000<-002A6  -- force it to be 0; we'll see that come up later...
                          688:     COCALL(&SemSP);  // wait for interrupt
9F50: 0002B0 nop # 688
9F53: 3F     push -1, 010E=FFFF
9F54: 02     enter #-1 =9F54=0102
9F55: 26     push 6, 0110=0006
9F56: 1C     add 010E=0108
9F57: 08     cocall 0108=02A6:9F2A,029A --9F2A is the same new PC; 029A is the old FP
9F2A: 5003   stf = 0000<-002A0  -- the SP was 02A6; now we store the 0 into this thread's SemSP
                          682:   if (SemSP==0) { // returned in caller's process..
9F2C: 0002AA nop # 682
9F2F: 4003   ldf 02A6=0000<-002A0
9F31: 11     iszero 02A6=1     -- yup, it's zero, so continue with the next line...
9F32: C015   bz 21 (1)
                          683:     iVector[1] = &ClockFrame[3]; // set interrupt vector
9F34: 0002AB nop # 683
9F37: 4000   ldf 02A6=0010<-0029A
9F39: 0100F8 push 248, 02A8=00F8
9F3C: 1C     add 02A6=0108     -- recall that ClockFrame[3] is SemSP in the other thread
9F3D: 20     push 0, 02A8=0000
9F3E: 22     push 2, 02AA=0002
9F3F: 1C     add 02A8=0002
9F40: 0D     stx = 0108<-00002  --iVector is now set, so exit to MiniOS()
                          684:     return;}
9F41: 0002AC nop # 684
9F44: 805A   br 90
9FA0: 21     push 1, 02A6=0001
9FA1: 03     exit #1, 0298=0288
                          876:   MainPIB[0] = 0; // create (fake) PIB for current process..
A616: 00036C nop # 876         -- take a look at the frames again:
--- MiniOS ---
MiniOS                         00294 -- back in MiniOS

DoClock                        00102 -- the interrupt thread only remains
             temp I  (0010C) = 0000
            prior I  (0010A) = 0000
            SemSP I  (00108) = 0112  -- its SP!

Skipping a few (unimportant for our discussion) lines, we get to where interrupts are enabled:
                          879:   Enables.datum = IntOnBit; // enables interrupts
A63A: 00036F nop # 879
A63D: 21     push 1, 029A=0001
A63E: 28     push 8, 029C=0008
A63F: 10     neg 029C=FFF8
A640: 0D     stx = 0001<-1FFF8  -- that store turns on the interrupts, and Bingo!
 --- Clock int =010C:9F58,0102  -- we pick up that thread right where we left off...
                          689:     temp = Date.seconds;
9F58: 0002B1 nop # 689
9F5B: 01FFDE push -34, 010E=FFDE
9F5E: 0C     ldx 010E=077F<-1FFDE
9F5F: 5005   stf = 077F<-0010C  -- OK, this part is not very interesting...
                          690:     if (temp != prior) { // ticks management..
                          691:       prior = temp;
                          692:       theTicks = prior*8;}
9F80: 0002B4 nop # 692
9F83: 8013   br 19
                          695:   } } // ~DoClock
9F98: 0002B7 nop # 695         -- loop back to the front of the while(true)
9F9B: BFB6   br -74
                         687:   while (true) {  // this is the interrupt handler loop..
                         688:     COCALL(&SemSP);  // wait for interrupt
9F50: 0002B0 nop # 688
9F53: 3F     push -1, 010E=FFFF
9F54: 02     enter #-1 =9F54=0102
9F55: 26     push 6, 0110=0006
9F56: 1C     add 010E=0108
9F57: 08     cocall 0108=0298:A641,0294 -- this exits to user, but another interrupt is pending
 --- Clock int =010C:9F58,0102
                          689:     temp = Date.seconds;
9F58: 0002B1 nop # 689
9F5B: 01FFDE push -34, 010E=FFDE
9F5E: 0C     ldx 010E=0785<-1FFDE
9F5F: 5005   stf = 077F<-0010C  -- skipping rapidly through this boring part again...
                          690:     if (temp != prior) { // ticks management..
                          691:       prior = temp;
                          692:       theTicks = prior*8;}
                          695:   } } // ~DoClock
9F98: 0002B7 nop # 695
9F9B: BFB6   br -74
9F53: 3F     push -1, 010E=FFFF
9F54: 02     enter #-1 =9F54=0102
9F55: 26     push 6, 0110=0006
9F56: 1C     add 010E=0108
9F57: 08     cocall 0108=0298:A641,0294 -- this time I turned off the clock interrupt,
                          881:   ;
A641: 000371 nop # 881         --   so here we are back in the user code, MiniOS()


The important thing you should be seeing here is that the frame we carefully set up in lines 677-680 is now the process block for the clock interrupt. The stack frame of the calling process was dissolved when we returned (after the second return from FORK) to MiniOS, but the newly created thread frame hangs around, like its owner process, dormant until another clock interrupt comes.
 

Semaphores

This is obviously still confusing. If you just read the Stallings text, the task will appear more difficult than it really is. Let's start with some pictures. We would like to start up with Figure B, but OSL gets us only to figure A as an initial configuration. We cannot assign to semaphores; the only way to give the take semaphore an initial count is to signal it four times. Well, we could also POKE(&take,-4) but that's not type-safe. We also need separate threads (processes) to do the fill and take operations. The MiniOS startup code sets up the main as a single process, which we can depend on, but we need a second process for the other thread. Use the given sample code to fork off a second thread, which we will use for the take process. This was renamed as TakenShow and it becomes the take process in Figure B:
One more thing, not shown here, is that we cannot have two current active processes with only one CPU, so TakenShow moved the main thread to the ReadyList (as shown in Figure C). The first thing the new TakenShow thread does with its new existence is wait on the fill semaphore, which is empty, so it blocks, as in Figure C. Actually causing it to block is the responsibility of the OS, which you are writing. It happens in the Wait interrupt handler. I did signal and wait as a unified interrupt handler, but you could implement them separately if you wish. All wait does is call my (slightly enhanced) SwapOut function to move the current process to some designated list (in this case the semaphore being waited on), then make the top ready list process current. That would be the main process, now recognized as the fill process (Figure D). SwapOut's job is fairly simple: stash the only unsaved register (SP) in the current process PIB, then link the current process into the semaphore list, and make the next ReadyList process current by unlinking it from the ReadyList and extracting its SP to be restored when the calling interrupt exits (returned as the function value).

The newly reactivated fill process immediately goes into its own infinite loop, waiting on the take semaphore. However, unlike the fill semaphore, this one has been pre-signalled four counts, so it just takes one count and continues. After it recovers from this (rather painless) wait, it knows there is at least one vacant slot in the circular queue to store data in, so it gets an input character 'x' to store there, shown also in Figure D:

Now that the data is inserted into the queue, the fill process can safely signal the fill semaphore to let the take process know it can start to take data out, as in Figure E. Because there is a process waiting on that semaphore, the OS is again called on to do the job, by a Signal interrupt. The Signal interrupt has less work to do than Wait. The process it is moving from the semaphore being signalled is already paused, so all it needs to do is unlink it from the semaphore and link it into the ReadyList, as in Figure E.

The Fill process is still executing as current process. It has finished its main loop, so it comes around again and waits on the take semaphore, which still has unused counts. It takes another, and inserts another character into the queue., and signals the fill semaphore again, but nobody is waiting on it any more, as shown in Figure F:

The fill process continues to fill up the queue and signal the fill semaphore for two more characters, until the take semaphore reaches 0 counts, signifying that the queue is full. At this point, the fill pointer fx has travelled all the way around the queue and collided with the take pointer tx, as in Figure G. For the fill process to add another character to the queue, it must wait on the take semaphore again, but this time it blocks, as in Figure H. The wait interrupt service handler links the fill process into the take semaphore, then unlinks the (only) process in the ReadyList and makes it current. Thus the take process can now take the four queued characters and output them before it blocks again on an empty queue (zero count in the fill semaphore), which is where we came in.
Of course the positive numbers in the semaphores never really exist, they are merely the implied count, which is the number of waiting processes it is linked to. Note also that the fill semaphore never actually reached -4, because the first signal was taken immediately by the waiting process. Instead, the difference between the fill and take semaphores is generally logically 4, the buffer size (except temporarily, during transition).
 

First Draft: 2004 March 10