-
Notifications
You must be signed in to change notification settings - Fork 0
CPU (cpu‐peripheral)
The SIRC CPU is a RISC load/store based CPU that takes heavy influences from early ARM and MIPS chips.
Address Bus |
24 bit |
Data Bus |
16 bit |
Instruction Size |
Fixed 32 bit |
The main inspiration for SIRC is the SNES (which has a 6502) and to a less extent the Sega Genesis/Mega Drive which has a 68k.
The CPU should be basic (e.g. no superscalar) but it does not need to be as simple as the 6502, and can take inspiration from other CPUs available around that time (68k, ARM6, MIPS, x86, etc.).
It will be strictly 16 bit. One exception is that memory addresses can be 24 bit by synthesising the address by from two 16 bit registers, as a 16 bit address space is too limiting (even the SNES has 24 bit addressing).
Instructions will be strictly 32 bit for instruction decoder simplicity reasons.
Prioritise a design that is simple and fast to execute, over complex instructions.
The advantages here are:
-
Simpler CPU design (easier for me define in Verilog)
-
Allows for performance improvements in future revisions (pipelining, prefetching, etc.?)
The disadvantages:
-
Poor ergonomics for developers writing straight assembler code without a compiler
-
Larger binary sizes (could be an issue with the limited RAM/ROM available at the target generation)
-
More instruction fetches (2x16 bit fetches for each instruction)
It will be a load/store architecture which simplifies the CPU design by restricting the ALU to only operating on registers.
Similar to MIPS, it does not have any microcode. The main CPU loop has six stages for each instruction, which means that all instructions take six clocks/cycles to execute, regardless of the operation. This could be considered wasteful for instructions that don’t need all six stages to execute (e.g. instructions without a memory access) but it makes the design simpler.
The six stages are:
-
Instruction Fetch (first word)
-
Instruction Fetch (second word)
-
Decode Instruction and Fetch (from registers)
-
Instruction Execution (and address calculation)
-
Memory Access (read/write or nop)
-
Write back (to registers)
The logic was based on https://www.cs.umd.edu/users/meesh/cmsc411/website/handouts/Simple_Operation_of_MIPS.htm (Thanks!) but got a lot more complicated.
There are 16 addressable registers: r1, r2, r3, r4, r5, r6, r7, lh, ll, ah, al, sh, sl, ph, pl, sr
The registers starting with "r" are general purpose registers. The registers starting with "l", "a", "s", and "p" are address registers can be used in pairs when 24-bit addressing is needed. The high byte of the high address register can not map to any address pins but it is still possible to store things in there for things like tagged pointers. The high side of each address register pair are considered privileged and can only be written to in supervisor mode. The "sr" register is the status register. The upper half of it is privileged and can only be written to or read from in supervisor mode.
This pseudo grammar I just made up to get this done quickly shows how an instruction is made up of a lot of optional components (note to self, define in BNF or something)
The destination is always first, and the operands after that in order. If the third operand is omitted the first source operand is assumed to be the same as the destination.
mnemonic(|optional-condition-code){0-1} (addressing mode, ){0-3} (optional-shift-definition){0,1}
Some examples:
; Do nothing (implied instruction) NOOP ; Return from subroutine if zero bit is set in SR (last compare was equal) RETS|== ; Load the value in memory pointed to by address register pair "a" + 1 into r1 LOAD r1, (#1, a) ; Take the value of r2, shift it left by 2, add it to r1 and store the result back to r1 ADDI r1, r2, ASL #2 ; "Compare" the value in r6 with #4 and put the result in the status register CMPI r6, #4 ; Get the value in r2, shift it left by 4, add it to the value in r3 and store it back in r1 (only if the result of the last compare indicated that the first operand was greater than the second operand) ADDR|>> r1, r2, r3, LSR #4
There are example projects written in SIRC ASM under the examples directory https://github.com/NoxHarmonium/sirc/tree/main/examples too.
Some mnemonics don’t map directly to an instruction, but are assembled into a specific instruction. These are called meta instructions.
For example:
RETS ; Return from subroutine ; Assembles to LDEA p, (l) ; -- Copy the contents of the link register to the program counter NOOP ; No operation ; Assembles to ADDI r1, #0 ; -- Adding zero to a register does nothing (when status register update is disabled) WAIT ; Wait for exception ; Assembles to COP #0x1F00 ; -- Transfers control to co-processor 1 (the exception CoP) with opcode 0xF00 RETE ; Return from exception ; Assembles to COP #0x1A00 ; -- Transfers control to co-processor 1 (the exception CoP) with opcode 0xA00
There are four kinds of instructions that the CPU can decode:
An instruction that not need any arguments to execute. For example "NOOP".
An implied instruction is made up of:
-
6 bit instruction identifier
-
22 bit reserved
-
4 bit condition flags
An instruction can encode a fixed 16-bit operand into the instruction. For example "BRAN #-3"
An immediate instruction is made up of:
-
6 bit instruction identifier (max 64 instructions)
-
4 bit register identifier
-
16 bit value
-
2 bit address register a, p or s (if any)
-
4 bit condition flags
Similar to immediate instruction but encoded fixed value can only be 8-bit to allow room for the shift information. For example "ADDI r1, #2, ASL #1"
An immediate (with shift) instruction is made up of:
-
6 bit instruction identifier (max 64 instructions)
-
4 bit register identifier
-
8 bit value
-
8 bit shift (1 bit operand, 3 bit shift type, 4 bit shift count)
-
2 bit address register a, p or s (if any)
-
4 bit condition flags
Used for operations on multiple registers. Since register IDs are only 4 bits it can support three register operand and an optional shift. For example "ADDI r1, r2, r3, ASL #1"
-
6 bit instruction identifier (max 64 instructions)
-
4 bit register identifier
-
4 bit register identifier
-
4 bit register identifier
-
8 bit shift (1 bit operand, 3 bit shift type, 4 bit shift count)
-
2 bit address register a, p or s (if any)
-
4 bit condition flags
There are eight types of addressing modes that can make up an instruction:
- Immediate
-
Immediate is a single 16-bit value (e.g. #123)
- Register Direct
-
References a specific register to store/load from (e.g. r2)
- Indirect Immediate
-
For storing/loading from a memory address. References an address register to use as a pointer to a memory address, and an immediate value to add as an offset to that address. It is specified in brackets to make it clear that it is an indirect reference to the data. (e.g. (#123, a))
- Indirect Register
-
For storing/loading from a memory address. References an address register to use as a pointer to a memory address, and another register to which should be used as an offset to that address. It is specified in brackets to make it clear that it is an indirect reference to the data. (e.g. (r3, a))
- Indirect Register with Post Increment
-
Same as Indirect Register but will increment the address after the load/store. Useful for operating on stacks. (e.g. (r3, s)+)
- Indirect Register with Pre Decrement
-
Same as Indirect Register but will increment the address after the load/store. Useful for operating on stacks. (e.g. -(r3, s))
- Implied
-
Not really an addressing mode but the absence of one. Used for instructions that don’t have operands (e.g. NOOP)
- Short Immediate
-
Same as immediate but the value can only be 8-bit. This is to allow room for shift information. (e.g. #123)
All the (non-pseudo) instructions are listed below with their opcode in hexadecimal.
Supported source addressing for each instruction
Immediate | Register Direct | Indirect Immediate | Indirect Register | Post Increment | Pre Decrement | Implied | Immediate (Short+Shift) | |
---|---|---|---|---|---|---|---|---|
ADDI |
0x00 |
|||||||
ADCI |
0x01 |
|||||||
SUBI |
0x02 |
|||||||
SBCI |
0x03 |
|||||||
ANDI |
0x04 |
|||||||
ORRI |
0x05 |
|||||||
XORI |
0x06 |
|||||||
LOAD |
0x07 |
|||||||
CMPI |
0x0A |
|||||||
TSAI |
0x0C |
|||||||
TSXI |
0x0E |
|||||||
COPI |
0x0F |
|||||||
STOR |
0x10, 0x11, 0x12 |
|||||||
LOAD |
0x14 |
0x15 |
0x17 |
|||||
LDEA |
0x18 |
0x19 |
||||||
BRAN |
0x1A |
0x1B |
||||||
LJSR |
0x1C |
0x1D |
||||||
BRSR |
0x1E |
0x1F |
||||||
ADDI |
0x20 |
|||||||
ADCI |
0x21 |
|||||||
SUBI |
0x22 |
|||||||
SBCI |
0x23 |
|||||||
ANDI |
0x24 |
|||||||
ORRI |
0x25 |
|||||||
XORI |
0x26 |
|||||||
LOAD |
0x27 |
|||||||
CMPI |
0x2A |
|||||||
TSAI |
0x2C |
|||||||
TSXI |
0x2E |
|||||||
COPI |
0x2F |
|||||||
ADDR |
0x30 |
|||||||
ADCR |
0x31 |
|||||||
SUBR |
0x32 |
|||||||
SBCR |
0x33 |
|||||||
ANDR |
0x34 |
|||||||
ORRR |
0x35 |
|||||||
XORR |
0x36 |
|||||||
LOAD |
0x37 |
|||||||
CMPR |
0x3A |
|||||||
TSAR |
0x3C |
|||||||
TSXR |
0x3E |
|||||||
COPR |
0x3F |
Supported destination addressing for each instruction
Immediate | Register Direct | Indirect Immediate | Indirect Register | Post Increment | Pre Decrement | Implied | Immediate (Short+Shift) | |
---|---|---|---|---|---|---|---|---|
ADDI |
0x00 |
|||||||
ADCI |
0x01 |
|||||||
SUBI |
0x02 |
|||||||
SBCI |
0x03 |
|||||||
ANDI |
0x04 |
|||||||
ORRI |
0x05 |
|||||||
XORI |
0x06 |
|||||||
LOAD |
0x07 |
|||||||
CMPI |
0x0A |
|||||||
TSAI |
0x0C |
|||||||
TSXI |
0x0E |
|||||||
COPI |
0x0F |
|||||||
STOR |
0x10 |
0x11 |
0x13 |
|||||
LOAD |
0x14, 0x15, 0x17 |
|||||||
LDEA |
0x18*, 0x19* |
|||||||
BRAN |
0x1A, 0x1B |
|||||||
LJSR |
0x1C, 0x1D |
|||||||
BRSR |
0x1E, 0x1F |
|||||||
ADDI |
0x20 |
|||||||
ADCI |
0x21 |
|||||||
SUBI |
0x22 |
|||||||
SBCI |
0x23 |
|||||||
ANDI |
0x24 |
|||||||
ORRI |
0x25 |
|||||||
XORI |
0x26 |
|||||||
LOAD |
0x27 |
|||||||
CMPI |
0x2A |
|||||||
TSAI |
0x2C |
|||||||
TSXI |
0x2E |
|||||||
COPI |
0x2F |
|||||||
ADDR |
0x30 |
|||||||
ADCR |
0x31 |
|||||||
SUBR |
0x32 |
|||||||
SBCR |
0x33 |
|||||||
ANDR |
0x34 |
|||||||
ORRR |
0x35 |
|||||||
XORR |
0x36 |
|||||||
LOAD |
0x37 |
|||||||
CMPR |
0x3A |
|||||||
TSAR |
0x3C |
|||||||
TSXR |
0x3E |
|||||||
COPR |
0x3F |
Opcodes with a "*" directly refer to an address register pair (a, p, s, or l), rather than a single register like ah.
To try and simplify the hardware and avoid microcode, part of how each instruction functions is encoded into the instruction bit pattern. For example, All ALU instructions that have an immediate operand start with 0x0, short immediate with 0x2 and register with 0x3. For another example, ALU instructions from 0x_0-0x_7 will save the result to a destination register and ALU instructions from 0x_8-0x_F will not. This means that ANDI (AND immediate) (0x04) becomes TSAI (test AND immediate) (0x0C) by adding 0x8.
Some instructions don’t really make sense, such as a test version of ORRI (OR immediate), so they are left undocumented.
Instructions 0x08, 0x09, 0x0B, 0x0D, 0x28, 0x29, 0x2B, 0x2D, 0x38, 0x39, 0x3B, 0x3D have no mnemonic and are considered "undocumented instructions". Executing them will probably do something, but probably not something very useful. What each of those instructions does will depend on the hardware implementation and will probably change between CPU revisions.
These are the memory access and branch instructions which are a bit special since they aren’t just an operation on a register.
The instructions at 0x10-0x1F follow a pattern to (hopefully) simplify the decoder.
7 | 6 | 5 | 4 | 3/2 | 1 | 0 |
---|---|---|---|---|---|---|
? |
? |
? |
Always 1 |
00 = Store 01 = Load 10 = Load Address 11 = Load Address With Link |
0 = Both Registers 1 = Only Low Register |
0 = Immediate 1 = Register |
E.g.
LDEA (LONG JUMP) Immediate would be: Load Address: 10 Both Registers: 0 Immediate: 0
which would encode as 0x18
Where as LJSR Immediate would be: Load Address with Link: 11 Both Registers: 0 Immediate: 0
which would encode as 0x1C
The second bit is used to distinguish between operations that write to both registers in an address register pair, vs ones that only write to the lower register. Why do we need instructions that only write to the lower register? Because when the system mode/privileged bit is not set, updating the upper register in an address register pair is illegal to prevent escaping the bank/segment and provide a crude memory protection.
The CPU will currently always expect a vector table at 0x0000_0000-0x0000_00FF.
The most important vector being the 0x0 which is the reset vector and tells the CPU where to jump to for the first instruction.
The exception vectors are listed in https://github.com/NoxHarmonium/sirc/blob/main/sirc-vm/peripheral-cpu/src/coprocessors/exception_unit/definitions.rs
Other than that, the memory layout is flexible as it can be specified in the vector table.
-
Notation for status register update source