Calcutron-33: A Decimal Based RISC Microprocessor
An imaginary CPU with a decimal number system instead of a binary number system to make teaching and learning how a CPU works easier.
I have talked about the Little Man Computer LMC before, which is a really simplified CPU intended for teaching children or beginners the principles of assembly programming.
The CPU used in with LMC uses a decimal number system which simplifies teaching. You can explain the basics of assembly coding without also introducing the binary number system.
However LMC has a number of shortcomings. One of them is that implementing realistic multiplication and division is very hard.
The LMC is also has a very CISC like design. One can operate directly on memory cells to a large degree.
As an alternative I have come up with an imaginary RISC like CPU I call the Calcutron-33. I have made an assembler and simulator as a Julia package called Calcutron-33.jl. This is still very much work in progress. But here I would like to discuss a bit of the goals and inspiration.
While obviously inspired by LMC, I have also been inspired by reading about a far more realistic CPU instruction set, the RISC-V instruction set architecture (ISA). RISC-V has a number of goals which I don’t share, such all the important parts of a modern high performance RISC CPU, as well as designing and instruction set which can be used to make real CPUs. That is not my goal.
However RISC-V does have a number of interesting properties I have learned from. Most instructions follow a very standardized format, which I find very useful.
RISC-V also partitions the instruction set into extensions. So you can make RISC-V CPUs of different complexity and capability. It also simplifies teaching as you have a very basic but complete instruction set you can begin teaching.
We try to keep many of the ideas of LMC. So the computer memory addresses are specified with two decimal digits. In principle that gives us 100 memory locations. However we will reserve the last 10 locations for Input and Output. Thus 0–89 are valid memory locations.
We will use one digit to specify register, which in principle gives us 10 registers. However the first register
x0 will be treated as always zero. Hence we get registers from
x9 which we can actually use.
Like a RISC CPU it has a load/store architecture. This means arithmetic operations and shift can only be done on registers. Only the store and load instructions can access memory. These are used to pull values into the registers.
Specification of Machine Code Format
- Every instruction is 4 decimal digits.
- The first digit is the opcode, which says what the instruction does such as add, subtract or load.
- The second digit is a register operand. Usually the destination register for whatever operation is performed.
- The last two digits will vary in meaning depending on opcode. For arithmetic operations, they will usually be two registers, used as input. The last one may be an immediate value from 0 to 9. For branches, store and load instructions , the last two digits will be a memory address.
Assembly Instruction Set
Here is description of the instruction set. It shows how each instruction is encoded as 4 decimal digits. E.g. for the first assembly code we have the encoding
1dst. That means the first digit must be a 1 for this to be an add. The letters indicate digits one is free to chose.
d means the destination register
rd is a single digit. So it can be from 1-9. The two source registers
t are one digit each as well.
Sometimes the instruction will set aside multiple digits for one argument such as in the case of the load instruction. Here
8daa means the destination register
rd is specified with one digit, but the address
aa uses two digits.
- ADD rd, rs, rt
rtand store in
rd ← rs + rt
- SUB rd, rs, rt
rsand store in
rd ← rs + rt
- SUBI rd, rs, k
rsand store in register
rd ← rs - k
- LSH rd, rs, k
kdigits and store in
- RSH rd, rs, k
kdigits and store in
- BRZ rd, aa
6daajump to address
- BGT rd, aa
7daajump to address
rd> 0 (positive).
- LD rd, aa
rdwith contents of memory at address
- ST rs, aa
rsin memory at address
Pseudo Assembly Instructions
This is a list of instructions which are just practical variations of the ones defined above. They are not new instructions per say. E.g.
INP is really just the same at the
LD instruction but applied to memory location 90.
MOV is a move instruction accomplished by adding 0 to the source register and storing the result in the destination register. That has the practical outcome of causing a move.
- INP rd
rdwith number from input.
- OUT rs
- MOV rd, rs
1d0smoves content of register
- CLR rd
1d00clears content of register
rd ← 0
- DEC rd
3dd1subtract 1 from register
rd ← rd + 1.
- BRA aa
60aajump to location
Here is a simple program which fetches two numbers from the input and multiplies them, writing the result back to output.
INP x2 // first number to multiply. This will get added
INP x3 // second number. Treated as counter
CLR x1 // accumulator for result. Clear it out.
RSH x4, x3, 1 // Push right most digit of x3 into x4
ADD x1, x1, x2 // Add first input to accumulator
DEC x4 // Decrement counter for number of additions
BGT x4, multiply // Repeat while x4 > 0
LSH x2, x2, 1 // Left shift. x2 made 10x larger
BGT x3, nextdigit // check if all digits have been processed
This next story covers more code examples. The idea is to have a number of simple code examples useful to teach principles of assembly coding to beginners.
This is really a form of thinking out aloud. I have not implemented this pretend CPU yet.