Instruction Format

Introduction

The MIPS R2000/R3000 ISA has fixed-width 32 bit instructions. Fixed-width instructions are common for RISC processors because they make it easy to fetch instructions without having to decode. These instructions must be stored at word-aligned addresses (i.e., addresses divisible by 4).

The MIPS ISA instructions fall into three categories: R-type, I-type, and J-type. Not all ISAs divide their instructions this neatly. This is one reason to study MIPS as a first assembly language. The format is simple.

R-type

R-type instructions refer to register type instructions. Of the three formats, the R-type is the most complex.

This is the format of the R-type instruction, when it is encoded in machine code.

B31-26 B25-21 B20-16 B15-11 B10-6 B5-0
  opcode   register s register t register d shift amount function

The prototypical R-type instruction is:

add $rd, $rs, $rt
where $rd refers to some register d (d is shown as a variable, however, to use the instruction, you must put a number between 0 and 31, inclusive for d). $rs, $rt are also registers.

The semantics of the instruction are;

R[d] = R[s] + R[t]
where the addition is signed addition.

You will notice that the order of the registers in the instruction is the destination register ($rd), followed by the two source registers ($rs and $rt).

However, the actual binary format (shown in the table above) stores the two source registers first, then the destination register. Thus, how the assembly language programmer uses the instruction, and how the instruction is stored in binary, do not always have to match.

Let's explain each of the fields of the R-type instruction.

I-type instructions

I-type is short for "immediate type". The format of an I-type instuction looks like:

B31-26 B25-21 B20-16 B15-0
  opcode   register s register t               immediate              

The prototypical I-type instruction looks like:

add $rt, $rs, immed
In this case, $rt is the destination register, and $rs is the only source register. It is unusual that $rd is not used, and that $rd does not appear in bit positions B25-21 for both R-type and I-type instructions. Presumably, the designers of the MIPS ISA had their reasons for not making the destination register at a particular location for R-type and I-type.

The semantics of the addi instruction are;

R[t] = R[s] + (IR15)16 IR15-0
where IR refers to the instruction register, the register where the current instruction is stored. (IR15)16 means that bit B15 of the instruction register (which is the sign bit of the immediate value) is repeated 16 times. This is then followed by IR15-0, which is the 16 bits of the immediate value.

Basically, the semantics says to sign-extend the immediate value to 32 bits, add it (using signed addition) to register R[s], and store the result in register $rt.

J-type instructions

J-type is short for "jump type". The format of an J-type instuction looks like:

B31-26 B25-0
  opcode                        target                     

The prototypical I-type instruction looks like:

j target
The semantics of the j instruction (j means jump) are:
PC <- PC31-28 IR25-0 00
where PC is the program counter, which stores the current address of the instruction being executed. You update the PC by using the upper 4 bits of the program counter, followed by the 26 bits of the target (which is the lower 26 bits of the instruction register), followed by two 0's, which creates a 32 bit address. The jump instruction will be explained in more detail in a future set of notes.

Why Five Bits?

If you look at the R-type and I-type instructions, you will see 5 bits reserved for each register. You might wonder why.

MIPS supports 32 integer registers. To specify each register, the register are identified with a number from 0 to 31. It takes log2 32 = 5 bits to specify one of 32 registers.

If MIPS has 64 register, you would need 6 bits to specify the register.

The register number is specified using unsigned binary. Thus, 00000 refers to $r0 and 11111 refers to register $r31.

Why Study Instruction Formats

You might wonder why it's important to study instruction formats. They seem to be arbitrarily constructed. Yet, they aren't. For example, it's quite useful to have the opcode be the same size and the same location. It's useful to know the exact bits used for the immediate value. This makes decoding much quicker, and the hardware to handle instruction decoding that much simpler.

Furthermore, you begin to realize what information the instructions store. For example, it's not all that obvious that immediate values are stored as part of the instruction for I-type instructions.

If you know that, for example, addi does signed addition, then you can also conclude that the immediate value is represented in 2C. Also, to add the immediate value to a 32-bit register value would mean sign-extending the immediate value to 32 bits.

However, not all I-type instructions encode the 16 bit immediate in 2C. For example, addiu (add immediate unsigned) interprets the 16 bits as UB. It zero-extends the immediate and then adds it to the value stored in a 32 bit register.

Three Operand Instructions

Also, notice that the R-type instructions use three operands (i.e., arguments). In earlier, pre-RISC ISAs, memory was expensive, so ISA designers tried to minimize the number of bits used in an instruction. This meant that there were often two, one, or no operands.

How did they manage that? Here's an example of an instruction

cisc_add $r1, $r2  # R[1] = R[1] + R[2]

One way to reduce the total number of operands is to make one operand both a source and a destination register.

Another approach is to use an implicit register.

acc_add $r2  # Acc = Acc + R[2]
For example, there may be a special register called the accumulator. This register is not mentioned explicitly in the instruction. Instead, it is implied by the opcode.

Early personal computers such as the Apple 2, used ISAs with 1 or 2 registers, and those registers were often part of most instructions, thus they didn't have to be specified.

With memory becoming cheaper, and memory access becoming cheaper, it's become easier to devote more bits to an instruction, and to specify three operands instead of two. This makes it more convenient for the assembly language programmer.

Summary

MIPS instructions fall into three categories: R-type, I-type, and J-type. You should know how the bits are laid out (i.e., what the 6 parts of the R-type instruction are, and how many bits in each of the 6 parts). However, it's unnecessary to memorize opcodes.