Datapath: performance
Average instruction time could be less if we didn't have to use load instruction
to determine cycle time
But that's not actually the longest instruction time!
What about multiply and divide, or even floating-point operations?
Need separate, slower ALUs
Assume: 8 ns for floating-point add, 16 ns for floating-point multiply
Instruction Inst Reg ALU Data Reg Total Distribution
type   mem read op mem write  
R-type 2 1 2 0 1 6 27% 1.6200
load   2 1 2 2 1 8 31% 2.4800
store   2 1 2 2 0 7 21% 1.4700
branch 2 1 2 0 0 5 5% 0.2500
jump   2 0 0 0 0 2 2% 0.0400
FP add 2 1 8 0 1 12 7% 0.8400
FP multiply 2 1 16 0 1 20 7% 1.4000
8.1
Average is 8.1 ns, which is only 8.1/20 = 41% of the single-cycle time
In other words, variable cycle could be about 2.5 times faster!
Variable clock cycle is impractical, but we can break up the instructions into
shorter cycles, then only use the parts needed for any given instruction