|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Datapath: performance |
|
|
|
|
|
|
|
Average
instruction time could be less if we didn't have to use load instruction |
|
|
|
to determine cycle time |
|
|
|
But that's not actually
the longest instruction time! |
|
|
|
What about multiply and
divide, or even floating-point operations? |
|
|
|
|
Need separate, slower
ALUs |
|
|
|
Assume:
8 ns for floating-point add, 16 ns for floating-point multiply |
|
|
|
|
Instruction |
Inst |
Reg |
ALU |
Data |
Reg |
Total |
Distribution |
|
|
|
|
type |
|
mem |
read |
op |
mem |
write |
|
|
|
|
|
|
|
R-type |
2 |
1 |
2 |
0 |
1 |
6 |
27% |
1.6200 |
|
|
|
|
load |
|
2 |
1 |
2 |
2 |
1 |
8 |
31% |
2.4800 |
|
|
|
|
store |
|
2 |
1 |
2 |
2 |
0 |
7 |
21% |
1.4700 |
|
|
|
|
branch |
2 |
1 |
2 |
0 |
0 |
5 |
5% |
0.2500 |
|
|
|
|
jump |
|
2 |
0 |
0 |
0 |
0 |
2 |
2% |
0.0400 |
|
|
|
|
FP
add |
2 |
1 |
8 |
0 |
1 |
12 |
7% |
0.8400 |
|
|
|
|
FP
multiply |
2 |
1 |
16 |
0 |
1 |
20 |
7% |
1.4000 |
|
|
|
|
|
8.1 |
|
|
|
|
Average is 8.1 ns, which
is only 8.1/20 = 41% of the single-cycle time |
|
|
|
|
In other words, variable
cycle could be about 2.5 times faster! |
|
|
|
|
|
|
|
Variable clock cycle is
impractical, but we can break up the instructions into |
|
|
|
|
shorter cycles, then
only use the parts needed for any given instruction |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|