We are going to look at addition in five ways:
We will consider this block of code:
ic = ia + ib;
uic = uia + uib;
cc = ca + cb;
ucc = uca + ucb;
lc = la + lb;
ulc = ula + ulb;For each line, we will treat it as c = a + b.
The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ic = ia + ib; |
||
movl -8(%rbp), %eax |
mov -0x8(%rbp),%eax |
|
addl -12(%rbp), %eax |
add -0xc(%rbp),%eax |
|
movl %eax, -16(%rbp) |
mov %eax,-0x10(%rbp) |
|
uic = uia + uib; |
||
movl -20(%rbp), %eax |
mov -0x14(%rbp),%eax |
|
addl -24(%rbp), %eax |
add -0x18(%rbp),%eax |
|
movl %eax, -28(%rbp) |
mov %eax,-0x1c(%rbp) |
|
cc = ca + cb; |
||
movsbl -29(%rbp), %eax |
movsbl -0x1d(%rbp),%eax |
|
movsbl -30(%rbp), %ecx |
movsbl -0x1e(%rbp),%ecx |
|
addl %ecx, %eax |
add %ecx,%eax |
|
movb %al, -31(%rbp) |
mov %al,-0x1f(%rbp) |
|
ucc = uca + ucb; |
||
movzbl -32(%rbp), %eax |
movzbl -0x20(%rbp),%eax |
|
movzbl -33(%rbp), %ecx |
movzbl -0x21(%rbp),%ecx |
|
addl %ecx, %eax |
add %ecx,%eax |
|
movb %al, -34(%rbp) |
mov %al,-0x22(%rbp) |
|
lc = la + lb; |
||
movq -48(%rbp), %rax |
mov -0x30(%rbp),%rax |
|
addq -56(%rbp), %rax |
add -0x38(%rbp),%rax |
|
movq %rax, -64(%rbp) |
mov %rax,-0x40(%rbp) |
|
ulc = ula + ulb; |
||
movq -72(%rbp), %rax |
mov -0x48(%rbp),%rax |
|
addq -80(%rbp), %rax |
add -0x50(%rbp),%rax |
|
movq %rax, -88(%rbp) |
mov %rax,-0x58(%rbp) |
How we perform the addition and assignment differs depending on the
types of the variables. For integers and long integers, we move the
value of a to a register
(eax for integers and rax for long integers),
add the value of b to the
register, and then move the register to c. Just as we have movl
and movq, we also have addl and
addq. In both cases, we begin with the new value as the
first argument, and the destination as the second. For addl
and addq, the second argument is both an addend and the
sum.
Things are a little more complicated for one-byte types. Here we move
the value of a to
eax and the value of b to ecx. To make
things more complicated, we have to account for the fact that we’re
dealing with 1-byte values, while registers like eax are
4-byte values. That’s what the instructions movsbl and
movzbl are for. The move a single-byte from the
least-significant part of the register (ie, the first byte). They also
differ in that movsbl moves a signed byte, while
movzbl moves an unsigned byte. We then add the two
registers together, storing the result in eax, and then use
movb to copy one byte back to the variable’s location on
the stack. Here’s we use al as the register, instead of
eax, because al is the least-significant side
of eax. Similarly, cl would be the
least-significant side of ecx.
In gdb, now that we’re dealing with registers, movl,
movb, and movq just become mov,
and similarly addl and addq are just
add.
The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ic = ia + ib; |
||
ldr r2, [fp, #-8] |
ldr r2, [r11, #-8] |
|
ldr r3, [fp, #-28] |
ldr r3, [r11, #-28] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
str r3, [fp, #-48] |
str r3, [r11, #-48] |
|
uic = uia + uib; |
||
ldr r2, [fp, #-12] |
ldr r2, [r11, #-12] |
|
ldr r3, [fp, #-32] |
ldr r3, [r11, #-32] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
str r3, [fp, #-52] |
str r3, [r11, #-52] |
|
cc = ca + cb; |
||
ldrb r2, [fp, #-13] |
ldrb r2, [r11, #-13] |
|
ldrb r3, [fp, #-33] |
ldrb r3, [r11, #-33] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
strb r3, [fp, #-53] |
strb r3, [r11, #-53] |
|
ucc = uca + ucb; |
||
ldrb r2, [fp, #-14] |
ldrb r2, [r11, #-14] |
|
ldrb r3, [fp, #-34] |
ldrb r3, [r11, #-34] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
strb r3, [fp, #-54] |
strb r3, [r11, #-54] |
|
lc = la + lb; |
||
ldr r2, [fp, #-20] |
ldr r2, [r11, #-20] |
|
ldr r3, [fp, #-40] |
ldr r3, [r11, #-40] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
str r3, [fp, #-60] |
str r3, [r11, #-60] |
|
ulc = ula + ulb; |
||
ldr r2, [fp, #-24] |
ldr r2, [r11, #-24] |
|
ldr r3, [fp, #-44] |
ldr r3, [r11, #-44] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
str r3, [fp, #-64] |
str r3, [r11, #-64] |
In all cases, we first load a and b into registers r2 and
r3 with ldr (or ldrb for a
one-byte variable), then add the registers, storing them in
r3, and finally store the value of r3 into
c. The add
instruction has the result register as the first argument, followed by
the addends. This is in contrast to AMD64, which always stores the
result in the register that is one of the addends. Aside from
fp versus r11, there is no difference between
how this is written in an assembly code file and how gdb displays
it.
We will consider this block of code:
ic = ia + 2;
uic = uia + 2;
cc = ca + 2;
ucc = uca + 2;
lc = la + 2;
ulc = ula + 2;The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ic = ia + 2; |
||
movl -8(%rbp), %eax |
mov -0x8(%rbp),%eax |
|
addl $2, %eax |
add $0x2,%eax |
|
movl %eax, -16(%rbp) |
mov %eax,-0x10(%rbp) |
|
uic = uia + 2; |
||
movl -20(%rbp), %eax |
mov -0x14(%rbp),%eax |
|
addl $2, %eax |
add $0x2,%eax |
|
movl %eax, -28(%rbp) |
mov %eax,-0x1c(%rbp) |
|
cc = ca + 2; |
||
movsbl -29(%rbp), %eax |
movsbl -0x1d(%rbp),%eax |
|
addl $2, %eax |
add $0x2,%eax |
|
movb %al, -31(%rbp) |
mov %al,-0x1f(%rbp) |
|
ucc = uca + 2; |
||
movzbl -32(%rbp), %eax |
movzbl -0x20(%rbp),%eax |
|
addl $2, %eax |
add $0x2,%eax |
|
movb %al, -34(%rbp) |
mov %al,-0x22(%rbp) |
|
lc = la + 2; |
||
movq -48(%rbp), %rax |
mov -0x30(%rbp),%rax |
|
addq $2, %rax |
add $0x2,%rax |
|
movq %rax, -64(%rbp) |
mov %rax,-0x40(%rbp) |
|
ulc = ula + 2; |
||
movq -72(%rbp), %rax |
mov -0x48(%rbp),%rax |
|
addq $2, %rax |
add $0x2,%rax |
|
movq %rax, -88(%rbp) |
mov %rax,-0x58(%rbp) |
This is very similar to what we saw when adding two variables, though
we can see that the literal value is the first argument to
add/addl/addq. The order in which
we add the values (ia + 2 or 2 + ia) makes no
difference in the generated assembly.
The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ic = ia + 2; |
||
ldr r3, [fp, #-8] |
ldr r3, [r11, #-8] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
str r3, [fp, #-48] |
str r3, [r11, #-48] |
|
uic = uia + 2; |
||
ldr r3, [fp, #-12] |
ldr r3, [r11, #-12] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
str r3, [fp, #-52] |
str r3, [r11, #-52] |
|
cc = ca + 2; |
||
ldrb r3, [fp, #-13] |
ldrb r3, [r11, #-13] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
strb r3, [fp, #-53] |
strb r3, [r11, #-53] |
|
ucc = uca + 2; |
||
ldrb r3, [fp, #-14] |
ldrb r3, [r11, #-14] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
strb r3, [fp, #-54] |
strb r3, [r11, #-54] |
|
lc = la + 2; |
||
ldr r3, [fp, #-20] |
ldr r3, [r11, #-20] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
str r3, [fp, #-60] |
str r3, [r11, #-60] |
|
ulc = ula + 2; |
||
ldr r3, [fp, #-24] |
ldr r3, [r11, #-24] |
|
add r3, r3, #2 |
add r3, r3, #2 |
|
str r3, [fp, #-64] |
str r3, [r11, #-64] |
This is also extremely similar to the previous example of adding two
variables, except that we do not have a second load instruction, since
there is only one variable, and the add instruction accepts
literal addends. The literal value is always the second addend,
regardless of the order in which they are specified in C.
Having established patterns between different integer types, we will begin limiting ourselves to a single statement or block, unless something interesting is illustrated by using multiple types. For additive assignment, we will only consider this statement:
ia += ib;The version involving a literal (ia += 2) is identical
to adding a variable and a literal, with the exception of the result, so
we will not include any details here.
The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ia += ib; |
||
movl -12(%rbp), %eax |
mov -0xc(%rbp),%eax |
|
addl -8(%rbp), %eax |
add -0x8(%rbp),%eax |
|
movl %eax, -8(%rbp) |
mov %eax,-0x8(%rbp) |
We can see from the assembly for ic = ia + ib how this
is constructed. Because we are using ia
(-8(%rbp)) as the result, as well as an addend, it makes
sense that we are moving the value of ib into the register
eax. We could have performed the exact same first two
instructions, however, without impacting the result, though we might
gain some cache speed-up by making the accesses of -8(%rbp)
sequential here.
The equivalent assembly is shown in the following table:
| C | .s file | gdb |
|---|---|---|
ia += ib; |
||
ldr r2, [fp, #-8] |
ldr r2, [r11, #-8] |
|
ldr r3, [fp, #-28] |
ldr r3, [r11, #-28] |
|
add r3, r2, r3 |
add r3, r2, r3 |
|
str r3, [fp, #-8] |
str r3, [r11, #-8] |
Aside from the store instruction, this is identical to
ic = ia + ib.
For both
ic = ia - ib;and
ic = ia - 2;The assembly is identical for both AMD64 and AArch64 to addition, but
replacing add with sub. This includes as
substrings (eg, addl becomes subl).