The intended audience for this book is students currently taking CMSC414 — Computer and Network Security. This is not intended as a comprehensive guide to assembly, but rather as a simple reference to a handful of instructions, for both AMD64 (also known as x86_64) and AArch64 (also known as arm64). We will take a C-focused approach, showing simple C statements, the assembly they produce (using gcc) on both architectures, and a discussion of what those assembly instructions are doing. We are more interested in the assembly instructions than how they are generated, so the corresponding C code is more to guide our understanding of the assembly than to demonstrate the workings of the compiler.
Additionally, we will present summary tables to serve as a quick
reference, including basic operations to/from assembly, variations on
operations (such as multiplying a register by a literal vs. another
register), and how the assembly in a .s
file compares to
what is displayed by gdb.
Our operations will focus on integers, and will use the following set of variables:
int ia, ib, ic;
unsigned int uia, uib, uic;
char ca, cb, cc;
unsigned char uca, ucb, ucc;
long la, lb, lc;
unsigned long ula, ulb, ulc;
We’ll be generating the assembly using
gcc -w -ggdb -S -o example.s example.c
and the binary using
gcc -w -ggdb -o example example.s
Please note that these commands might produce different assembly for you, depending on compiler version, operating system, and possibly other things. Again, the mapping of C to assembly is intended to help us understand what the assembly instructions are doing, not how the compiler generates assembly from C.
In gdb, we’ll generate assembly dumps with corresponding source code using
disassemble/s main
since all of our code is in main()
.
Our variables appear in the following locations on the stack:
Variable | AMD64 Stack Location | AArch64 Stack Location |
---|---|---|
ia |
rbp-8 |
r11-8 |
ib |
rbp-12 |
r11-28 |
ic |
rbp-16 |
r11-48 |
uia |
rbp-20 |
r11-12 |
uib |
rbp-24 |
r11-32 |
uic |
rbp-28 |
r11-52 |
ca |
rbp-29 |
r11-13 |
cb |
rbp-30 |
r11-33 |
cc |
rbp-31 |
r11-53 |
uca |
rbp-32 |
r11-14 |
ucb |
rbp-33 |
r11-34 |
ucc |
rbp-34 |
r11-54 |
la |
rbp-48 |
r11-20 |
lb |
rbp-56 |
r11-40 |
lc |
rbp-64 |
r11-60 |
ula |
rbp-72 |
r11-24 |
ulb |
rbp-80 |
r11-44 |
ulc |
rbp-88 |
r11-64 |
AMD64 has four general-purpose registers: A, B, C, and D. Each of these can be used as 64-bit (quad-length), 32-bit (word), 16-bit (half-word), or 8-bit (byte) values. These different sizes are named as follows:
RAX
— quadEAX
— word (lowest 4 bytes of RAX
)AX
— half-word (lowest 2 bytes of
RAX
)AL
— lowest byte of RAX
AH
— second-lowest byte of RAX
RSP
and RBP
are similar, without the
AH
equivalent:
RSP
ESP
SP
SPL
RIP
is similar, except without the single-byte
value.
AArch64 has 31 general-purpose registers:
R0
–R30
or X0
–X30
(these are equivalent). These are quad-length (64-bit) values, the lower
halves (1 word, 32 bits) being accessible as
W0
–W30
. Generally, you would use
R0
–R7
(arguments) or
R9
–R15
(temporary values). You should not use
R29
or R30
, and try to avoid using other
registers.