Introduction

The intended audience for this book is students currently taking CMSC414 — Computer and Network Security. This is not intended as a comprehensive guide to assembly, but rather as a simple reference to a handful of instructions, for both AMD64 (also known as x86_64) and AArch64 (also known as arm64). We will take a C-focused approach, showing simple C statements, the assembly they produce (using gcc) on both architectures, and a discussion of what those assembly instructions are doing. We are more interested in the assembly instructions than how they are generated, so the corresponding C code is more to guide our understanding of the assembly than to demonstrate the workings of the compiler.

Additionally, we will present summary tables to serve as a quick reference, including basic operations to/from assembly, variations on operations (such as multiplying a register by a literal vs. another register), and how the assembly in a .s file compares to what is displayed by gdb.

Our operations will focus on integers, and will use the following set of variables:

  int ia, ib, ic;
  unsigned int uia, uib, uic;
  char ca, cb, cc;
  unsigned char uca, ucb, ucc;
  long la, lb, lc;
  unsigned long ula, ulb, ulc;

We’ll be generating the assembly using

gcc -w -ggdb -S -o example.s example.c

and the binary using

gcc -w -ggdb -o example example.s

Please note that these commands might produce different assembly for you, depending on compiler version, operating system, and possibly other things. Again, the mapping of C to assembly is intended to help us understand what the assembly instructions are doing, not how the compiler generates assembly from C.

In gdb, we’ll generate assembly dumps with corresponding source code using

disassemble/s main

since all of our code is in main().

Our variables appear in the following locations on the stack:

Variable	AMD64 Stack Location	AArch64 Stack Location
`ia`	`rbp-8`	`r11-8`
`ib`	`rbp-12`	`r11-28`
`ic`	`rbp-16`	`r11-48`
`uia`	`rbp-20`	`r11-12`
`uib`	`rbp-24`	`r11-32`
`uic`	`rbp-28`	`r11-52`
`ca`	`rbp-29`	`r11-13`
`cb`	`rbp-30`	`r11-33`
`cc`	`rbp-31`	`r11-53`
`uca`	`rbp-32`	`r11-14`
`ucb`	`rbp-33`	`r11-34`
`ucc`	`rbp-34`	`r11-54`
`la`	`rbp-48`	`r11-20`
`lb`	`rbp-56`	`r11-40`
`lc`	`rbp-64`	`r11-60`
`ula`	`rbp-72`	`r11-24`
`ulb`	`rbp-80`	`r11-44`
`ulc`	`rbp-88`	`r11-64`

AMD64 Registers

AMD64 has four general-purpose registers: A, B, C, and D. Each of these can be used as 64-bit (quad-length), 32-bit (word), 16-bit (half-word), or 8-bit (byte) values. These different sizes are named as follows:

RAX — quad
EAX — word (lowest 4 bytes of RAX)
AX — half-word (lowest 2 bytes of RAX)
AL — lowest byte of RAX
AH — second-lowest byte of RAX

RSP and RBP are similar, without the AH equivalent:

RSP
ESP
SP
SPL

RIP is similar, except without the single-byte value.

AArch64 Registers

AArch64 has 31 general-purpose registers: R0–R30 or X0–X30 (these are equivalent). These are quad-length (64-bit) values, the lower halves (1 word, 32 bits) being accessible as W0–W30. Generally, you would use R0–R7 (arguments) or R9–R15 (temporary values). You should not use R29 or R30, and try to avoid using other registers.