CMSC216 Lab07: Assembly Globals and Stack Space
- Due: 11:59pm Sun 29-Mar-2026 on Gradescope
- Approximately 1.00% of total grade
CODE DISTRIBUTION: lab07-code.zip
CHANGELOG: Empty
1 Rationale
Modern compilers create programs that will run independent of the memory location into which they are placed by the operating system. This is a boon to security, but requires some special techniques at the assembly level to access global variables. This lab demonstrates "RIP-relative" addressing for globals (known more widely as Program Counter Relative / PC-Relative addressing) and the syntax used to access global variables.
Function calls are an important abstraction in any computing
environment. At the architecture/assembly level, function calls often
involve some setup such as placing arguments in certain
registers. Functions that require local variables in main memory must
manipulate the stack pointer, %rsp in x86-64, to "create" such space
and then track offsets from the stack pointer at which various local
variables reside. These two phenomena are intertwined: calling a
function always means aligning the %rsp to a 16-byte boundary and
passing the main memory address of a local variable to another
function often involves loading an argument variable with an address
based on %rsp. This lab demonstrates these concepts by completing a
main() function in assembly which has several local variables that
require main memory addresses and passes those addresses to another
function.
Associated Reading / Preparation
Bryant and O'Hallaron Ch 3.7 on assembly procedure call conventions in x86-64 including expansion of the stack.
Grading Policy
Credit for this lab is earned by completing the code/answers in the
Lab codepack and submitting a Zip of the work to Gradescope preferably
via running make submit. Students are responsible to check that the
results produced locally are reflected on Gradescope after submitting
their completed Zip.
Lab Exercises are Free Collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.
No late submissions are accepted for Lab work but the lowest two lab scores for the semester will be dropped including zeros due to missing submissions. See the full policies in the course syllabus.
2 Codepack
The codepack for the HW contains the following files:
| File | Description | |
|---|---|---|
prime_funcs.c |
Study | PROBLEM 1: C version of functions to implement in assembly |
prime_fucns_asm.s |
EDIT | PROBLEM 1: Assembly functions to study and complete |
prime_main.c |
Provided | PROBLEM 1: Main function which calls C/Assembly functions |
prime_fact.c |
Optional | PROBLEM 1: Main function which calls an optional Assembly function |
order2_c.c |
Provided | Problem 2: C version of code for reference |
order3_c.c |
Provided | Problem 2: C version of code for reference |
order2_asm.s |
EDIT | Problem 2: Incomplete Assembly main() function, fill in remaining code |
order3_asm.s |
EDIT | Problem 2: Incomplete Assembly main() function, fill in remaining code |
QUESTIONS.txt |
EDIT | Questions to answer: fill in the multiple choice selections in this file. |
Makefile |
Build | Enables make test and make zip |
QUESTIONS.txt.bk |
Backup | Backup copy of the original file to help revert if needed |
QUESTIONS.md5 |
Testing | Checksum for answers in questions file |
test_quiz_filter |
Testing | Filter to extract answers from Questions file, used in testing |
test_lab07.org |
Testing | Tests for this exercise |
test_lab07_code.org |
Testing | Tests for code portions of the lab run via make test-code |
testy |
Testing | Test running scripts |
gradescope-submit |
Misc | Allows submission to Gradescope from the command line |
3 Using Stack Space for Local Variables
Problem 2 focuses on situations where local variables in a function
like main() require main memory addresses.
- Examine the Completed code in
order2_c.cfirst to get a sense of the C code versions of themain()andorder2()functions. - Examine the assembly code in
order2_asm.s:- COMPLETE the assembly
main()according to the provided outline - The assembly
order2()function is complete and correct and requires no modification.
- COMPLETE the assembly
- Examine the Completed code in
order3_c.cfirst to get a sense of the C code versions of themain()andorder3()functions. Note that there are several function calls and many more local variables requiring addresses. - Examine the assembly code in
order3_asm.sand complete the TODO sections inmain()according to the provided outline. Pay careful attention to the local variable layout table.
Locals in the Stack
Demoers will examine code such as the following fragment from main()
in order2_c.c:
int r=17, t=12; order2(&r, &t);
It is important to realize that since r,t need memory addresses
for the function call, they cannot exist only in registers. A
compiler will likely place them in the function call stack. This
appears in x86-64 assembly as offsets from the stack pointer %rsp
such as the following fragment in order2_asm.s:
movl $17, 0(%rsp) # r=17 movl $12, 4(%rsp) # t=12
Near the top of the assembly code for main() is a table indicating
the locations of all the local variables in the stack.
Creating Stack Space
Prior to writing into the stack the stack pointer %rsp must be
adjusted to grow the stack. Growing the stack is usually done via one
of two methods.
- A subtraction like
subq $24, %rspwhich will grow the stack by 24 bytes. The specific value is chosen to be large enough for all local variables but leaves that area uninitialized. Subtractions are usually used at the beginning of a function execution to grow the stack. - A push like
pushl %r15dwhich will grow the stack a little, 4 bytes in this case, and initialize the new space with a value, in this case the value in register%r15d. SeveralpushXinstructions can be used in a row, usually towards the beginning of a function. They are most often used to save registers that will be changed and need be restored such as%r15or other Callee save registers.
Note that the stack grows downwards to lower addresses and shrinks
upwards to higher addresses in x86-64. Later when the stack needs to
shrink, the "inverse" instructions are used to adjust %rsp.
- An addition like
addq $24, %rspundoes a subtraction of 24 bytes. - A pop like
popl %r15dwhich copies the 4-byte value at the top of the stack into the given register and shrinks the stack by 4 bytes.
Keep in mind that any changes to %rsp must be undone before
returning as %rsp must point at the function's return address when
ret is used.
Insert assembly code near the top of main() in the provided
assembly files to grow the stack by an appropriate amount of
space. This is discussed further in the Grow/Shrink section later.
Address of Locals in Assembly
The preceding assembly fragment is followed by additional instructions
which equate to the address-of &x operator in C to load the stack
locations of several local variables prior to a function call. This
appears as the following assembly code.
movq %rsp, %rdi # arg1 &r leaq 4(%rsp), %rsi # arg2 &t leaq 8(%rsp), %rdx # arg3 &v call order3 # function call: order3(&r, &t, &v);
Note the use of movq to copy the stack pointer to %rdi as %rsp
contains the address of the variable r already while the Load
Effective Address leaq instruction is used to compute the addresses
for variables t,v and store them in registers.
There are similar blocks that follow this initial example and you
should use the table at the top of main() to guide you on where the
various local variables are stored in the stack. Note that this stack
storage is required because the order3() function requires memory
addresses/pointers as arguments so the variables must be stored in
main memory rather than registers.
Calls to printf()
Similarly, there are several blocks that need to be COMPLETE'd to call
printf() to show the results of the ordering. Use the template
provided and adjust the pattern as needed. Note that the printf()
function is special for two reasons.
- It is a "variadic" function which can take an arbitrary number of
arguments. This has the special convention that the
%eaxregister is used during function call setup, set to 0 in the sample code to indicate no vector registers are used. This is not essential to understand so copy the pattern provided. - It is defined in a dynamically linked library and thus uses the
Procedure Linkage Table during its call via the syntax
call printf@PLT. This may be discussed later in the semester when we study the linking process.
Grow/Shrink the Stack
IMPORTANT:
- The
main()function needs space for local variables during its operation so should create enough space for all locals at the beginning of its execution. - Before returning
main()must restore%rspthrough add/pop instructions to shrink the stack to its original state where%rsppoints to the return address.
Finally, the x86-64 interface dictates that when calling a function
such as in call order2 or call order3, the %rsp should be
divisible by 16, referred to at times as "the stack is aligned for
function calls." This leads to an interesting calculation that the
compiler computes to decide how many bytes to adjust the stack
pointer:
- When a function is called, the stack pointer is divisible by 16;
call its value
N - The
callinstruction pushes 8 bytes for the return address into the stack. The stack pointer is nowN-8which is NOT divisible by 16. - Even if a function has no locals, if it in turn calls another
function, the compiler will usually grow the stack by another 8
bytes to re-align the stack. This is done with instructions like
subq $8, %rspwhich leaves%rspwith valueN-16which is again divisible by 16. - If space is required for locals like
36bytes, then the compiler must grow by this amount such as viasubq $36, %rspleaving%rspatN-8-36 = N-44. Unfortunately this is not divisible by 16 so often the compiler "pads" the stack growth to get to alignment: rather than growing by 36, grow by 40 bytes givingN-40-8 = N-48which is divisible by 16. - Such "padded" expansion of the stack both (1) creates space for locals and (2) prepares for a function call later on in execution.
4 QUESTIONS.txt File Contents
Below are the contents of the QUESTIONS.txt file for the exercise.
Follow the instructions in it to complete the QUIZ and CODE questions
for the exercise.
_________________
LAB07 QUESTIONS
_________________
Exercise Instructions
=====================
Follow the instructions below to experiment with topics related to
this exercise.
- For sections marked QUIZ, fill in an (X) for the appropriate
response in this file. Use the command `make test-quiz' to see if
all of your answers are correct.
- For sections marked CODE, complete the code indicated. Use the
command `make test-code' to check if your code is complete.
- DO NOT CHANGE any parts of this file except the QUIZ sections as it
may interfere with the tests otherwise.
- If your `QUESTIONS.txt' file seems corrupted, restore it by copying
over the `QUESTIONS.txt.bk' backup file.
- When you complete the exercises, check your answers with `make test'
and if all is well. Create a zip file and submit it to Gradescope
with `make submit'. Ensure that the Autograder there reflects your
local results.
- IF YOU WORK IN A GROUP only one member needs to submit and then add
the names of their group on Gradescope.
PROBLEM 1 Overview of prime program
===================================
Examine the following files related to this problem.
-----------------------------------------------------------------------------------------------------
FILE Description
-----------------------------------------------------------------------------------------------------
prime_main.c A main() function in C that calls two functions defined elsewhere
prime_funcs.c C implementations of the primprod() and primsums() functions
prime_funcs_asm.s Assembly implementations of primprod() and primsums(), the 2nd must be completed
-----------------------------------------------------------------------------------------------------
The code performs some simple integer calculations involving
arithmetic on prime numbers. It utilizes a global array called
`primes[]' that contains some prime numbers in it. This allows the
demonstrating of how global variables, particularly arrays, are
accessed in assembly. Provided code shows the syntax for this using
RIP-relative addressing which must be used to complete the second
function.
PROBLEM 1 QUIZ
==============
Study the C and Assembly implementations in `prime_funcs.c /
prime_funcs_asm.s' and answer the following questions.
A
~
The variable `primes[]' is declared in the C program outside of any
function making it a global variable. Where does this variable appear
in the Assembly version of the code?
- ( ) It is defined in a separate file `prime_funcs_asm.data' because
Assembly implementation requires that Global Data and Functions be
placed in different compilation units
- ( ) It is lower down in the `prime_funcs_asm.s' file in a `.data'
section which is distinguished from the `.text' section where the
functions are placed
- ( ) It is simply defined above the Assembly functions in
`prime_funcs_asm.s' using the same syntax as in C:
,----
| int data[200] = {2,3,7,...};
`----
- ( ) Trick question: the `primes[]' array is actually defined in
`prime_main.c' and only used in the assembly code.
B
~
The C code for `primprod()' starts with a complex condition to check
for bad parameters. Which of the following best describes how this is
realized in assembly.
- ( ) The instruction `cmpl' is used with multiple operands (8 in this
case) to check all comparisons and a single jump is made if any are
invalid.
- ( ) The instruction `cmpmultl' is used with multiple operands (8 in
this case) to check all comparisons and a single jump is made if any
are invalid.
- ( ) The instruction `cmpl' is used with two operands 4 times in a
row to check each comparison with a jump to an error label made if
the condition is not met.
- ( ) The instruction `cmpmultl' is used with multiple operands (3 in
this case) to check that a register is between two constants. Two
checks are made for the two parameter registers which are followed
by jumps if the condition is not met.
C
~
Which of the following best describes the following instruction?
,----
| leaq primes(%rip), %rcx
`----
- ( ) It loads the address of the global variable `primes' into the
`%rcx' register
- ( ) It moves the value of the global variable `primes' into the
`%rcx' register
- ( ) It creates the global variable `primes' at the location
indicated by the `%rcx' register
- ( ) It creates the global variable `primes' at the location
indicated by the `%rip' register and gives it the initial value
stored in `%rcx'
D
~
Which of the following best describes the following instruction which
appears soon after the one above?
,----
| movl (%rcx,%rdi,4), %eax
`----
- ( ) It computes the arithmetic expression `4*rdi+rcx' and stores
that value in `%eax'.
- ( ) It treats `%rcx' as a pointer to an array, `%rdi' as an index
into the array, and the array elements of as size 4; it copies the
value in `%eax' into all elements of array up to index `%rdi'
- ( ) It treats `%rcx' as a pointer to an array, `%rdi' as an index
into the array, and the array elements of as size 4; it copies the
value in `%eax' into the array at the given position
- ( ) It treats `%rcx' as a pointer to an array, `%rdi' as an index
into the array, and the array elements of as size 4; it copies the
value from the array into the register `%eax'
E
~
After completing the code and building the prime_fact_asm executable,
use the objdump utility as shown below to disassemble it and show the
compiled instructions.
,----
| ## Build the assembly version of the program
| >> make prime_fact_asm
| gcc -Wall -Werror -g -fstack-protector-all -o prime_fact_asm prime_fact.c prime_funcs_asm.s
|
| ## Disassemble the program to show the resulting compiled instructions
| >> objdump -d prime_fact_asm
| prime_fact_asm: file format elf64-x86-64
| Disassembly of section .init:
| ...
`----
In the output, find the `primprod()' function and study the
instructions there. Which of the following best describes how the
disassembled instructions for the `leaq' instruction appears in this
output?
- ( ) It appears as the same as the original source code, as a line
like
,----
| leaq primes(%rip), %rcx
`----
- ( ) It appears nearly the same as the original source code but the
global variable name has been substituted for a numeric offset like
,----
| leaq 0x2d60(%rip), %rcx
`----
- ( ) It appears similar to the original source code but with some
additional information like the address and binary opcodes like
,----
| 12c9: 48 8d 0d 60 2d 00 00 lea primes(%rip),%rcx
`----
- ( ) It appears similarly but additional information is present and
the global name has been substituted with a numeric offset as in:
,----
| 12c9: 48 8d 0d 60 2d 00 00 lea 0x2d60(%rip),%rcx
`----
PROBLEM 1 CODE
==============
Complete the assembly implementation of `primsums()'. Use the provided
C code as a reference for its behavior and the the techniques
demonstrated in `primprod()' Assembly function to complete the
function. To complete the function, you'll need a good grasp on
- Which registers are used for function parameters and its return
value
- How to create a simple loop in assembly
- How to set up a pointer to a global array
- How to access array elements
You can test your code for the problem via the provided Makefile:
,----
| make test-prob1
`----
PROBLEM 1 OPTIONAL Code Practice
================================
For additional practice, complete the optional function `primfact()'
in `prime_funcs_asm.s'. The C version of this is provided in
`prime_funcs.c' and a `main()' for it is present in `prime_fact.c'.
no test cases are available by when compiling, two executables are
produced
- `prime_fact_c' based on the C provided implementation
- `prime_fact_asm' based on the Assembly implementation
and their output can be compared to determine if the code is correct.
NOTE: This is a more challenging function as it requires the use of
division which has a number of interesting effects such as clobbering
several registers. It makes for EXCELLENT practice in preparation for
projects involving assembly.
PROBLEM 2A: order2
==================
CODE order2_asm.s
~~~~~~~~~~~~~~~~~
Examine the code in `order2_asm.s' and compare it to `order2_c.c'.
The only change that is needed is marked as TODO and involves
extending the stack to make space for local variables that must be in
memory. Examine this code and discuss with lab staff how to grow the
stack to fit the required local variables.
QUIZ Questions
~~~~~~~~~~~~~~
Stack Space
-----------
To create enough stack space for 2 integers in x86-64, the stack must
grow by this amount
- ( ) 2 bytes
- ( ) 4 bytes
- ( ) 8 bytes
- ( ) 16 bytes
Growing the Stack
-----------------
Which of the following instructions can be used to extend / grow the
stack?
Instruction Description
---------------------------------------------------------------------------------------------------------------
A subq $20, %rsp extends the stack by 20 bytes, data is uninitialized
B pushq %rbx extends the stack by 8 bytes, 8-byte value in register rbx written at the top of the stack
C pushl $99 extends the stack by 4 bytes, 4-byte value 99 written at the top of the stack
- ( ) A only
- ( ) B only
- ( ) C only
- ( ) Any combination of A/B/C: they all grow the stack
Moving data to main memory
--------------------------
This code appears in `order2_asm.s'.
,----
| movl $17, 0(%rsp)
| movl $12, 4(%rsp)
`----
Which following instructions best describes the sequence
- ( ) Copies the integer 17 to memory where the stack pointer points
and the integer 12 to memory 4 bytes about the stack pointer
- ( ) Copies the value 17 to the stack pointer and then ovewrites that
with 17+4=21 to the stack pointer which grows the stack
- ( ) Copies data from memory where the stack pointer points to memory
address 17 and copies data from 4 bytes above the stack pointer to
memory address 12
- ( ) Compares the stack pointer address to 17 and to 12 to determine
if it should be extended by 4 bytes
PROBLEM 2B: order3
==================
CODE order3_asm.s
~~~~~~~~~~~~~~~~~
Complete the `main()' function in `order3_asm.s'. This will require
completing the `TODO' sections in the code to grow the stack, populate
the stack with local variable values, call several functions with the
addresses of local variables, and then shrink the stack.
To help understand the intent of the assembly code, you can analyze
the equivalent C code in `order3_c.c' which performs the same
"computation" in C including use of the address-of operator.
When written correctly, the program should compile and run as follows.
,----
| > make
| gcc -Wall -Werror -g -o order3_c order3_c.c
| gcc -Wall -Werror -g -o order3_asm order3_asm.s
|
| > ./order3_asm
| r t v: 12 13 17
| q e d: 2 5 9
| i j k: 24 27 29
`----
If mistakes in the stack manipulation are present, this can lead to
problems late in the program. Valgrind can give a little insight but
generally these are difficult problems to diagnose so be careful. For
example, below is a transcript of an incorrectly written version which
does not allocate the correct amount of space in the stack for the
local variables.
,----
| > ./order3_asm # run broken version
| r t v: 12 13 17 # output look okay
| q e d: 2 5 9
| i j k: 24 27 29
| Segmentation fault (core dumped) # uh-oh...
|
| > valgrind ./order3_asm # see if valgrind gives any help
| ==2508984== Memcheck, a memory error detector
| ==2508984== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
| ==2508984== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
| ==2508984== Command: ./order3_asm
| ==2508984==
| r t v: 12 13 17 # output OK...
| q e d: 2 5 9
| i j k: 24 27 29
| ==2508984== Jump to the invalid address stated on the next line
| ==2508984== at 0x1D: ???
| ==2508984== by 0x1FFF000557: ???
| ==2508984== by 0x10489EF72: ???
| ==2508984== by 0x109138: ??? (in ./order3_asm)
| ==2508984== by 0x7FFFFFFFF: ???
| ==2508984== Address 0x1d is not stack'd, malloc'd or (recently) free'd
| ==2508984== # ADDRESS 0x1d is really small; probably clobbered
| ==2508984== # return address during execution, look at stack carefully
| ==2508984== Process terminating with default action of signal 11 (SIGSEGV): dumping core
| ==2508984== Bad permissions for mapped region at address 0x1D
| ==2508984== at 0x1D: ???
| ==2508984== by 0x1FFF000557: ???
| ==2508984== by 0x10489EF72: ???
| ==2508984== by 0x109138: ??? (in ./order3_asm)
| ==2508984== by 0x7FFFFFFFF: ???
| ==2508984==
| ==2508984== HEAP SUMMARY:
| ==2508984== in use at exit: 0 bytes in 0 blocks
| ==2508984== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
| ==2508984==
| ==2508984== All heap blocks were freed -- no leaks are possible
| ==2508984==
| ==2508984== For lists of detected and suppressed errors, rerun with: -s
| ==2508984== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
| Segmentation fault (core dumped)
`----
After completing both `order2_asm.s' and `order3_asm.s' they can be
tested via
,----
| make test-prob2
`----
QUIZ Questions
~~~~~~~~~~~~~~
Answer the following questions on stack manipulation and function
calls in assembly.
call / callq effects on Stack Pointer
-------------------------------------
When calling a function via the `call / callq', the stack pointer
`%rsp' must be "aligned", e.g. divisible by 16. Assuming this is so,
how does the `callq' instruction change `%rsp'?
- ( ) `callq' does not change `%rsp' during at all. `%rsp' is
therefore still divisible by 16 after the instruction completes.
- ( ) `callq' will subtract off 8 from the value of `%rsp' and places
the return address at the top of the stack. `%rsp' is then divisible
by 8 but not 16.
- ( ) `callq' will subtract off 16 from the value of `%rsp' and places
the return address at the top of the stack. `%rsp' is therefore
still divisible by 16.
- ( ) `callq' will subtract off 24 from the value of `%rsp' and places
the return address at the top of the stack. `%rsp' is then divisible
by 8 but not 16.
Alignment
---------
If the total size of local variables that need main memory space in a
function is 36 bytes, one approach is to grow the stack by 36 bytes
exactly. BUT if functions are to be called during that function, then
it would be better to...
- ( ) No special action is required: growing by 36 bytes is a good
idea as it saves memory while growing larger would waste memory.
- ( ) No special action is required: the `callq' instruction
automatically changes `%rsp' to be a value that is divisible by 16.
- ( ) Grow the stack by 48 bytes; this will mean `%rsp' is aligned at
a 16-byte boundary and ready for function calls.
- ( ) Grow the stack by 40 bytes; this + 8 bytes for the return
address in the stack will mean `%rsp' is aligned at a 16-byte
boundary and ready for function calls.
5 Submission
Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed lab zip to Gradescope.