|
Page 1 of 5 Writing Buffer Overflow Exploits - a Tutorial for
Beginners
Summary
Buffer overflows in user input dependent buffers have become one
of the biggest security hazards on the internet and to modern
computing in general. This is because such an error can easily be
made at programming level, and while invisible for the user who
does not understand or cannot acquire the source code, many of
those errors are easy to exploit. This paper attempts to teach the
novice - average C programmer how an overflow condition can be
proven to be exploitable.
Details
What
if you could scan your network for all the vulnerabilities on
SecuriTeam.com, automatically, every day?
1.
Memory
Note:
The way we describe it here, memory for a process is organized on
most computers, however it depends on the type of processor
architecture. This example is for x86 and roughly applies to
Sparc.
The
principle of exploiting a buffer overflow is to overwrite parts of
memory that are not supposed to be overwritten by arbitrary input
and making the process execute this code. To see how and where an
overflow takes place, let us look at how memory is organized. A
page is a part of memory that uses its own relative addressing,
meaning the kernel allocates initial memory for the process, which
it can then access without having to know where the memory is
physically located in RAM. The processes memory consists of three
sections:
- Code
segment, data in this segment are assembler instructions that the
processor executes. The code execution is non-linear, it can skip
code, jump, and call functions on certain conditions. Therefore, we
have a pointer called EIP, or instruction pointer. The address
where EIP points to always contains the code that will be executed
next.
- Data
segment, space for variables and dynamic buffers
- Stack
segment, which is used to pass data (arguments) to functions and as
a space for variables of functions. The bottom (start) of the stack
usually resides at the very end of the virtual memory of a page,
and grows down. The assembler command PUSHL will add to the top of
the stack, and POPL will remove one item from the top of the stack
and put it in a register. For accessing the stack memory directly,
there is the stack pointer ESP that points at the top (lowest
memory address) of the stack.
2. Functions
A
function is a piece of code in the code segment that is called,
performs a task, and then returns to the previous thread of
execution. Optionally, arguments can be passed to a function. In
assembler, it usually looks like this (very simple example, just to
get the idea):
memory
address code
0x8054321 <main+x> pushl $0x0
0x8054322 call $0x80543a0 <function>
0x8054327 ret
0x8054328 leave
...
0x80543a0 <function> popl %eax
0x80543a1 addl $0x1337,%eax
0x80543a4 ret
What
happens here? The main function calls function(0); The variable is
0, main pushes it onto the stack, and calls the function. The
function gets the variable from the stack using popl. After
finishing, it returns to 0x8054327. Commonly, the main function
would always push register EBP on the stack, which the function
stores, and restores after finishing. This is the frame pointer
concept that allows the function to use own offsets for addressing,
which is mostly uninteresting while dealing with exploits, because
the function will not return to the original execution thread
anyways. We just have to know what the stack looks like. At the
top, we have the internal buffers and variables of the function.
After this, there is the saved EBP register (32 bit, which is 4
bytes), and then the return address, which is again 4 bytes. Going
further down, there are the arguments passed to the function, which
are uninteresting to us.
In this
case, our return address is 0x8054327. It is automatically stored
on the stack when the function is called. This return address can
be overwritten, and changed to point to any point in memory, if
there is an overflow somewhere in the code.
|