Shellcode & ROP

1. Lecture 10
2. Shellcode
3. Code Reuse Attack

1. Lecture 10

Class: Malware Analysis and Incident Forencsis
Topic: Shellcode & ROP

2. Shellcode

Shellcode indicates a slab of self-contained executable code, historically, the words refers to the ability to “obtain a shell”, but in general a shellcode can perform arbitrary tasks. Self contained refers to the fact that there is no assumption on where it will be placed in memory, external dependencies like API addresses need manual resolution; and a shellcode determines its run-time location and uses relative offsets from it in order to male control transferar, accessing its own data. Also it is composed by one big section that has data and code.

For exmploitation, one can supply a poisoned input containing a shellcode to a vulnerable program (arbitrary code execution ability), abusing a process to do something. This technique is bundant in infected documents, before DEP/NX, vulnerable programs were hijacked to run shellcode; but today a further step is needed.

2.1. Stack buffer Overflow

Stack overflow based attacks were wxplained for the first time n smashing the stack fo fun and profit written by Aleph One for Phrack (1996). The attacker makes the program fill a stack buffer with untrusted input crafted to contain executable code, this is especially relevant for memory-unsafe languages like C/C++ that support direct memory access.

There is an input supplied to a process, that will be contained in a buffer. The input contains data that alters the return address to point back to itself to where the attack code contained in it begins. Data Execution Prevention can defuse it.

2.2. Typical injection Vector

The tail is the address of the shellcode (were it starts), then ther is the code itself, and a buffer. The purpose of the buffer is to ease the exploitation when the jump targetis not easy to control. So the buffer contains sequence of one byte long NOPs.

So a shellcode is crafted by:

writing a code
preprend a NOP sled
and then add the address at the tail.

mystart is an offset, so recolation is not needed. The important part is to have a call followed to a pop, where the real shellcode is.

To stop this kind of attack a combination of techniques can be used: DEP/NX, canaries and ASLR.

2.2.1. NX

Memory can’t be both writable and executable.

2.2.2. Stack Canaries

Canaries are a way to detect if the pointers have been tampered. The compiler inserts safeguards, so the goal is to detect a buffer overflow that affects the return address. If the canary has been tampered then the process kills itself.

There can be different kinds of stack canaries. The terminator (like NULL), prevents an attacker to use strcpy. But not all programs rely on strcpy, so a second type of canaries are random, to prevent attacks that rely on nowledge of canary values. Even better the canary value can be as a XOR key to decrypt a pointer.

2.2.3. ASLR

It makes difficoult to place the right address of the start of the shellcode, and also manipulate the position of core modules.

2.3. More on shellcode

Malware can inject and execute shellcode in another process, and also malware can allocate memory, so they have not to worry about ASLR. So we consider a shellcode like position indepentent code; in this case the malware makes no assumption on where it will be located. For this reason instruction and data are referenced via relative offsets, so some jmps and calls must be avoided; with regards with data, there is the need to leak a base pointer.

2.3.1. Relative Offsets

3. Code Reuse Attack

WIth the introduction of NX capabilities, buffer overflow attacks are very difficoult to execute; but also good hardening will stop memory allocation from other processes, and permit only the capability of a software to modify is own memory permissions.

Historically the first solution was the reuse of library function, chained togheter ot make multiple function calls. The problem is that the is difficoult to locate libc (ASLR), and there is no control flow.

Return Oriented Programming, is the evolution of return to libc, is composed by the sequential use of small pieces of instructions. These small sequence of instruction is called a gadget (intruction + ret), ROP effectively bypasses DEP as all the executed instructions are legit, and it cannot be analyzed with conventional reverse engineering tools. shadow stack and integrity control flow are the main security measures.

Data oriented attacks hijacks the data of a program to perform unwanted computation. ROP gadgets can be interleaved with data, so gadget addresses and adata operands next to each other.