Lifting shared libraries & PIE binaries with DragonFFI (and LIEF)
Dragonffi 0.6.0 comes with new APIs that can be leveraged to call arbitrary functions within a shared library (or position independent (aka PIE) binaries).
This small post showcases how to use these new APIs to do so. Impatient readers can directly jump to the Python scripts achieving this feature.
DragonFFI’s python bindings can be installed using pip install pydffi
. Wheels have been precompiled for OSX/Windows/Linux for x64. The 0.6.0 release is the first one where these packages have been fully built using Github Actions!
Introduction
Let’s say we have a closed-source shared library (or a PIE binary). Calling exported function from this library is supported by the (famous) combo dlopen
+ dlsym
(or LoadLibrary
+ GetProcAddress
for Windows), and also easily achievable from Python with FFI libraries like ctypes, cffi and, obviously, DragonFFI. Problems arise when we want to call a non exported function from that binary, with the special bonus to do it directly from Python.
The classical way to do this is:
- load this library into the Python process
- find its loading base address
- compute the function address (thanks to the base address)
- call it using an FFI library, e.g. ctypes
Let’s see how DragonFFI makes all of this an easy task through a shared library example. This section gives links for the extra necessary steps to dynamically load a PIE binary under Linux (using LIEF).
Toy library example
Let’s say we have a very elite library that decrypts and prints buffers:
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
__attribute__((noinline)) static void decrypt(uint8_t* buf, const size_t len, uint8_t const key)
{
for (size_t i = 0; i < len; ++i) {
buf[i] ^= key;
}
}
__attribute__((visibility("default"))) void decrypt_print(uint8_t const* buf, size_t len, uint8_t const key)
{
uint8_t* decr = malloc(len+1);
if (!buf) return;
decr[len] = 0;
memcpy(decr, buf, len);
decrypt(decr, len, key);
puts(decr);
}
Note the __attribute__((noinline))
attribute for the decrypt
function, so that we are sure this function ends up in the compiled binary.
Let’s consider we are provided with only a binary version of this library, compiled like this:
gcc -O2 -fvisibility=hidden mylib.c -fPIC -shared -o mylib.so -Wl,-strip-all
nm
tells us that the only exported symbol is decrypt_print
:
$ nm -D --defined-only mylib.so
0000000000001150 T decrypt_print
Reverse engineering this library should give us the relative address of the decrypt
function. With the system used to write this post, this relative address is 0x1130
in the resulting mylib.so
file.
Load the library and get its base address
DragonFFI provides the cross-platform pydffi.dlopen
function to load a shared library into the Python process:
import pydffi
lib = pydffi.dlopen("/path/to/mylib.so")
Note that, on some system (e.g. Linux), a full path to the library must be provided to dlopen
.
The object returned by dlopen
(lib
in our example) have a baseAddress
property containing the base address where the library has been loaded:
print(hex(lib.baseAddress))
This is supported for OSX/Windows/Linux. Note that, under OSX, baseAddress
can have a O(n)
complexity, with n
being the number of loaded libraries in the Python process.
Let’s see two methods to call the decrypt
function from such a shared library.
Call the function, method 1
The more straightforward method to call arbitrary function from that binary is to dynamically add new symbols into DragonFFI’s runtime dynamic linker (which is basically LLVM’s).
This can be achieved using the pydffi.addSymbol
function, and then call the decrypt
function as we would usually do:
import pydffi
# Load the library into our current Python process
lib = pydffi.dlopen("/path/to/mylib.so")
BASE = lib.baseAddress
# TODO: put the offset for your library
DECRYPT_OFFSET = 0x1130
# Add the "decrypt" symbol to the dynamic linker
pydffi.addSymbol("decrypt", BASE+DECRYPT_OFFSET)
# Declare the "decrypt" function interface, as we would do with any other
# classical one
FFI = pydffi.FFI()
CU = FFI.cdef('''
#include <stdint.h>
#include <stdlib.h>
void decrypt(uint8_t* buf, const size_t len, uint8_t const key);
''')
# Call the decrypt function
key = 0xAA
buf = bytearray(c^key for c in b"Hello World!")
CU.funcs.decrypt(buf, len(buf), key)
print(buf)
Call the function, method 2
Another way to call a function at an arbitrary address is to declare function types, and instantiate them with an arbitrary address:
import pydffi
# Load the library into our current Python process
lib = pydffi.dlopen("/path/to/mylib.so")
BASE = lib.baseAddress
DECRYPT_OFFSET = 0x1130
# Define the "decrypt" function type
FFI = pydffi.FFI()
CU = FFI.cdef('''
#include <stdint.h>
// Note that we declare a function type, not a pointer-to-function type
typedef void(decrypt_functy)(uint8_t*, const size_t, uint8_t);
''')
# Create the decrypt function object
decrypt = (CU.types.decrypt_functy)(BASE+DECRYPT_OFFSET)
# Call the decrypt function
key = 0xAA
buf = bytearray(c^key for c in b"Hello World!")
decrypt(buf, len(buf), key)
print(buf)
What about PIE binaries? (Linux only)
PIE binaries are relocatable (by design), and we can trick dlopen
to open them.
Please note that doing this isn’t fully supported by the Linux loader, and recent version of glibc actually prevent us from doing so. That being said, the LIEF project documentation has a great tutorial that shows how to patch the binary to circumvent this limitation.
Alternative method using LIEF
Another way to fix our original problem is to use LIEF to statically add exported symbols to our binary. This is covered in this tutorial.
Conclusion
There’s nothing really fundamentally new in this post, or, in other words, nothing that you weren’t already able to do directly in C(++), or using ctypes
in Python. DragonFFI just makes this more straightforward and portable across OSX, Linux and Windows.
Acknowledgment
Thanks to serge-sans-paille that took the time to review this post while cleaning his lovely Kig Ah Farz bags, and toffan for various corrections.