Let’s suppose i have an elf 32 or 64 bits binary executable. This binary has been compiled with pic/pie options.
That’s mean all functions are mapped at a random address in memory.
What should i do if à need to make instrumentation or reverse engineering on this kind of binary ?
Is there a way to hook linux binary loader in order to always give the same addresses ?
Thanks
That’s mean all functions are mapped at a random address in memory.
No, it doesn't mean that.
With address randomization enabled, the PIE binary will be loaded at a random base address from run to run, but all functions and data will move together.
That is, if &foo == 0x12345600 and &bar == 0x12345700 in one execution, then the delta between them will always be 0x100 in subsequent executions (until the binary is relinked).
Is there a way to hook linux binary loader in order to always give the same addresses ?
There are several ways:
Address randomization can be globally disabled via echo 0 > /proc/sys/kernel/randomize_va_space
Use setarch ... -R a.out
Run the program under GDB, which disables randomization via personality system call.
Related
I notice there is a macro uint4korr in the MySQL/MariaDB source code.
include/byte_order_generic.h
I merely understand this macro is correlated with byte order. But I looked for the comments about this macro, found nothing. I don't know the meaning of the suffix korr. What does the abbreviation express?
I want to know why the code implements like this? What are the effects on different platforms?
"korr" is an abbreviation for "Korrekt" of the phonic and meaning equivalent of the English word "Correct".
The purpose of the code is to provide a uniform byte order of storage and communication components so the storage files are portable between different endian architectures without conversion, and the client/server communication doesn't need to know which endian the other architecture is.
I believe that the related Swedish verb is korrigera, to correct. uint4korr() is kind of the opposite of ntohl(), because it will swap the bytes on a big-endian architecture and not little-endian.
Somewhat related to this, the InnoDB storage engine stores its data in big-endian byte order, so that a simple memcmp() can be used for comparing keys. (It also inverts the sign bit of signed integers due to this.) The InnoDB function mach_read_from_4() is basically ntohl() combined with a 32-bit load via an unaligned pointer. Recent versions of GCC and clang impress me by translating that into the IA-32 or AMD64 instructions mov and bswap or simply movbe.
As all information I found about Qemu is related to Linux kernel, uboot or elf binaries I can't quite figure out how to load a binary blob from an embedded device into a specific address and execute part of it. The code I want to run does only arithmetics, so there are no hardware dependencies involved.
I would start qemu with something like
qemu-arm -singlestep -g8000
attach gdb, set initial register state and jump to my starting address to single step through it.
But how do I initially load binary data to a specific address and eventually set up an additional ram range?
how to load a binary blob from an embedded device into a specific address and execute part of it.
You can load binary blob into softmmu QEMU by the generic loader (-device loader).
I would start qemu with something like
qemu-arm -singlestep -g8000
This command line is for the linux-user QEMU invocation. It emulates userspace linux process of the guest architecture, it is unprivileged and does not provide support for any devices, including generic loader. Try using qemu-system-arm instead.
It's in fact easy with the Unicorn framework which works on top of Qemu. Based on the example in the websites doc section I wrote a Python script which loads the data, sets the registers, adds a hook which prints important per step information and start execution at the desired address until a target address.
I'm using QEMU-4.1.0 aarch64 to emulate some multi-core systems. Is it possible to run different elfs on different cores?
I am trying to use qemu provided function arm_load_kernel (
https://github.com/qemu/qemu/blob/master/hw/arm/boot.c line:1275) during my board initialization, but am not able to load different elfs.
If you want to load more than one ELF file then you should look at the 'generic loader' documented in docs/generic-loader.txt. This also lets you specify which CPU, if any, should have its PC set to the entry point of the ELF file. Depending on the board, you might be able to load all the ELF files that way and not specify -kernel at all. The command line for it is '-device loader,[options...]'.
Note that if you are using a board model which starts with most of the CPUs in a 'power off' state (ie where the expectation is that the primary CPU will power the other CPUs on) then you'll need to have code to do that whether you have one ELF or several (or, if the board permits it, use suitable command line options to have all the CPUs start powered on).
We write OpenCL C code and clCreateProgramWithSource and use clGetProgramInfo to get the binary. This binary is then integrated to the product binary which uses clCreateProgramWithBinary when initializing it.
We create a .h file and include the same in the source file. The content of the .h file is the binary generated after compiling OpenCL C Kernel.
The issue with the above step is, the compatibility of the binary is expected to break with any minor/major change in OpenCL and it will most likely break across vendors. We need to generate the OpenCL Kernel binary for each vendor or OpenCL release.
It is possible to integrate the OpenCL Kernel binary in header form to the project. In this case, if the binary is incompatible, we will not be in position to replace the binary. In such cases, the project initialization fails.
Expected Solution
The OpenCL C source is proprietary to the company and cannot be shared with the customers.
Since the OpenCL Kernel binary is integrated with the project
library, we need to understand if it is possible to generate binary
which can re-organize itself while clCreateProgramWithBinary to fit
to the target platform.
If it is absolutely necessary to generate the binary once for each
vendor/OpenCL minor/major revision and store it to disk (which will
be done at end user’s machine), how can we protect the source which
proprietary to the company (is SPIR the only option)?
I already visited Universal binaries for OpenCL but it suggests that SPIR also takes long time in compilation and hence it might not be the solution I am looking for since the init time is also important.
In practice the Intel Gen binary format can change on driver changes for the same platform/hardware (e.g. for bug fix workarounds and performance improvements). Hence, the bits returned by clGetProgramInfo are only sure to work in clCreateProgramWithBinary on the same device x driver x etc... Sadly, this means that the binary path is a poor match for the intellectual property security problem.
SPIR sort of splits the difference as it would be hardware independent while still being harder to reverse engineer. If startup performance is somehow important, you can always try the clCreateProgramWithBinary path; just be able to fall back to SPIR should the binary load fail (meaning the driver changed or something).
What does executable actually contain ? .. Does it contain instructions to processor in the form of Opcode and Operands ? If so why we have different executables for different operating systems ?
Processors understand programs in terms of opcodes - so your intution about executables containing opcodes is correct, and you guessed correctly that any executable has to have opcodes and operands for executing the program on a processor.
However, programs mostly execute with the help of operating systems (you can write programs which do not use an OS to execute, but that would be a lot of unnecessary work) - which provide abstractions on top of the hardware which the programs can use. The OS is responsible for setting up a "context" for any program to run i.e. provide the program the memory it needs, provide general purpose libraries which the program can use for doing common stuff such as write to files, print to console etc.
However, to set up the context for the program (provide it memory, load its data, set up a stack for it), the OS needs to read a program's executable file and needs to know a few things about the program such as the data which the program expects to use, size of that data, the initial values stored in that data region, the list of opcodes that make up the program (also called the text region of a process), their size etc. All of this data and a lot more (debugging information, readonly data such as hardcoded strings in the program, symbol tables etc) is stored within the executable file. Each OS understands a different format of this executable file, since they expect all this info to be stored in the executable in different ways. Check out the links provided by Groo.
A couple of formats that have been used for storing information in an executable file are ELF and COFF on UNIX systems and PE on Windows.
P.S. - Not all programs need executable formats. Look up bootloaders on Google. These are special programs which occupy the first sector of a bootable partition on the hard-disk and are used to load the OS itself.
Yes, code in the form of opcodes and operands, and data of course. Anything you want to do that involves the operating system in any way depends on the operating system, not on the CPU. That is why you need different programs for different operating systems. Opening a window in Windows is not done with the same sequence of instructions as in Linux, and so on.
As unwind implied in his answer, an executable file contains calls to routines in the Operating System.
It would be extremely inefficient for an executable file to try to implement functions already provided by the OS (for example, writing to disk, accepting input) so heavy use is made of calls to the OS functions.
Different Operating Systems provide functions which do similar things, but the details of how to call those functions (and where they are) may be different.
So, apart from the major differences of processor type, executables written for one OS won't work with another.
To do any form of IO, an executable needs to interface with the Operating System using sys-calls. in Windows these are calls to the Win32 API and on linux/unit these are mostly posix calls.
Furthermore, the executable file format differs with the OS the same way a PNG file differs from a GIF file. the data is ordered differently and there are different headers and sub-headers.
An Executable file contains several blobs of data and instructions on how the datas should be loaded into memory. Some of these sections happen to contain machine code that can be executed. Other sections contain program data, resources, relocation information, import information etc.