Adding a function to a Linux binary - binary

as part of a homework assignment for my security class, I'm supposed to "add a simple function which prints "Hello, World!" to a compiled C Linux binary". The binary provided is just a compiled main function with 10 NOPs in it.
Normally, I would have written the code needed into the NOP section directly, but we were explicitly told to add a new function to the program.
I have no idea how to do that. I tried putting some code at the end of the binary, but this seems to destroy it. Could somebody clear this up for me?
EDIT: This question sounds somewhat similar.
EDIT 2: Searching for "ELF injection" yields many interesting results.

Perhaps you need to learn more about ELF (espacially if you want to make a program able to "infect" any Linux binary, not just the simple one you've got).
The Linux ABI X86-64 supplement could also be useful, and also the Linux Assembly Howto

You could end the main function, and then start a new function in the NOPs -- but before ending the main function, call the new function you have added after the end of the main function.

Related

passing c++ variables to python via gdb

I am developing/debugging a c++ code which extensively uses c++ STL vectors and blitz cpp arrays
(vectors/arrays are multidimensional, upto 4D/5D arrays)
I am currently using cout/print to log the outputs of inputs/outputs of functions but it is getting very tedious. To be able to print the vectors/arrays while debugging, can you suggest any options.
I thought of a couple of options
(a) write template functions on c++ to print and use GDBs "call" feature. but unable to use the "call" functionality of GDB for c++ template functions but works for normal functions though.
(b) Is it possible to pass c++ variables to python interface of GDB and print them ? any examples for the same ?
I googled before posting this question, but did not find any useful thread.
Any help is highly appreciated (even if some links can be provided)
Thanks a lot in advance !
Writing code in C++ to print the array and call it from gdb is certainly an option, but it might be unreliable because the print function you write might not be accessible (the linker might have dropped it because it was not used in your c++ code, for instance). Also, remember that templates are just "recipes" and you actually need to use them in order for the compiler to generate a class/function from it.
Is it possible to pass c++ variables to python interface of GDB and print them ? any examples for the same ?
A simple answer to this is "yes". You can use the parse_and_eval function in the gdb module when you use gdb's python API. Something such as
py print(gdb.parse_and_eval('your_variable'))
would print the value of a variable called your_variable using gdb's python API. But just that would be the same as just p your_variable in gdb's regular prompt without using the python API. The real power comes when you use gdb's python API to write pretty-printers for the types you want to debug.
A pretty-printer is basically just some code that you or someone else wrote to tell gdb how to print some type in a nice way. With a pretty-printer for a type just p your_variable in gdb's prompt prints the variable in the nice way defined by your pretty-printer.
I couldn't find a pretty-printer for blitz with a quick google search and I haven't used blitz before. However, I have used another library for vectors and matrices in scientific computing called armadillo and thus faced similar problems. I have thus written some pretty printers for armadillo here that might help you in case you decide to write pretty printers for blitz.
As an illustration, below you can see how the arma::mat (a matrix of doubles) type from armadillo is printed in gdb without a pretty printer (the m1 variable, which is a 6x3 matrix of doubles)
Notice that we can't even see the matrix elements. They are stored in a continuous memory region pointed by the mem attribute of the arma::mat object.
Now the same matrix with the pretty printer available here.
That makes debugging code a lot easier.
Note: You can also write pretty printers in the guile language, but I bet python is a much more common choice.

How to find dependend functions in octave

I would like to identify all functions needed to run a specific function in octave. I need this to deploy an application written in Octave.
While Matlab offers some tools to analyse a function on its dependencies, I could not find something similar for Octave.
Trying inmem as recommended in matlab does not produce the expected result:
> inmem
warning: the 'inmem' function is not yet implemented in Octave
Is there any other solution to this problem available?
First, let me point out that from your description, the matlab tool you're after is not inmem, but deprpt.
Secondly, while octave does not have a built-in tool for this, there is a number of ways to do so yourself. I have not tried these personally, so, ymmv.
1) Run your function while using the profiler, then inspect the functions used during the running process. As suggested in the octave archives: https://lists.gnu.org/archive/html/help-octave/2015-10/msg00135.html
2) There are some external tools on github that attempt just this, e.g. :
https://git.osuv.de/m/about
https://github.com/KaeroDot/mDepGen
3) If I had to attack this myself, I would approach the problem as follows:
Parse and tokenise the m-file in question. (possibly also use binary checks like isvarname to further filter useless tokens before moving to the next step.)
For each token x, wrap a "help(x)" call to a try / catch block
Inspect the error, this will be one of:
"Invalid input" (i.e. token was not a function)
"Not found" (i.e. not a valid identifier etc)
"Not documented" (function exists but has no help string)
No error, in which case you stumbled upon a valid function call within the file
To further check if these are builtin functions or part of a loaded package, you could further parse the first line of the "help" output, which typically tells you where this function came from.
If the context for this is that you're trying to check if a matlab script will work on octave, one complication will be that typically packages that will be required on octave are not present in matlab code. Then again, if this is your goal, you should probably be using deprpt from matlab directly instead.
Good luck.
PS. I might add that the above is for creating a general tool etc. In terms of identifying dependencies in your own code, good software engineering practices go a long way towards providing maintenable code and easily resolving dependency problems for your users. E.g: -- clearly identifying required packages (which, unlike matlab, octave does anyway by requiring such packages to be visibly loaded in code) -- similarly, for custom dependencies, consider wrapping and providing these as packages / namespaces, rather than scattered files -- if packaging dependencies isn't possible, you can create tests / checks in your file that throw errors if necessary files are missing, or at least mention such dependencies in comments in the file itself, etc.
According to Octave Compatibility FAQ here,
Q. inmem
A. who -functions
You can use who -function. (Note: I have not tried yet.)

ida pro virtual function actual declaration

I am trying to hack into an exe to find the implementation of certain functions and perform actions in an injected dll.
The exe is a sort of screen saver, and fortunately very simple, so it uses important strings to identify code sections.
My dilemma is that one of the functions, 'getaxis', is a virtual function. I know this thanks to the strings window and other telltale info on the string (Audioplayer.Tracklist::GetAxis) that i traced back to an rdata section
.data:01E204B0 off_1E204B0 dd offset aAudioplayer_to ; DATA XREF: _call_vfuncr
.data:01E204B0 ; _call_vfunc+26r
.data:01E204B0 ; "Audioplayer.Tracklist::Internal_GetTrack"
...
Using ida pro i have successfully traced the very function call, but I am unable to find the actual virtual function implementation.
So here's my problem:
1- I am ignorant on the subject, being a newbie in disassembling: is there a way to actually find a virtual function's implementation?
Ida clearly shows the various 'subs' in the "functions window", but there's a couple hundred of functions in there, and i'm hoping there's a better way of finding a virtual functions' declaration.
2- is there any association between a sub name and a virtual function. i was unable to find any.
So how can I find the actual virtual function declaration in disassembled code?
Is this possible at all?
Thanks.
Do you know what compiler/language was used to generate this program? I'm only familiar with how most C++ implementations generate vftables. "Knowing your enemy" is key to reverse engineering.
From the looks of it, those '_call_vfunc' functions may be some sort of implementation detail of some other language's compiler (say some random Pascal compiler, or whatever) which may have a need for retaining such metadata. call_vfunc may throw an error when a pure/nullptr entry is used in whatever they used for a vftable, hence the use of/reference to that string.
If call_vfunc is used to perform ALL virtual function calls, you could hook said function and log the vftable addresses it ends up using. Unless this is an overly complex screen saver, there shouldn't be too many vftables that are used. If IDAPython has any debugger APIs, you could possibly do all the logging via a Python script while debugging with IDA.

Building GPL C program with CUDA module

I am attempting to modify a GPL program written in C. My goal is to replace one method with a CUDA implementation, which means I need to compile with nvcc instead of gcc. I need help building the project - not implementing it (You don't need to know anything about CUDA C to help, I don't think).
This is my first time trying to change a C project of moderate complexity that involves a .configure and Makefile. Honestly, this is my first time doing anything in C in a long time, including anything involving gcc or g++, so I'm pretty lost.
I'm not super interested in learning configure and Makefiles - this is more of an experiment. I would like to see if the project implementation goes well before spending time creating a proper build script. (Not unwilling to learn as necessary, just trying to give an idea of the scope).
With that said, what are my options for building this project? I have a myriad of questions...
I tried adding "CC=nvcc" to the configure.in file after AC_PROG_CC. This appeared to work - output from running configure and make showed nvcc as the compiler. However make failed to compile the source file with the CUDA kernel, not recognizing the CUDA specific syntax. I don't know why, was hoping this would just work.
Is it possible to compile a source file with nvcc, and then include it at the linking step in the make process for the main program? If so, how? (This question might not make sense - I'm really rusty at this)
What's the correct way to do this?
Is there a quick and dirty way I could use for testing purposes?
Is there some secret tool everyone uses to setup and understand these configure and Makefiles? This is even worse than the Apache Ant scripts I'm used to (Yeah, I'm out of my realm)
You don't need to compile everything with nvcc. Your guess that you can just compile your CUDA code with NVCC and leave everything else (except linking) is correct. Here's the approach I would use to start.
Add a 1 new header (e.g. myCudaImplementation.h) and 1 new source file (with .cu extension, e.g. myCudaImplementation.cu). The source file contains your kernel implementation as well as a (host) C wrapper function that invokes the kernel with the appropriate execution configuration (aka <<<>>>) and arguments. The header file contains the prototype for the C wrapper function. Let's call that wrapper function runCudaImplementation()
I would also provide another host C function in the source file (with prototype in the header) that queries and configures the GPU devices present and returns true if it is successful, false if not. Let's call this function configureCudaDevice().
Now in your original C code, where you would normally call your CPU implementation you can do this.
// must include your new header
#include "myCudaImplementation.h"
// at app initialization
// store this variable somewhere you can access it later
bool deviceConfigured = configureCudaDevice;
...
// then later, at run time
if (deviceConfigured)
runCudaImplementation();
else
runCpuImplementation(); // run the original code
Now, since you put all your CUDA code in a new .cu file, you only have to compile that file with nvcc. Everything else stays the same, except that you have to link in the object file that nvcc outputs. e.g.
nvcc -c -o myCudaImplementation.o myCudaImplementation.cu <other necessary arguments>
Then add myCudaImplementation.o to your link line (something like:)
g++ -o myApp myCudaImplementation.o
Now, if you have a complex app to work with that uses configure and has a complex makefile already, it may be more involved than the above, but this is the general approach. Bottom line is you don't want to compile all of your source files with nvcc, just the .cu ones. Use your host compiler for everything else.
I'm not expert with configure so can't really help there. You may be able to run configure to generate a makefile, and then edit that makefile -- it won't be a general solution, but it will get you started.
Note that in some cases you may also need to separate compilation of your .cu files from linking them. In this case you need to use NVCC's separate compilation and linking functionality, for which this blog post might be helpful.

how to create applications with Clozure Common Lisp (on Microsoft Windows)

I am a new one to Common Lisp (using Clozure Common Lisp under Microsoft Windows), who is familiar with c and python before. So maybe the questions are stupid here, but be patient to give me some help.
1) What's is the usual way to run a common lisp script?
Now, I wrote a bat file under windows to call ccl exe(wx86cl.exe) and evaluate (progn (load "my_script_full_path") (ccl:quit)) every time when I want to "run" my script. Is this a standard way to "run" a script for common lisp?
Any other suggestion about this?
2) What's the difference between (require 'cxml) and (asdf:operate 'asdf:load-op :cxml)?
They are seems to be the same for my script, which one should I use?
3) ignore it, not a clear question
4) When I want to load some library (such as require 'cxml), it always takes time(3s or even 5s) to load cxml every time when I "run" my script, there is also much log to standard output I show below, it seems like checking something internal. Does it means I have to spent 3-5s to load cxml every time when I want to run a simple test? It seems like a little inefficient and the output is noisy. Any suggestion?
My Script
(require 'cxml) (some-code-using-cxml)
And the output
; Loading system definition from D:/_play_/lispbox-0.7/quicklisp/dists/quicklisp/software/cxml-20101107-git/cxml.asd into #<Package "ASDF0">
;;; Checking for wide character support... yes, using code points.
; Registering #<SYSTEM "cxml-xml">
......
some my script output
---EDIT TO ADD MORE----
5) I must say that I almost forget the way of dumping image to accelerate the loading speed of lisp library. So, what is the normal process for us to develop a (maybe very simple) lisp script?
Base on the answer of what I got now, I guess maybe
a) edit your script
b) test it via a REPL environment, SLIME is a really good choice, and there should be many loop between a <==> b
c) dump the image to distribute it?( I am no sure about this)
6) Furthermore, what is the common way/form for us to release/distribute the final program?
For a lisp library, we just release our source code, and let someone else can "load/require" them.
For a lisp program, we dump a image to distribute it when we confirm that all functions go well.
Am I right?
What form do we use in a real product? Do we always dump all the thing into a image at final to speed up the loading speed?
1) Yes, the normal way to run a whole programme is to use a launcher script. However, windows has much, much better scripting support these days than just the bat interpreter. Windows Scripting Host and PowerShell ship as standard.
1a) During development, it is usual to simply type things in a the REPL (Read-Eval-Print-Loop, i.e. the lisp command line), or to use something like SLIME (for emacs or xemacs) as a development environment. If you don't know what they are, look them up. You may wish to use Cygwin to install xemacs, which will give you access to a range of linux-ish tools.
2) Require is, IIRC, a part of the standard. ASDF is technically not, it is a library that operates to make libraries work more conveniently. ASDF has a bunch of features that you will eventually want if you really get into writing large Lisp programmes.
3) Question unclear, pass.
4) See 1a) - do your tests and modifications in a running instance, thus avoiding the need to load the library more than once (just as you would in Python - you found the python repl, right?). In addition, when your programme is complete, you can probably dump an image which has all of your libraries pre-loaded.
Edit: additional answers:
5) Yes
6) Once you have dumped the image, you will still need to distribute the lisp binary to load the memory image. To make this transparent to the user, you will also have to have a loader script (or binary) to run the lisp binary with the image.
You don't have to start the lisp from scratch and load everything over again each time you want to run a simple test. For more efficient development, interactively evaluate code in the listener (REPL) of a running lisp environment.
For distribution, I use Zachary Beane's Buildapp tool. Very easy to install and use.
Regarding distribution -
I wrote a routine (it's at home and unavailable at the moment) that will write out the current image as a standard executable and quit. It works for both CLISP and SBCL.
I can rummage it up if you like.