I understand how things like numbers and letters are encoded in binary, and thus can be stored as 0's and 1's.
But how are functions stored in memory? I don't see how they could be stored as 0's and 1's, and I don't see how something could be stored in memory as anything besides 0's and 1's.
They are in fact stored into memory as 0's and 1's
Here is a real world example:
int func(int a, int b) {
return (a + b);
}
Here is an example of 32-bit x86 machine instructions that a compiler might generate for the function (in a text representation known as assembly code):
func:
push ebp
mov ebp, esp
mov edx, [ebp+8]
mov eax, [ebp+12]
add eax, edx
pop ebp
ret
Going into how each of these instructions work is beyond the scope of this question, but each one of these symbols (such as add, pop, mov, etc) and their parameters are encoded into 1's and 0's. This table shows many of the Intel instructions and a summary of how they are encoded. See also the x86 tag wiki for links to docs/guides/manuals.
So how does one go about converting code from text assembly into machine-readable bytes (aka machine code)? Take for example, the instruction add eax, edx. This page shows how the add instruction is encoded. eax and edx are something called registers, spots in the processor used to hold information for processing. Variables in computer programming will often map to registers at some point. Because we are adding registers and the registers are 32-bit, we select the opcode 000000001 (see also Intel's official instruction-set reference manual entry for ADD, which lists all the forms available).
The next step is for specifying the operands. This section of the same previous page shows how this is done with the example "add ecx, eax" which is very similar to our own. The first two bits have to be '11' to show we are adding registers. The next 3 bits specifies the first register, in our case we pick edx rather than the eax in their example, which leaves us with '100'. The next 3 bits specifies our eax, so we have a final result of
00000001 11100000
Which is 01 D0 in hexadecimal. A similar process can be applied to converting any instruction into binary. The tool used to do this automatically is called an assembler.
So, running the above assembly code through an assembler produces the following output:
66 55 66 89 E5 66 67 8B 55 O8 66 67 8B 45 0C 66 01 D0 66 5D C3
Note the 01 D0 near the end of the string, this is our "add" instruction. Converting machine-code bytes back into text assembly-language mnemonics is called disassembling:
address | machine code | disassembly
0: 55 push ebp
1: 89 e5 mov ebp, esp
3: 8b 55 08 mov edx, [ebp+0x8]
6: 8b 45 0c mov eax, [ebp+0xc]
9: 01 d0 add eax, edx
b: 5d pop ebp
c: c3 ret
Addresses start at zero because this is only a .o, not a linked binary. So they're just relative to the start of the file's .text section.
You can see this for any function you like on the Godbolt Compiler Explorer (or on your own machine on any binary, freshly-compiled or not, using a disassembler).
You may notice there is no mention of the name "func" in the final output. This is because in machine code, a function is referenced by its location in RAM, not its name. The compiler-output object file may have a func entry in its symbol table referring to this block of machine code, but the symbol table is read by software, not something the CPU hardware can decode and run directly. The bit-patterns of the machine code are seen and decoded directly by transistors in the CPU.
Sometimes it is hard for us to understand how computers encode instructions like this at a low level because as programmers or power users, we have tools to avoid ever dealing with them directly. We rely on compilers, assemblers, and interpreters to do the work for us. Nonetheless, anything a computer ever does must eventually be specified in machine code.
Functions are made of instructions, such as bytecode or machine code. Instructions are numbers, which can be encoded in binary.
A good introduction to this is Charles Petzold's book Code.
I will explain how functions are stored in the easiest way possible. You will be surprised by the amazing simplicity of all this at the end of this contribution. This is the most fundamental explanation and any type of computer will work in somehow the same way.
The only part of a computer that can perform any operations on data ie addition, subtraction, multiplication and division. Every data manipulation(any sort of Math or just any formula) in human existence is made up of these operations.
Now let us look at the basic structure of an instruction in binary. If we are working on a 32 bit machine, an instruction will take the form of:
1 001 32-bit address 32-bit adress
1(if this bit is one then the instruction is diverted to logic unit for calculation and if zero, we are basically moving the data between the two memory addresses following this) 001(these 3 bits determine if we are adding(001), subtracting(010), multiplying(011), or dividing(100) in this instruction cycle) (32-bit memory address of first memory location) (32-bit memory adress of second memory adress)
A function is basically a string of instructions of how to manipulate data in defined memory locations.
Let us take a random function that adds a number, then multiplies. It’s string of instructions will be:
(let me use MA to mean memory adress)
1 001 MAone MAtwo (adds value in MAone to value in MAtwo and stores resultant in MAone )
1 011 MAtwo MAthree (multiplies value in MAtwo with value in MAthree and stores resultant in MAthree )
Returns value in MAthree
So the ony difference in how functions are stored is that they are stored with 1 in the left most bit so that the CPU knows its a function that needs logical operations and diverts it to the ALU
Related
I am reading the Programming From Ground Up book. I see two different examples of how the base pointer %ebp is created from the current stack position %esp.
In one case, it is done before the local variables.
_start:
# INITIALIZE PROGRAM
subl $ST_SIZE_RESERVE, %esp # Allocate space for pointers on the
# stack (file descriptors in this
# case)
movl %esp, %ebp
The _start however is not like other functions, it is the entry point of the program.
In another case it is done after.
power:
pushl %ebp # Save old base pointer
movl %esp, %ebp # Make stack pointer the base pointer
subl $4, %esp # Get room for our local storage
So my question is, do we first reserve space for local variables in the stack and create the base pointer or first create the base pointer and then reserve space for local variables?
Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc. Does C have a specific convention when it creates the machine code?
My reasoning is that all the code in a function would be relative to the base pointer, so as long as that function follows the convention according to which it created a reference of the stack, it just works?
Few related links for those are interested:
Function Prologue
In your first case you don't care about preservation - this is the entry point. You are trashing %ebp when you exit the program - who cares about the state of the registers? It doesn't matter any more as your application has ended. But in a function, when you return from that function the caller certainly doesn't want %ebp trashed. Now can you modify %esp first then save %ebp then use %ebp? Sure, so long as you unwind the same way on the other end of the function, you may not need to have a frame pointer at all, often that is just a personal choice.
You just need a relative picture of the world. A frame pointer is usually just there to make the compiler author's job easier, actually it is usually there just to waste a register for many instruction sets. Perhaps because some teacher or textbook taught it that way, and nobody asked why.
For coding sanity, the compiler author's sanity etc, it is desirable if you need to use the stack to have a base address from which to offset into your portion of the stack, FOR THE DURATION of the function. Or at least after the setup and before the cleanup. This can be the stack pointer(sp) itself or it can be a frame pointer, sometimes it is obvious from the instruction set. Some have a stack that grows down (in address space toward zero) and the stack pointer can only have positive offsets in sp based address (sane) or some negative only (insane) (unlikely but lets say its there). So you may want a general purpose register. Maybe there are some you cant use the sp in addressing at all and you have to use a general purpose register.
Bottom line, for sanity you want a reference point to offset items in the stack, the more painful way but uses less memory would be to add and remove things as you go:
x is at sp+4
push a
push b
do stuff
x is at sp+12
pop b
x is at sp+8
call something
pop a
x is at sp+4
do stuff
More work but can make a program (compiler) that keeps track and is less error prone than a human by hand, but when debugging the compiler output (a human) it is harder to follow and keep track. So generally we burn the stack space and have one reference point. A frame pointer can be used to separate the incoming parameters and the local variables using base pointer(bp) for example as a static base address within the function and sp as the base address for local variables (athough sp could be used for everything if the instruction set provides that much of an offset). So by pushing bp then modifying sp you are creating this two base address situation, sp can move around perhaps for local stuff (although not usually sane) and bp can be used as a static place to grab parameters if this is a calling convention that dictates all parameters are on the stack (generally when you dont have a lot of general purpose registers) sometimes you see the parameters are copied to local allocation on the stack for later use, but if you have enough registers you may see that instead a register is saved on the stack and used in the function instead of needing to access the stack using a base address and offset.
unsigned int more_fun ( unsigned int x );
unsigned int fun ( unsigned int x )
{
unsigned int y;
y = x;
return(more_fun(x+1)+y);
}
00000000 <fun>:
0: e92d4010 push {r4, lr}
4: e1a04000 mov r4, r0
8: e2800001 add r0, r0, #1
c: ebfffffe bl 0 <more_fun>
10: e0800004 add r0, r0, r4
14: e8bd4010 pop {r4, lr}
18: e12fff1e bx lr
Do not take what you see in a text book, white board (or on answers in StackOverflow) as gospel. Think through the problem, and through alternatives.
Are the alternatives functionally broken?
Are they functionally correct?
Are there disadvantages like readability?
Performance?
Is the performance hit universal or does it depend on just how
slow/fast the memory is?
Do the alternatives generate more code which is a performance hit but
maybe that code is pipelined vs random memory accesses?
If I don't use a frame pointer does the architecture let me regain
that register for general purpose use?
In the first example bp is being trashed, that is bad in general but this is the entry point to the program, there is no need to preserve bp (unless the operating system dictates).
In a function though, based on the calling convention one assumes that bpis used by the caller and must be preserved, so you have to save it on the stack to use it. In this case it appears to want to be used to access parameters passed in by the caller on the stack, then sp is moved to make room for (and possibly access but not necessarily required if bp can be used) local variables.
If you were to modify sp first then push bp, you would basically have two pointers one push width away from each other, does that make much sense? Does it make sense to have two frame pointers anyway and if so does it make sense to have them almost the same address?
By pushing bp first and if the calling convention pushes the first paramemter last then as a compiler author you can make bp+N always or ideally always point at the first parameter for a fixed value N likewise bp+M always points at the second. A bit lazy to me, but if the register is there to be burned then burn it...
In one case, it is done before the local variables.
_start is not a function. It's your entry point. There's no return address, and no caller's value of %ebp to save.
The i386 System V ABI doc suggests (in section 2.3.1 Initial Stack and Register State) that you might want to zero %ebp to mark the deepest stack frame. (i.e. before your first call instruction, so the linked list of saved ebp values has a NULL terminator when that first function pushes the zeroed ebp. See below).
Does C have a specific convention when it creates the machine code?
No, unlike in some other x86 systems, the i386 System V ABI doesn't require much about your stack-frame layout. (Linux uses the System V ABI / calling convention, and the book you're using (PGU) is for Linux.)
In some calling conventions, setting up ebp is not optional, and the function entry sequence has to push ebp just below the return address. This creates a linked list of stack frames which allows an exception handler (or debugger) to backtrace up the stack. (How to generate the backtrace by looking at the stack values?). I think this is required in 32-bit Windows code for SEH (structured exception handling), at least in some cases, but IDK the details.
The i386 SysV ABI defines an alternate mechanism for stack unwinding which makes frame pointers optional, using metadata in another section (.eh_frame and .eh_frame_hdr which contains metadata created by .cfi_... assembler directives, which in theory you could write yourself if you wanted stack-unwinding through your function to work. i.e. if you were calling any C++ code which expected throw to work.)
If you want to use the traditional frame-walking in current gdb, you have to actually do it yourself by defining a GDB function like gdb backtrace by walking frame pointers or Force GDB to use frame-pointer based unwinding. Or apparently if your executable has no .eh_frame section at all, gdb will use the EBP-based stack-walking method.
If you compile with gcc -fno-omit-frame-pointer, your call stack will have this linked-list property, because when C compilers do make proper stack frames, they push ebp first.
IIRC, perf has a mode for using the frame-pointer chain to get backtraces while profiling, and apparently this can be more reliable than the default .eh_frame stuff for correctly accounting which functions are responsible for using the most CPU time. (Or causing the most cache misses, branch mispredicts, or whatever else you're counting with performance counters.)
Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc.
Yes, it would work fine. In fact setting up ebp at all is optional, but when writing by hand it's easier to have a fixed base (unlike esp which moves around when you push/pop).
For the same reason, it's easier to stick to the convention of mov %esp, %ebp after one push (of the old %ebp), so the first function arg is always at ebp+8. See What is stack frame in assembly? for the usual convention.
But you could maybe save code size by having ebp point in the middle of some space you reserved, so all the memory addressable with an ebp + disp8 addressing mode is usable. (disp8 is a signed 8-bit displacement: -128 to +124 if we're limiting to 4-byte aligned locations). This saves code bytes vs. needing a disp32 to reach farther. So you might do
bigfunc:
push %ebp
lea -112(%esp), %ebp # first arg at ebp+8+112 = 120(%ebp)
sub $236, %esp # locals from -124(%ebp) ... 108(%ebp)
# saved EBP at 112(%ebp), ret addr at 116(%ebp)
# 236 was chosen to leave %esp 16-byte aligned.
Or delay saving any registers until after reserving space for locals, so we aren't using up any of the locations (other than the ret addr) with saved values we never want to address.
bigfunc2: # first arg at 4(%esp)
sub $252, %esp # first arg at 252+4(%esp)
push %ebp # first arg at 252+4+4(%esp)
lea 140(%esp), %ebp # first arg at 260-140 = 120(%ebp)
push %edi # save the other call-preserved regs
push %esi
push %ebx
# %esp is 16-byte aligned after these pushes, in case that matters
(Remember to be careful how you restore registers and clean up. You can't use leave because esp = ebp isn't right. With the "normal" stack frame sequence, you might restore other pushed registers (from near the saved EBP) with mov, then use leave. Or restore esp to point at the last push (with add), and use pop instructions.)
But if you're going to do this, there's no advantage to using ebp instead of ebx or something. In fact, there's a disadvantage to using ebp: the 0(%ebp) addressing mode requires a disp8 of 0, instead of no displacement, but %ebx wouldn't. So use %ebp for a non-pointer scratch register. Or at least one that you don't dereference without a displacement. (This quirk is irrelevant with a real frame pointer: (%ebp) is the saved EBP value. And BTW, the encoding that would mean (%ebp) with no displacement is how the ModRM byte encodes a disp32 with no base register, like (12345) or my_label)
These example are pretty artifical; you usually don't need that much space for locals unless it's an array, and then you'd use indexed addressing modes or pointers, not just a disp8 relative to ebp. But maybe you need space for a few 32-byte AVX vectors. In 32-bit code with only 8 vector registers, that's plausible.
AVX512 compressed disp8 mostly defeats this argument for 64-byte AVX512 vectors, though. (But AVX512 in 32-bit mode can still only use 8 vector registers, zmm0-zmm7, so you could easily need to spill some. You only get x/ymm8-15 and zmm8-31 in 64-bit mode.)
How fortunate it is for all of use learning the art of computer programming to have access to a community such as Stack Overflow! I have made the decision to take up the task of learning how to program computers and I am doing so by the knowledge of an e-book called 'Programming From the Ground Up', which teaches the reader how to create programs in the assembly language within the GNU/Linux environment.
My progress in the book has come to the point of creating a program which computes the factorial of the integer 4 with a function, which I have made and done without any error caused by the assembler of GCC or caused by running the program. However, the function in my program does not return the right answer! The factorial of 4 is 24, but the program returns a value of 0! Rightly speaking, I do not know why this is!
Here is the code for your consideration:
.section .data
.section .text
.globl _start
.globl factorial
_start:
push $4 #this is the function argument
call factorial #the function is called
add $4, %rsp #the stack is restored to its original
#state before the function was called
mov %rax, %rbx #this instruction will move the result
#computed by the function into the rbx
#register and will serve as the return
#value
mov $1, %rax #1 must be placed inside this register for
#the exit system call
int $0x80 #exit interrupt
.type factorial, #function #defines the code below as being a function
factorial: #function label
push %rbp #saves the base-pointer
mov %rsp, %rbp #moves the stack-pointer into the base-
#pointer register so that data in the stack
#can be referenced as indexes of the base-
#pointer
mov $1, %rax #the rax register will contain the product
#of the factorial
mov 8(%rbp), %rcx #moves the function argument into %rcx
start_loop: #the process loop begins
cmp $1, %rcx #this is the exit condition for the loop
je loop_exit #if the value in %rcx reaches 1, exit loop
imul %rcx, %rax #multiply the current integer of the
#factorial by the value stored in %rax
dec %rcx #reduce the factorial integer by 1
jmp start_loop #unconditional jump to the start of loop
loop_exit: #the loop exit begins
mov %rbp, %rsp #restore the stack-pointer
pop %rbp #remove the saved base-pointer from stack
ret #return
TL:DR: the factorial of the return address overflowed %rax, leaving 0, because you ported wrong.
Porting 32-bit code to 64-bit is not as simple as changing all the register names. That might get it to assemble, but as you found even this simple program behaves differently. In x86-64, push %reg and call both push 64-bit values, and modify rsp by 8. You would see this if you single-stepped your code with a debugger. (See the bottom of the x86 tag wiki for info using gdb for asm.)
You're following a book that uses 32-bit examples, so you should probably just build them as 32-bit executables instead of trying to port them to 64-bit before you know how.
Your sys_exit() using the 32-bit int 0x80 ABI still works (What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?), but you will run into trouble with system calls if you try to pass 64-bit pointers. Use the 64-bit ABI.
You will also run into problems if you want to call any library functions, because the standard function-calling convention is different, too. See Why parameters stored in registers and not on the stack in x86-64 Assembly?, and the 64-bit ABI link, and other calling-convention docs in the x86 tag wiki.
But you're not doing any of that, so the problem with your program simply comes down to not accounting for the doubled "stack width" in x86-64. Your factorial function reads the return address as its argument.
Here's your code, commented to explain what it actually does
push $4 # rsp-=8. (rsp) = qword 4
# non-standard calling convention with args on the stack.
call factorial # rsp-=8. (rsp) = return address. RIP=factorial
add $4, %rsp # misalign the stack, so it's pointing to the top half of the 4 you pushed earlier.
# if this was in a function that wanted to return, you'd be screwed.
mov %rax, %rbx # copy return value to first arg of system call
mov $1, %rax #eax = __NR_EXIT from asm/unistd_32.h, wasting 2 bytes vs. mov $1, %eax
int $0x80 # 32-bit ABI system call, eax=call number, ebx=first arg. sys_exit(factorial(4))
So the caller is sort of fine (for the non-standard 64-bit calling convention you've invented that passes all args on the stack). You might as well omit the add to %rsp entirely, since you're about to exit without touching the stack any further.
.type factorial, #function #defines the code below as being a function
factorial: #function label
push %rbp #rsp-=8, (rsp) = rbp
mov %rsp, %rbp # make a traditional stack frame
mov $1, %rax #retval = 1. (Wasting 2 bytes vs. the exactly equivalent mov $1, %eax)
mov 8(%rbp), %rcx #load the return address into %rcx
... and calculate the factorial
For static executables (and dynamically linked executables that aren't ASLR enabled with PIE), _start is normally at 0x4000c0. Your program will still run nearly instantaneously on a modern CPU, because 0x4000c0 * 3c latency of imul is still only 12.5 million core clock cycles. On a 4GHz CPU, that's 3 milliseconds of CPU time.
If you'd made a position-independent executable by linking with gcc foo.o on a recent distro, _start would have an address like 0x5555555545a0, and your function would have taken ~70368 seconds to run on a 4GHz CPU with 3-cycle imul latency.
4194496! includes many even numbers, so its binary representation has many trailing zeros. The whole %rax will be zero by the time you're done multiplying by every number from 0x4000c0 down to 1.
The exit status of a Linux process is only the low 8 bits of the integer you pass to sys_exit() (because the wstatus is only a 32-bit int and includes other stuff, like what signal ended the process. See wait4(2)). So even with small args, it doesn't take much.
I have this function:
BOOL WINAPI MyFunction(WORD a, WORD b, WORD *c, WORD *d)
When disassembling, I'm getting something like this:
PUSH EBP
MOV ESP, EBP
SUB ESP, C
...
LEAVE
RETN C
As far as I know, the SUB ESP, C means that the function takes 12 bytes for all it's arguments, right? Each argument is 4-byte, and there're 4 arguments so shouldn't this function be disassembled as SUB ESP, 10?
Also, if I don't know about the C header of the function, how can I know the size of each parameter (not the size of all the parameters)?
No, the SUB instruction only tells you that the function needs 12 bytes for its local variables. Inferring the arguments requires looking at the code that calls this function. You'll see it setting up the stack before the CALL instruction.
In the specific case of a WINAPI function (aka __stdcall), the RET instruction gives you information since that calling convention requires the function to clean-up the stack before it returns. So a RET 0x0C tells you that the arguments required 12 bytes. Otherwise an accidental match with the stack frame size. Which usually means it takes 3 arguments, it depends on the argument types. A WORD size argument gets promoted to a 32-bit value so the signature you theorized is not a match.
If the convention call uses the stack (as it seems) to pass parameters, you can figure out how many parameters and what size they have.
For "how many", you can look at the operand of the RET instruction, if any (stdcall convention). This will give you how many bytes parameters are using. Of course this data alone if of not much use.
You have to read the function code and search for memory references like this [EBP+n] where n is a positive offset from the value of EBP. Positive offsets are addressing parameters, and negative offsets are addressing local variables (created with the SUB ESP,x instruction)
Hopefully, you will able to spot all distinct parameters. If the function has been complied with optimizations, this may be hard to figure out.
For size and type, more inverse engineering is needed. Look at the instructions that use addressed parameters. If you find something like dword ptr [ebp+n] then that parameter is 32-bit long. word ptr [ebp+n] tels you that the parameter is 16-bit long, and byte ptr [ebp+n] means a byte size parameter.
For byte and word sized parameters, the most plausible options are char/unsigned char and short/unsigned short.
For double word sized parameters, type may be int/unsigned int/long/unsigned long, but it may be a pointer as well. To differentiate a pointer from a plain integer, you will have to look further, to see if the dword read from the parameter is being used as a memory address itself to access memory (i.e. it's being dereferenciated).
To tell signedness of a parameter, you have to search for a code fragment in which a particular parameter is compared against some other value, and then a conditional jump is issued. The particular condition used in the jump will tell you if the comparison was performed taking the sign into account or not. For example: a comparison with a JA / JB / JAE / JBE conditional jumps indicate an unsigned comparison and hence, an unsigned parameter. Conditional jumps as JG / JE / JGE / JLE indicate signed parameter involved in the comparison.
That depends on your ABI.
In your case, it seems you're using Windows x86 (32 bit), which allows several C calling conventions. Some pass parameters in registers, others on the stack.
If the parameters are passed on the stack, they will be above the frame pointer, so subtracting from the stack pointer is used to make space for local variables, not to read the function parameters.
I am new to reverse engineering, and I have been looking at a simple program:
char* a = "hello world";
printf(a);
However, when I open this in ollydbg, I am not taken right to the assembly as I would have been in gdb, there are many more instructions first. I was wondering why this was happening.
Thanks!
Depending how you attach to the program with olly, you'll be take to one of two places(if no errors occurred):
The module entry point (aka the system glue and CRT wrapper for main/WinMain/DllMain): this occurs when you start a program with olly.
NtUserBreakPoint: this is when you attach to an existing process.
To navigate to where you want you can use ctrl + e to bring up the modules window, from there, select the module you want. Then use crtl + n to bring up the symbols window for your current module (note: for non-exported symbols to be available, the pdb's need to be available or you need to perform an object scan of your obj's for that build).
if your taken to the ModuleEntryPoint you can also just spelunk down the call chain (generally you want the second call/jmp), this gets you to the crt entrypoint, from there just look for a call with 3/5/4 args, this will be main/WinMain/DllMain:
from here:
Blackene.<ModuleEntryPoint> 004029C3 E8 FC030000 CALL Blackene.__security_init_cookie
004029C8 ^ E9 D7FCFFFF JMP Blackene.__tmainCRTStartup
we goto here:
Blackene.__tmainCRTStartup 004026A4 6A 58 PUSH 58
004026A6 68 48474000 PUSH Blackene.00404748
004026AB E8 1C060000 CALL Blackene.__SEH_prolog4
004026B0 33DB XOR EBX,EBX
then scroll down here:
004027D3 6A 0A PUSH 0A
004027D5 58 POP EAX
004027D6 50 PUSH EAX
004027D7 56 PUSH ESI
004027D8 6A 00 PUSH 0
004027DA 68 00004000 PUSH Blackene.00400000
004027DF E8 2CF2FFFF CALL Blackene.WinMain
I'm assuming ollydbg 1.10 is being used.
I'm currently analyzing a program I wrote in assembly and was thinking about moving some code around in the assembly. I have a procedure which takes one argument, but I'm not sure if it is passed on the stack or a register.
When I open my program in IDA Pro, the first line in the procedure is:
ThreadID= dword ptr -4
If I hover my cursor over the declaration, the following also appears:
ThreadID dd ?
r db 4 dup(?)
which I would assume would point to a stack variable?
When I open the same program in OllyDbg however, at this spot on the stack there is a large value, which would be inconsistent with any parameter that could have been passed, leading me to believe that it is passed in a register.
Can anyone point me in the right direction?
The way arguments are passed to a function depends on the function's calling convention. The default calling convention depends on the language, compiler and architecture.
I can't say anything for sure with the information you provided, however you shouldn't forget that assembly-level debuggers like OllyDbg and disassemblers like IDA often use heuristics to reverse-engineer the program. The best way to study the code generated by the compiler is to instruct it to write assembly listings. Most compilers have an option to do this.
It is a local variable for sure. To check out arguments look for [esp+XXX] values. IDA names those [esp+arg_XXX] automatically.
.text:0100346A sub_100346A proc near ; CODE XREF: sub_100347C+44p
.text:0100346A ; sub_100367A+C6p ...
.text:0100346A
.text:0100346A arg_0 = dword ptr 4
.text:0100346A
.text:0100346A mov eax, [esp+arg_0]
.text:0100346E add dword_1005194, eax
.text:01003474 call sub_1002801
.text:01003474
.text:01003479 retn 4
.text:01003479
.text:01003479 sub_100346A endp
And fastcall convention as was outlined in comment above uses registers to pass arguments. I'd bet on Microsoft or GCC compiler as they are more widely used. So check out ECX and EDX registers first.
Microsoft or GCC [2] __fastcall[3]
convention (aka __msfastcall) passes
the first two arguments (evaluated
left to right) that fit into ECX and
EDX. Remaining arguments are pushed
onto the stack from right to left.
http://en.wikipedia.org/wiki/X86_calling_conventions#fastcall