Ollydbg instructions before program - reverse-engineering

I am new to reverse engineering, and I have been looking at a simple program:
char* a = "hello world";
printf(a);
However, when I open this in ollydbg, I am not taken right to the assembly as I would have been in gdb, there are many more instructions first. I was wondering why this was happening.
Thanks!

Depending how you attach to the program with olly, you'll be take to one of two places(if no errors occurred):
The module entry point (aka the system glue and CRT wrapper for main/WinMain/DllMain): this occurs when you start a program with olly.
NtUserBreakPoint: this is when you attach to an existing process.
To navigate to where you want you can use ctrl + e to bring up the modules window, from there, select the module you want. Then use crtl + n to bring up the symbols window for your current module (note: for non-exported symbols to be available, the pdb's need to be available or you need to perform an object scan of your obj's for that build).
if your taken to the ModuleEntryPoint you can also just spelunk down the call chain (generally you want the second call/jmp), this gets you to the crt entrypoint, from there just look for a call with 3/5/4 args, this will be main/WinMain/DllMain:
from here:
Blackene.<ModuleEntryPoint> 004029C3 E8 FC030000 CALL Blackene.__security_init_cookie
004029C8 ^ E9 D7FCFFFF JMP Blackene.__tmainCRTStartup
we goto here:
Blackene.__tmainCRTStartup 004026A4 6A 58 PUSH 58
004026A6 68 48474000 PUSH Blackene.00404748
004026AB E8 1C060000 CALL Blackene.__SEH_prolog4
004026B0 33DB XOR EBX,EBX
then scroll down here:
004027D3 6A 0A PUSH 0A
004027D5 58 POP EAX
004027D6 50 PUSH EAX
004027D7 56 PUSH ESI
004027D8 6A 00 PUSH 0
004027DA 68 00004000 PUSH Blackene.00400000
004027DF E8 2CF2FFFF CALL Blackene.WinMain
I'm assuming ollydbg 1.10 is being used.

Related

EMV Offline Approval/Decline

I'm developing an interface to a VeriFone VX terminal. Although, this is really a general EMV question. Our processor has a zero floor limit, so it will always be sent online. However, in case it ever changes, how do you know (what tags) if the transaction was approved or declined offline? Or, in other words, how do you know to go online or not?
how do you know (what tags) if the transaction was approved or declined offline? Or, in other words, how do you know to go online or not?
The terminal has to decides either to proceed the transaction offline, to go online or to reject the transaction. Here terminal send a command (AC) to the card and response of this command helps terminal to decide the action next followed.
Decision making is depend on three fields -
1) - Issuer Action Code
2) - Terminal Action Code
3) - TVR
IAC, TAC and TVR have the same structure. For more to know this data you can see EMV BOOK 3
IAC Usage Example-
suppose IAC-ONLINE (TAG - 9F0F) = 08 00 00 00 00 ,
here byte 1 bit 4 is on i.e. offline DDA Failed ,
Here Issuer want to go online if offline DDA Failed.
when terminal perform DDA and it fails, it set corresponding bit in TVR
that means TVR says- offline DDA is failed for this card.
now terminal check IAC online and found DDA_Failed bit is on and same on in TVR, here terminal decision would be to go online and then it send a Gen AC command to card with p1 = 80 ( ARQC - Online authorisation requested).
Coding of P1 as below
Ex- Gen AC command
C: 80 AE 80 00 other data
R: SW1/SW2=9000 (Normal processing: No error) Lr=32
77 1E 9F 27 01 80 9F 36 02 02 13 9F 26 08 2D F3
83 3C 61 85 5B EA 9F 10 07 06 84 23 00 31 02 08
.
Now decision is made by card, Terminal get card decision in the response of Gen AC command. Card return tag 9F27 - Cryptogram Information Data. here card return 80 i.e. cards wants transaction to go Online.
Really your question is important and you need to read more spec for clarity on this topic. Please checks EMV BOOKs, for more in this topic. also can read - Terminal action analysis or Card Action analysis
Assuming you're using VeriFone's VIPA API, then the first 'Continue Transaction' command (GenAC1) returns tags wrapped in a TLV template (or 'constructed' TLV tag). The value of this template determines the result:
E3: Locally authorized
E4: Requires online authorization
AFAIK (in vanilla EMV) the tag Cryptogram Information Data ('9F27') returned during 1st GENERATE AC should serve this purpose.
See EMV Book 3, Table 14.
Beware, that this tag contains the decision of the card, so you won't see the cryptogram type the kernel required.

X64 Disassemblers IDA and WINDBG. IDA doesnt show x64 opcodes

So i just started learning WINDBG, upgrading from ollydbg to 64bit. and while studying something weird happed: on WINDBG i see all the RXX Registers and opcodes while on IDA i still see the EXX opcodes while debugging the same EXE (notepad.exe for instance)
Does anyone have any idea why is that?
Example:
WINDBG:
0:000> u notepad!_security_init_cookie L5
notepad!_security_init_cookie:
00000000`ffaf3380 48895c2418 mov qword ptr [rsp+18h],rbx
00000000`ffaf3385 57 push rdi
00000000`ffaf3386 4883ec20 sub rsp,20h
00000000`ffaf338a 488b05e7cc0000 mov rax,qword ptr [notepad!_security_cookie (00000000ffb00078)]
00000000`ffaf3391 488364243000 and qword ptr [rsp+30h],0
IDA:
___security_init_cookie proc near ; CODE XREF: _WinMainCRTStartupp
.text:01003053 8B FF mov edi, edi
.text:01003055 55 push ebp
.text:01003056 8B EC mov ebp, esp
.text:01003058 83 EC 10 sub esp, 10h
.text:0100305B A1 10 C0 00 01 mov eax, ___security_cookie
or a picture:
on the left is WINDBG on right right its IDA
There are two versions of IDA included in your installation. please confirm that you are using the 64-bit version of IDA (e.g., idaq64.exe).
If the PE file being disassembled is 64-bit, and the IDA version being used is the one designed for 64-bit disassembly, then you will indeed see the correct registers. If not, then most likely one of these conditions is not true.
You have disassembled the 32-bit Notepad in IDA.
Did you open notepad.exe from system32? In that case IDA got the 32-bit version (since it's a 32-bit executable and so is subject to WoW64 filesystem redirection).
The easiest way to "fix" this is to copy the file out of the system32 directory somewhere else and open it from there.

How are functions encoded/stored in memory?

I understand how things like numbers and letters are encoded in binary, and thus can be stored as 0's and 1's.
But how are functions stored in memory? I don't see how they could be stored as 0's and 1's, and I don't see how something could be stored in memory as anything besides 0's and 1's.
They are in fact stored into memory as 0's and 1's
Here is a real world example:
int func(int a, int b) {
return (a + b);
}
Here is an example of 32-bit x86 machine instructions that a compiler might generate for the function (in a text representation known as assembly code):
func:
push ebp
mov ebp, esp
mov edx, [ebp+8]
mov eax, [ebp+12]
add eax, edx
pop ebp
ret
Going into how each of these instructions work is beyond the scope of this question, but each one of these symbols (such as add, pop, mov, etc) and their parameters are encoded into 1's and 0's. This table shows many of the Intel instructions and a summary of how they are encoded. See also the x86 tag wiki for links to docs/guides/manuals.
So how does one go about converting code from text assembly into machine-readable bytes (aka machine code)? Take for example, the instruction add eax, edx. This page shows how the add instruction is encoded. eax and edx are something called registers, spots in the processor used to hold information for processing. Variables in computer programming will often map to registers at some point. Because we are adding registers and the registers are 32-bit, we select the opcode 000000001 (see also Intel's official instruction-set reference manual entry for ADD, which lists all the forms available).
The next step is for specifying the operands. This section of the same previous page shows how this is done with the example "add ecx, eax" which is very similar to our own. The first two bits have to be '11' to show we are adding registers. The next 3 bits specifies the first register, in our case we pick edx rather than the eax in their example, which leaves us with '100'. The next 3 bits specifies our eax, so we have a final result of
00000001 11100000
Which is 01 D0 in hexadecimal. A similar process can be applied to converting any instruction into binary. The tool used to do this automatically is called an assembler.
So, running the above assembly code through an assembler produces the following output:
66 55 66 89 E5 66 67 8B 55 O8 66 67 8B 45 0C 66 01 D0 66 5D C3
Note the 01 D0 near the end of the string, this is our "add" instruction. Converting machine-code bytes back into text assembly-language mnemonics is called disassembling:
address | machine code | disassembly
0: 55 push ebp
1: 89 e5 mov ebp, esp
3: 8b 55 08 mov edx, [ebp+0x8]
6: 8b 45 0c mov eax, [ebp+0xc]
9: 01 d0 add eax, edx
b: 5d pop ebp
c: c3 ret
Addresses start at zero because this is only a .o, not a linked binary. So they're just relative to the start of the file's .text section.
You can see this for any function you like on the Godbolt Compiler Explorer (or on your own machine on any binary, freshly-compiled or not, using a disassembler).
You may notice there is no mention of the name "func" in the final output. This is because in machine code, a function is referenced by its location in RAM, not its name. The compiler-output object file may have a func entry in its symbol table referring to this block of machine code, but the symbol table is read by software, not something the CPU hardware can decode and run directly. The bit-patterns of the machine code are seen and decoded directly by transistors in the CPU.
Sometimes it is hard for us to understand how computers encode instructions like this at a low level because as programmers or power users, we have tools to avoid ever dealing with them directly. We rely on compilers, assemblers, and interpreters to do the work for us. Nonetheless, anything a computer ever does must eventually be specified in machine code.
Functions are made of instructions, such as bytecode or machine code. Instructions are numbers, which can be encoded in binary.
A good introduction to this is Charles Petzold's book Code.
I will explain how functions are stored in the easiest way possible. You will be surprised by the amazing simplicity of all this at the end of this contribution. This is the most fundamental explanation and any type of computer will work in somehow the same way.
The only part of a computer that can perform any operations on data ie addition, subtraction, multiplication and division. Every data manipulation(any sort of Math or just any formula) in human existence is made up of these operations.
Now let us look at the basic structure of an instruction in binary. If we are working on a 32 bit machine, an instruction will take the form of:
1 001 32-bit address 32-bit adress
1(if this bit is one then the instruction is diverted to logic unit for calculation and if zero, we are basically moving the data between the two memory addresses following this) 001(these 3 bits determine if we are adding(001), subtracting(010), multiplying(011), or dividing(100) in this instruction cycle) (32-bit memory address of first memory location) (32-bit memory adress of second memory adress)
A function is basically a string of instructions of how to manipulate data in defined memory locations.
Let us take a random function that adds a number, then multiplies. It’s string of instructions will be:
(let me use MA to mean memory adress)
1 001 MAone MAtwo (adds value in MAone to value in MAtwo and stores resultant in MAone )
1 011 MAtwo MAthree (multiplies value in MAtwo with value in MAthree and stores resultant in MAthree )
Returns value in MAthree
So the ony difference in how functions are stored is that they are stored with 1 in the left most bit so that the CPU knows its a function that needs logical operations and diverts it to the ALU

Function Parameters in ASM x86 FASM

How do I pass parameters to a function in Assembly?
I did push Last Param, push Second Param, push First Param..
But I cannot access the parameters within Meh Function.. What I'm doing crashes the program..
format PE console ;Format PE OUT GUI 4.0
entry main
include 'macro/import32.inc'
section '.idata' import data readable ;Import Section.
library msvcrt,'msvcrt.dll'
import msvcrt, printf, 'printf',\
exit,'exit', getchar, 'getchar'
section '.data' data readable writeable ;Constants/Static Section.
InitialValue dd 0
section '.code' code readable executable
main:
push 67
push 66
push 65
call MEH
call [getchar]
mov eax, 0
ret 0
MEH:
push ebx
mov ebp, esp
sub esp, 0
mov eax, [ebp + 8] ; Trying to print first parameter..
push eax
call [printf]
add esp, eax
mov esp, ebp
pop ebx
ret
Small additional notes.
The proper header/footer of the procedure uses push/pop ebp:
MEH:
push ebp
mov ebp, esp
mov esp, ebp
pop ebp
ret
The reason is that we need to save/restore ebp register before using it as a pointer to the arguments and local variables.
Second, CCALL calling convention where the caller restores the stack pointer after procedure return is common for C/C++ language, but not for assembly programming. The reason is obvious - the compiler can properly compute how many parameters are pushed in the stack. In hand written assembly program, using this convention will make the code not legible.
Better approach is to use STDCALL calling convention:
MEH:
push ebp
mov ebp, esp
mov esp, ebp
pop ebp
retn 12 ; how many bytes to be automatically
; removed from the stack after return.
Even better practice is to use some macros in order to automate the creation of the standard procedure elements and to provide human readable labels for the arguments and local variables. For example, macros provided in FreshLib library have following syntax:
proc MEH, .arg1, .arg2, .arg3
; define local variables here, if needed.
begin
; place your code here without headers and footers
return ; will clean the stack automatically.
endp
; pushes the arguments in the stack and call MEH
stdcall MEH, 65, 66, 67
The standard macro library provided with FASM packages has slightly different syntax, that is covered in details by FASM programmers manual.
Let's see...
Say your ESP is 0x00180078 on the outset, then after the three pushes you have
00180078: 67
00180074: 66
00180070: 65
then you call MEH, which immediately pushes ebx so now you have the stack as
00180078: 67
00180074: 66
00180070: 65
0018006C: return address
00180068: ebx value
you now load EBP with ESP = 00180068
sub esp,0 does nothing
mov eax, [ebp+8] ~ 00180068 + 8 = 00180070 = 65
so not the first but rather the last argument
call [printf]
Here comes your problem, though:
add esp, eax
What good was this supposed to do? Assuming printf preserves this argument passed in (which it is incidentally not required to do), why would you add the argument to the stack pointer? That is sure to mess up your return.
What you want to do is restore esp to the value of ebp and pop back the saved ebx value.
If the calling convention for printf() is correct (it is for 32-bit MinGW and 32-bit gcc on Linux), then you're completely ignoring what the function expects and there's no surprise in you not getting the desired output.
The function's prototype is:
int printf(const char* format, ...);
format, the first parameter, is a pointer to an ASCIIZ string, which contains the text to print and/or special tokens like %d to be replaced by the appropriate interpretation of the optional parameters following format.
So, if you want printf() to print 'A', then this is what you need to do in C:
printf("A");
or
printf("%c", 'A');
And here's how you'd do the same in assembly:
myformatstring db "A", 0 ; this line goes into section .data
push myformatstring ; push address of the string
call [printf]
add esp, 4 ; remove all parameters from the stack
or
myformatstring db "%c", 0 ; this line goes into section .data
push 'A'
push myformatstring ; push address of the string
call [printf]
add esp, 2*4 ; remove all parameters from the stack

Will arguments to a function be passed on the stack or in a register?

I'm currently analyzing a program I wrote in assembly and was thinking about moving some code around in the assembly. I have a procedure which takes one argument, but I'm not sure if it is passed on the stack or a register.
When I open my program in IDA Pro, the first line in the procedure is:
ThreadID= dword ptr -4
If I hover my cursor over the declaration, the following also appears:
ThreadID dd ?
r db 4 dup(?)
which I would assume would point to a stack variable?
When I open the same program in OllyDbg however, at this spot on the stack there is a large value, which would be inconsistent with any parameter that could have been passed, leading me to believe that it is passed in a register.
Can anyone point me in the right direction?
The way arguments are passed to a function depends on the function's calling convention. The default calling convention depends on the language, compiler and architecture.
I can't say anything for sure with the information you provided, however you shouldn't forget that assembly-level debuggers like OllyDbg and disassemblers like IDA often use heuristics to reverse-engineer the program. The best way to study the code generated by the compiler is to instruct it to write assembly listings. Most compilers have an option to do this.
It is a local variable for sure. To check out arguments look for [esp+XXX] values. IDA names those [esp+arg_XXX] automatically.
.text:0100346A sub_100346A proc near ; CODE XREF: sub_100347C+44p
.text:0100346A ; sub_100367A+C6p ...
.text:0100346A
.text:0100346A arg_0 = dword ptr 4
.text:0100346A
.text:0100346A mov eax, [esp+arg_0]
.text:0100346E add dword_1005194, eax
.text:01003474 call sub_1002801
.text:01003474
.text:01003479 retn 4
.text:01003479
.text:01003479 sub_100346A endp
And fastcall convention as was outlined in comment above uses registers to pass arguments. I'd bet on Microsoft or GCC compiler as they are more widely used. So check out ECX and EDX registers first.
Microsoft or GCC [2] __fastcall[3]
convention (aka __msfastcall) passes
the first two arguments (evaluated
left to right) that fit into ECX and
EDX. Remaining arguments are pushed
onto the stack from right to left.
http://en.wikipedia.org/wiki/X86_calling_conventions#fastcall