X64 Disassemblers IDA and WINDBG. IDA doesnt show x64 opcodes - reverse-engineering

So i just started learning WINDBG, upgrading from ollydbg to 64bit. and while studying something weird happed: on WINDBG i see all the RXX Registers and opcodes while on IDA i still see the EXX opcodes while debugging the same EXE (notepad.exe for instance)
Does anyone have any idea why is that?
Example:
WINDBG:
0:000> u notepad!_security_init_cookie L5
notepad!_security_init_cookie:
00000000`ffaf3380 48895c2418 mov qword ptr [rsp+18h],rbx
00000000`ffaf3385 57 push rdi
00000000`ffaf3386 4883ec20 sub rsp,20h
00000000`ffaf338a 488b05e7cc0000 mov rax,qword ptr [notepad!_security_cookie (00000000ffb00078)]
00000000`ffaf3391 488364243000 and qword ptr [rsp+30h],0
IDA:
___security_init_cookie proc near ; CODE XREF: _WinMainCRTStartupp
.text:01003053 8B FF mov edi, edi
.text:01003055 55 push ebp
.text:01003056 8B EC mov ebp, esp
.text:01003058 83 EC 10 sub esp, 10h
.text:0100305B A1 10 C0 00 01 mov eax, ___security_cookie
or a picture:
on the left is WINDBG on right right its IDA

There are two versions of IDA included in your installation. please confirm that you are using the 64-bit version of IDA (e.g., idaq64.exe).
If the PE file being disassembled is 64-bit, and the IDA version being used is the one designed for 64-bit disassembly, then you will indeed see the correct registers. If not, then most likely one of these conditions is not true.

You have disassembled the 32-bit Notepad in IDA.
Did you open notepad.exe from system32? In that case IDA got the 32-bit version (since it's a 32-bit executable and so is subject to WoW64 filesystem redirection).
The easiest way to "fix" this is to copy the file out of the system32 directory somewhere else and open it from there.

Related

Porting from 32 to 64-bit by just changing all the register names from eXX to rXX makes factorial return 0?

How fortunate it is for all of use learning the art of computer programming to have access to a community such as Stack Overflow! I have made the decision to take up the task of learning how to program computers and I am doing so by the knowledge of an e-book called 'Programming From the Ground Up', which teaches the reader how to create programs in the assembly language within the GNU/Linux environment.
My progress in the book has come to the point of creating a program which computes the factorial of the integer 4 with a function, which I have made and done without any error caused by the assembler of GCC or caused by running the program. However, the function in my program does not return the right answer! The factorial of 4 is 24, but the program returns a value of 0! Rightly speaking, I do not know why this is!
Here is the code for your consideration:
.section .data
.section .text
.globl _start
.globl factorial
_start:
push $4 #this is the function argument
call factorial #the function is called
add $4, %rsp #the stack is restored to its original
#state before the function was called
mov %rax, %rbx #this instruction will move the result
#computed by the function into the rbx
#register and will serve as the return
#value
mov $1, %rax #1 must be placed inside this register for
#the exit system call
int $0x80 #exit interrupt
.type factorial, #function #defines the code below as being a function
factorial: #function label
push %rbp #saves the base-pointer
mov %rsp, %rbp #moves the stack-pointer into the base-
#pointer register so that data in the stack
#can be referenced as indexes of the base-
#pointer
mov $1, %rax #the rax register will contain the product
#of the factorial
mov 8(%rbp), %rcx #moves the function argument into %rcx
start_loop: #the process loop begins
cmp $1, %rcx #this is the exit condition for the loop
je loop_exit #if the value in %rcx reaches 1, exit loop
imul %rcx, %rax #multiply the current integer of the
#factorial by the value stored in %rax
dec %rcx #reduce the factorial integer by 1
jmp start_loop #unconditional jump to the start of loop
loop_exit: #the loop exit begins
mov %rbp, %rsp #restore the stack-pointer
pop %rbp #remove the saved base-pointer from stack
ret #return
TL:DR: the factorial of the return address overflowed %rax, leaving 0, because you ported wrong.
Porting 32-bit code to 64-bit is not as simple as changing all the register names. That might get it to assemble, but as you found even this simple program behaves differently. In x86-64, push %reg and call both push 64-bit values, and modify rsp by 8. You would see this if you single-stepped your code with a debugger. (See the bottom of the x86 tag wiki for info using gdb for asm.)
You're following a book that uses 32-bit examples, so you should probably just build them as 32-bit executables instead of trying to port them to 64-bit before you know how.
Your sys_exit() using the 32-bit int 0x80 ABI still works (What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?), but you will run into trouble with system calls if you try to pass 64-bit pointers. Use the 64-bit ABI.
You will also run into problems if you want to call any library functions, because the standard function-calling convention is different, too. See Why parameters stored in registers and not on the stack in x86-64 Assembly?, and the 64-bit ABI link, and other calling-convention docs in the x86 tag wiki.
But you're not doing any of that, so the problem with your program simply comes down to not accounting for the doubled "stack width" in x86-64. Your factorial function reads the return address as its argument.
Here's your code, commented to explain what it actually does
push $4 # rsp-=8. (rsp) = qword 4
# non-standard calling convention with args on the stack.
call factorial # rsp-=8. (rsp) = return address. RIP=factorial
add $4, %rsp # misalign the stack, so it's pointing to the top half of the 4 you pushed earlier.
# if this was in a function that wanted to return, you'd be screwed.
mov %rax, %rbx # copy return value to first arg of system call
mov $1, %rax #eax = __NR_EXIT from asm/unistd_32.h, wasting 2 bytes vs. mov $1, %eax
int $0x80 # 32-bit ABI system call, eax=call number, ebx=first arg. sys_exit(factorial(4))
So the caller is sort of fine (for the non-standard 64-bit calling convention you've invented that passes all args on the stack). You might as well omit the add to %rsp entirely, since you're about to exit without touching the stack any further.
.type factorial, #function #defines the code below as being a function
factorial: #function label
push %rbp #rsp-=8, (rsp) = rbp
mov %rsp, %rbp # make a traditional stack frame
mov $1, %rax #retval = 1. (Wasting 2 bytes vs. the exactly equivalent mov $1, %eax)
mov 8(%rbp), %rcx #load the return address into %rcx
... and calculate the factorial
For static executables (and dynamically linked executables that aren't ASLR enabled with PIE), _start is normally at 0x4000c0. Your program will still run nearly instantaneously on a modern CPU, because 0x4000c0 * 3c latency of imul is still only 12.5 million core clock cycles. On a 4GHz CPU, that's 3 milliseconds of CPU time.
If you'd made a position-independent executable by linking with gcc foo.o on a recent distro, _start would have an address like 0x5555555545a0, and your function would have taken ~70368 seconds to run on a 4GHz CPU with 3-cycle imul latency.
4194496! includes many even numbers, so its binary representation has many trailing zeros. The whole %rax will be zero by the time you're done multiplying by every number from 0x4000c0 down to 1.
The exit status of a Linux process is only the low 8 bits of the integer you pass to sys_exit() (because the wstatus is only a 32-bit int and includes other stuff, like what signal ended the process. See wait4(2)). So even with small args, it doesn't take much.

How are functions encoded/stored in memory?

I understand how things like numbers and letters are encoded in binary, and thus can be stored as 0's and 1's.
But how are functions stored in memory? I don't see how they could be stored as 0's and 1's, and I don't see how something could be stored in memory as anything besides 0's and 1's.
They are in fact stored into memory as 0's and 1's
Here is a real world example:
int func(int a, int b) {
return (a + b);
}
Here is an example of 32-bit x86 machine instructions that a compiler might generate for the function (in a text representation known as assembly code):
func:
push ebp
mov ebp, esp
mov edx, [ebp+8]
mov eax, [ebp+12]
add eax, edx
pop ebp
ret
Going into how each of these instructions work is beyond the scope of this question, but each one of these symbols (such as add, pop, mov, etc) and their parameters are encoded into 1's and 0's. This table shows many of the Intel instructions and a summary of how they are encoded. See also the x86 tag wiki for links to docs/guides/manuals.
So how does one go about converting code from text assembly into machine-readable bytes (aka machine code)? Take for example, the instruction add eax, edx. This page shows how the add instruction is encoded. eax and edx are something called registers, spots in the processor used to hold information for processing. Variables in computer programming will often map to registers at some point. Because we are adding registers and the registers are 32-bit, we select the opcode 000000001 (see also Intel's official instruction-set reference manual entry for ADD, which lists all the forms available).
The next step is for specifying the operands. This section of the same previous page shows how this is done with the example "add ecx, eax" which is very similar to our own. The first two bits have to be '11' to show we are adding registers. The next 3 bits specifies the first register, in our case we pick edx rather than the eax in their example, which leaves us with '100'. The next 3 bits specifies our eax, so we have a final result of
00000001 11100000
Which is 01 D0 in hexadecimal. A similar process can be applied to converting any instruction into binary. The tool used to do this automatically is called an assembler.
So, running the above assembly code through an assembler produces the following output:
66 55 66 89 E5 66 67 8B 55 O8 66 67 8B 45 0C 66 01 D0 66 5D C3
Note the 01 D0 near the end of the string, this is our "add" instruction. Converting machine-code bytes back into text assembly-language mnemonics is called disassembling:
address | machine code | disassembly
0: 55 push ebp
1: 89 e5 mov ebp, esp
3: 8b 55 08 mov edx, [ebp+0x8]
6: 8b 45 0c mov eax, [ebp+0xc]
9: 01 d0 add eax, edx
b: 5d pop ebp
c: c3 ret
Addresses start at zero because this is only a .o, not a linked binary. So they're just relative to the start of the file's .text section.
You can see this for any function you like on the Godbolt Compiler Explorer (or on your own machine on any binary, freshly-compiled or not, using a disassembler).
You may notice there is no mention of the name "func" in the final output. This is because in machine code, a function is referenced by its location in RAM, not its name. The compiler-output object file may have a func entry in its symbol table referring to this block of machine code, but the symbol table is read by software, not something the CPU hardware can decode and run directly. The bit-patterns of the machine code are seen and decoded directly by transistors in the CPU.
Sometimes it is hard for us to understand how computers encode instructions like this at a low level because as programmers or power users, we have tools to avoid ever dealing with them directly. We rely on compilers, assemblers, and interpreters to do the work for us. Nonetheless, anything a computer ever does must eventually be specified in machine code.
Functions are made of instructions, such as bytecode or machine code. Instructions are numbers, which can be encoded in binary.
A good introduction to this is Charles Petzold's book Code.
I will explain how functions are stored in the easiest way possible. You will be surprised by the amazing simplicity of all this at the end of this contribution. This is the most fundamental explanation and any type of computer will work in somehow the same way.
The only part of a computer that can perform any operations on data ie addition, subtraction, multiplication and division. Every data manipulation(any sort of Math or just any formula) in human existence is made up of these operations.
Now let us look at the basic structure of an instruction in binary. If we are working on a 32 bit machine, an instruction will take the form of:
1 001 32-bit address 32-bit adress
1(if this bit is one then the instruction is diverted to logic unit for calculation and if zero, we are basically moving the data between the two memory addresses following this) 001(these 3 bits determine if we are adding(001), subtracting(010), multiplying(011), or dividing(100) in this instruction cycle) (32-bit memory address of first memory location) (32-bit memory adress of second memory adress)
A function is basically a string of instructions of how to manipulate data in defined memory locations.
Let us take a random function that adds a number, then multiplies. It’s string of instructions will be:
(let me use MA to mean memory adress)
1 001 MAone MAtwo (adds value in MAone to value in MAtwo and stores resultant in MAone )
1 011 MAtwo MAthree (multiplies value in MAtwo with value in MAthree and stores resultant in MAthree )
Returns value in MAthree
So the ony difference in how functions are stored is that they are stored with 1 in the left most bit so that the CPU knows its a function that needs logical operations and diverts it to the ALU

Function Parameters in ASM x86 FASM

How do I pass parameters to a function in Assembly?
I did push Last Param, push Second Param, push First Param..
But I cannot access the parameters within Meh Function.. What I'm doing crashes the program..
format PE console ;Format PE OUT GUI 4.0
entry main
include 'macro/import32.inc'
section '.idata' import data readable ;Import Section.
library msvcrt,'msvcrt.dll'
import msvcrt, printf, 'printf',\
exit,'exit', getchar, 'getchar'
section '.data' data readable writeable ;Constants/Static Section.
InitialValue dd 0
section '.code' code readable executable
main:
push 67
push 66
push 65
call MEH
call [getchar]
mov eax, 0
ret 0
MEH:
push ebx
mov ebp, esp
sub esp, 0
mov eax, [ebp + 8] ; Trying to print first parameter..
push eax
call [printf]
add esp, eax
mov esp, ebp
pop ebx
ret
Small additional notes.
The proper header/footer of the procedure uses push/pop ebp:
MEH:
push ebp
mov ebp, esp
mov esp, ebp
pop ebp
ret
The reason is that we need to save/restore ebp register before using it as a pointer to the arguments and local variables.
Second, CCALL calling convention where the caller restores the stack pointer after procedure return is common for C/C++ language, but not for assembly programming. The reason is obvious - the compiler can properly compute how many parameters are pushed in the stack. In hand written assembly program, using this convention will make the code not legible.
Better approach is to use STDCALL calling convention:
MEH:
push ebp
mov ebp, esp
mov esp, ebp
pop ebp
retn 12 ; how many bytes to be automatically
; removed from the stack after return.
Even better practice is to use some macros in order to automate the creation of the standard procedure elements and to provide human readable labels for the arguments and local variables. For example, macros provided in FreshLib library have following syntax:
proc MEH, .arg1, .arg2, .arg3
; define local variables here, if needed.
begin
; place your code here without headers and footers
return ; will clean the stack automatically.
endp
; pushes the arguments in the stack and call MEH
stdcall MEH, 65, 66, 67
The standard macro library provided with FASM packages has slightly different syntax, that is covered in details by FASM programmers manual.
Let's see...
Say your ESP is 0x00180078 on the outset, then after the three pushes you have
00180078: 67
00180074: 66
00180070: 65
then you call MEH, which immediately pushes ebx so now you have the stack as
00180078: 67
00180074: 66
00180070: 65
0018006C: return address
00180068: ebx value
you now load EBP with ESP = 00180068
sub esp,0 does nothing
mov eax, [ebp+8] ~ 00180068 + 8 = 00180070 = 65
so not the first but rather the last argument
call [printf]
Here comes your problem, though:
add esp, eax
What good was this supposed to do? Assuming printf preserves this argument passed in (which it is incidentally not required to do), why would you add the argument to the stack pointer? That is sure to mess up your return.
What you want to do is restore esp to the value of ebp and pop back the saved ebx value.
If the calling convention for printf() is correct (it is for 32-bit MinGW and 32-bit gcc on Linux), then you're completely ignoring what the function expects and there's no surprise in you not getting the desired output.
The function's prototype is:
int printf(const char* format, ...);
format, the first parameter, is a pointer to an ASCIIZ string, which contains the text to print and/or special tokens like %d to be replaced by the appropriate interpretation of the optional parameters following format.
So, if you want printf() to print 'A', then this is what you need to do in C:
printf("A");
or
printf("%c", 'A');
And here's how you'd do the same in assembly:
myformatstring db "A", 0 ; this line goes into section .data
push myformatstring ; push address of the string
call [printf]
add esp, 4 ; remove all parameters from the stack
or
myformatstring db "%c", 0 ; this line goes into section .data
push 'A'
push myformatstring ; push address of the string
call [printf]
add esp, 2*4 ; remove all parameters from the stack

How to keep assembly instructions of asm functions in c file unchanged when creating the executable for mac os x lion 10.7

Here is the asm function in c file copyed from inject-bundle project:
asm void mach_thread_trampoline(void)
{
// Call _pthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
// Call cthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
// Call function with return address and arguments already on stack
pop eax
jmp eax
}
After gcc (i do the work on mac os x lion 10.7.4):
$gcc -m32 -fasm-blocks -o a a.c -g
gdb the target, watch the mach_thread_trampoline content in gdb:
(gdb) x/17i mach_thread_trampoline
0x1f80 <mach_thread_trampoline>: pop %eax
0x1f81 <mach_thread_trampoline+1>: call *%eax
0x1f83 <mach_thread_trampoline+3>: mov %esp,%eax
0x1f85 <mach_thread_trampoline+5>: mov %eax,%esp
0x1f87 <mach_thread_trampoline+7>: add $0x4,%esp
0x1f8a <mach_thread_trampoline+10>: mov %esp,%eax
0x1f8c <mach_thread_trampoline+12>: mov %eax,-0x8(%ebp)
0x1f8f <mach_thread_trampoline+15>: pop %eax
0x1f90 <mach_thread_trampoline+16>: call *%eax
0x1f92 <mach_thread_trampoline+18>: mov %esp,%eax
0x1f94 <mach_thread_trampoline+20>: mov %eax,%esp
0x1f96 <mach_thread_trampoline+22>: add $0x4,%esp
0x1f99 <mach_thread_trampoline+25>: mov %esp,%eax
0x1f9b <mach_thread_trampoline+27>: mov %eax,-0x8(%ebp)
0x1f9e <mach_thread_trampoline+30>: pop %eax
0x1f9f <mach_thread_trampoline+31>: jmp *%eax
0x1fa1 <mach_thread_trampoline+33>: ret
The target added some instructions for the mach_thread_trampoline functions.
Is there any methods to keep the asm functions unchanged?
It doesn't appear that you can do that with gcc, but you could write the function in a.asm & assemble a.asm with nasm. a.asm would look something like:
[BITS 32]
global _mach_thread_trampoline
_mach_thread_trampoline:
; Call _pthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
; Call cthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
; Call function with return address and arguments already on stack
pop eax
jmp eax
To use it from C, you'd need a.h containing:
void mach_thread_trampoline(void);
Note that the function is intituled _mach_thread_trampoline in a.asm & mach_thread_trampoline in a.h. This is because function names have an underscore prepended to them on OS X. If you moved it to linux, the assembly would have mach_thread_trampoline as well because linux doesn't prepend anything.
If you have asm functions of non-trivial size, you should put it / them in separate .S files where you can use .intel_syntax noprefix if you like, or use NASM like another answer suggests.
But you can use an asm(""); statement at global scope to work around any compile bugs / weirdness if you want to stick a small function written in GAS syntax into a .c or .cpp, see the last section of this answer.
asm as a function type appears to be an Apple extension, it's not supported by (Linux) gcc or clang on the Godbolt compiler explorer.
It looks like it serves the same purpose as __attribute__ ((naked)) (supported on x86 by mainstream (not just Apple) clang 3.3 and later, and recent GCC). But #MichaelPetch reports that gcc on OS X 10.7 (gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)) does not support __attribute__ ((naked)).
(Newer OS X make gcc an alias for clang, but older OS X had custom versions of GCC).
It appears that there's a compiler bug with the asm void func(void) { syntax in the compiler version you're using; it doesn't make sense that it can assume the existence of a frame point and insert stuff like mov %eax,-0x8(%ebp). I don't see how that asm extension could be useful if the compiler's allowed to do that. Maybe it doesn't happen at -O3, so you might be able to work around it by enabling optimization.
BTW, regular (non-Apple) GCC doesn't support MSVC-style asm blocks, though, only GNU C asm("instructions"); asm statements. But Apple's version of gcc that uses LLVM does support it.
__attribute__ ((naked)) (not supported by OS X 10.7 gcc)
If you have a buggy compiler that introduces extra instructions into asm blocks, IDK if this will make any difference. (Even if it was supported, which it isn't on OS X 10.7 gcc)
This syntax works on modern clang. (Not gcc, because it still doesn't support -fasm-blocks)
This compiles correctly on Godbolt with clang 3.3. I used "binary" output to make sure I was seeing the real machine code in the object file, so I had to use -nostdlib to get it to link. It avoids distorting your asm with -O0 or -O3, but I'm using a very different compiler this syntax change is probably not relevant.
__attribute__ ((naked)) void mach_thread_trampoline(void)
{
// asm block *inside* a function
asm {
// Call _pthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
// Call cthread_set_self with pthread_t arg already on stack
pop eax
call eax
add esp, 4
// Call function with return address and arguments already on stack
pop eax
jmp eax
}
}
Portable GNU C asm("");
You can put Basic ASM statements at global scope in GNU C.
This syntax probably will work around the compiler bug on OS X 10.7, because we're not using -fasm-blocks at all, and the compiler has no chance to think it's inside a function. It should just assemble this code in the .text section.
void mach_thread_trampoline(void); // prototype
// definition: name mangling / leading underscore must be done manually
asm(".globl _mach_thread_trampoline\n\t"
"_mach_thread_trampoline:\n\t"
"pop %eax\n\t"
"call *%eax\n\t"
...
"jmp *%eax"
);
So a naked function could have advantages in C++ where the name mangling is non-trivial. Probably naked functions are automatically noinline.
If you want really portable, you can use dialect-alternatives like jmp {*%eax|eax} so your code works if compiled with -masm=intel or not.
I don't recommend using .intel_syntax noprefix inside an asm block, and then switching back to .att_syntax at the end for the rest of the compiler's code. It's a hack that breaks if you (or a tool like the Godbolt compiler explorer) compiles with -masm=intel.

Ollydbg instructions before program

I am new to reverse engineering, and I have been looking at a simple program:
char* a = "hello world";
printf(a);
However, when I open this in ollydbg, I am not taken right to the assembly as I would have been in gdb, there are many more instructions first. I was wondering why this was happening.
Thanks!
Depending how you attach to the program with olly, you'll be take to one of two places(if no errors occurred):
The module entry point (aka the system glue and CRT wrapper for main/WinMain/DllMain): this occurs when you start a program with olly.
NtUserBreakPoint: this is when you attach to an existing process.
To navigate to where you want you can use ctrl + e to bring up the modules window, from there, select the module you want. Then use crtl + n to bring up the symbols window for your current module (note: for non-exported symbols to be available, the pdb's need to be available or you need to perform an object scan of your obj's for that build).
if your taken to the ModuleEntryPoint you can also just spelunk down the call chain (generally you want the second call/jmp), this gets you to the crt entrypoint, from there just look for a call with 3/5/4 args, this will be main/WinMain/DllMain:
from here:
Blackene.<ModuleEntryPoint> 004029C3 E8 FC030000 CALL Blackene.__security_init_cookie
004029C8 ^ E9 D7FCFFFF JMP Blackene.__tmainCRTStartup
we goto here:
Blackene.__tmainCRTStartup 004026A4 6A 58 PUSH 58
004026A6 68 48474000 PUSH Blackene.00404748
004026AB E8 1C060000 CALL Blackene.__SEH_prolog4
004026B0 33DB XOR EBX,EBX
then scroll down here:
004027D3 6A 0A PUSH 0A
004027D5 58 POP EAX
004027D6 50 PUSH EAX
004027D7 56 PUSH ESI
004027D8 6A 00 PUSH 0
004027DA 68 00004000 PUSH Blackene.00400000
004027DF E8 2CF2FFFF CALL Blackene.WinMain
I'm assuming ollydbg 1.10 is being used.