Self-modifying program: Why does it raise an exception? - exception

Just for the purposes of experimenting and playing around, I wrote the following short x64 assembly program:
.code
AsmFun proc
mov rax, MyLabel
mov byte ptr [rax], 0C3h ; C3 is x64 machine code for "ret"
MyLabel:
mov rax, 239847 ; This isn't "ret"
AsmFun endp
end
(I then called the code from C.)
It compiles/assembles/links just fine, but when I walk through the program, Visual Studio complains that an un-handled exception has been raised: "Access writing violation as [MyLabel].", where of course it doesn't actually say "[MyLabel]", but rather the address that happens to be at in memory.
Why is this happening? Is it a Windows thing that was put in place to avoid security exploits?

I live in Linux world, but perhaps you can adapt what I've found out.
Memory pages are generally read-only if they have execute permission. How I got around this was with mmap() and mprotect()... I'm sure there's something similar in Windows. It's a good bet the Mono source code would shed some light.
I used mmap() to allocate a new page with write access (but not read or execute). I populated it, then called mprotect() to change it to read-only and executable.
Don't forget... there are registers you want to avoid trashing. See the ABI documentation for further details.

Related

mret does not return to pc [duplicate]

In an assembly program, the .text section is loaded at 0x08048000; the .data and the .bss section comes after that.
What would happen if I don't put an exit syscall in the .text section? Would it lead to the .data and the .bss section being interpreted as code causing "unpredictable" behavior? When will the program terminate -- probably after every "instruction" is executed?
I can easily write a program without the exit syscall, but testing if .data and .bss gets executed is something I don't know because I guess I would have to know the real machine code that is generated under-the-hoods to understand that.
I think this question is more about "How would OS and CPU handle such a scenario?" than assembly language, but it is still interesting to know for assembly programmers etc.
The processor does not know where your code ends. It faithfully executes one instruction after another until execution is redirected elsewhere (e.g. by a jump, call, interrupt, system call, or similar).
If your code ends without jumping elsewhere, the processor continues executing whatever is in memory after your code. It is fairly unpredictable what exactly happens, but eventually, your code typically crashes because it tries to execute an invalid instruction or tries to access memory that it is not allowed to access.
If neither happens and no jump occurs, eventually the processor tries to execute unmapped memory or memory that is marked as “not executable” as code, causing a segmentation violation. On Linux, this raises a SIGSEGV or SIGBUS. When unhandled, these terminate your process and optionally produce core dumps.
If you're curious, run under a debugger and look at disassembly of the faulting instruction.

Why should you keep ESP in EBP inside a call?

I'm reading in Professional Assembly Language by Richard Blum that when you enter a call you should copy the value of the ESP register to EBP, and he also provided the following template:
function_label:
pushl %ebp
movl %esp, %ebp
< normal function code goes here>
movl %ebp, %esp
popl %ebp
ret
I don't understand why this is necessary. When you push something inside the function, you obviously intend to pop it back, thus restoring ESP to it's original value.
So why have this template?
And what's the use of the EBP register anyway?
I'm obviously missing something, but what is it?
When you push something inside the function, you obviously intend to pop it back
That's just part of the reason for using stack. The far more common usage is the one that's missing from your snippet, storing local variables. The next common code you see after setting up EBP is a substraction on ESP, equivalent to the amount of space required for local variable storage. That's of course easy to balance as well, just add the same amount back at the function epilogue. It gets more difficult when the code is also using things like C99 variable length arrays or the non-standard but commonly available _alloca() function. Being able to restore ESP from EBP makes this simple.
More to the point perhaps, it is not necessary to setup the stack frame like this. Most any x86 compiler supports an optimization option called "frame pointer omission". Turned on with GCC's -fomit-frame-pointer, /Oy on MSVC. Which makes the EBP register available for general usage, that can be very helpful on x86 with its dearth of cpu registers.
That optimization has a very grave disadvantage though. Without the EBP register pointing at the start of a stack frame, it gets very difficult to perform stack walks. That matters when you need to debug your code. A stack trace can be very important to find out how your code ended up crashing. Invaluable when you get a "core dump" of a crash from your customer. So valuable that Microsoft agreed to turn off the optimization on Windows binaries to give their customers a shot at diagnosing crashes.

Determining if XScale is present in a safe way

I have a ARMv5-powered non-XScale device (the SHARP Brain™ electronic dictionary) with Windows Embedded CE 6.0 installed in NAND flash, and I use TCPMP to play my favorite AAC tunes and MPEG-4 movies.
But, when I start TCPMP, sometimes TCPMP freezes. So I looked into TCPMP and I managed to found that the freeze happens when this code is executed.
CheckARMXScale PROC
mov r0,#0x1000000
mov r1,#0x1000000
mar acc0,r0,r1 ; <--- here
mov r0,#32
mov r1,#32
mra r0,r1,acc0
cmp r0,#0x1000000
moveq r0,#1
movne r0,#0
cmp r1,#0x1000000 ;64bit or just 40bit?
moveq r0,#2
mov pc,lr
This code determines whether XScale is present by trying executing XScale instruction, and catching the exception if the "Undefined Instruction" exception was thrown.
The problem is that somehow the system fails to pass this exception to TCPMP properly, causing TCPMP to freeze. It seems to be not because of Windows CE, but rather because of buggy drivers in this device. Any driver updates are not expected since running TCPMP on this device is not officially supported.
I posted this issue to 2channel, and some people claimed that this way to determine if XScale is present is not good, but no one even tried to find a better way. So I googled and read through ARMv5 Architecture Reference Manual and so on, but I could find nothing. It appears that almost every program that utilizes XScale instruction set determines if XScale is present in the same way.
The question is that, is it possible to determine if XScale instruction set is present, without making use of any exception or any CPU mode except user mode?
Try IOCTL_PROCESSOR_INFORMATION
(needs switching into kernel mode) read the CP15 register c0, Main ID register aka ID Code Register aka ARM CPUID. The top byte is the implementor which will be 0x69 ('i', Intel) for XScale.
Check also this thread.

OS development: How to avoid an infinite loop after an exception routine

For some months I've been working on a "home-made" operating system.
Currently, it boots and goes into 32-bit protected mode.
I've loaded the interrupt table, but haven't set up the pagination (yet).
Now while writing my exception routines I've noticed that when an instruction throws an exception, the exception routine is executed, but then the CPU jumps back to the instruction which threw the exception! This does not apply to every exception (for example, a div by zero exception will jump back to the instruction AFTER the division instruction), but let's consider the following general protection exception:
MOV EAX, 0x8
MOV CS, EAX
My routine is simple: it calls a function that displays a red error message.
The result: MOV CS, EAX fails -> My error message is displayed -> CPU jumps back to MOV CS -> infinite loop spamming the error message.
I've talked about this issue with a teacher in operating systems and unix security.
He told me he knows Linux has a way around it, but he doesn't know which one.
The naive solution would be to parse the throwing instruction from within the routine, in order to get the length of that instruction.
That solution is pretty complex, and I feel a bit uncomfortable adding a call to a relatively heavy function in every affected exception routine...
Therefore, I was wondering if the is another way around the problem. Maybe there's a "magic" register that contains a bit that can change this behaviour?
--
Thank you very much in advance for any suggestion/information.
--
EDIT: It seems many people wonder why I want to skip over the problematic instruction and resume normal execution.
I have two reasons for this:
First of all, killing a process would be a possible solution, but not a clean one. That's not how it's done in Linux, for example, where (AFAIK) the kernel sends a signal (I think SIGSEGV) but does not immediately break execution. It makes sense, since the application can block or ignore the signal and resume its own execution. It's a very elegant way to tell the application it did something wrong IMO.
Another reason: what if the kernel itself performs an illegal operation? Could be due to a bug, but could also be due to a kernel extension. As I've stated in a comment: what should I do in that case? Shall I just kill the kernel and display a nice blue screen with a smiley?
That's why I would like to be able to jump over the instruction. "Guessing" the instruction size is obviously not an option, and parsing the instruction seems fairly complex (not that I mind implementing such a routine, but I need to be sure there is no better way).
Different exceptions have different causes. Some exceptions are normal, and the exception only tells the kernel what it needs to do before allowing the software to continue running. Examples of this include a page fault telling the kernel it needs to load data from swap space, an undefined instruction exception telling the kernel it needs to emulate an instruction that the CPU doesn't support, or a debug/breakpoint exception telling the kernel it needs to notify a debugger. For these it's normal for the kernel to fix things up and silently continue.
Some exceptions indicate abnormal conditions (e.g. that the software crashed). The only sane way of handling these types of exceptions is to stop running the software. You may save information (e.g. core dump) or display information (e.g. "blue screen of death") to help with debugging, but in the end the software stops (either the process is terminated, or the kernel goes into a "do nothing until user resets computer" state).
Ignoring abnormal conditions just makes it harder for people to figure out what went wrong. For example, imagine instructions to go to the toilet:
enter bathroom
remove pants
sit
start generating output
Now imagine that step 2 fails because you're wearing shorts (a "can't find pants" exception). Do you want to stop at that point (with a nice easy to understand error message or something), or ignore that step and attempt to figure out what went wrong later on, after all the useful diagnostic information has gone?
If I understand correctly, you want to skip the instruction that caused the exception (e.g. mov cs, eax) and continue executing the program at the next instruction.
Why would you want to do this? Normally, shouldn't the rest of the program depend on the effects of that instruction being successfully executed?
Generally speaking, there are three approaches to exception handling:
Treat the exception as an unrepairable condition and kill the process. For example, division by zero is usually handled this way.
Repair the environment and then execute the instruction again. For example, page faults are sometimes handled this way.
Emulate the instruction using software and skip over it in the instruction stream. For example, complicated arithmetic instructions are sometimes handled this way.
What you're seeing is the characteristic of the General Protection Exception. The Intel System Programming Guide clearly states that (6.15 Exception and Interrupt Reference / Interrupt 13 - General Protection Exception (#GP)) :
Saved Instruction Pointer
The saved contents of CS and EIP registers point to the instruction that generated the
exception.
Therefore, you need to write an exception handler that will skip over that instruction (which would be kind of weird), or just simply kill the offending process with "General Protection Exception at $SAVED_EIP" or a similar message.
I can imagine a few situations in which one would want to respond to a GPF by parsing the failed instruction, emulating its operation, and then returning to the instruction after. The normal pattern would be to set things up so that the instruction, if retried, would succeed, but one might e.g. have some code that expects to access some hardware at addresses 0x000A0000-0x000AFFFF and wish to run it on a machine that lacks such hardware. In such a situation, one might not want to ever bank in "real" memory in that space, since every single access must be trapped and dealt with separately. I'm not sure whether there's any way to handle that without having to decode whatever instruction was trying to access that memory, although I do know that some virtual-PC programs seem to manage it pretty well.
Otherwise, I would suggest that you should have for each thread a jump vector which should be used when the system encounters a GPF. Normally that vector should point to a thread-exit routine, but code which was about to do something "suspicious" with pointers could set it to an error handler that was suitable for that code (the code should unset the vector when laving the region where the error handler would have been appropriate).
I can imagine situations where one might want to emulate an instruction without executing it, and cases where one might want to transfer control to an error-handler routine, but I can't imagine any where one would want to simply skip over an instruction that would have caused a GPF.

How exactly do executables work?

I know that executables contain instructions, but what exactly are these instructions? If I want to call the MessageBox API function for example, what does the instruction look like?
Thanks.
Executables are binary files that are understood by the operating system. The executable will contain sections which have data in them. Windows uses the PE format. The PE Format has a section which has machine instructions. These instructions are just numbers which are ordered in a sequence and is understood by the CPU.
A function call to MessageBox(), would be a sequence of instructions which will
1) have the address of the function which is in a DLL. This address is put in by the compiler
2) instructions to "push" the parameters onto a stack
3) The actual function call
4) some sort of cleanup (depends on the calling convention).
Its important to remember that EXE files are just specially formatted files. I dont have a disassembly for you, but you can try compiling your code, then open your EXE in visual studio to see the disassembly.
That is a bloated question if I ever saw one.
BUT, I will try my best to give an overview.
In a binary executable there are these things called "byte codes", byte codes are just the hex represtation of an instruction. Commonly you can "look up" byte codes and convert them to Assembly instructions. For example:
The instruction:
mov ax, 2h
Has the byte code representation:
B8 02 00
The byte codes get loaded into RAM and executed by the processer as that is its "language". No one sane that I know programs in byte code, it would just be wayyyy to complicated. Assembly is...fun enough as it is. Whenever you compile a program in a higher level language it has to take your code and turn it into Assembly instructions, you just imagine how mangled your code would look after it compiles it. Don't get me wrong, compilers are great, but disassemble a C++ program with IDA Pro Freeware and you will see what I am talking about.
That is executables in a nutshell, there are certainly books written on this subject.
I am not a Windows API expert, but someone else can show you what the instruction would look like for calling the Windows API "MessageBox". It should only be a few lines of Assembly.
Whatever code is written (be it in C or some other language) is compiled by a compiler to a special sort of language called assembly (well, machine code, but they're very close). Assembly is a very low-level language, which the CPU executes natively. Normally, you don't program in assembly because it is so low-level (for example, you don't want to deal with pulling bits back and forth from memory).
I can't say about the MessageBox function specifically, but I'd guess that it's a LOT of instructions. Think about it: it has to draw the box, and style it however your computer styles it, and hook up an even handler so that something happens when the user clicks the button, tells Windows (or whatever operating system) to add it to the taskbar (or dock, etc), and so many other things.
It depends on the language that you are working in. But for many it is as simple as...
msgbox("Your message goes here")
or
alert("Your message goes here")