I am trying to write assembly code to cause a stack exception but I am having no luck so far. According to the AT&T programmer manuals a stack exception is caused by one of the following:
• Implied stack references in which the stack address is not in canonical form. Implied stack
references include all push and pop instructions, and any instruction using RSP or RBP as a base
register.
• Attempting to load a stack-segment selector that references a segment descriptor containing a clear
present bit (descriptor.P=0).
• Any stack access that fails the stack-limit check.
I went for the first method; I am trying to load rsp with a non-canonical form with the following code:
asm volatile("mov $0xAAAAAAAA00000000, %%rax;"
"orq %%rax, %%rsp;"
"push %%rax;" : : : );
GDB just says something about not being able to address memory and everything breaks rather than the exception. Does anyone have any ideas? If not does anyone know how I could cause a exception using the 3rd condition? I don't know what "fails the stack-limit check" means. Thanks!
asm(
"\n"
"MYLOOP:\n\t"
"pushq %rbp\n\t"
//"popq %rbp\n\t"
"jmp MYLOOP\n\t"
);
Simple stack overflow. Uncomment out the popq instruction to have an infinite stack push/pop loop consuming 100% of one cpu core.
Related
On most x86 Assembly (NASM especifically) code samples I see around (even on the ones generated by GCC) I see what's called "setup of stack frame". Like this:
main:
/*setting the stack frame*/
push ebp
mov ebp,esp
...
code goes here
...
/*removing the stack frame*/
mov esp, ebp
pop ebp
I have 3 questions about this practice:
If my code doesn't touch the stack then setting/removing the stack frame as above is completely useless, right?
Even if my code uses the stack, as long as pop everything I push (leaving the stack as it was essentially) then again setting up a stack frame is completely useless, right?
As I see it the only purpose of this would be to save the value of ESP so that I can play around with it on my code without worrying about messing things up, and once I am done I simply restore its original value. Is this the purpose of the stack frame setup or am I missing something?
Thanks
Well, in fact, you don't need stack frames.
Stack frames are convenient when you save registers and store local variables in stack - to make writing and debugging easier: you just set ebp to a fixed point in stack and address all stack data using ebp. And it's easier to restores esp at the end.
Also, debuggers often expect stack frames to be present, otherwise you can get inaccurate stack call for example.
So, the answer to 1 is yes, the answer to 2 and 3 is above.
You are essentially correct.
Stack frames do have certain benefits, though, even when you don't need a fixed stack reference to access parameters and locals. In particular having them there allows for accurate stack walking for generating stack traces for debugging purposes.
This is done by convention. C language functions use stack frames to access parameters that are sent to functions, and to set up dynamic local variables. That is why they do it in the sample code that you are looking at. Of course if you want to do it your way, you can, but you won't be able to invoke your code from C etc.
EDIT: I am also pretty sure that there are compilers that implement different calling conventions that optimize that and maybe do not create a frame at all. So basically, you are right. Stack frames are not necessary.
I am writing some x64 assembly for the GNU assembler. I've been trying to read about the .seh_* directives, but I'm not finding much information about them. The gas docs don't mention them at all.
But as I understand it, if my code might be in the stack during an SEH unwind operation, I am expected to use these. And since my code does stack manipulations and calls other functions, SEH is a possibility, so I should be using these.
Mostly I think I've got it right:
.seh_proc FCT
FCT:
push %rbp
.seh_pushreg %rbp
mov %rsp, %rbp
.seh_setframe %rbp, 0
push %r14
.seh_pushreg %r14
lea -(iOffset + iBytes)(%rsp), %rsp
.seh_stackalloc iOffset + iBytes
andq $-16, %rsp <---- But what about this?
.seh_endprologue
etc...
But there's one bit that's not clear. I've got this instruction:
andq $-16, %rsp
How on earth do I tell SEH that I'm performing stack alignment? This might adjust the stack by anywhere from 15 bytes (very unlikely) to 8 bytes (very likely), to 0 bytes (certainly possible). Since the actual amount may not be determined until runtime, I'm stuck.
I suppose I can skip the .seh instruction, but if 8 bytes of stack do get reserved there, I've probably trashed the unwind, haven't I? Doesn't that defeat the entire purpose here?
Alternately I can omit the alignment. But if I call other functions (say memcpy), aren't I supposed to align the stack? According to MS:
The stack will always be maintained 16-byte aligned, except within the prolog
Maybe I can 'reason' my way thru this? If the guy that called me did things right (if...), then the stack was aligned when he did the call, so now I'm off by 8 bytes (the return address) plus whatever I do in my prolog. Can I depend on this? Seems fragile.
I've tried looking at other code, but I'm not sure I trust what I'm seeing. I doubt gas reports errors from misusing .seh_*. You would probably only ever see a problem during an actual exception (and maybe not always even then).
If I'm going to do this, I'd like to do it right. It seems like stack alignment would be a common thing, so someone must have a solution here. I'm just not seeing it.
Looking at some code output by gcc, I think I know the answer. I was on the right track with my 'reason' approach.
When a function is called, the stack temporarily becomes unaligned (due to the call), but is almost immediately re-aligned via pushq %rbp. After that, adjustments to the stack (for local variables or stack space for parameters to called functions, etc) are always made using multiples of 16. So by the end of the prolog, the stack is always properly aligned again, and stays that way until the next call.
Which means that while andq $-16, %rsp can be used to align the stack, I shouldn't need to if I write my prolog correctly.
CAVEAT: Leaf functions (ie functions that don't call other functions) do not need to align the stack (https://msdn.microsoft.com/en-us/library/67fa79wz.aspx).
I am debugging an exe (x86) in WinDbg because it is crashing on my computer, the devs provide no support and it's closed source.
So far I found out that it crashes because a null pointer is passed to ntdll!RtlEnterCriticalSection.
I'm trying to find the source of that null pointer and I've reached a point (my "current point") where I have absolutely no idea where it was called from. I tried searching the area of the last few addresses on the stack, but there were no calls, jumps or returns at all there.
The only thing I have is the last dll loaded before the crash, which is apparently also long (at least a few thousand instructions) before my current point.
I can't just set a few thousand break points, so I thought single step exceptions could help (I could at least print eip on every instruction, I don't care if that would take days).
But I can't get the CPU to fire the exception! After loading the exe, I enter the following in the debugger:
sxe ld:<dll name>
g
sxe sse
sxe wos
r tf=1
g
The debugger breaks for the loaded dll where I want it to, but after the second g, the program just runs for a few seconds before hitting the crash point, not raising any single step exception at all.
If I do the same without the first two lines (so I'm at the start point of the program), it works. I know that tf is set to zero every time a SSE is fired, but why doesn't it fire at all later in the program?
Am I missing something? Or is there any other way I could find the source of that null pointer?
g is not the command for single stepping, it means "go" and only breaks on breakpoints or exceptions.
To do single stepping, use p. Since you don't have the source code, you cannot do instruction-stepping on source code level, meaning that you have to do it on assembly level. (Assembler instruction stepping should be default, it not enable it with l-t.) Depending on how far you need to go, this takes time.
Above only answers the question as it is. The open question is, like pointed out in the comments already, what will you do to mitigate that bug? You can't simply create a new critical section nor do you know which existing critical section should be used in that place.
How do I show a line number that says where an exception was thrown on runtime? Currently the IDE only displays the exception name, and no stack trace of any kind, making it very difficult to debug. I have searched the IntelliJ docs and haven't been able to find a simple answer (I don't want to have to use breakpoints and debugging commands).
It is basic Java stuff. You must print the stacktrace in your application to see line numbers, not just the exception itself which produces #toString() output.
I had this strange problem with stack underflow errors happen only in the release build of Flex Builder project. I looked around the web to find a solution, but while I found some related posts, nothing really helped my out. So here is this question and my solution in the answers so that it may hopefully help other people.
The Problem: I ported a java program (a game) to flex and it works fine in debug mode on Android, the web and Playbook. However, when I build a release version of the game, it crashes. The error reported is 1024, i.e. stack underflow, according to Adobe's documentation.
At first, I thought the problem was limited only to the Playbook, but no, the exact same problem happens at the exact same place on the web browser and Android. From the debugging information I inserted, I discovered that the exception appears to be thrown during the call to another function.
To solve the problem, I broke down the offending function in many individual functions and so narrowed down which precise part of the code what causing problem. This lead me to a few lines of code that had the following call (in a try-catch):
trace(e.getStackTrace())()
Hummm, this apparently was produced by the regex I used to refactor from Java to Actionscript. Removing the extra () solve the problem.
This is the kind of things I wished the compiler would catch instead of letting it fail only at release, when the function containing the offending code is pushed on the stack.