Why save the old base pointer at beginning of function? - function

What's the point of saving the old base pointer on the stack at the beginning of a function? I'm new to working with functions in assembly, but so far I have yet to see the point of doing this. It just gets pushed onto the stack and then popped off at the end, it doesn't do anything. For example the following code works just fine without doing this:
.section .data
.section .text
.globl _start
.type add, #function
add:
mov %rsp, %rbp
mov 8(%rbp), %rax
mov 16(%rbp), %rdi
add %rax, %rdi
mov %rbp, %rsp
ret
_start:
push $45
push $36
call add
add $16, %rsp
mov $60, %rax
syscall
I know that you could have simplified this even further by just using the stack pointer in this example, but I can see how that's bad practice.

Every function using xBP to locate its parameters or local variables needs to set xBP to xSP at the very beginning.
By doing so, it destroys the previous value of xBP from the calling function and so, naturally, it should save and restore it by e.g. using push and pop.
If xBP isn't used at all, it doesn't need to be saved and restored.
Many compilers have an option to use xSP to locate function parameters and local variables. If that option is enabled, xBP may not need to be preserved (unless the calling convention requires its preservation).

Related

Jump and Call on same line in assembly

I want to know if there is a way to make a conditional decision and call a function based on that result.
For example. I want to compare something. I want to do the function call if they are even. However, the way I wrote my function I need to call the function and not jump to it. (based on the way my function handles the stack) Is there a way to do that? I have copied my code in as shown, it does not compile.
.endOfForLoop: cmp dword [ebp - 4], 1 ; compares the boolean to one
je call print_prime ; if it is one then prime needs to be printed
jmp call print_not_prime ; otherwise it is not prime
Using NASM, x86 32 bit assembly, linux, intel
Just jump around the function call as if you'd implement an if-then-else:
.endOfForLoop:
cmp dword [ebp-4],1
jne .not_prime
call print_prime
jmp .endif
.not_prime:
call print_not_prime
.endif:
You could also use function pointers and the cmov instruction to make your code branchless, but I advise against writing code like this as it is harder to understand and not actually faster as all branch predictors I know do not try to predict indirect jumps at all.
.endOfForLoop:
cmp dword [ebp-4],1
mov eax,print_prime
mov ebx,print_not_prime
cmovne eax,ebx
call eax

Assembly: function template meditation

I'm learning assembly now and I don't get one thing about the (presumably) standard function template.
So, based on this really nice book, "the form to remember for functions is as follows:"
function_label:
pushl %ebp
movl %esp, %ebp
< normal function code goes here>
movl %ebp, %esp
popl %ebp
ret
OK, I'm perfectly fine with it, but there is one small thing that I don't understand. After the "normal function code" we restore the initial (pre-call) value of esp, which was previously stored in ebp.
Now, I understand it clearly why we want to serve the esp value back to the calling context untouched. What I do not understand is under which conditions the esp value can be changed at all during the function's execution.
Is it some kind of protection against ourselves (in case we somehow corrupt the stack somewhere in our code) included in this template? Or maybe changing stack values inside a function is a normal practice?
Or it is possible that the initial esp value may end up changed during execution even if we don't do anything with it? (I can't figure out how this can be, in fact.)
I felt rather silly while thinking about this and checked the esp value with gdb in this simple code:
.section .data
message:
.asciz "> Hello from function #%d\n"
.section .text
.globl main
main:
nop
call overhere
pushl $0
call exit
overhere:
pushl %ebp
movl %esp, %ebp
pushl $1
pushl $message
call printf
add $8, %esp
movl %ebp, %esp
popl %ebp
ret
And esp (as I actually expected) was untouched, so moving ebp to esp didn't actually change anything.
Now, I hope that it's clear what I want to find out:
Can esp value eventualy change by itself? (I would bet it can't.)
If it can't, then this template above, obviously, assumes that the programmer might change it somehow inside the function. But I can't figure out why on Earth one might need to do that, so - is changing esp value a mistake?
Thank you in advance and excuse my ignorance.
I am puzzled how you missed the instruction that explicitly changes esp: add $8, %esp. So the answer is clearly yes, it may change during a function and is not a mistake. Note that push and call also change it, in fact the add is to compensate for the two push instructions (the ret at the end of printf will balance the call). Other typical reason for changing esp is the allocation of local variables. This has been omitted from the function template you showed, it typically looks like sub $size_of_locals, %esp right after the movl %esp, %ebp.
That said, you don't need to use ebp to remember the stack pointer, as long as you ensure it has the same value at the exit of the function as it had upon entry. Recent versions of gcc don't use ebp when optimization is enabled, otherwise you can use -fomit-frame-pointer to do so.

How can I simulate a CALL instruction by using JMP?

Like this but without the CALL instruction. I suppose that I should use JMP and probably other instructions.
PUSH 5
PUSH 4
CALL Function
This is fairly easy to do. Push the return address onto the stack and then jump to the subroutine.
The final code looks like this:
PUSH 5
PUSH 4
PUSH offset label1
jmp Function
label1: ; returns here
leas esp, 8[esp]
Function:
...
ret
While this works, you really don't want to do this. On most modern processors, an on-chip call stack return address cache is kept, which pushes return addresses on a call, and pops return addresses on an RET. Being on the processor this has extremely short update/access times, which means the RET instruction can use the call-stack cache popped value to predict where the PC should go next, rather than waiting for the actual memory read from the memory location actually pointed to by ESP. If you do the "PUSH offset label1" trick,
this cache does not get updated, and thus the RET branch prediction is wrong and the processor pipeline gets blown, having a severe negative impact on performance. (I think IBM has a patent on special instructions which are essentially "PUSHRETURNADDRESS k" and "POPRETURNADDESS", allowing this trick to be used on some of their CPUs. Alas, not on the x86.
It depends on the situation. If the last thing your function does before returning is call another function, you can simply jump to that function. This is called tail call elimination, and is an optimization performed by many compilers. Example:
foo:
call B
call A
ret
Tail call elimination replaces the last two lines with a single jump instruction:
foo:
call B
jmp A
This works because the stack contains the return address of foo's caller. So when function A returns, it returns back to the function that called foo.
It you want execution to resume after the jump to A, push that address onto the stack before jumping:
foo:
call B
push offset bar
jmp A
bar:
However, I can think of no reason why anybody would want to do this.
Before x86-64, call was the only instruction that could read EIP. (I guess int as well, but it doesn't put the result anywhere you can read from user-space).
So it's impossible to simulate call in position-independent code. In fact, 32-bit PIC code uses call to find out its own address.
But in x86-64, we have RIP-relative lea
... put function args in registers
lea rax, [rel ret_addr] ; AT&T lea ret_addr(%rip), %rax
push rax
jmp call_target
ret_addr:
call itself internally decodes as push RIP / jmp target, where RIP during execution of an instruction = address of the end of that instruction = start of the next.
Of course this is normally terrible for performance, unbalancing the return-address predictor stack. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/. Use a normal call unless you want a ret to mispredict, e.g. for a retpoline or specpoline.
(A tailcall with just jmp is fine, collapsing a call/ret pair into a jmp, but pushing a new return address manually is always a problem.)

Calling Multiple Functions in x86

I am attempting to learn x86 AT&T syntax and am at a spot where I cam a little confused in general. I understand that there are frames on the stack and when a call is made the first thing that happens in that function is some sort of frame update, then getting parameters. So, if I have some sort of value like 5 in the register eax in my Main area of code and call Function, I still have access to the value 5 in eax correct? Or in order to get it as a parameter I have to do something like this. I saw somewhere else that you pushed your arguments to the stack before calling a function, is this true? I guess something has to be located at 8(ebp) for me to move it into eax, but what is the value of eax before I move something into it with movl? Is it 5? I know this is a lot of questions, I'm just confused at the moment of calling a function and returning something. Any help would be greatly appreciated. I'm sure this is like a piece of cake for some assembly gurus!
Function:
pushl %ebp
movl %esp, %ebp
movl 8(ebp), eax
This page should basically wrap that up.
With cdecl you go like
; I'm not comfortable with AT&T syntax, but it's not relevant here
; void *processData(void *firstParam, void *secondParam)
proc processData
push ebp
mov ebp,esp
mov eax,[dword ptr ss:ebp + 8] ; firstParam
mov edx,[dword ptr ss:ebp + 12] ; secondParam
; do something with the data and put the result into EAX
mov esp,ebp
pop ebp
ret
endp processData
You invoke it like
lea eax,[ds:bufferOfSecondParam]
push eax
lea eax,[ds:bufferOfFirstParam]
push eax
call processData
add esp,8
; Here you can do with the return value anything you want
First of all, you need to decide on the calling convention to use. Win32, for example, uses a variant of cdecl called stdcall where the callee is responsible for cleaning up the stack - this is not too convenient to implement and does not allow for varargs.
[SS:EBP + 8] points to the first argument, because
Arguments are passed onto the stack from right to left ([SS:EBP + 12] points to the second arg)
DWORDS are 4 bytes
[SS:EBP + 0] points to the previous EBP saved upon creation of the stack frame
[SS:EBP + 4] points to the return address reaad into EIP upon ret

x86 assembly functions

I have a function that is called by main. Assume that function's name is funct1. funct1 calls another function named read_input.
Now assume that funct1 starts as follows:
push %rbp
push %rbx
sub $0x28, %rsp
mov $rsp, %rsi
callq 4014f0 read_input
cmpl $0x0, (%rsp)
jne (some terminating function)
So just a few of questions:
In this case, does read_input only have one argument, which is
%rbx?
Furthermore, if the stack pointer is being decreased by
0x28, this means a string of size 0x28 is getting pushed onto the
stack? (I know it's a string).
And what is the significance of
mov %rsp, %rsi before calling a function?
And lastly, when read_input returns, where is the return value put?
Thank you and sorry for the questions but I am just starting to learn x86!
It looks like your code is using the Linux/AMD ABI. I'll answer your questions in that context.
No, rbx is a callee-saved (nonvolatile) register. Your function is saving it so that it doesn't disturb the caller's value. It's not being restored in the code you've shown, but that's because you haven't shown the whole function. If there's more to this function, and I think there is, it's because rbx is being used somewhere later on in this routine.
Yes, space for 0x28 bytes of data is being made on the stack. Assuming read_input is taking a string as a parameter, your description is reasonable. It's not necessarily accurate, however. Some of that data might be used for other local variables aside from just the buffer being allocated to pass to read_input.
This instruction is putting a pointer to the newly allocated stack buffer into rsi. rsi is the second parameter register for the AMD x64 calling convention. That means you're going to be calling read_input with whatever the first parameter passed to this function is, along with a pointer to your new stack buffer.
In rax, if it's a 64-bit value or smaller, in rax & rdx if it's larger. Or if it's floating point, in xmm0, ymm0, or st(0). You probably should look at a description of your calling convention to get a handle on this stuff - there's a great PDF file at this link. Check out Table 4.