Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am well aware of the fact that in C and C++ everything is passed by value (even if that value is of reference type). I think (but I'm no expert there) the same is true for Java.
So, and that's why I include language-agnostic as a tag, in what language can I pass anything to a function without passing some value?
And if that exists, what does the mechanism look like? I thought hard about that, and I fail to come up with any mechanism that does not involve the passing of a value.
Even if the compiler optimizes in a way that I don't have a pointer/reference as a true variable in memory, it still has to calculate an address as an offset from the stack (frame) pointer - and pass that.
Anybody who could enlighten me?
From C perspective:
There are no references as a language level concept. Objects are referred to by pointing at them with pointers.
The value of a pointer is the address of the pointed object. Pointers are passed by value just like any other arguments. A pointed object is conceptually passed by reference.
At least from C++ perspective:
How is 'pass by reference' implemented [...] ?
Typically, by copying the address of the object.
... without actually passing an address to a function?
If a function invocation is expanded inline, there is no need to copy the address anywhere. Same applies to pointers too, because copies may be elided due to the as-if rule.
in what language can I pass anything to a function without passing some value?
Such language would have to have significantly difference concept of a function than C. There would have to be no stack frame push.
Function-like C pre-processor macros, as their name implies, are similar to functions, but their arguments are not passed around at runtime, because pre-processing happens before compilation.
On the other hand, you can have global variables. If you change the global state of the program, and call a function with no arguments, you have conceptually "passed the new global state to the function" without having passed any value.
At a machine-code level, "pass X by reference" is essentially "pass the address of X by value".
Pointers are values. Valuea ars values. Values have a unique identity, require storage.
References are not values. References have no identity. If we have:
int x=0;
int& y=x;
int& z=x;
both y and z are references to x, and they have no independent identity.
In comparison:
int x=0;
int* py=&x;
int* pz=&x;
both py and pz are pointers at x, and they have independent identity. You could modify py and not pz, you can get a size of them, you can memset them.
In some circumstances, at the machine code level, references are implemented the same way as pointers, except certain operations are never performed on them (like reaiming them).
But C++ is not defined in terms of machine code. It is defined innterms of the behaviour of an abstract machine. Compilers compile your code to operations on this abstract machine, which has no fixed calling convention (by the standard), no layout for references, no stack, no heap, etc. It then does arbitrary transformations on this that do not change the as-if behaviour (a common one is single assignment), rearranges things, and then at some point emits assembly/machine code that generates similar behaviour on the actual hardware you are running on.
Now the near universal way to compile C++ is the compilation unit/linker model, where functions are exported as symbols and a fixed ABI calling convention is provided for other compilation units to use them. Then at link stage the compilation units are connected together.
In those ABIs, references are passed as pointers.
How is 'pass by reference' implemented without actually passing an address to a function?
Within the context of the C languages, the short answers are:
In C, it is not.
In C++, a type followed by an ampersand (&) is a reference type.
For instance, int& is a reference to an int. When passing an argument
to a function that takes reference type, the object is truly passed
by reference. (More on this in the scholarly link below.)
But in truth, most of the confusion is semantics. Some of the confusion could be helped by:
1) Stop using the word emulated to describe passing an address.
2) Stop using the word reference to describe address
Or
3) Recognize that within the context of the C/C++ languages, in the
phrase pass-by-reference, the word reference is defined as: value of
address.
Beyond this, there are many examples of illusions and concepts created to convey impossible ideas. The concept of non-emulated pass-by-reference is arguably one of them, no matter how many scholarly papers or practical discussions.
This one (scholarly paper category) is yet another that presents a distinction between emulated and actual pass-by-reference in a discussion using both C & C++, but who's conclusions stick closely to reality. The following is an excerpt:
...Somehow, it is only a matter of how the concept of “passing by reference” is actually realized by a programming language: C implements this by using pointers and passing them by value to functions whereas C++ provides two implementations. From a side, it reuses the same mechanism derived from C (i.e., pointers + pass by value). On the other hand, C++ also provides a native “pass by reference” solution which makes use of the idea of reference types. Thus, even in C++ if you are passing a pointer à la C, you are not truly passing by reference, you are passing a pointer by value (that is, of course, unless you are passing a reference to a pointer! e.g., int*&).
Because of this potential ambiguity in the term “pass by reference”, perhaps it’s best to only use it in the context of C++ when you are using a reference type.
But as you, and others have already noted, in the concept of passing anything via an argument, whether value or reference, that something must by definition have a value.
What is meant by pass by value is that the object itself is passed.
In pass by pointer, we pass the value of the pointer to the object.
In pass by reference, we pass a reference (basically a pointer that we know points to an object) in the same way.
So yes, we always pass a value, but the question is what is the value? Not always the object itself. But when we say pass a variable by **, we give the information relative to the object we want to pass, not the value actually passed.
I am reading the Programming From Ground Up book. I see two different examples of how the base pointer %ebp is created from the current stack position %esp.
In one case, it is done before the local variables.
_start:
# INITIALIZE PROGRAM
subl $ST_SIZE_RESERVE, %esp # Allocate space for pointers on the
# stack (file descriptors in this
# case)
movl %esp, %ebp
The _start however is not like other functions, it is the entry point of the program.
In another case it is done after.
power:
pushl %ebp # Save old base pointer
movl %esp, %ebp # Make stack pointer the base pointer
subl $4, %esp # Get room for our local storage
So my question is, do we first reserve space for local variables in the stack and create the base pointer or first create the base pointer and then reserve space for local variables?
Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc. Does C have a specific convention when it creates the machine code?
My reasoning is that all the code in a function would be relative to the base pointer, so as long as that function follows the convention according to which it created a reference of the stack, it just works?
Few related links for those are interested:
Function Prologue
In your first case you don't care about preservation - this is the entry point. You are trashing %ebp when you exit the program - who cares about the state of the registers? It doesn't matter any more as your application has ended. But in a function, when you return from that function the caller certainly doesn't want %ebp trashed. Now can you modify %esp first then save %ebp then use %ebp? Sure, so long as you unwind the same way on the other end of the function, you may not need to have a frame pointer at all, often that is just a personal choice.
You just need a relative picture of the world. A frame pointer is usually just there to make the compiler author's job easier, actually it is usually there just to waste a register for many instruction sets. Perhaps because some teacher or textbook taught it that way, and nobody asked why.
For coding sanity, the compiler author's sanity etc, it is desirable if you need to use the stack to have a base address from which to offset into your portion of the stack, FOR THE DURATION of the function. Or at least after the setup and before the cleanup. This can be the stack pointer(sp) itself or it can be a frame pointer, sometimes it is obvious from the instruction set. Some have a stack that grows down (in address space toward zero) and the stack pointer can only have positive offsets in sp based address (sane) or some negative only (insane) (unlikely but lets say its there). So you may want a general purpose register. Maybe there are some you cant use the sp in addressing at all and you have to use a general purpose register.
Bottom line, for sanity you want a reference point to offset items in the stack, the more painful way but uses less memory would be to add and remove things as you go:
x is at sp+4
push a
push b
do stuff
x is at sp+12
pop b
x is at sp+8
call something
pop a
x is at sp+4
do stuff
More work but can make a program (compiler) that keeps track and is less error prone than a human by hand, but when debugging the compiler output (a human) it is harder to follow and keep track. So generally we burn the stack space and have one reference point. A frame pointer can be used to separate the incoming parameters and the local variables using base pointer(bp) for example as a static base address within the function and sp as the base address for local variables (athough sp could be used for everything if the instruction set provides that much of an offset). So by pushing bp then modifying sp you are creating this two base address situation, sp can move around perhaps for local stuff (although not usually sane) and bp can be used as a static place to grab parameters if this is a calling convention that dictates all parameters are on the stack (generally when you dont have a lot of general purpose registers) sometimes you see the parameters are copied to local allocation on the stack for later use, but if you have enough registers you may see that instead a register is saved on the stack and used in the function instead of needing to access the stack using a base address and offset.
unsigned int more_fun ( unsigned int x );
unsigned int fun ( unsigned int x )
{
unsigned int y;
y = x;
return(more_fun(x+1)+y);
}
00000000 <fun>:
0: e92d4010 push {r4, lr}
4: e1a04000 mov r4, r0
8: e2800001 add r0, r0, #1
c: ebfffffe bl 0 <more_fun>
10: e0800004 add r0, r0, r4
14: e8bd4010 pop {r4, lr}
18: e12fff1e bx lr
Do not take what you see in a text book, white board (or on answers in StackOverflow) as gospel. Think through the problem, and through alternatives.
Are the alternatives functionally broken?
Are they functionally correct?
Are there disadvantages like readability?
Performance?
Is the performance hit universal or does it depend on just how
slow/fast the memory is?
Do the alternatives generate more code which is a performance hit but
maybe that code is pipelined vs random memory accesses?
If I don't use a frame pointer does the architecture let me regain
that register for general purpose use?
In the first example bp is being trashed, that is bad in general but this is the entry point to the program, there is no need to preserve bp (unless the operating system dictates).
In a function though, based on the calling convention one assumes that bpis used by the caller and must be preserved, so you have to save it on the stack to use it. In this case it appears to want to be used to access parameters passed in by the caller on the stack, then sp is moved to make room for (and possibly access but not necessarily required if bp can be used) local variables.
If you were to modify sp first then push bp, you would basically have two pointers one push width away from each other, does that make much sense? Does it make sense to have two frame pointers anyway and if so does it make sense to have them almost the same address?
By pushing bp first and if the calling convention pushes the first paramemter last then as a compiler author you can make bp+N always or ideally always point at the first parameter for a fixed value N likewise bp+M always points at the second. A bit lazy to me, but if the register is there to be burned then burn it...
In one case, it is done before the local variables.
_start is not a function. It's your entry point. There's no return address, and no caller's value of %ebp to save.
The i386 System V ABI doc suggests (in section 2.3.1 Initial Stack and Register State) that you might want to zero %ebp to mark the deepest stack frame. (i.e. before your first call instruction, so the linked list of saved ebp values has a NULL terminator when that first function pushes the zeroed ebp. See below).
Does C have a specific convention when it creates the machine code?
No, unlike in some other x86 systems, the i386 System V ABI doesn't require much about your stack-frame layout. (Linux uses the System V ABI / calling convention, and the book you're using (PGU) is for Linux.)
In some calling conventions, setting up ebp is not optional, and the function entry sequence has to push ebp just below the return address. This creates a linked list of stack frames which allows an exception handler (or debugger) to backtrace up the stack. (How to generate the backtrace by looking at the stack values?). I think this is required in 32-bit Windows code for SEH (structured exception handling), at least in some cases, but IDK the details.
The i386 SysV ABI defines an alternate mechanism for stack unwinding which makes frame pointers optional, using metadata in another section (.eh_frame and .eh_frame_hdr which contains metadata created by .cfi_... assembler directives, which in theory you could write yourself if you wanted stack-unwinding through your function to work. i.e. if you were calling any C++ code which expected throw to work.)
If you want to use the traditional frame-walking in current gdb, you have to actually do it yourself by defining a GDB function like gdb backtrace by walking frame pointers or Force GDB to use frame-pointer based unwinding. Or apparently if your executable has no .eh_frame section at all, gdb will use the EBP-based stack-walking method.
If you compile with gcc -fno-omit-frame-pointer, your call stack will have this linked-list property, because when C compilers do make proper stack frames, they push ebp first.
IIRC, perf has a mode for using the frame-pointer chain to get backtraces while profiling, and apparently this can be more reliable than the default .eh_frame stuff for correctly accounting which functions are responsible for using the most CPU time. (Or causing the most cache misses, branch mispredicts, or whatever else you're counting with performance counters.)
Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc.
Yes, it would work fine. In fact setting up ebp at all is optional, but when writing by hand it's easier to have a fixed base (unlike esp which moves around when you push/pop).
For the same reason, it's easier to stick to the convention of mov %esp, %ebp after one push (of the old %ebp), so the first function arg is always at ebp+8. See What is stack frame in assembly? for the usual convention.
But you could maybe save code size by having ebp point in the middle of some space you reserved, so all the memory addressable with an ebp + disp8 addressing mode is usable. (disp8 is a signed 8-bit displacement: -128 to +124 if we're limiting to 4-byte aligned locations). This saves code bytes vs. needing a disp32 to reach farther. So you might do
bigfunc:
push %ebp
lea -112(%esp), %ebp # first arg at ebp+8+112 = 120(%ebp)
sub $236, %esp # locals from -124(%ebp) ... 108(%ebp)
# saved EBP at 112(%ebp), ret addr at 116(%ebp)
# 236 was chosen to leave %esp 16-byte aligned.
Or delay saving any registers until after reserving space for locals, so we aren't using up any of the locations (other than the ret addr) with saved values we never want to address.
bigfunc2: # first arg at 4(%esp)
sub $252, %esp # first arg at 252+4(%esp)
push %ebp # first arg at 252+4+4(%esp)
lea 140(%esp), %ebp # first arg at 260-140 = 120(%ebp)
push %edi # save the other call-preserved regs
push %esi
push %ebx
# %esp is 16-byte aligned after these pushes, in case that matters
(Remember to be careful how you restore registers and clean up. You can't use leave because esp = ebp isn't right. With the "normal" stack frame sequence, you might restore other pushed registers (from near the saved EBP) with mov, then use leave. Or restore esp to point at the last push (with add), and use pop instructions.)
But if you're going to do this, there's no advantage to using ebp instead of ebx or something. In fact, there's a disadvantage to using ebp: the 0(%ebp) addressing mode requires a disp8 of 0, instead of no displacement, but %ebx wouldn't. So use %ebp for a non-pointer scratch register. Or at least one that you don't dereference without a displacement. (This quirk is irrelevant with a real frame pointer: (%ebp) is the saved EBP value. And BTW, the encoding that would mean (%ebp) with no displacement is how the ModRM byte encodes a disp32 with no base register, like (12345) or my_label)
These example are pretty artifical; you usually don't need that much space for locals unless it's an array, and then you'd use indexed addressing modes or pointers, not just a disp8 relative to ebp. But maybe you need space for a few 32-byte AVX vectors. In 32-bit code with only 8 vector registers, that's plausible.
AVX512 compressed disp8 mostly defeats this argument for 64-byte AVX512 vectors, though. (But AVX512 in 32-bit mode can still only use 8 vector registers, zmm0-zmm7, so you could easily need to spill some. You only get x/ymm8-15 and zmm8-31 in 64-bit mode.)
I am trying to decipher a fortran code. It passes a pointer to a function as an actual argument, and the formal argument is instead a target. It defines and allocates a pointer of type globalDATA in the main program, then it calls a function passing that pointer:
module dataGLOBAL
type globalDATA
type (gl_1) , pointer :: gl1
type (gd_2) , pointer :: gd2
type (gdt_ok) , pointer :: gdtok
...
...
end type globalDATA
end module dataGLOBAL
Program main
....
....
use dataGLOBAL
...
type(globalDATA),pointer :: GD
allocate(GD)
returnvalue = INIT(GD)
....
....
end
The function reads:
integer function INIT(GD) result(returnvalue)
....
....
use dataGLOBAL
type(globalDATA) , target :: GD
allocate (GD%gl1)
allocate (GD%gd2)
allocate (GD%gdtok)
....
....
end function INIT
What is the meaning of doing this? And why do both the pointer in the main program and the single components of the target structure have to be allocated?
thanks
A.
A few things may come into play...
When you provide a pointer as an actual argument to a procedure where the corresponding dummy argument does NOT have the POINTER attribute (the case here), the thing that is associated with the dummy argument is the target of the actual argument pointer. So in this case, the thing being passed is the object that GD (in the main program) is pointing to - the thing that was allocated by the allocate statement. (When both the actual and dummy arguments have the POINTER argument, then the POINTER itself is "passed" - you can change what the POINTER points to and that change is reflected back in the calling scope.)
Because the GD dummy argument inside the function has the target attribute, pointers inside the function can be pointed at the dummy argument. You don't show any declarations for such pointers, but perhaps they are in elided code. If nothing is ever pointed at the GD dummy argument (including inside any procedures that might be called by the INIT function), then the TARGET attribute is superfluous, but harmless apart from inhibiting some optimisations.
Things that have the pointer attribute also (automatically by language rules) have the TARGET attribute - so GD in the main program has the TARGET attribute. The fact that GD in the main program and in the function BOTH have the target attribute may be relevant because...
When the dummy argument has the TARGET attribute and the thing passed as the actual argument has the TARGET attribute, then pointers associated with the dummy argument inside the procedure are also "usually" (there are exceptions/processor dependencies for coindexed things/non-contiguous arrays/vector subscripted sections too complicated for me to remember) associated with the corresponding actual argument. If a pointer is not a local variable (perhaps it is a pointer declared in a module) then this association survives past the end of the procedure. Perhaps that's relevant in the elided code. (Alternatively, if the the actual argument does not have the TARGET attribute, then any pointers associated with the dummy argument become undefined when the procedure ends.)
The components of the globalDATA type are themselves pointers. Consequently, GD in the main program is a pointer to something (that something being allocated by the single ALLOCATE statement in the main program) that itself contains pointers to other things (those other things being allocated by the numerous ALLOCATE statements in the function). You have two levels of pointer, hence two levels of ALLOCATE.
Before Fortran 2003 (or Fortran 95 with the "allocatable TR") you couldn't have ALLOCATABLE components in derived types, and you couldn't have ALLOCATABLE dummy arguments - when the need for dynamic allocation collided with these former restrictions you had to use pointers instead, even if you were only using the pointers as values. I strongly suspect your code dates from this era (Support for the allocatable TR became widespread about a decade ago). In very "modern" Fortran pointers are (should?) only used when you might want variables that point at other things (where other things includes "no thing").
With a pointer variable that is a user-defined type that itself contains pointers, you have to allocate (i.e., create the storage) both the overall variable and the component pointers. The components aren't automatically allocated when the overall variable is. Someone made a design choice to allocate the overall variable in the main program and the components in a subroutine. Maybe they thought that allocating the overall variable was simple but allocating all of the components was getting complicated and wanted to relegate that to a subroutine.
Because the pointer attribute is not specified for the dummy argument, the entire derived type GD is passed from the main code (not the pointer to it). On the subroutine side, you could explicitely write
integer function INIT(GD) result(returnvalue)
...
use dataGLOBAL
type(globalDATA), intent(inout), target :: GD
to make it more clear. The target attribute of the dummy argument only ensures, that you can point to that argument inside the subroutine via pointer assignment.
As long as you are only manipulating the fields of the derived type, but not the derived type as whole (e.g. by allocating or deallocating it), it should not make a difference, whether you call the INIT routine by passing a pointer or the derived type itself.
As noted already in other answers, the purpose of the program seems to separate the allocation of the derived type and its components from each other. One possible advantage of this strategy is the possibility to pass both, pointers and statically allocated derived types to the INIT routine.
The following codes are widely used for GPU global memory allocation:
float *M;
cudaMalloc((void**)&M,size);
I wonder why do we have to pass a pointer to a pointer to cudaMalloc, and why it was not designed like:
float *M;
cudaMalloc((void*)M,size);
Thanks for any plain descriptions!
cudaMalloc needs to write the value of the pointer to M (not *M), so M must be passed by reference.
Another way would be to return the pointer in the classic malloc fashion. Unlike malloc, however, cudaMalloc returns an error status, like all CUDA runtime functions.
To explain the need in a little more detail:
Before the call to cudaMalloc, M points... anywhere, undefined. After the call to cudaMalloc you want a valid array to be present at the memory location where it points at. One could naïvely say "then just allocate the memory at this location", but that's of course not possible in general: the undefined address will normally not even be inside valid memory. cudaMalloc need to be able to choose the location. But if the pointer is called by value, there's no way to tell the caller where.
In C++, one could make the signature
template<typename PointerType>
cudaStatus_t cudaMalloc(PointerType& ptr, size_t);
where passing ptr by reference allows the function to change the location, but since cudaMalloc is part of the CUDA C API this is not an option. The only way to pass something as modifiable in C is to pass a pointer to it. And the object is itself a pointer what you need to pass is a pointer to a pointer.
I'm currently improving the part of our COM component that logs all external calls into a file. For pointers we write something like (IInterface*)0x12345678 with the value being equal to the actual address.
Currently no difference is made for null pointers - they are displayed as 0x0 which IMO is suboptimal and inelegant. Changing this behaviour is not a problem at all. But first I'd like to know - is there any real advantage in representing null pointers in hex?
In C or C++, you should be able to use the standard %p formatting code, which will then make your pointers look like everybody else's.
I'm not sure how null pointers are formatted in Win32 by %p, on Linux I think you get "null" or something similar.
Using the notation 0x0 (IMO) makes it clearer that it's referring to an address (even if it's not the internal representation of the null pointer). (In actual code, I prefer would using the NULL macro, though, but it sounds like you're talking specifically about debugging spew.)
It gives some context, just like I prefer using '\0' for the NUL-terminator.
It's a stylistic preference, though, so do what appeals to you (and to your colleagues).
Personally, I'd print 0x0 to the log file[*]. Some day when someone comes to parse the file automatically, the more uniform the data is the better. I don't find 0x0 difficult to read, so it seems silly to have a special case in the writer code, and another special case in the reader code, for no benefit that I can think of.
0x0 is preferable to 0 for grepping the log for NULLs, too: saves you having to figure out that you should be grepping for )0 or something funny.
I wouldn't write 0x0 for a null pointer constant in C or C++, though. I write non-null addresses so unbelievably rarely that there's nothing for the nulls to be uniform with. I guess if I was defining a bunch of constants to represent the memory map of some device, and the zero address was significant in that memory map, then I might write it 0x0 in that context.
[*] Or perhaps 0x00000000. I like 32-bit pointers to be printed 8 chars long, because when I read/remember a pointer I start out in pairs from the left. If it turns out to have 7 chars, I get horribly confused at the end ;-). 64-bit pointers it doesn't matter, because I can't remember a number that long anyway...
It's all positive zero in the end.
There is: You can always convert them back to a number (0), with no additional effort. And the only disadvantage is readability.
There is no reason to prefer (SomeType*)0x0 to (SomeType*)0.
As an aside: In C, the null pointer constant is a somewhat strange construct; the compiler recognizes (SomeType*)0 as "the null pointer", even if the internal representation on some machine might differ from the numerical value 0. It is more like NULL in SQL -- not a "real" pointer value. In practice, all machines I know of model the null pointer as the "0" address.
I am pretty sure the hex notation is a result of the layout of memory. Memory is word aligned, where a word is 32 bits if you are on a 32 bit processor. These words are segmented into pages, which are arranged in page tables, etc. etc. Hex notation is the only way to make sense of this arrangements (unless you really like using your calculator).
My opinion, is for readability, think about it, if you were to look at 0, what does that mean, does that mean its a unsigned integer, or if it was 0x0, then instinctively, it has something to do with binary notation, more likely platform dependent.
Since the tag is language agnostic, and the word 'null pointer', in Delphi/Object Pascal, it is 'nil', in C#, it is 'null', in C/C++ it is 'NULL'.
Look at for example in the C-FAQ, in Section 5 on NULL pointers, specifically, 5.4, 5.5, 5.6 and 5.7 to give you an insight into this.
In a nutshell, the usage and notation of null pointers is dependent on
What language is used?
Semantics and syntax of the language specifications.
What type of compiler?
Type of platform, in terms of how memory is accessed, the processor, bits...
Hope this helps,
Best regards,
Tom.