How are functions implemented? - function

I've been playing around with the reflect package, and I notice how limited the functionality of functions are.
package main
import (
"fmt"
"reflect"
"strings"
)
func main() {
v := reflect.ValueOf(strings.ToUpper)
fmt.Printf("Address: %v\n", v) // 0xd54a0
fmt.Printf("Can set? %d\n", v.CanSet()) // False
fmt.Printf("Can address? %d\n", v.CanAddr()) // False
fmt.Printf("Element? %d\n", v.Elem()) // Panics
}
Playground link here.
I've been taught that functions are addresses to memory with a set of instructions (hence v prints out 0xd54a0), but it looks like I can't get an address to this memory address, set it, or dereference it.
So, how are Go functions implemented under the hood? Eventually, I'd ideally want to manipulate the strings.ToUpper function by making the function point to my own code.

Disclaimers:
I've only recently started to delve deeper into the golang compiler, more specifically: the go assembler and mapping thereof. Because I'm by no means an expert, I'm not going to attempt explaining all the details here (as my knowledge is most likely still lacking). I will provide a couple of links at the bottom that might be worth checking out for more details.
What you're trying to do makes very, very little sense to me. If, at runtime, you're trying to modify a function, you're probably doing something wrong earlier on. And that's just in case you want to mess with any function. The fact that you're trying to do something with a function from the strings package makes this all the more worrying. The reflect package allows you to write very generic functions (eg a service with request handlers, but you want to pass arbitrary arguments to those handlers requires you to have a single handler, process the raw request, then call the corresponding handler. You cannot possibly know what that handler looks like, so you use reflection to work out the arguments required...).
Now, how are functions implemented?
The go compiler is a tricky codebase to wrap ones head around, but thankfully the language design, and the implementation thereof has been discussed openly. From what I gather, golang functions are essentially compiled in pretty much the same way as a function in, for example, C. However, calling a function is a bit different. Go functions are first-class objects, that's why you can pass them as arguments, declare a function type, and why the reflect package has to allow you to use reflection on a function argument.
Essentially, functions are not addressed directly. Functions are passed and invoked through a function "pointer". Functions are effectively a type like similar to a map or a slice. They hold a pointer to the actual code, and the call data. In simple terms, think of a function as a type (in pseudo-code):
type SomeFunc struct {
actualFunc *func(...) // pointer to actual function body
data struct {
args []interface{} // arguments
rVal []interface{} // returns
// any other info
}
}
This means that the reflect package can be used to, for example, count the number of arguments and return values the function expects. It also tells you what the return value(s) will be. The overall function "type" will be able to tell you where the function resides, and what arguments it expects and returns, but that's about it. IMO, that's all you really need though.
Because of this implementation, you can create fields or variables with a function type like this:
var callback func(string) string
This would create an underlying value that, based on the pseudo code above, looks something like this:
callback := struct{
actualFunc: nil, // no actual code to point to, function is nil
data: struct{
args: []interface{}{string}, // where string is a value representing the actual string type
rVal: []interface{}{string},
},
}
Then, by assigning any function that matches the args and rVal constraints, you can determine what executable code the callback variable points to:
callback = strings.ToUpper
callback = func(a string) string {
return fmt.Sprintf("a = %s", a)
}
callback = myOwnToUpper
I hope this cleared 1 or 2 things up a bit, but if not, here's a bunch of links that might shed some more light on the matter.
Go functions implementation and design
Introduction to go's ASM
Rob Pike on the go compiler written in go, and the plan 9 derived asm mapping
Writing a JIT in go asm
a "case study" attempting to use golang ASM for optimisation
Go and assembly introduction
Plan 9 assembly docs
Update
Seeing as you're attempting to swap out a function you're using for testing purposes, I would suggest you not use reflection, but instead inject mock functions, which is a more common practice WRT testing to begin with. Not to mention it being so much easier:
type someT struct {
toUpper func(string) string
}
func New(toUpper func(string) string) *someT {
if toUpper == nil {
toUpper = strings.ToUpper
}
return &someT{
toUpper: toUpper,
}
}
func (s *someT) FuncToTest(t string) string {
return s.toUpper(t)
}
This is a basic example of how you could inject a specific function. From within your foo_test.go file, you'd just call New, passing a different function.
In more complex scenario's, using interfaces is the easiest way to go. Simply implement the interface in the test file, and pass the alternative implementation to the New function:
type StringProcessor interface {
ToUpper(string) string
Join([]string, string) string
// all of the functions you need
}
func New(sp StringProcessor) return *someT {
return &someT{
processor: sp,
}
}
From that point on, simply create a mock implementation of that interface, and you can test everything without having to muck about with reflection. This makes your tests easier to maintain and, because reflection is complex, it makes it far less likely for your tests to be faulty.
If your test is faulty, it could cause your actual tests to pass, even though the code you're trying to test isn't working. I'm always suspicious if the test code is more complex than the code you're covering to begin with...

Underneath the covers, a Go function is probably just as you describe it- an address to a set of instructions in memory, and parameters / return values are filled in according to your system's linkage conventions as the function executes.
However, Go's function abstraction is much more limited, on purpose (it's a language design decision). You can't just replace functions, or even override methods from other imported packages, like you might do in a normal object-oriented language. You certainly can't do dynamic replacement of functions under normal circumstances (I suppose you could write into arbitrary memory locations using the unsafe package, but that's willful circumvention of the language rules, and all bets are off at that point).
Are you trying to do some sort of dependency injection for unit testing? If so, the idiomatic way to do this in Go is to define interface that you pass around to your functions/methods, and replace with a test version in your tests. In your case, an interface may wrap the call to strings.ToUpper in the normal implementation, but a test implementation might call something else.
For example:
type Upper interface {
ToUpper(string) string
}
type defaultUpper struct {}
func (d *defaultUpper) ToUpper(s string) string {
return strings.ToUpper(s)
}
...
// normal implementation: pass in &defaultUpper{}
// test implementation: pass in a test version that
// does something else
func SomethingUseful(s string, d Upper) string {
return d.ToUpper(s)
}
Finally, you can also pass function values around. For example:
var fn func(string) string
fn = strings.ToUpper
...
fn("hello")
... but this won't let you replace the system's strings.ToUpper implementation, of course.
Either way, you can only sort of approximate what you want to do in Go via interfaces or function values. It's not like Python, where everything is dynamic and replaceable.

Related

Are function declarations and function expressions implemented differently in Go? If yes, why?

I just got into programming again with Go (with no experience whatsoever in low-level languages), and I noticed that function expressions are not treated the same as function declarations (go1.18.5 linux/amd64).
For instance, this works (obviously):
package main
import "fmt"
func main() {
fmt.Println("Do stuff")
}
But this outputs an error:
package main
import "fmt"
var main = func() {
fmt.Println("Do stuff")
}
./prog.go:3:8: imported and not used: "fmt"
./prog.go:5:5: cannot declare main - must be func
Go build failed.
Even specifying the type as in var main func() = func() {} does nothing for the end result. Go seems to, first of all, evaluate if the main identifier is being used in any declaration, ignoring the type. Meanwhile, Javascript folks seem to choose what is more readable, like there's no underlying difference between them.
Questions:
Are function declarations and function expressions implemented differently under the hood, or is this behavior just hard-coded?
If yes, is the difference between these implementations critical?
Is Go somewhat better in any way for doing it the way it does?
From the spec:
The main package must [...] declare a function main that takes no arguments and returns no value.
The moment you write var main you prevent the above requirement from being met, because what you are creating is a variable storing a reference to a function literal as opposed to being a function declaration.
A function declaration binds an identifier, the function name, to a function.
A function literal represents an anonymous function.
So:
Are function declarations and function expressions implemented differently under the hood?
Yes.
Is Go somewhat better in any way for doing it the way it does?
Generally, no. It comes with the territory of being a typed language, which has all sorts of pros and cons depending on use.
If yes, is the difference between these implementations critical?
By what standard? As an example, a variable referencing a function literal could be nil, which isn't usually something you'd like to call. The same cannot happen with function declarations (assuming package unsafe is not used).
The code
func main() {
fmt.Println("Do stuff")
}
binds a function to the identifier main.
The code
var main = func() {
fmt.Println("Do stuff")
}
binds a variable of type func () to the identifier main and initializes the variable to the result of a function expression.
Are function declarations and function expressions implemented differently under the hood, or is this behavior just hard-coded?
A function expression evaluates to a function value. A function declaration binds a function value to a name. There is no difference in the implementation of these function values (but the result of an expressions can additionally close over function scoped variables).
If yes, is the difference between these implementations critical?
Yes.
The question points out one scenario where the difference is important (program execution starts at the function main in package main)
The compiler cannot inline a function called through a variable.
Functions cannot be declared inside another function, but a function expression can be assigned to a variable in a function.
Is Go somewhat better in any way for doing it the way it does?
Other languages make a distinction between an identifier bound a function and and identifier bound to a variable with a function type. Go is no better or worse in than those languages with regards to the binding of identifiers.
The OP says in a comment:
The question was about the difference between javascript and go, though, but you also answered it with "It comes with the territory of being a typed language".
The difference is unrelated to being a typed language. Identifiers can be bound to constants, types, variables and functions in Go. I may be going out on a limb here, but non-reserved identifiers in Javascript are always bound to variables.

What is the memory allocation when you return a function in Golang?

Here is a simplified code
func MyHandler(a int) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteCode(a)
})
}
Whenever a http request comes MyHandler will be called, and it will return a function which will be used to handle the request. So whenever a http request comes a new function object will be created. Function is taken as the first class in Go. I'm trying to understand what actually happened when you return a function from memory's perspective. When you return a value, for example an integer, it will occupy 4 bytes in stack. So how about return a function and lots of things inside the function body? Is it an efficient way to do so? What's shortcomings?
If you're not used to closures, they may seem a bit magic. They are, however, easy to implement in compilers:
The compiler finds any variables that must be captured by the closure. It puts them into a working area that will be allocated and remain allocated as long as the closure itself exists.
The compiler then generates the inner function with a secret extra parameter, or some other runtime trickery,1 such that calling the function activates the closure.
Because the returned function accesses its closure variables through the compile-time arrangement, there's nothing special needed. And since Go is a garbage-collected language, there's nothing else needed either: the pointer to the closure keeps the closure data alive until the pointer is gone because the function cannot be called any more, at which point the closure data evaporates (well, at the next GC).
1GCC sometimes uses trampolines to do this for C, where trampolines are executable code generated at runtime. The executable code may set a register or pass an extra parameter or some such. This can be expensive since something treated as data at runtime (generated code) must be turned into executable code at runtime (possibly requiring a system call and potentially requiring that some supervisory code "vet" the resulting runtime code).
Go does not need any of this because the language was defined with closures in mind, so implementors don't, er, "close off" any easy ways to make this all work. Some runtime ABIs are defined with closures in mind as well, e.g., register r1 is reserved as the closure-variables pointer in all pointer-to-function types, or some such.
Actual function size is irrelevant. When you return a function like this, memory will be allocated for the closure, that is, any variables in the scope that the function uses. In this case, a pointer will be returned containing the address of the function and a pointer to the closure, which will contain a reference to the variable a.

In Rust, what is `fn() -> ()`?

I have a grasp of the Fn (capital-F) traits: Fn, FnMut, FnOnce. I understand that they are traits and work like traits.
But what about fn (lowercase-f)? It gets a different coloring in editors, which tells me it's not a trait. It can also be used in some places where the others can't (and vice-versa), though it seems to behave similarly in other cases. I couldn't find anything explaining it directly in the docs.
Rust has three kinds of function-like types:
Function items are what you get when you create a function by using fn foo() {...}. It's also the type of the constructor of a tuple-like struct or enum variant. Function items are zero-sized (they contain no data), and every non-generic function has a unique, unnameable function item type. In error messages, the compiler displays these "Voldemort types" as something like fn() -> () {foo} (with the name of the function in {}).
Closures are values similar to function items, but closures may contain data: copies of or references to whatever variables they capture from their environment. As you already know, you create a closure by using closure syntax (|args| expression). Like function items, closures have unique, unnameable types (rendered by the compiler something like [closure#src/main.rs:4:11: 4:23]).
Function pointers are what you're asking about: the types that look like fn() -> (). Function pointers cannot contain data, but they are not zero-sized; as their name suggests, they are pointers. A function pointer may point either to a function item, or to a closure that captures nothing, but it cannot be null.
Function items and closures are automatically coerced to the relevant function pointer type when possible, so that's why let f: fn(i32) = |_| (); works: because the closure captures nothing, it can be coerced to a function pointer.
All three function-like types implement the relevant Fn, FnMut and FnOnce traits (except that closures might not implement Fn or FnMut depending on what they capture). Function items and function pointers also implement Copy, Clone, Send and Sync (closures only implement these traits when all their contents do).
Performance-wise, function pointers are something of a compromise between generics and trait objects. They have to be dereferenced to be called, so calling a function pointer may be slower than calling a function item or closure directly, but still faster than calling a dyn Fn trait object, which involves a vtable lookup in addition to the indirect call. However, in real code there are many variables that confound naive analysis; if the difference in performance is important to you, you should measure it instead of guessing which is faster.
References
What's the practical difference between fn item and fn pointer?
Why design a language with unique anonymous types?
How do I make a struct for FFI that contains a nullable function pointer?
Why does passing a closure to function which accepts a function pointer not work?
It is a function pointer type.
It refers only to a function, not a closure, since it contains just the address of the function not the captured environment a closure needs.
A Fn trait (capital F) can refer either to a closure or a function.
fn is the type for a function pointer. See also here in the documentation:
https://doc.rust-lang.org/std/primitive.fn.html

Golang Function Types with a binding to a struct?

This might seem like a silly question, but I want to make a struct with a collection of functions, but the functions bind to the struct. I can sorta see that this is a cycle, but humor me with this example:
type FuncType func() error
type FuncSet struct {
TokenVariable int
FuncTyper FuncType
}
and I want to be able to create a function bound to the FuncSet type so it can operate on TokenVariable, thusly:
func (f *FuncSet) FuncType() error {
f.TokenVariable = 100
return nil
}
However, this changes the signature of the type (I can't find any information about type bindings as part of function type specifications) such that assigning this function to the struct element tells me this function/variable is not found.
I can see an easy work-around for this, by prefixing the parameters with a pointer to the struct type, it's just a bit ugly.
I looked around a little further and discovered that what I'm kinda looking for is like a closure in that it can be passed a variable from the immediate outer scope but... well, I'll be glad to be corrected about this absence of type binding in function types, but for now passing the pointer to the type looks like the way to go.
I think I found the solution:
type nullTester func(*Bast, uint32) bool
type Bast struct {
...
isNull nullTester
...
}
func isNull(b *Bast, d uint32) bool {
return d == 0
}
and then I can bind it to the type like this:
func NewBast() (b *Bast) {
...
b.isNull = isNull
...
}
// IsNull - tests if a value in the tree is null
func (b *Bast) IsNull(d uint32) bool {
return b.isNull(b, d)
}
It seems a bit hackish and I'm not sure what's going to happen in a second library that I will write that sets a different type for the uint32 parameter, but go vet is happy so maybe this is the correct way to do it.
It does seem to me that func types should really have a field in the grammar to specify a binding type, but maybe I just found a hack that sorta lets me do polymorphism. In calling programs all they will see is the nice exported function that binds to the type as planned and I get my readability as well as being able to retarget the base library to store a different type of data.
I think this is the proper solution. I just can't find anything that confirms or denies whether in a type Name func specification there is any way of asserting the type. It really should not match up, since the binding is part of the signature, but the syntax for type with functions does not appear to have this type binding.
My actual code is here, and you can see by looking at it what I am aiming to do:
https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast.go
The differences between the type of data the tree stores is entirely superficial, because it is intended to be primarily used for sorting unsigned integers of various lengths, and one important thing it needs to have is to be able to work from a, for example, 64 bit integer but sort only by the first or last half (as I have a bigger project that treats these hash values as coordinates in an adjacency list). In theory it could be used instead of a hash table lookup as well, with a low variance in time to find elements because of the binary tree structure.
It's not a conventional, reference-vector based tree, and the store itself is an array with an unconventional power of two mapping, a 'dense' tree, and the purpose above all, for implementing this way, is that when the tree is walked, as well as rotated, much of the time it is sequential blocks of memory being accessed which should make for a lot less cache misses than a conventional binary tree (and for which reason generally this type of application just uses some kind of sort like a bucket sort).
You could use an anonymous field with an interface that defines the method set that you want to use (that might change).
Go playground here
You'd define your interface
type validator interface {
IsRightOf(a, b interface{}) bool
... // other methods
}
and your type:
type Bast struct {
validator // anonymous interface field
... // some fields
}
Then you can access the methods of validator from the Bast type
b := bast.New()
b.IsRightOf(c, d) // this is valid, you do not need to do b.validator.IsRightOf(...)
because validator is an interface you can change those methods how you like.

why overloading not support in Actionscript?

Action script is developed based on object oriented programming but why does it not support function overloading?
Does Flex support overloading?
If yes, please explain briefly with a real example.
As you say, function overloading is not supported in Action Script (and therefore not even in Flex).
But the functions may have default parameters like here:
public function DoSomething(a:String='', b:SomeObject=null, c:Number=0):void
DoSomething can be called in 4 different ways:
DoSomething()
DoSomething('aString')
DoSomething('aString', anObject)
DoSomething('aString', anObject, 123)
This behavior maybe is because Action Script follows the ECMA Script standard. A function is indeed one property of the object, so, like you CAN'T have two properties with the same name, you CAN'T have two functions with the same name. (This is just a hypothesis)
Here is the Standard ECMA-262 (ECMAScript Language Specification) in section 13 (page 83 of the PDF file) says that when you declare a function like
function Identifier(arg0, arg1) {
// body
}
Create a property of the current variable object with name Identifier and value equals to a Function object created like this:
new Function(arg0, arg1, body)
So, that's why you can't overload a function, because you can't have more than one property of the current variable object with the same name
It's worth noting that function overloading is not an OOP idiom, it's a language convention. OOP languages often have overloading support, but it's not necessary.
As lk notes, you can approximate it with the structure he shows. Alternately, you can do this:
public function overloaded(mandatory1: Type, mandatory2: Type, ...rest): *;
This function will require the first two arguments and then pass the rest in as an array, which you can then handle as needed. This is probably the more flexible approach.
There is another way - function with any parameters returns anything.
public function doSomething(...args):*{
if(args.length==1){
if(args[0] is String){
return args[0] as String;
}
if(args[0] is Number){
return args[0] as Number;
}
}
if(args.length==2){
if(args[0] is Number && args[1] is Number){
return args[0]+args[1];
}
}
}
You can't overload, but you can set default values for arguments which is practically the same thing, but it does force you to plan your methods ahead sometimes.
The reason it doesn't is probably mostly a time/return on investment issue for Adobe in designing and writing the language.
Likely because Actionscript looks up functions by function name at runtime, rather than storing them by name and parameters at compile time.
This feature makes it easy to add and remove functions from dynamic objects, and the ability to get and call functions by name using object['functionName'](), but I imagine that it makes implementing overloading very difficult without making a mess of those features.