I have a grasp of the Fn (capital-F) traits: Fn, FnMut, FnOnce. I understand that they are traits and work like traits.
But what about fn (lowercase-f)? It gets a different coloring in editors, which tells me it's not a trait. It can also be used in some places where the others can't (and vice-versa), though it seems to behave similarly in other cases. I couldn't find anything explaining it directly in the docs.
Rust has three kinds of function-like types:
Function items are what you get when you create a function by using fn foo() {...}. It's also the type of the constructor of a tuple-like struct or enum variant. Function items are zero-sized (they contain no data), and every non-generic function has a unique, unnameable function item type. In error messages, the compiler displays these "Voldemort types" as something like fn() -> () {foo} (with the name of the function in {}).
Closures are values similar to function items, but closures may contain data: copies of or references to whatever variables they capture from their environment. As you already know, you create a closure by using closure syntax (|args| expression). Like function items, closures have unique, unnameable types (rendered by the compiler something like [closure#src/main.rs:4:11: 4:23]).
Function pointers are what you're asking about: the types that look like fn() -> (). Function pointers cannot contain data, but they are not zero-sized; as their name suggests, they are pointers. A function pointer may point either to a function item, or to a closure that captures nothing, but it cannot be null.
Function items and closures are automatically coerced to the relevant function pointer type when possible, so that's why let f: fn(i32) = |_| (); works: because the closure captures nothing, it can be coerced to a function pointer.
All three function-like types implement the relevant Fn, FnMut and FnOnce traits (except that closures might not implement Fn or FnMut depending on what they capture). Function items and function pointers also implement Copy, Clone, Send and Sync (closures only implement these traits when all their contents do).
Performance-wise, function pointers are something of a compromise between generics and trait objects. They have to be dereferenced to be called, so calling a function pointer may be slower than calling a function item or closure directly, but still faster than calling a dyn Fn trait object, which involves a vtable lookup in addition to the indirect call. However, in real code there are many variables that confound naive analysis; if the difference in performance is important to you, you should measure it instead of guessing which is faster.
References
What's the practical difference between fn item and fn pointer?
Why design a language with unique anonymous types?
How do I make a struct for FFI that contains a nullable function pointer?
Why does passing a closure to function which accepts a function pointer not work?
It is a function pointer type.
It refers only to a function, not a closure, since it contains just the address of the function not the captured environment a closure needs.
A Fn trait (capital F) can refer either to a closure or a function.
fn is the type for a function pointer. See also here in the documentation:
https://doc.rust-lang.org/std/primitive.fn.html
Related
I just got into programming again with Go (with no experience whatsoever in low-level languages), and I noticed that function expressions are not treated the same as function declarations (go1.18.5 linux/amd64).
For instance, this works (obviously):
package main
import "fmt"
func main() {
fmt.Println("Do stuff")
}
But this outputs an error:
package main
import "fmt"
var main = func() {
fmt.Println("Do stuff")
}
./prog.go:3:8: imported and not used: "fmt"
./prog.go:5:5: cannot declare main - must be func
Go build failed.
Even specifying the type as in var main func() = func() {} does nothing for the end result. Go seems to, first of all, evaluate if the main identifier is being used in any declaration, ignoring the type. Meanwhile, Javascript folks seem to choose what is more readable, like there's no underlying difference between them.
Questions:
Are function declarations and function expressions implemented differently under the hood, or is this behavior just hard-coded?
If yes, is the difference between these implementations critical?
Is Go somewhat better in any way for doing it the way it does?
From the spec:
The main package must [...] declare a function main that takes no arguments and returns no value.
The moment you write var main you prevent the above requirement from being met, because what you are creating is a variable storing a reference to a function literal as opposed to being a function declaration.
A function declaration binds an identifier, the function name, to a function.
A function literal represents an anonymous function.
So:
Are function declarations and function expressions implemented differently under the hood?
Yes.
Is Go somewhat better in any way for doing it the way it does?
Generally, no. It comes with the territory of being a typed language, which has all sorts of pros and cons depending on use.
If yes, is the difference between these implementations critical?
By what standard? As an example, a variable referencing a function literal could be nil, which isn't usually something you'd like to call. The same cannot happen with function declarations (assuming package unsafe is not used).
The code
func main() {
fmt.Println("Do stuff")
}
binds a function to the identifier main.
The code
var main = func() {
fmt.Println("Do stuff")
}
binds a variable of type func () to the identifier main and initializes the variable to the result of a function expression.
Are function declarations and function expressions implemented differently under the hood, or is this behavior just hard-coded?
A function expression evaluates to a function value. A function declaration binds a function value to a name. There is no difference in the implementation of these function values (but the result of an expressions can additionally close over function scoped variables).
If yes, is the difference between these implementations critical?
Yes.
The question points out one scenario where the difference is important (program execution starts at the function main in package main)
The compiler cannot inline a function called through a variable.
Functions cannot be declared inside another function, but a function expression can be assigned to a variable in a function.
Is Go somewhat better in any way for doing it the way it does?
Other languages make a distinction between an identifier bound a function and and identifier bound to a variable with a function type. Go is no better or worse in than those languages with regards to the binding of identifiers.
The OP says in a comment:
The question was about the difference between javascript and go, though, but you also answered it with "It comes with the territory of being a typed language".
The difference is unrelated to being a typed language. Identifiers can be bound to constants, types, variables and functions in Go. I may be going out on a limb here, but non-reserved identifiers in Javascript are always bound to variables.
I am writing a language where functions are not typed. Which means I need to infer the return type of a function call in order to do type checking. However when somebody writes a recursive function the type checker goes into an infinite recursion trying to infer the type of the function call inside the function body.
The type checker does something like this:
Infer the types of the function call actual arguments.
Create a mapping of the actual argument types to the formal arguments.
Use the mapping to annotate types on the arguments used inside the function body.
Infer and return the return type of the function body.
Step 4 tries to then infer the type of the function call inside the function body, which calls the same type checker function again, causing an infinite recursion.
An example of a recursive function that gives me this problem:
function factorial(n) = n<1 ? 1 : n*factorial(n-1); // Function definition.
...
assert 24 == factorial(4); // Function call expression usage example.
How can I solve this problem without going in to an infinite recursion loop? Is there a way to infer the type of the recursive function call without having to go into the body again? Or some clean way to infer the type from context?
I know the easy solution might be to add types annotations to functions, this way the problem is trivial, but before doing that I want to know if there is a way to solve this without resorting to that.
I'd also like for the solution to work for mutual recursion.
Type inference can vary a lot depending on the language's type system and on what properties you want to have in terms of when annotations are needed. But whatever your language looks like, I think there's one seminal case you really should read about, which is ML. ML's type inference holds a nice sweet spot where it all fits together in a relatively simple paradigm. No type annotations are needed, and any expression has a single most general type (this property is called principality of typing).
ML's type system is the Hindley-Milner type system, which has parametric polymorphism. The type of an expression is either a specific type, or “any”. More precisely, the type constructor of an expression is either a specific type constructor or “any”, and type constructors can have arguments which themselves either have a specific type constructor or “any”. For example, the empty list has the type “list of any”. Two expressions that can have “any” type in isolation may be constrained to have the same type, whatever it is, so “any” is expressed with variables. For example, function list_of_two(x, y) = [x, y] (in a notation like your language) constrains x and y to have the same type, because they're inserted in the same list, but that type can be any type, so the type of this function is “take any two parameters of the same type α, and return a value of type list of α”.
The basic type inference algorithm for Hindley-Milner is algorithm W. At its core, it works by giving each subexpression a type that's a variable: α₁, α₂, α₃, … Programming language constructions then impose constraints on those variables. For example, if a list contains two elements of types α₁ and α₂ and the list itself has the type α₃, this constraints α₁ = α₂ and α₃ = list of α₁. Putting all these constraints together is a unification problem.
The constraints are based on a purely syntactic reading of the program. If there's a recursive call, you don't need to know the type of the function: it just means that there's a constraint that the variable for the return type of the function is the same as the type at its point of use. That's just one more equation to add to the set of constraints.
I left out an important aspect of ML which is that an expression's type can be generalized: an expression can be used with different types at different places. This is what allows polymorphism. For example,
let empty_list = [] in
(empty_list # [3]), (empty_list # ["hello"])
is a valid program where empty_list is used once with the type “list of integers” and once with the type “list of strings”. The type of empty_list is “for any α, list of α”: that's parametric polymorphism. Generalization adds some complexity to the algorithm, but it also removes complexity elsewhere, because that's what allows principality. Without it, let empty_list = [] in … would be ambiguous: empty_list would have to have some type, but there's no way to know what type without analyzing …, and then when you do analyze the … above you'd need to make a choice between integer and string.
Depending on your language's type system, ML and algorithm W may be directly reusable or may just provide some vague inspiration. But the principle of using variables during the inference, and progressively constraining these variables, is very general.
This might seem like a silly question, but I want to make a struct with a collection of functions, but the functions bind to the struct. I can sorta see that this is a cycle, but humor me with this example:
type FuncType func() error
type FuncSet struct {
TokenVariable int
FuncTyper FuncType
}
and I want to be able to create a function bound to the FuncSet type so it can operate on TokenVariable, thusly:
func (f *FuncSet) FuncType() error {
f.TokenVariable = 100
return nil
}
However, this changes the signature of the type (I can't find any information about type bindings as part of function type specifications) such that assigning this function to the struct element tells me this function/variable is not found.
I can see an easy work-around for this, by prefixing the parameters with a pointer to the struct type, it's just a bit ugly.
I looked around a little further and discovered that what I'm kinda looking for is like a closure in that it can be passed a variable from the immediate outer scope but... well, I'll be glad to be corrected about this absence of type binding in function types, but for now passing the pointer to the type looks like the way to go.
I think I found the solution:
type nullTester func(*Bast, uint32) bool
type Bast struct {
...
isNull nullTester
...
}
func isNull(b *Bast, d uint32) bool {
return d == 0
}
and then I can bind it to the type like this:
func NewBast() (b *Bast) {
...
b.isNull = isNull
...
}
// IsNull - tests if a value in the tree is null
func (b *Bast) IsNull(d uint32) bool {
return b.isNull(b, d)
}
It seems a bit hackish and I'm not sure what's going to happen in a second library that I will write that sets a different type for the uint32 parameter, but go vet is happy so maybe this is the correct way to do it.
It does seem to me that func types should really have a field in the grammar to specify a binding type, but maybe I just found a hack that sorta lets me do polymorphism. In calling programs all they will see is the nice exported function that binds to the type as planned and I get my readability as well as being able to retarget the base library to store a different type of data.
I think this is the proper solution. I just can't find anything that confirms or denies whether in a type Name func specification there is any way of asserting the type. It really should not match up, since the binding is part of the signature, but the syntax for type with functions does not appear to have this type binding.
My actual code is here, and you can see by looking at it what I am aiming to do:
https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast.go
The differences between the type of data the tree stores is entirely superficial, because it is intended to be primarily used for sorting unsigned integers of various lengths, and one important thing it needs to have is to be able to work from a, for example, 64 bit integer but sort only by the first or last half (as I have a bigger project that treats these hash values as coordinates in an adjacency list). In theory it could be used instead of a hash table lookup as well, with a low variance in time to find elements because of the binary tree structure.
It's not a conventional, reference-vector based tree, and the store itself is an array with an unconventional power of two mapping, a 'dense' tree, and the purpose above all, for implementing this way, is that when the tree is walked, as well as rotated, much of the time it is sequential blocks of memory being accessed which should make for a lot less cache misses than a conventional binary tree (and for which reason generally this type of application just uses some kind of sort like a bucket sort).
You could use an anonymous field with an interface that defines the method set that you want to use (that might change).
Go playground here
You'd define your interface
type validator interface {
IsRightOf(a, b interface{}) bool
... // other methods
}
and your type:
type Bast struct {
validator // anonymous interface field
... // some fields
}
Then you can access the methods of validator from the Bast type
b := bast.New()
b.IsRightOf(c, d) // this is valid, you do not need to do b.validator.IsRightOf(...)
because validator is an interface you can change those methods how you like.
I've been playing around with the reflect package, and I notice how limited the functionality of functions are.
package main
import (
"fmt"
"reflect"
"strings"
)
func main() {
v := reflect.ValueOf(strings.ToUpper)
fmt.Printf("Address: %v\n", v) // 0xd54a0
fmt.Printf("Can set? %d\n", v.CanSet()) // False
fmt.Printf("Can address? %d\n", v.CanAddr()) // False
fmt.Printf("Element? %d\n", v.Elem()) // Panics
}
Playground link here.
I've been taught that functions are addresses to memory with a set of instructions (hence v prints out 0xd54a0), but it looks like I can't get an address to this memory address, set it, or dereference it.
So, how are Go functions implemented under the hood? Eventually, I'd ideally want to manipulate the strings.ToUpper function by making the function point to my own code.
Disclaimers:
I've only recently started to delve deeper into the golang compiler, more specifically: the go assembler and mapping thereof. Because I'm by no means an expert, I'm not going to attempt explaining all the details here (as my knowledge is most likely still lacking). I will provide a couple of links at the bottom that might be worth checking out for more details.
What you're trying to do makes very, very little sense to me. If, at runtime, you're trying to modify a function, you're probably doing something wrong earlier on. And that's just in case you want to mess with any function. The fact that you're trying to do something with a function from the strings package makes this all the more worrying. The reflect package allows you to write very generic functions (eg a service with request handlers, but you want to pass arbitrary arguments to those handlers requires you to have a single handler, process the raw request, then call the corresponding handler. You cannot possibly know what that handler looks like, so you use reflection to work out the arguments required...).
Now, how are functions implemented?
The go compiler is a tricky codebase to wrap ones head around, but thankfully the language design, and the implementation thereof has been discussed openly. From what I gather, golang functions are essentially compiled in pretty much the same way as a function in, for example, C. However, calling a function is a bit different. Go functions are first-class objects, that's why you can pass them as arguments, declare a function type, and why the reflect package has to allow you to use reflection on a function argument.
Essentially, functions are not addressed directly. Functions are passed and invoked through a function "pointer". Functions are effectively a type like similar to a map or a slice. They hold a pointer to the actual code, and the call data. In simple terms, think of a function as a type (in pseudo-code):
type SomeFunc struct {
actualFunc *func(...) // pointer to actual function body
data struct {
args []interface{} // arguments
rVal []interface{} // returns
// any other info
}
}
This means that the reflect package can be used to, for example, count the number of arguments and return values the function expects. It also tells you what the return value(s) will be. The overall function "type" will be able to tell you where the function resides, and what arguments it expects and returns, but that's about it. IMO, that's all you really need though.
Because of this implementation, you can create fields or variables with a function type like this:
var callback func(string) string
This would create an underlying value that, based on the pseudo code above, looks something like this:
callback := struct{
actualFunc: nil, // no actual code to point to, function is nil
data: struct{
args: []interface{}{string}, // where string is a value representing the actual string type
rVal: []interface{}{string},
},
}
Then, by assigning any function that matches the args and rVal constraints, you can determine what executable code the callback variable points to:
callback = strings.ToUpper
callback = func(a string) string {
return fmt.Sprintf("a = %s", a)
}
callback = myOwnToUpper
I hope this cleared 1 or 2 things up a bit, but if not, here's a bunch of links that might shed some more light on the matter.
Go functions implementation and design
Introduction to go's ASM
Rob Pike on the go compiler written in go, and the plan 9 derived asm mapping
Writing a JIT in go asm
a "case study" attempting to use golang ASM for optimisation
Go and assembly introduction
Plan 9 assembly docs
Update
Seeing as you're attempting to swap out a function you're using for testing purposes, I would suggest you not use reflection, but instead inject mock functions, which is a more common practice WRT testing to begin with. Not to mention it being so much easier:
type someT struct {
toUpper func(string) string
}
func New(toUpper func(string) string) *someT {
if toUpper == nil {
toUpper = strings.ToUpper
}
return &someT{
toUpper: toUpper,
}
}
func (s *someT) FuncToTest(t string) string {
return s.toUpper(t)
}
This is a basic example of how you could inject a specific function. From within your foo_test.go file, you'd just call New, passing a different function.
In more complex scenario's, using interfaces is the easiest way to go. Simply implement the interface in the test file, and pass the alternative implementation to the New function:
type StringProcessor interface {
ToUpper(string) string
Join([]string, string) string
// all of the functions you need
}
func New(sp StringProcessor) return *someT {
return &someT{
processor: sp,
}
}
From that point on, simply create a mock implementation of that interface, and you can test everything without having to muck about with reflection. This makes your tests easier to maintain and, because reflection is complex, it makes it far less likely for your tests to be faulty.
If your test is faulty, it could cause your actual tests to pass, even though the code you're trying to test isn't working. I'm always suspicious if the test code is more complex than the code you're covering to begin with...
Underneath the covers, a Go function is probably just as you describe it- an address to a set of instructions in memory, and parameters / return values are filled in according to your system's linkage conventions as the function executes.
However, Go's function abstraction is much more limited, on purpose (it's a language design decision). You can't just replace functions, or even override methods from other imported packages, like you might do in a normal object-oriented language. You certainly can't do dynamic replacement of functions under normal circumstances (I suppose you could write into arbitrary memory locations using the unsafe package, but that's willful circumvention of the language rules, and all bets are off at that point).
Are you trying to do some sort of dependency injection for unit testing? If so, the idiomatic way to do this in Go is to define interface that you pass around to your functions/methods, and replace with a test version in your tests. In your case, an interface may wrap the call to strings.ToUpper in the normal implementation, but a test implementation might call something else.
For example:
type Upper interface {
ToUpper(string) string
}
type defaultUpper struct {}
func (d *defaultUpper) ToUpper(s string) string {
return strings.ToUpper(s)
}
...
// normal implementation: pass in &defaultUpper{}
// test implementation: pass in a test version that
// does something else
func SomethingUseful(s string, d Upper) string {
return d.ToUpper(s)
}
Finally, you can also pass function values around. For example:
var fn func(string) string
fn = strings.ToUpper
...
fn("hello")
... but this won't let you replace the system's strings.ToUpper implementation, of course.
Either way, you can only sort of approximate what you want to do in Go via interfaces or function values. It's not like Python, where everything is dynamic and replaceable.
I have read the question Difference between method and function in Scala and many articles about differences between method and function. I got a feeling that a 'method' is just a "named function" defined as a method in a class, a trait or an object. A 'function' represents things like the "anonymous function" or "function literal" or "function object" in those articles. An evidence can be found in the book Programming in Scala http://www.artima.com/shop/programming_in_scala_2ed , page 141, section 8.1, "The most common way to define a function is as a member of some object. Such a function is called a method."
However, when I checked the Scala Language Reference http://www.scala-lang.org/docu/files/ScalaReference.pdf, there are concepts like named method. In page 91, Section 6.20 Return expressions: "A return expression return e must occur inside the body of some enclosing named
method or function." You can also find the term "named function" in the same page and other places.
So my question is, in Scala, do method, named method, and named function refer to the same concept? Where do you get the definition of named function?
In code List(1, 2).map(_ + 1), the original expression _ + 1 is a named method, then the method is converted into a function. What kind of function, anonymous function, function object, named function?
In my understanding, Scala only has two types of function: a named function that is a method; an anonymous function that is a function literal. Function literal is compiled into a function object of trait FunctionN for it to be used in the pure object-oriented world of Scala.
However, for a regular named funciton/method such as _ + 1 in the above code, why does Scala transform it into another function object?
At the language level, there are only two concepts,
Methods are fundamental building blocks of Scala. Methods are always named. Methods live in classes or traits. Methods are a construct native to the JVM, and thus are the same in both Scala and Java. Methods in Scala (unlike functions) may have special features: they can be abstracted over type parameters, their arguments can have default values or be implicit, etc.
Function objects are just instances of a function trait (Function1, Function2, ...). The function is evaluated when the apply method on the function object is called. There is special syntax for defining unnamed "anonymous" functions (aka, "function literals"). A function is just a value, and as such can be named (e.g., val f: (Int => Int) = (x => x)). The type A => B is shorthand for Function1[A, B].
In the linked SO question, it was mentioned that some references (like the Scala spec) use the word "function" imprecisely to mean either "method" or "function object". I guess part of the reason is that methods can be automatically converted to function objects, depending on the context. Note, however, that the opposite conversion wouldn't make sense: a method is not a first-class value that lives on the heap with its own independent existence. Rather, a method is inextricably linked to the class in which it is defined.
The answers to the linked question cover this fairly well, but to address your specific queries:
method => The thing you define with the def keyword
named method => The same, all methods have names
named function => a function that has been assigned to a value, or converted from a method. As contrasted with an anonymous function.
The difference between a method and a Function is somewhat like the difference between an int primitive and a boxed Integer in Java.
In general discussion, it's common to hear both described as being "integers". This normally isn't a problem, but you must take care to be precise wherever the distinction is relevant.
Likewise, a method will be automatically converted to a Function (and therefore an object) when your program demands it, much like boxing a primitive. So it's not entirely wrong to refer to a method as being a function.
UPDATE
So how does it work?
When you attempt to pass a method as the argument to e.g. List[A].map, the compiler will generate an inner class (with a synthetic name) that derives Function1[A,B], and an apply method that delegates to the method you originally supplied. An instance of this will then be passed as the actual argument.