What's the difference these two function calling conventions? - function

Functions can be called in a couple ways:
say(1, 2, 3) # 123
say: 1, 2, 3 # (1, 2, 3)
The latter seems to pass a Positional, but apart from that I don't know how else they differ. Are there any differences that are important to know? What types of situations would you use one over the other?

#jjmerelo's answer covers the basics. This complementary answer, which aims at being somewhat exhaustive but hopefully not exhausting, covers traps, rare cases, and advice.
foo: valuea, valueb, ...
Surprisingly perhaps, this is not a call of a sub or method called foo.
Instead it's a statement that begins with a label, foo:.
The say: line in your question won't work in an ordinary program:
say: <a b c>; # Useless use of constant value a b c ...
The "Useless use" warning means the <a b c> doesn't get used in a useful way. The say: isn't doing anything with the list of values. It's just a label that doesn't do anything.
Presumably you are using something like the Perl 6 REPL. The REPL automatically says the last value in a line if it isn't otherwise used, thus making the line appear to work without a warning.
.a-method:
If a postfix method call using the form .a-method has no arguments other than the invocant (the argument to the left of the ., or the current topic if there isn't an explicit invocant) then you can just write it in the form:
42.say ;
You can optionally append a colon:
42.say: ;
There's no good reason to, but it's consistent with:
.a-method: arg2, arg3, ...
If you want to supply one or more arguments (other than the invocant) to a postfix .a-method call, then you have to pick one of two ways to introduce them.
One way is to write a colon immediately after the method name, before the argument(s). There must be no space between the method name and colon, and there must be space after the colon before the method argument(s).1
For example, the following uses a colon before the Numeric argument in the following method call:
say <abc 2 def ghi> .first: Numeric ; # 2
In the above line the method call expression (.first: Numeric) ends at the statement terminator (;). If there's an enclosing sub-expression such as an array subscript then the method call expression ends at the end of that sub-expression:
say .[1 + .first: Numeric] given <abc 2 def ghi> ; # ghi
The argument list of a colon form method call is also brought to a close by a valid statement modifier like given:
say .first: Numeric given <abc 2 def ghi> ; # 2
a-sub arg1, arg2, ...
This is the corresponding form for subroutine calls. The only format differences are that the sub has no invocant or . before the sub name and you must omit the colon after the sub name.
.a-method( arg2, arg3, ... )
a-sub( arg1, arg2, ... )
The other common form used for both method and sub calls is to immediately follow the method or sub name with parens to delimit arguments. The opening paren must immediately follow, without any space between the routine name and (.
Here's parens used with the .first method:
say 1 + .first(Numeric) given <abc 2 def ghi> ; # 3
This has the advantage that it's arguably prettier than the alternative of using outer parens:
say 1 + (.first: Numeric) given <abc 2 def ghi> ; # 3
If you want to put a sub call directly inside a double quoted string, you need to prefix the sub name with an & sigil and use the postfix parens form:
my #array = <abc 2 def ghi> ;
say "first number is &first(Numeric,#array)" ; # first number is 2
To put in a method call, you again have to use the postfix parens form, and you must also provide an explicit invocant (you can't just write "Some text .a-method()"):
my #array = <abc 2 def ghi> ;
say "first number is #array.first(Numeric)" ; # first number is 2
If there are no arguments (other than the invocant for a method call) you still need to use this form with empty parens if you want to interpolate a sub or method call in a string:
my #array = <abc 2 def ghi> ;
say "no method call #array[3].uc" ; # no method call ghi.uc
say "with method call #array[3].uc()" ; # with method call GHI
say "&rand"; # &rand
say "&rand()"; # 0.929123203371282
.a-method ( arrgh, arrgh, ... ) ;
This won't work.
Because the .a-method isn't followed by a colon, the method call is considered complete.
That means the next thing must be either an expression/statement ender like ;, or a postfix operator that will operate on the result of the method call, or an infix operator that will operate on the result and some following argument.
But ( arrgh, arrgh, ... ) is none of these. So you get a "Two terms in a row" compilation error.
.a-method:( arrgh, arrgh, ... ) ;
.a-method: ( arrgh, arrgh, ... ) ;
In general, DO NOT MIX use of a : with use of parens around arguments as part of a method call. There is no good reason to do so because it will either not work, or work only by accident, or work but very likely confuse readers.
Doing so without a space between the colon and opening paren yields a cryptic compilation error:
This type (QAST::WVal) does not support positional operations
Leaving a space appears to work -- but typically only by luck:
say .first: (Numeric) given <abc 2 def ghi> ; # 2
The (Numeric) is a single value in parens which yields Numeric so this line is the same as:
say .first: Numeric given <abc 2 def ghi> ; # 2
But if there are two or more arguments in parens, things will go awry. Use one of these forms:
say .first: Numeric, :k given <abc 2 def ghi> ; # 1
say .first(Numeric, :k) given <abc 2 def ghi> ; # 1
which correctly yield the array index ("key") of the 2 element rather than:
say .first: (Numeric, :k) given <abc 2 def ghi> ; # Nil
which yields Nil because the .first method doesn't do anything useful with a single argument that's a list of the form (Numeric, :k).
Of course, you may occasionally want to pass a single argument that's a list of values in parens. But you can do so without using a colon. For the sake of clarity, it's my advice that you instead write this as:
invocant.a-method(( valuea, valueb, ... ));
a-sub ( arrgh1, arrgh2, ... ) ;
As just explained for method calls, this passes ONE argument to a-sub, namely the single list ( arrgh1, arrgh2, ... ) which will seldom be what the writer means.
Again, my advice is to instead write this as:
`a-sub( valuea, valueb, ... ) ;`
or:
`a-sub valuea, valueb, ... ;`
if you mean to pass multiple arguments, or if you wish to pass a list as a single argument, then:
`a-sub(( valuea, valueb, ... )) ;`
.a-method : arrgha, arrghb, ...
a-sub : arrgha, arrghb, ...
For the method form this will net you a "Confused" compilation error.
The same is true for the sub form if a-sub takes no arguments. If a-sub takes arguments you'll get a "Preceding context expects a term, but found infix : instead" compilation error.
.&a-sub
There's a call form which lets you call a routine declared as a sub -- but use the .method call syntax. The following feeds the "invocant" qux on the left of the dot as the first argument to a sub called a-sub:
qux.&a-sub
Use a : or parentheses as usual to pass additional arguments to a-sub:
sub a-sub ($a, $b) { $a == $b }
say 42.&a-sub(42), 42.&a-sub(43); # TrueFalse
say 42.&a-sub: 42; # True
(In my original version of this section I wrote that one can not pass additional arguments. I had tested this and thought one could not. But I must have just gotten confused by something. #Enheh's comment led me to retest and discover that one can pass additional arguments just as with ordinary method calls. Thank you #Enheh. :))
a-method( invocant: arg2, arg3, ... )
a-method invocant: arg2, arg3, ...
Called "Indirect object notation" in the design docs, these formats are an undocumented and very rarely seen form of method call in which the call mimics the method declaration -- the method name comes first and then the invocant followed by a colon:
say first <abc 2 def ghi>: Numeric ; # 2
Note that say is a sub call because the next token, first, isn't followed by a colon. In contrast first is a method call because the token after it is followed by a colon.
Footnotes
1 All comments about spaces/spacing in this answer ignore unspacing.

As Raiph tells you above, say: is a label. So you didn't say anything (even though you thought you did) and -- outside use of the REPL -- the compiler will complain that your use of <a b c> was useless:
say: <a b c>; # OUTPUT: «WARNINGS for <tmp>:␤Useless use of constant value a b c in sink context (lines 1, 1, 1, 1, 1, 1)␤»
However, you often can use a : notation instead of parentheses in method calls. Consider the four routine calls below (two subroutine calls then two method calls):
my #numbers = (33, 77, 49, 11, 34);
say map *.is-prime, #numbers ; # simplest subroutine call syntax
say map( *.is-prime, #numbers ); # same meaning, but delimiting args
say #numbers.map( *.is-prime ) ; # similar, but using .map *method*
say #numbers.map: *.is-prime ; # same, but using : instead of parens
These sentences will all return the same (False False False True False).
In general, as you see above with map, you can use () in method calls wherever you would use :, but the opposite is not true; : can be used only in method calls.
Use () if the arguments need to be delimited precisely, as Raiph comments below.
This answer focuses on the basics. See Raiph's answer for more exhaustive coverage of the precise details of routine call syntax. (As an important example, the meaning of these calls normally changes if there's any spaces between the routine name and the colon (:) or opening parenthesis (()).

Related

synthesizing long parameter strings

When consuming a JSON string, the parameters can be deeply nested, making reading/checking tedious:
update(capture_created: params[:data][:object][:created], capture_currency: params[:data][:object][:currency]
...[...] and so on...
In what way can a node params[:data][:object] be represented only once and be thus able to handle the child values as a parameter?
There are a few things you can.
You could grab the inner hash in a local variable as dbugger mentioned:
p = params[:data][:object]
update(capture_created: p[:created], capture_currency: p[:currency], ...)
Or you could use #tap or #then (depending on what return value you want from the expression):
# This evaluates to params[:data][:object]
params[:data][:object].tap do |p|
update(capture_created: p[:created], capture_currency: p[:currency], ...)
end
# This evaluates to whatever update returns
params[:data][:object].then do |p|
update(capture_created: p[:created], capture_currency: p[:currency], ...)
end
If the keys in the nested hash only need to be consistently renamed (i.e. add a "capture_" prefix) then #transform_keys:
update(params[:data][:object].transform_keys { |k| "capture_#{k}" })
is an option. String keys are fine with an ActiveRecord #update call but you could get symbols if you really want them:
update(params[:data][:object].transform_keys { |k| :"capture_#{k}" })
You might want to include a Hash#slice call if you want to ensure that you're only accessing certain keys:
update(params[:data][:object].slice(:created, :currency, ...).transform_keys { |k| :"capture_#{k}" })

Confused about this nested function

I am reading the Python Cookbook 3rd Edition and came across the topic discussed in 2.6 "Searching and Replacing Case-Insensitive Text," where the authors discuss a nested function that is like below:
def matchcase(word):
def replace(m):
text = m.group()
if text.isupper():
return word.upper()
elif text.islower():
return word.lower()
elif text[0].isupper():
return word.capitalize()
else:
return word
return replace
If I have some text like below:
text = 'UPPER PYTHON, lower python, Mixed Python'
and I print the value of 'text' before and after, the substitution happens correctly:
x = matchcase('snake')
print("Original Text:",text)
print("After regsub:", re.sub('python', matchcase('snake'), text, flags=re.IGNORECASE))
The last "print" command shows that the substitution correctly happens but I am not sure how this nested function "gets" the:
PYTHON, python, Python
as the word that needs to be substituted with:
SNAKE, snake, Snake
How does the inner function replace get its value 'm'?
When matchcase('snake') is called, word takes the value 'snake'.
Not clear on what the value of 'm' is.
Can any one help me understand this clearly, in this case?
Thanks.
When you pass a function as the second argument to re.sub, according to the documentation:
it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.
The matchcase() function itself returns the replace() function, so when you do this:
re.sub('python', matchcase('snake'), text, flags=re.IGNORECASE)
what happens is that matchcase('snake') returns replace, and then every non-overlapping occurrence of the pattern 'python' as a match object is passed to the replace function as the m argument. If this is confusing to you, don't worry; it is just generally confusing.
Here is an interactive session with a much simpler nested function that should make things clearer:
In [1]: def foo(outer_arg):
...: def bar(inner_arg):
...: print(outer_arg + inner_arg)
...: return bar
...:
In [2]: f = foo('hello')
In [3]: f('world')
helloworld
So f = foo('hello') is assigning a function that looks like the one below to a variable f:
def bar(inner_arg):
print('hello' + inner_arg)
f can then be called like this f('world'), which is like calling bar('world'). I hope that makes things clearer.

Lua: colon notation, 'self' and function definition vs. call

I'm getting terribly confused by the colon notation used when defining/calling Lua functions.
I thought I'd got my head round it until I saw this piece of code:
function string.PatternSafe( str )
return ( str:gsub( ".", pattern_escape_replacements ) );
end
function string.Trim( s, char )
if char then char = char:PatternSafe() else char = "%s" end
return string.match( s, "^" .. char .. "*(.-)" .. char .. "*$" ) or s
end
What's confusing me here is that string.PatternSafe() doesn't reference 'self' anywhere, yet the code seems to work.
I've also seen some scripts that use colon notation when defining the function, for example:
function foo:bar( param1 ) ... end
After several hours of googling I've still not managed to work out what precisely is happening in these two contexts. My current assumptions are as follows:
If a function is defined using colon notation, it gets an invisible 'self' parameter inserted as first parameter
If a function is called using colon notation, the object preceding ':' is inserted in to the arguments (so becomes the first parameter of the function)
If a function is called using dot notation, then even if it was defined using colon notation it will not get the object inserted as first argument/parameter
If my assumptions are correct, that raises an additional question: What is the best way to ensure that the function was called properly?
Your assumptions are all correct.
Assumption 1 from the manual:
The colon syntax is used for defining methods, that is, functions
that have an implicit extra parameter self. Thus, the statement
function t.a.b.c:f (params) body end
is syntactic sugar for
t.a.b.c.f = function (self, params) body end
Assumption 2 from the manual:
A call v:name(args) is syntactic sugar for v.name(v,args), except that v is evaluated only once.
Assumption 3 doesn't have a direct manual section since that's just normal function call syntax.
Here's the thing though. self is just the auto-magic name given in the syntax sugar used as part of the colon assignment. It isn't a necessary name. The first argument is the first argument whatever the name happens to be.
So in your example:
function string.PatternSafe( str )
return ( str:gsub( ".", pattern_escape_replacements ) );
end
the first argument is str so when the function is called as char:PatternSafe() is de-sugars (via assumption 2) to char.PatternSafe(char) which is just passing char to the function as the first argument (which, as I already said, is str).

How to deal with name/value pairs of function arguments in MATLAB

I have a function that takes optional arguments as name/value pairs.
function example(varargin)
% Lots of set up stuff
vargs = varargin;
nargs = length(vargs);
names = vargs(1:2:nargs);
values = vargs(2:2:nargs);
validnames = {'foo', 'bar', 'baz'};
for name = names
validatestring(name{:}, validnames);
end
% Do something ...
foo = strmatch('foo', names);
disp(values(foo))
end
example('foo', 1:10, 'bar', 'qwerty')
It seems that there is a lot of effort involved in extracting the appropriate values (and it still isn't particularly robust again badly specified inputs). Is there a better way of handling these name/value pairs? Are there any helper functions that come with MATLAB to assist?
I prefer using structures for my options. This gives you an easy way to store the options and an easy way to define them. Also, the whole thing becomes rather compact.
function example(varargin)
%# define defaults at the beginning of the code so that you do not need to
%# scroll way down in case you want to change something or if the help is
%# incomplete
options = struct('firstparameter',1,'secondparameter',magic(3));
%# read the acceptable names
optionNames = fieldnames(options);
%# count arguments
nArgs = length(varargin);
if round(nArgs/2)~=nArgs/2
error('EXAMPLE needs propertyName/propertyValue pairs')
end
for pair = reshape(varargin,2,[]) %# pair is {propName;propValue}
inpName = lower(pair{1}); %# make case insensitive
if any(strcmp(inpName,optionNames))
%# overwrite options. If you want you can test for the right class here
%# Also, if you find out that there is an option you keep getting wrong,
%# you can use "if strcmp(inpName,'problemOption'),testMore,end"-statements
options.(inpName) = pair{2};
else
error('%s is not a recognized parameter name',inpName)
end
end
InputParser helps with this. See Parse Function Inputs for more information.
I could yack for hours about this, but still don't have a good gestalt view of general Matlab signature handling. But here's a couple pieces of advice.
First, take a laissez faire approach to validating input types. Trust the caller. If you really want strong type testing, you want a static language like Java. Try to enforce type safety every where in Matlab, and you'll end up with a good part of your LOC and execution time devoted to run time type tests and coercion in userland, which trades in a lot of the power and development speed of Matlab. I learned this the hard way.
For API signatures (functions intended to be called from other functions, instead of from the command lines), consider using a single Args argument instead of varargin. Then it can be passed around between multiple arguments without having to convert it to and from a comma-separated list for varargin signatures. Structs, like Jonas says, are very convenient. There's also a nice isomorphism between structs and n-by-2 {name,value;...} cells, and you could set up a couple functions to convert between them inside your functions to whichever it wants to use internally.
function example(args)
%EXAMPLE
%
% Where args is a struct or {name,val;...} cell array
Whether you use inputParser or roll your own name/val parser like these other fine examples, package it up in a separate standard function that you'll call from the top of your functions that have name/val signatures. Have it accept the default value list in a data structure that's convenient to write out, and your arg-parsing calls will look sort of like function signature declarations, which helps readability, and avoid copy-and-paste boilerplate code.
Here's what the parsing calls could look like.
function out = my_example_function(varargin)
%MY_EXAMPLE_FUNCTION Example function
% No type handling
args = parsemyargs(varargin, {
'Stations' {'ORD','SFO','LGA'}
'Reading' 'Min Temp'
'FromDate' '1/1/2000'
'ToDate' today
'Units' 'deg. C'
});
fprintf('\nArgs:\n');
disp(args);
% With type handling
typed_args = parsemyargs(varargin, {
'Stations' {'ORD','SFO','LGA'} 'cellstr'
'Reading' 'Min Temp' []
'FromDate' '1/1/2000' 'datenum'
'ToDate' today 'datenum'
'Units' 'deg. C' []
});
fprintf('\nWith type handling:\n');
disp(typed_args);
% And now in your function body, you just reference stuff like
% args.Stations
% args.FromDate
And here's a function to implement the name/val parsing that way. You could hollow it out and replace it with inputParser, your own type conventions, etc. I think the n-by-2 cell convention makes for nicely readable source code; consider keeping that. Structs are typically more convenient to deal with in the receiving code, but the n-by-2 cells are more convenient to construct using expressions and literals. (Structs require the ",..." continuation at each line, and guarding cell values from expanding to nonscalar structs.)
function out = parsemyargs(args, defaults)
%PARSEMYARGS Arg parser helper
%
% out = parsemyargs(Args, Defaults)
%
% Parses name/value argument pairs.
%
% Args is what you pass your varargin in to. It may be
%
% ArgTypes is a list of argument names, default values, and optionally
% argument types for the inputs. It is an n-by-1, n-by-2 or n-by-3 cell in one
% of these forms forms:
% { Name; ... }
% { Name, DefaultValue; ... }
% { Name, DefaultValue, Type; ... }
% You may also pass a struct, which is converted to the first form, or a
% cell row vector containing name/value pairs as
% { Name,DefaultValue, Name,DefaultValue,... }
% Row vectors are only supported because it's unambiguous when the 2-d form
% has at most 3 columns. If there were more columns possible, I think you'd
% have to require the 2-d form because 4-element long vectors would be
% ambiguous as to whether they were on record, or two records with two
% columns omitted.
%
% Returns struct.
%
% This is slow - don't use name/value signatures functions that will called
% in tight loops.
args = structify(args);
defaults = parse_defaults(defaults);
% You could normalize case if you want to. I recommend you don't; it's a runtime cost
% and just one more potential source of inconsistency.
%[args,defaults] = normalize_case_somehow(args, defaults);
out = merge_args(args, defaults);
%%
function out = parse_defaults(x)
%PARSE_DEFAULTS Parse the default arg spec structure
%
% Returns n-by-3 cellrec in form {Name,DefaultValue,Type;...}.
if isstruct(x)
if ~isscalar(x)
error('struct defaults must be scalar');
end
x = [fieldnames(s) struct2cell(s)];
end
if ~iscell(x)
error('invalid defaults');
end
% Allow {name,val, name,val,...} row vectors
% Does not work for the general case of >3 columns in the 2-d form!
if size(x,1) == 1 && size(x,2) > 3
x = reshape(x, [numel(x)/2 2]);
end
% Fill in omitted columns
if size(x,2) < 2
x(:,2) = {[]}; % Make everything default to value []
end
if size(x,2) < 3
x(:,3) = {[]}; % No default type conversion
end
out = x;
%%
function out = structify(x)
%STRUCTIFY Convert a struct or name/value list or record list to struct
if isempty(x)
out = struct;
elseif iscell(x)
% Cells can be {name,val;...} or {name,val,...}
if (size(x,1) == 1) && size(x,2) > 2
% Reshape {name,val, name,val, ... } list to {name,val; ... }
x = reshape(x, [2 numel(x)/2]);
end
if size(x,2) ~= 2
error('Invalid args: cells must be n-by-2 {name,val;...} or vector {name,val,...} list');
end
% Convert {name,val, name,val, ...} list to struct
if ~iscellstr(x(:,1))
error('Invalid names in name/val argument list');
end
% Little trick for building structs from name/vals
% This protects cellstr arguments from expanding into nonscalar structs
x(:,2) = num2cell(x(:,2));
x = x';
x = x(:);
out = struct(x{:});
elseif isstruct(x)
if ~isscalar(x)
error('struct args must be scalar');
end
out = x;
end
%%
function out = merge_args(args, defaults)
out = structify(defaults(:,[1 2]));
% Apply user arguments
% You could normalize case if you wanted, but I avoid it because it's a
% runtime cost and one more chance for inconsistency.
names = fieldnames(args);
for i = 1:numel(names)
out.(names{i}) = args.(names{i});
end
% Check and convert types
for i = 1:size(defaults,1)
[name,defaultVal,type] = defaults{i,:};
if ~isempty(type)
out.(name) = needa(type, out.(name), type);
end
end
%%
function out = needa(type, value, name)
%NEEDA Check that a value is of a given type, and convert if needed
%
% out = needa(type, value)
% HACK to support common 'pseudotypes' that aren't real Matlab types
switch type
case 'cellstr'
isThatType = iscellstr(value);
case 'datenum'
isThatType = isnumeric(value);
otherwise
isThatType = isa(value, type);
end
if isThatType
out = value;
else
% Here you can auto-convert if you're feeling brave. Assumes that the
% conversion constructor form of all type names works.
% Unfortunately this ends up with bad results if you try converting
% between string and number (you get Unicode encoding/decoding). Use
% at your discretion.
% If you don't want to try autoconverting, just throw an error instead,
% with:
% error('Argument %s must be a %s; got a %s', name, type, class(value));
try
out = feval(type, value);
catch err
error('Failed converting argument %s from %s to %s: %s',...
name, class(value), type, err.message);
end
end
It is so unfortunate that strings and datenums are not first-class types in Matlab.
MathWorks has revived this beaten horse, but with very useful functionality that answers this need, directly. It's called Function Argument Validation (a phrase one can and should search for in the documentation) and comes with release R2019b+. MathWorks created a video about it, also. Validation works much like the "tricks" people have come up with over the years. Here is an example:
function ret = example( inputDir, proj, options )
%EXAMPLE An example.
% Do it like this.
% See THEOTHEREXAMPLE.
arguments
inputDir (1, :) char
proj (1, 1) projector
options.foo char {mustBeMember(options.foo, {'bar' 'baz'})} = 'bar'
options.Angle (1, 1) {double, integer} = 45
options.Plot (1, 1) logical = false
end
% Code always follows 'arguments' block.
ret = [];
switch options.foo
case 'bar'
ret = sind(options.Angle);
case 'baz'
ret = cosd(options.Angle);
end
if options.Plot
plot(proj.x, proj.y)
end
end
Here's the unpacking:
The arguments block must come before any code (OK after help block) and must follow the positional order defined in the function definition, and I believe every argument requires a mention. Required arguments go first, followed by optional arguments, followed by name-value pairs. MathWorks also recommends to no longer use the varargin keyword, but nargin and nargout are still useful.
Class requirements can be custom classes, such as projector, in this case.
Required arguments may not have a default value (i.e. they are known because they don't have a default value).
Optional arguments must have a default value (i.e. they are known because they have a default value).
Default values must be able to pass the same argument validation. In other words, a default value of zeros(3) won't work as a default value for an argument that's supposed to be a character vector.
Name-value pairs are stored in an argument that are internally converted to a struct, which I'm calling options, here (hinting to us that we can use structs to pass keyword arguments, like kwargs in Python).
Very nicely, name-value arguments will now show up as argument hints when you hit tab in a function call. (If completion hints interest you, I encourage you to also look up MATLAB's functionSignatures.json functionality).
So in the example, inputDir is a required argument because it's given no default value. It also must be a 1xN character vector. As if to contradict that statement, note that MATLAB will try to convert the supplied argument to see if the converted argument passes. If you pass 97:122 as inputDir, for example, it will pass and inputDir == char(97:122) (i.e. inputDir == 'abcdefghijklmnopqrstuvwxyz'). Conversely, zeros(3) won't work on account of its not being a vector. And forget about making strings fail when you specify characters, making doubles fail when you demand uint8, etc. Those will be converted. You'd need to dig deeper to circumvent this "flexibility."
Moving on, 'foo' specifies a name-value pair whose value may be only 'bar' or 'baz'.
MATLAB has a number of mustBe... validation functions (start typing
mustBe and hit tab to see what's available), and it's easy enough to
create your own. If you create your own, the validation function must
give an error if the input doesn't match, unlike, say, uigetdir,
which returns 0 if the user cancels the dialog. Personally, I
follow MATLAB's convention and call my validation functions
mustBe..., so I have functions like mustBeNatural for natural
numbers, and mustBeFile to ensure I passed a file that actually
exists.
'Angle' specifies a name-value pair whose value must be a scalar double or integer, so, for example, example(pwd, 'foo', 'baz', 'Angle', [30 70]) won't work since you passed a vector for the Angle argument.
You get the idea. There is a lot of flexibility with the arguments block -- too much and too little, I think -- but for simple functions, it's fast and easy. You still might rely on one or more of inputParser, validateattributes, assert, and so on for addressing greater validation complexity, but I always try to stuff things into an arguments block, first. If it's becoming unsightly, maybe I'll do an arguments block and some assertions, etc.
Personally I use a custom function derived from a private method used by many Statistics Toolbox functions (like kmeans, pca, svmtrain, ttest2, ...)
Being an internal utility function, it changed and was renamed many times over the releases. Depending on your MATLAB version, try looking for one of the following files:
%# old versions
which -all statgetargs
which -all internal.stats.getargs
which -all internal.stats.parseArgs
%# current one, as of R2014a
which -all statslib.internal.parseArgs
As with any undocumented function, there are no guarantees and it could be removed from MATLAB in subsequent releases without any notice... Anyways, I believe someone posted an old version of it as getargs on the File Exchange..
The function processes parameters as name/value pairs, using a set of valid parameter names along with their default values. It returns the parsed parameters as separate output variables. By default, unrecognized name/value pairs raise an error, but we could also silently capture them in an extra output. Here is the function description:
$MATLABROOT\toolbox\stats\stats\+internal\+stats\parseArgs.m
function varargout = parseArgs(pnames, dflts, varargin)
%
% [A,B,...] = parseArgs(PNAMES, DFLTS, 'NAME1',VAL1, 'NAME2',VAL2, ...)
% PNAMES : cell array of N valid parameter names.
% DFLTS : cell array of N default values for these parameters.
% varargin : Remaining arguments as name/value pairs to be parsed.
% [A,B,...]: N outputs assigned in the same order as the names in PNAMES.
%
% [A,B,...,SETFLAG] = parseArgs(...)
% SETFLAG : structure of N fields for each parameter, indicates whether
% the value was parsed from input, or taken from the defaults.
%
% [A,B,...,SETFLAG,EXTRA] = parseArgs(...)
% EXTRA : cell array containing name/value parameters pairs not
% specified in PNAMES.
Example:
function my_plot(x, varargin)
%# valid parameters, and their default values
pnames = {'Color', 'LineWidth', 'LineStyle', 'Title'};
dflts = { 'r', 2, '--', []};
%# parse function arguments
[clr,lw,ls,txt] = internal.stats.parseArgs(pnames, dflts, varargin{:});
%# use the processed values: clr, lw, ls, txt
%# corresponding to the specified parameters
%# ...
end
Now this example function could be called as any of the following ways:
>> my_plot(data) %# use the defaults
>> my_plot(data, 'linestyle','-', 'Color','b') %# any order, case insensitive
>> my_plot(data, 'Col',[0.5 0.5 0.5]) %# partial name match
Here are some invalid calls and the errors thrown:
%# unrecognized parameter
>> my_plot(x, 'width',0)
Error using [...]
Invalid parameter name: width.
%# bad parameter
>> my_plot(x, 1,2)
Error using [...]
Parameter name must be text.
%# wrong number of arguments
>> my_plot(x, 'invalid')
Error using [...]
Wrong number of arguments.
%# ambiguous partial match
>> my_plot(x, 'line','-')
Error using [...]
Ambiguous parameter name: line.
inputParser:
As others have mentioned, the officially recommended approach to parsing functions inputs is to use inputParser class. It supports various schemes such as specifying required inputs, optional positional arguments, and name/value parameters. It also allows to perform validation on the inputs (such as checking the class/type and the size/shape of the arguments)
Read Loren's informative post on this issue. Don't forget to read the comments section... - You will see that there are quite a few different approaches to this topic. They all work, so selecting a prefered method is really a matter of personal taste and maintainability.
I'm a bigger fan of home-grown boiler plate code like this:
function TestExample(req1, req2, varargin)
for i = 1:2:length(varargin)
if strcmpi(varargin{i}, 'alphabet')
ALPHA = varargin{i+1};
elseif strcmpi(varargin{i}, 'cutoff')
CUTOFF = varargin{i+1};
%we need to remove these so seqlogo doesn't get confused
rm_inds = [rm_inds i, i+1]; %#ok<*AGROW>
elseif strcmpi(varargin{i}, 'colors')
colors = varargin{i+1};
rm_inds = [rm_inds i, i+1];
elseif strcmpi(varargin{i}, 'axes_handle')
handle = varargin{i+1};
rm_inds = [rm_inds i, i+1];
elseif strcmpi(varargin{i}, 'top-n')
TOPN = varargin{i+1};
rm_inds = [rm_inds i, i+1];
elseif strcmpi(varargin{i}, 'inds')
npos = varargin{i+1};
rm_inds = [rm_inds i, i+1];
elseif strcmpi(varargin{i}, 'letterfile')
LETTERFILE = varargin{i+1};
rm_inds = [rm_inds i, i+1];
elseif strcmpi(varargin{i}, 'letterstruct')
lo = varargin{i+1};
rm_inds = [rm_inds i, i+1];
end
end
This way I can simulate the 'option', value pair that's nearly identical to how most Matlab functions take their arguments.
Hope that helps,
Will
Here's the solution I'm trialling, based upon Jonas' idea.
function argStruct = NameValuePairToStruct(defaults, varargin)
%NAMEVALUEPAIRTOSTRUCT Converts name/value pairs to a struct.
%
% ARGSTRUCT = NAMEVALUEPAIRTOSTRUCT(DEFAULTS, VARARGIN) converts
% name/value pairs to a struct, with defaults. The function expects an
% even number of arguments to VARARGIN, alternating NAME then VALUE.
% (Each NAME should be a valid variable name.)
%
% Examples:
%
% No defaults
% NameValuePairToStruct(struct, ...
% 'foo', 123, ...
% 'bar', 'qwerty', ...
% 'baz', magic(3))
%
% With defaults
% NameValuePairToStruct( ...
% struct('bar', 'dvorak', 'quux', eye(3)), ...
% 'foo', 123, ...
% 'bar', 'qwerty', ...
% 'baz', magic(3))
%
% See also: inputParser
nArgs = length(varargin);
if rem(nArgs, 2) ~= 0
error('NameValuePairToStruct:NotNameValuePairs', ...
'Inputs were not name/value pairs');
end
argStruct = defaults;
for i = 1:2:nArgs
name = varargin{i};
if ~isvarname(name)
error('NameValuePairToStruct:InvalidName', ...
'A variable name was not valid');
end
argStruct = setfield(argStruct, name, varargin{i + 1}); %#ok<SFLD>
end
end
Inspired by Jonas' answer, but more compact:
function example(varargin)
defaults = struct('A',1, 'B',magic(3)); %define default values
params = struct(varargin{:});
for f = fieldnames(defaults)',
if ~isfield(params, f{1}),
params.(f{1}) = defaults.(f{1});
end
end
%now just access them as params.A, params.B
There is a nifty function called parsepvpairs that takes care of this nicely, provided you have access to MATLAB's finance toolbox. It takes three arguments, expected field names, default field values, and the actual arguments received.
For example, here's a function that creates an HTML figure in MATLAB and can take the optional field value pairs named 'url', 'html', and 'title'.
function htmldlg(varargin)
names = {'url','html','title'};
defaults = {[],[],'Padaco Help'};
[url, html,titleStr] = parsepvpairs(names,defaults,varargin{:});
%... code to create figure using the parsed input values
end
Since ages I am using process_options.m. It is stable, easy to use and has been included in various matlab frameworks. Don't know anything about performance though – might be that there are faster implementations.
Feature I like most with process_options is the unused_args return value, that can be used to split input args in groups of args for, e.g., subprocesses.
And you can easily define default values.
Most importantly: using process_options.m usually results in readable and maintainable option definitions.
Example code:
function y = func(x, y, varargin)
[u, v] = process_options(varargin,
'u', 0,
'v', 1);
If you are using MATLAB 2019b or later, the best way to deal with name-value pairs in your function is to use "Declare function argument validation".
function result = myFunction(NameValueArgs)
arguments
NameValueArgs.Name1
NameValueArgs.Name2
end
% Function code
result = NameValueArgs.Name1 * NameValueArgs.Name2;
end
see: https://www.mathworks.com/help/matlab/ref/arguments.html
function argtest(varargin)
a = 1;
for ii=1:length(varargin)/2
[~] = evalc([varargin{2*ii-1} '=''' num2str(varargin{2*ii}) '''']);
end;
disp(a);
who
This does of course not check for correct assignments, but it's simple and any useless variable will be ignored anyway. It also only works for numerics, strings and arrays, but not for matrices, cells or structures.
I ended up writing this today, and then found these mentions.
Mine uses struct's and struct 'overlays' for options. It essentially mirrors the functionality of setstructfields() except that new parameters can not be added. It also has an option for recursing, whereas setstructfields() does it automatically.
It can take in a cell array of paired values by calling struct(args{:}).
% Overlay default fields with input fields
% Good for option management
% Arguments
% $opts - Default options
% $optsIn - Input options
% Can be struct(), cell of {name, value, ...}, or empty []
% $recurseStructs - Applies optOverlay to any existing structs, given new
% value is a struct too and both are 1x1 structs
% Output
% $opts - Outputs with optsIn values overlayed
function [opts] = optOverlay(opts, optsIn, recurseStructs)
if nargin < 3
recurseStructs = false;
end
isValid = #(o) isstruct(o) && length(o) == 1;
assert(isValid(opts), 'Existing options cannot be cell array');
assert(isValid(optsIn), 'Input options cannot be cell array');
if ~isempty(optsIn)
if iscell(optsIn)
optsIn = struct(optsIn{:});
end
assert(isstruct(optsIn));
fields = fieldnames(optsIn);
for i = 1:length(fields)
field = fields{i};
assert(isfield(opts, field), 'Field does not exist: %s', field);
newValue = optsIn.(field);
% Apply recursion
if recurseStructs
curValue = opts.(field);
% Both values must be proper option structs
if isValid(curValue) && isValid(newValue)
newValue = optOverlay(curValue, newValue, true);
end
end
opts.(field) = newValue;
end
end
end
I'd say that using the naming convention 'defaults' and 'new' would probably be better :P
I have made a function based on Jonas and Richie Cotton. It implements both functionalities (flexible arguments or restricted, meaning that only variables existing in the defaults are allowed), and a few other things like syntactic sugar and sanity checks.
function argStruct = getnargs(varargin, defaults, restrict_flag)
%GETNARGS Converts name/value pairs to a struct (this allows to process named optional arguments).
%
% ARGSTRUCT = GETNARGS(VARARGIN, DEFAULTS, restrict_flag) converts
% name/value pairs to a struct, with defaults. The function expects an
% even number of arguments in VARARGIN, alternating NAME then VALUE.
% (Each NAME should be a valid variable name and is case sensitive.)
% Also VARARGIN should be a cell, and defaults should be a struct().
% Optionally: you can set restrict_flag to true if you want that only arguments names specified in defaults be allowed. Also, if restrict_flag = 2, arguments that aren't in the defaults will just be ignored.
% After calling this function, you can access your arguments using: argstruct.your_argument_name
%
% Examples:
%
% No defaults
% getnargs( {'foo', 123, 'bar', 'qwerty'} )
%
% With defaults
% getnargs( {'foo', 123, 'bar', 'qwerty'} , ...
% struct('foo', 987, 'bar', magic(3)) )
%
% See also: inputParser
%
% Authors: Jonas, Richie Cotton and LRQ3000
%
% Extract the arguments if it's inside a sub-struct (happens on Octave), because anyway it's impossible that the number of argument be 1 (you need at least a couple, thus two)
if (numel(varargin) == 1)
varargin = varargin{:};
end
% Sanity check: we need a multiple of couples, if we get an odd number of arguments then that's wrong (probably missing a value somewhere)
nArgs = length(varargin);
if rem(nArgs, 2) ~= 0
error('NameValuePairToStruct:NotNameValuePairs', ...
'Inputs were not name/value pairs');
end
% Sanity check: if defaults is not supplied, it's by default an empty struct
if ~exist('defaults', 'var')
defaults = struct;
end
if ~exist('restrict_flag', 'var')
restrict_flag = false;
end
% Syntactic sugar: if defaults is also a cell instead of a struct, we convert it on-the-fly
if iscell(defaults)
defaults = struct(defaults{:});
end
optionNames = fieldnames(defaults); % extract all default arguments names (useful for restrict_flag)
argStruct = defaults; % copy over the defaults: by default, all arguments will have the default value.After we will simply overwrite the defaults with the user specified values.
for i = 1:2:nArgs % iterate over couples of argument/value
varname = varargin{i}; % make case insensitive
% check that the supplied name is a valid variable identifier (it does not check if the variable is allowed/declared in defaults, just that it's a possible variable name!)
if ~isvarname(varname)
error('NameValuePairToStruct:InvalidName', ...
'A variable name was not valid: %s position %i', varname, i);
% if options are restricted, check that the argument's name exists in the supplied defaults, else we throw an error. With this we can allow only a restricted range of arguments by specifying in the defaults.
elseif restrict_flag && ~isempty(defaults) && ~any(strmatch(varname, optionNames))
if restrict_flag ~= 2 % restrict_flag = 2 means that we just ignore this argument, else we show an error
error('%s is not a recognized argument name', varname);
end
% else alright, we replace the default value for this argument with the user supplied one (or we create the variable if it wasn't in the defaults and there's no restrict_flag)
else
argStruct = setfield(argStruct, varname, varargin{i + 1}); %#ok<SFLD>
end
end
end
Also available as a Gist.
And for those interested in having real named arguments (with a syntax similar to Python, eg: myfunction(a=1, b='qwerty'), use InputParser (only for Matlab, Octave users will have to wait until v4.2 at least or you can try a wrapper called InputParser2).
Also as a bonus, if you don't want to always have to type argstruct.yourvar but directly use yourvar, you can use the following snippet by Jason S:
function varspull(s)
% Import variables in a structures into the local namespace/workspace
% eg: s = struct('foo', 1, 'bar', 'qwerty'); varspull(s); disp(foo); disp(bar);
% Will print: 1 and qwerty
%
%
% Author: Jason S
%
for n = fieldnames(s)'
name = n{1};
value = s.(name);
assignin('caller',name,value);
end
end

Is there a programming language that allows variable declaration at call site?

Update 2: examples removed, because they were misleading. The ones below are more relevant.
My question:
Is there a programming language with such a construct?
Update:
Now when I think about it, Prolog has something similar.
I even allows defining operations at definition line.
(forget about backtracking and relations - think about syntax)
I asked this question because I believe, it's a nice thing to have symmetry in a language.
Symmetry between "in" parameters and "out" parameters.
If returning values like that would be easy, we could drop explicit returning in designed language.
retruning pairs ... I think this is a hack. we do not need a data structure to pass multiple parameters to a function.
Update 2:
To give an example of syntax I'm looking for:
f (s, d&) = // & indicates 'out' variable
d = s+s.
main =
f("say twice", &twice) // & indicates 'out' variable declaration
print(twice)
main2 =
print (f("say twice", _))
Or in functional + prolog style
f $s (s+s). // use $ to mark that s will get it's value in other part of the code
main =
f "say twice" $twice // on call site the second parameter will get it's value from
print twice
main2 =
print (f "Say twice" $_) // anonymous variable
In a proposed language, there are no expressions, because all returns are through parameters. This would be cumbersome in situations where deep hierarchical function calls are natural. Lisp'ish example:
(let x (* (+ 1 2) (+ 3 4))) // equivalent to C x = ((1 + 2) * (3 + 4))
would need in the language names for all temporary variables:
+ 1 2 res1
+ 3 4 res2
* res1 res2 x
So I propose anonymous variables that turn a whole function call into value of this variable:
* (+ 1 2 _) (+ 3 4 _)
This is not very natural, because all the cultural baggage we have, but I want to throw away all preconceptions about syntax we currently have.
<?php
function f($param, &$ret) {
$ret = $param . $param;
}
f("say twice", $twice);
echo $twice;
?>
$twice is seen after the call to f(), and it has the expected value. If you remove the ampersand, there are errors. So it looks like PHP will declare the variable at the point of calling. I'm not convinced that buys you much, though, especially in PHP.
"Is there a programming language with such a construct?"
Your question is in fact a little unclear.
In a sense, any language that supports assignment to [the variable state associated with] a function argument, supports "such a construct".
C supports it because "void f (type *address)" allows modification of anything address points to. Java supports it because "void f (Object x)" allows any (state-modifying) invocation of some method of x. COBOL supports it because "PROCEDURE DIVISION USING X" can involve an X that holds a pointer/memory address, ultimately allowing to go change the state of the thing pointed to by that address.
From that perspective, I'd say almost every language known to mankind supports "such a construct", with the exception perhaps of languages such as Tutorial D, which claim to be "absolutely pointer-free".
I'm having a hard time understanding what you want. You want to put the return type on call signature? I'm sure someone could hack that together but is it really useful?
// fakelang example - use a ; to separate ins and outs
function f(int in1, int in2; int out1, int out2, int out3) {...}
// C++0x-ish
auto f(int in1, int in2) -> int o1, int o2, int o3 {...}
int a, b, c;
a, b, c = f(1, 2);
I get the feeling this would be implemented internally this way:
LEA EAX, c // push output parameter pointers first, in reverse order
PUSH EAX
LEA EAX, b
PUSH EAX
LEA EAX, a
PUSH EAX
PUSH 1 // push input parameters
PUSH 2
CALL f // Caller treat the outputs as references
ADD ESP,20 // clean the stack
For your first code snippet, I'm not aware of any such languages, and frankly I'm glad it is the case. Declaring a variable in the middle of expression like that, and then using it outside said expression, looks very wrong to me. If anything, I'd expect the scope of such variable to be restricted to the function call, but then of course it's quite pointless in the first place.
For the second one - multiple return values - pretty much any language with first-class tuple support has something close to that. E.g. Python:
def foo(x, y):
return (x + 1), (y + 1)
x, y = foo(1, 2)
Lua doesn't have first-class tuples (i.e. you can't bind a tuple value to a single variable - you always have to expand it, possibly discarding part of it), but it does have multiple return values, with essentially the same syntax:
function foo(x, y)
return (x + 1), (y + 1)
end
local x, y = foo(x, y)
F# has first-class tuples, and so everything said earlier about Python applies to it as well. But it can also simulate tuple returns for methods that were declared in C# or VB with out or ref arguments, which is probably the closest to what you describe - though it is still implicit (i.e. you don't specify the out-argument at all, even as _). Example:
// C# definition
int Foo(int x, int y, out int z)
{
z = y + 1;
return x + 1;
}
// explicit F# call
let mutable y = 0
let x = Foo(1, 2, byref y);
// tupled F# call
let x, y = Foo(1, 2)
Here is how you would do it in Perl:
sub f { $_[1] = $_[0] . $_[0] } #in perl all variables are passed by reference
f("say twice", my $twice);
# or f("...", our $twice) or f("...", $twice)
# the last case is only possible if you are not running with "use strict;"
print $twice;
[edit] Also, since you seem interested in minimal syntax:
sub f { $_[1] = $_[0] x 2 } # x is the repetition operator
f "say twice" => $twice; # => is a quoting comma, used here just for clarity
print $twice;
is perfectly valid perl. Here's an example of normal quoting comma usage:
("abc", 1, "d e f", 2) # is the same as
(abc => 1, "d e f" => 2) # the => only quotes perl /\w+/ strings
Also, on return values, unless exited with a "return" early, all perl subroutines automatically return the last line they execute, be it a single value, or a list. Lastly, take a look at perl6's feed operators, which you might find interesting.
[/edit]
I am not sure exactly what you are trying to achieve with the second example, but the concept of implicit variables exists in a few languages, in Perl, it is $_.
an example would be some of perl's builtins which look at $_ when they dont have an argument.
$string = "my string\n";
for ($string) { # loads "my string" into $_
chomp; # strips the last newline from $_
s/my/our/; # substitutes my for our in $_
print; # prints $_
}
without using $_, the above code would be:
chomp $string;
$string =~ s/my/our/;
print $string;
$_ is used in many cases in perl to avoid repeatedly passing temporary variables to functions
Not programming languages, but various process calculi have syntax for binding names at the receiver call sites in the scope of process expressions dependent on them. While Pict has such syntax, it doesn't actually make sense in the derived functional syntax that you're asking about.
You might have a look at Oz. In Oz you only have procedures and you assign values to variables instead of returning them.
It looks like this:
proc {Max X Y Z}
if X >= Y then Z = X else Z = Y end
end
There are functions (that return values) but this is only syntactic sugar.
Also, Concepts, Techniques, and Models of Computer Programming is a great SICP-like book that teaches programming by using Oz and the Mozart Programming System.
I don't think so. Most languages that do something like that use Tuples so that there can be multiple return values. Come to think of it, the C-style reference and output parameters are mostly hacks around not being about to return Tuples...
Somewhat confusing, but C++ is quite happy with declaring variables and passing them as out parameters in the same statement:
void foo ( int &x, int &y, int &z ) ;
int a,b,c = (foo(a,b,c),c);
But don't do that outside of obfuscation contests.
You might also want to look at pass by name semantics in Algol, which your fuller description smells vaguely similar to.