How to determine whether a function exists in a POSIX shell? - function

This is basically the same question as Determine if a function exists in bash, except that this time it's not aiming at Bash, but at a POSIX shell:
How to determine whether a shell function with a given name exists?
It seems that none of the typical built-ins like type are mandated by POSIX, so the matter is more difficult or maybe even impossible.

POSIX (more precisely the X/Open Portability Guide) specify the type command. It doesn't state what the type command should return to tell what the argument is. However, the standard says it typically identifies the operand, so it would be very unlikely for a type implementation not to include the string "function" in its reply when passed a function name.
This should then work with most, if not all, POSIX compliant shells:
isFunction()
{
type "$1" | sed "s/$1//" | grep -qwi function
}
You might also run command -V instead of type here, with the same comment about the unspecified output format. I never do, given the fact the former is shorter to type and easier to remember. This would, however, be mandatory if you run a shell that decided not to include XSI (likely posh), i.e., a shell that breaks portability with many existing scripts by limiting the utilities it tries to comply with to the strict POSIX set.

You can use command for this in shells that implement the 2013 version of POSIX, or the User Portability Utilities option of the older spec:
isFunction() {
command -V "$1" 2>/dev/null | grep -qwi function
}
However, note that the spec doesn't actually dictate the form of the command's output. It requires that functions be identified as such, so it is highly likely that the output will include the word function if and only if the requested name is a function, but it is not strictly guaranteed. The above solution can be fooled pretty easily (see #jiliagre's comment).
A different part of the spec mandates a type command that does much the same thing (with the same caveats about unspecified output format). Oddly, it's not listed as one of the commands required to be a shell builtin, but as the informational notes say, it pretty much has to be one in order to work as specified.

For the sake of completeness: it is possible to use type or command -V without spawning any extra subprocesses like sed or grep (although you still have to spawn one for $(type ...)):
is_function() {
case "$(type -- "$1" 2>/dev/null)" in
*function*) return 0 ;;
esac
return 1
}

Related

Is there any WORDCHARS alternative in Bash?

Conculusion:
No WORDCHARS alternative in Bash, where C-w ends can't be configured.
mysql depends on editline, which is customizable with ~/.editrc.
redis-cli depends on linenoise, it deletes the whole word without considering :, -
In zsh, WORDCHARS controls the behavior of C-w when deleting a word. Is there any alternative in readline?
I've noticed recently the behavior of C-w in mysql/redis-cli differs that in Bash, although both of which depends on readline?
Take string foo:bar as an example, only bar is deleted by C-w in Bash. While in mysql/redis-cli, the whole word foo:bar is deleted.
How do I control this behavior?
There are two commands to do backward kill word :
backward-kill-word
unix-word-rubout
backward-kill-word deletes bar,
unix-word-rubout deletes foo:bar
Run following command to find out what C-w is bound to
bind -P | grep C-w
Seems bash doesn't have WORDCHARS as in zsh

mysql startup shell problems

since met so many startup errors,I decide to analyze mysql startup shell.while some code fragment I cannot understand clearly.
version:
mysql Ver 14.14 Distrib 5.5.43, for osx10.8 (i386) using readline 5.1
368 #
369 # First, try to find BASEDIR and ledir (where mysqld is)
370 #
372 if echo '/usr/local/mysql/share' | grep '^/usr/local/mysql' > /dev/null
373 then
374 relpkgdata=echo '/usr/local/mysql/share' | sed -e 's,^/usr/local/mysql,,' -e 's,^/,,' -e 's,^,./,'
375 else
376 # pkgdatadir is not relative to prefix
377 relpkgdata='/usr/local/mysql/share'
378 fi
what's the purpose of line 372? a little weird
any help will be appreciated.
At first glance, this is very strange indeed... but here's a solution to this mystery.
372: if echo '/usr/local/mysql/share' | grep '^/usr/local/mysql' > /dev/null
373: then
grep returns true if it matches and false if it doesn't, so this is testing whether the string /usr/local/mysql/share begins with (^) /usr/local/mysql. Output goes /dev/null because we don't need to see it, we just want to compare it.
"Well," you interject, "that's obvious enough. The question is why?" Stick with me.
If it matches:
374: relpkgdata=echo '/usr/local/mysql/share' | sed -e 's,^/usr/local/mysql,,' -e 's,^/,,' -e 's,^,./,'
Beginning with /usr/local/mysql/share, strip off the beginning /usr/local/mysql, then strip off the beginning / then prepend ./.
So /usr/local/mysql/share becomes ./share.
Otherwise, use the string /usr/local/mysql/share.
375: else
376: # pkgdatadir is not relative to prefix
377: relpkgdata='/usr/local/mysql/share'
"That's all fine, too," I hear you say, "but why go through all these gyrations to (apparently) compare and massage two fixed literal strings?? We already know the answer, so what's up with all the tests and substitution?"
It's a fair question.
My first suspicion was that there was some sort of magic bash hackery going on that I didn't recognize, but no, this code is really all too simple to be something along those lines.
My second suspicion, since this is notably absent from MySQL 5.0.96 (which I am not running but keep on hand for reference), was that this was an abandoned attempt to introduce some new magical behavior into mysqld_safe which was never finished and replaced with actual variables, the testing and massaging of which would have made a lot more sense than doing the same thing to literal strings.
But, no. When the only tool you have is a hammer, everything looks like a nail. What this is, is an example of doing something simple... the hard way. At least that's what it looks like to me. There actually is a somewhat rational explanation. To find it answer, you have to look into the source code (not binary) distribution.
MySQL has a lot of "hard-coded" defaults. This turns out to be an example of these.
In the source file scripts/mysqld_safe.sh, the snippet above looks very different:
if echo '#pkgdatadir#' | grep '^#prefix#' > /dev/null
then
relpkgdata=`echo '#pkgdatadir#' | sed -e 's,^#prefix#,,' -e 's,^/,,' -e 's,^,./,'`
else
# pkgdatadir is not relative to prefix
relpkgdata='#pkgdatadir#'
fi
Ah, source munging. Pattern substitution.
When you're compiling MySQL from source, the file scripts/Makefile contains instruction that use sed to replace things like #prefix# and #pkgdatadir# with the literal values. The same thing, of course, happens when Oracle or the Linux disto maintainers compile their binary distribution from source. These paths get hard-coded into many, many, many places in the code, including this script... resulting in the otherwise incomprehensible comparison of two literal strings that somebody should already have known the answer to.
Instead of testing at build time, whether one path is an anchored substring of the other, and the "relpkgdata" value should be expressed relative to the current directory and modifying this script accordingly, that logical test is actually deferred until runtime, comparing two literals that were substituted in for their placeholders at build time.
I've gone to this amount of detail, not because it will help you troubleshoot, because I suspect it won't. It was, however, just bizarre enough to warrant some further investigation.
If you are having difficulty getting MySQL Server running... well, you shouldn't be, because it's a well-established system and it should work. If /bin/sh on your system isn't a symlink to /bin/bash, you might want to change mysqld_safe's shebang line from #!/bin/sh to #!/bin/bash, but beyond that, I suspect you are sniffing down the wrong rabbit hole by looking at mysqld_safe to get to the bottom of your issue. As convoluted as mysqld_safe is, it can't be said that it isn't time-tested. As they say, "the problem is somewhere else."
If I may, I'll suggest that you familiarize yourself with some of our other communities where you're likely to find the answer you need, particularly Ask Ubuntu, Super User, Server Fault, and Database Administrators. Familiarize yourself with each site's community, scope, and the level of existing expertise that each community expects on the part of those who ask questions there, and search the sites for the specific problem you're encountering. It's very likely someone has seen it and we've fixed it on one of them, if not here on SO.

Break on namespace function in gdb (llvm)

I'm trying to step through llvm's opt program (for an assignment) and the instructor suggested setting a breakpoint at runOnFunction. I see this in one of the files:
bool InstCombiner::runOnFunction(Function &F) { /* (Code removed for SO) */ }
but gdb does not seem to find the runOnFunction breakpoint. It occurred to me that the problem might be namespaces? I tried this but gdb never breaks, it just creates the fooOpt.s file:
(gdb) b runOnFunction
Function "runOnFunction" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (runOnFunction) pending.
(gdb) r -S -instcombine -debug -o ~/Desktop/fooOpt.s ~/Desktop/foo.s
I'm on a Mac so I don't have objdump but otool produces 5.6 million lines, wading through that for a starting point does not seem reasonable as runOnFunction appears more than once there.
Gdb has several builtin commands to find name of such functions. First is info functions, which can be used with optional regexp argument to grep all available functions, https://sourceware.org/gdb/current/onlinedocs/gdb/Symbols.html
info functions regexp
Print the names and data types of all defined functions whose names contain a match for regular expression regexp. Thus, ‘info fun step’ finds all functions whose names include step; ‘info fun ^step’ finds those whose names start with step. If a function name contains characters that conflict with the regular expression language (e.g. ‘operator*()’), they may be quoted with a backslash.
So, you can try info functions runOnFunction to get the name. Sometimes it can be useful to add quotes around name when doing break command.
The other way is to use rbreak command instead of break (b). rbreak will do regexp search in functions names and may define several breakpoints: https://sourceware.org/gdb/current/onlinedocs/gdb/Set-Breaks.html#Set-Breaks
rbreak regex
Set breakpoints on all functions matching the regular expression regex. This command sets an unconditional breakpoint on all matches, printing a list of all breakpoints it set. ...
The syntax of the regular expression is the standard one used with tools like grep. Note that this is different from the syntax used by shells, so for instance foo* matches all functions that include an fo followed by zero or more os. There is an implicit .* leading and trailing the regular expression you supply, so to match only functions that begin with foo, use ^foo.
(or even rbreak file:regex to limit search to single source file)
PS: if you want, you can turn on or off C++ function name demangling with set print demangle on or off (https://sourceware.org/gdb/current/onlinedocs/gdb/Debugging-C-Plus-Plus.html#Debugging-C-Plus-Plus). With demangling turned off it will be easier to copy function name to break command.

Perl - getlogin, getpwuid, and $<

Wanted to understand the example line of code given # perldoc.perl.org for getlogin
$login = getlogin || getpwuid($<) || "Kilroy";
It seems like it tries to get the user name from getlogin or getpwuid, but if either fails, use Kilroy instead. I might be wrong, so please correct me. Also, I've been using getlogin() in previous scripts - is there any difference between getlogin() and getlogin?
What is this code safeguarding against? Also, what purpose does $< serve? I'm not exactly sure what to search for when looking up what $< is and what it does.
EDIT
found this in the special variables section - still don't know why it is needed or what is does in the example above
$<
The real uid of this process.
(Mnemonic: it's the uid you came from,
if you're running setuid.) You can
change both the real uid and the
effective uid at the same time by
using POSIX::setuid(). Since changes
to $< require a system call, check $!
after a change attempt to detect any
possible errors.
EDIT x2
Is this line comparable to the above example? (it is currently what I use to avoid any potential problems with "cron" executing a script - i've never run into this problem, but i am trying to avoid any theoretical problem)
my $username = getlogin(); if(!($username)){$username = 'jsmith';}
You're exactly right. If getlogin returns false it will test getpwuid($<) if that returns false it will set $login to "Kilroy"
$< is the real uid of the process. Even if you're running in a setuid environment it will return the original uid the process was started from.
Edit to match your edit :)
getpwuid returns the user's name by the UID (in scalar context, which would be the case here). You would want $< as an argumnent in case the program switched UID at some point ($< is the original one it was started with)
The only thing it's guarding against is the fact that on some systems, in some circumstances, getlogin can fail to return anything useful. In particular, getlogin only does anything useful when the process it's in has a "controlling terminal", which non-interactive processes may not. See, e.g., http://www.perlmonks.org/?node_id=663562.
I think the fallback of "Kilroy" is just for fun, though in principle getpwuid can fail to return anything useful too. (You can have a user ID that doesn't have an entry in the password database.)

sed: modify function arguments

I'm trying to write a sed command that will allow me to modify a function's arguments. The number of arguments can be variable.
If this is my function:
int myFunction(int arg1, int arg2, Dog arg3) {
// function implementation
}
I would like to be able to perform addition operations on int arg1, int arg2, ...
Here's what I have that does not work:
sed -e '/^[a-zA-Z0-9_]\+\s\+[a-zA-Z0-9_]\+(/ , /)[\n\s]*{/ {
# arguments should be listed here
}'
Any help is appreciated. Go easy on me, it's my first attempt at sed / shell scripting.
Thanks.
Ultimately, sed is not the correct tool for this job - because of your comment about 'the number of arguments can be variable'. If you are dealing with a fixed number of arguments of a fixed type, you can kludge your way around it, but any more general processing requires a more general processor (than sed).
I suggest trying a different task as your introduction to shell scripting and sed.
If you must do it, then maybe:
sed '/^[A-Za-z_][A-Za-z0-9_]* *[A-Za-z_][A-Za-z0-9]* *([A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\) *, *[A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\)[ ,)].*{/{p;a\
return \1 + \2;
}' $file
That horror of a match contains the sequence [A-Za-z_][A-Za-z0-9_]* 6 times; it matches an identifier each time. The segment from '[ ,)].*{ matches a third or subsequent arguments. The spaces in the pattern should, perhaps, be '[<blank><tab>]' character classes, but they're a pain to enter on StackOverflow. The regex then matches a function definition, and captures (in the '\(<identifier>\)' parts the names of the two variables (arg1 and arg2 in your example). The actions when this is recognized are:
p - print the line that was recognized.
a - append the following line(s) to the output; in this case, one line containing a return statement that is the sum of the two remembered argument names. The backslash indicates that there is another line of output to append. The braces group the operations together.
Some versions of sed support more powerful regular expressions than others; I'm not sure though that even GNU sed supports PCRE (Perl-Compatible Regular Expressions), and it would take something like PCRE to significantly reduce the the regex.
Note that this script leaves the comment line '// function implementation' untouched. It's your call what you do with that.
Finally, remember that if you write more than one function to add two integers together, you are wasting code. Therefore, this is not a plausible transformation. Each function should do something different, somehow. Granted, if the types are different each time, then maybe it has its uses, but even so, it would be easier to write a generator than to parse the skeleton and fill in the bits. And that might be a good scripting exercise.