Equivalent of abbrev-mode but for functions? - function

I'm a big fan of abbrev-mode and I'd like something a bit similar: you start typing and as soon as you enter some punctation (or just a space would be enough) it invokes a function (if I type space after a special abbreviation, of course, just like abbrev-mode does).
I definitely do NOT want to execute some function every single time I hit space...
So instead of expanding the abbreviation using abbrev-mode, it would run a function of my choice.
Of course it needs to be compatible with abbrev-mode, which I use all the time.
How can I get this behavior?

One approach could be to use pre-abbrev-expand-hook. I don't use abbrev mode myself, but it rather sounds as if you could re-use the abbrev mode machinery this way, and simply define some 'abbreviations' which expand to themselves (or to nothing?), and then you catch them in that hook and take whatever action you wish to.
The expand library is apparently related, and that provides expand-expand-hook, which may be another alternative?
edit: Whoops; pre-abbrev-expand-hook is obsolete since 23.1
abbrev-expand-functions is the correct variable to use:
Wrapper hook around `expand-abbrev'.
The functions on this special hook are called with one argument:
a function that performs the abbrev expansion. It should return
the abbrev symbol if expansion took place.
See M-x find-function RET expand-abbrev RET for the code, and you'll also want to read C-h f with-wrapper-hook RET to understand how this hook is used.

EDIT:
Your revised question adds some key details that my answer didn't address. phils has provided one way to approach this issue. Another would be to use yasnippet . You can include arbitrary lisp code in your snippet templates, so you could do something like this:
# -*- mode: snippet -*-
# name: foobars
# key: fbf
# binding: direct-keybinding
# --
`(foo-bar-for-the-win)`
You'd need to ensure your function didn't return anything, or it would be inserted in the buffer. I don't use abbrev-mode, so I don't know if this would introduce conflicts. yas/snippet takes a bit of experimenting to get it running, but it's pretty handy once you get it set up.
Original answer:
You can bind space to any function you like. You could bind all of the punctuation keys to the same function, or to different functions.
(define-key your-mode-map " " 'your-choice-function)
You probably want to do this within a custom mode map, so you can return to normal behaviour when you switch modes. Globally setting space to anything but self-insert would be unhelpful.

Every abbrev is composed of several elements. Among the main elements are the name (e.g. "fbf"), the expansion (any string you like), and the hook (a function that gets called). In your case it sounds like you want the expansion to be the empty string and simply specify your foo-bar-for-the-win as the hook.

Related

Why is my %h is List = 1,2; a valid assignment?

While finalizing my upcoming Raku Advent Calendar post on sigils, I decided to double-check my understanding of the type constraints that sigils create. The docs describe sigil type constraints with the table
below:
Based on this table (and my general understanding of how sigils and containers work), I strongly expected this code
my %percent-sigil is List = 1,2;
my #at-sigil is Map = :k<v>;
to throw an error.
Specifically, I expected that is List would attempt to bind the %-sigiled variable to a List, and that this would throw an X::TypeCheck::Binding error – the same error that my %h := 1,2 throws.
But it didn't error. The first line created a List that seemed perfectly ordinary in every way, other than the sigil on its variable. And the second created a seemingly normal Map. Neither of them secretly had Scalar intermediaries, at least as far as I could tell with VAR and similar introspection.
I took a very quick look at the World.nqp source code, and it seems at least plausible that discarding the % type constraint with is List is intended behavior.
So, is this behavior correct/intended? If so, why? And how does that fit in with the type constraints and other guarantees that sigils typically provide?
(I have to admit, seeing an %-sigiled variable that doesn't support Associative indexing kind of shocked me…)
I think this is a grey area, somewhere between DIHWIDT (Docter, It Hurts When I Do This) and an oversight in implementation.
Thing is, you can create your own class and use that in the is trait. Basically, that overrides the type with which the object will be created from the default Hash (for %) and Array (for # sigils). As long as you provide the interface methods, it (currently) works. For example:
class Foo {
method AT-KEY($) { 42 }
}
my %h is Foo;
say %h<a>; # 42
However, if you want to pass such an object as an argument to a sub with a % sigil in the signature, it will fail because the class did not consume the Associatve role:
sub bar(%) { 666 }
say bar(%h);
===SORRY!=== Error while compiling -e
Calling bar(A) will never work with declared signature (%)
I'm not sure why the test for Associative (for the % sigil) and Positional (for #) is not enforced at compile time with the is trait. I would assume it was an oversight, maybe something to be fixed in 6.e.
Quoting the Parameters and arguments section of the S06 specification/speculation document about the related issue of binding arguments to routine parameters:
Array and hash parameters are simply bound "as is". (Conjectural: future versions ... may do static analysis and forbid assignments to array and hash parameters that can be caught by it. This will, however, only happen with the appropriate use declaration to opt in to that language version.)
Sure enough the Rakudo compiler implemented some rudimentary static analysis (in its AOT compilation optimization pass) that normally (but see footnote 3 in this SO answer) insists on binding # routine parameters to values that do the Positional role and % ones to Associatives.
I think this was the case from the first official Raku supporting release of Rakudo, in 2016, but regardless, I'm pretty sure the "appropriate use declaration" is any language version declaration, including none. If your/our druthers are static typing for the win for # and % sigils, and I think they are, then that's presumably very appropriate!
Another source is the IRC logs. A quick search quickly got me nothing.
Hmm. Let's check the blame for the above verbiage so I can find when it was last updated and maybe spot contemporaneous IRC discussion. Oooh.
That is an extraordinary read.
"oversight" isn't the right word.
I don't have time tonight to search the IRC logs to see what led up to that commit, but I daresay it's interesting. The previous text was talking about a PL design I really liked the sound of in terms of immutability, such that code could become increasingly immutable by simply swapping out one kind of scalar container for another. Very nice! But reality is important, and Jonathan switched the verbiage to the implementation reality. The switch toward static typing certainty is welcome, but has it seriously harmed the performance and immutability options? I don't know. Time for me to go to sleep and head off for seasonal family visits. Happy holidays...

How do i know when a function body ends in assembly

I want to know when a function body end in assemby, for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Is there a parser that can extract me all the functions from assembly and start line and endline of their body?
There's no foolproof way, and there might not even be a well-defined correct answer in hand-written asm.
Usually (e.g. in compiler-generated code) you know a function ends when you see the next global symbol, like objdump does to decide when to print a new "banner". But without all function-start symbols being visible, there's no unambigious way. That's why some object file formats have room for size metadata associated with a symbol. Like .size foo, . - foo in GAS syntax.
It's not as easy as looking for a ret; some functions end with a jmp tail-call to another function. And some call a noreturn function like abort or __stack_chk_fail (not tailcall because they want to push a return address for a backtrace.) Or just fall off into whatever's next because that path had undefined behaviour in the source so the compiler assumed it wasn't reachable and stopped generating instructions for it, e.g. a C++ non-void function where execution can/does fall off the end without a return.
In general, assembly can blur the lines of what a function is.
Asm has features you can use to implement the high-level concept of a function, but you're not restricted to that.
e.g. multiple asm functions could all return by jumping to a common block of code that pops some registers before a ret. Is that shared tail a separate function that's called with a tail-called with a special calling convention?
Compilers don't usually do that, but humans could.
As for function entry points, usually some other code somewhere in the program will contain a call to it. But not necessarily; it might only be reachable via a table of function pointers, and you don't know that a block of .rodata holds function pointers until you find some code loading from it and calling or jumping.
But that doesn't work if the lowest-address instruction of the function isn't its entry point. See Does a function with instructions before the entry-point label cause problems for anything (linking)? for an example
Compilers don't generate code like that, but humans can. (It's a handy trick sometimes for https://codegolf.stackexchange.com/ questions.)
Or in the general case, a function might have multiple entry points. Or you could describe that as multiple functions with overlapping implementations. Sometimes it's as simple as one tailcalling another by falling into it without needing a jmp, i.e. it starts a few instructions before another.
I wan't to know when a function body ends in assembly, [...]
There are mainly four ways that the execution of a stream of (userspace) instructions can "end":
An unconditional jump like jmp or a conditional one like Jcc (je,jnz,jg ...)
A ret instruction (meaning the end of a subroutine) which probably comes closest to the intent of your question (including the ExitProcess "ret" command)
The call of another "function"
An exception. Not a C style exception, but rather an exception like "Invalid instruction" or "Division by 0" which terminates the user space program
[...] for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Simple answer: you don't. On the machine level every address can (theoretically) be an entry point to a "function". So there is no unique entry point to a "function" other than defined - and you can define anything.
On a tangent, this relates to self-modifying code and viruses, but it must not. The exit/end is as described in the first part above.
Is there a parser that can extract me all the functions from assembly and
start line and endline of their body?
Disassemblers create some kind of "functions" with entry and exit points. But they are merely assumed. No way to know if that assumption is correct. This may cause problems.
The usual approach is using a disassembler and the work to recombinate the stream of instructions to different "functions" remains to the person that mandated this task (vulgo: you). Some tools exist that claim to simplify this, but I cannot judge their efficacy.
From the perspective of a high level language, there are decompilers that try to reverse the transformation from (for example) C to assembly/machine code that try to automatize that task and will work more or less or in some cases.

Why does tcl consider an empty string a double?

Quite confused about this one:
$ tclsh
% string is double {}
1
Why would tcl consider an empty string to be a valid double?
Summary: It's a bit of a misdesign and was a place where the initial use-case was the wrong one, but we can't change it in 8.* (I'm not sure about Tcl 9.0; we still want to avoid gratuitous changes there).
The string is command was originally designed to support the Tk entry widget's validation options. These let the widget respond to typing (or focus changs) by checking whether the change made the widget be in a valid state, such as holding a integer. If you wanted that, you'd just do this:
entry $w -validate key -vcmd {string is integer %P} -invcmd {bell}
Then, if you pressed a letter key, say A, with the cursor in the middle of an integer, the edit would be rejected and the system would make a warning noise. Really easy.
There's only one slight problem. If you had selected all the text in the entry and the pressed a digit, the edit would also be rejected (if string is was strict by default). The problem is that there's an intermediate transition state in the edit where the old text is deleted but before the new text is inserted: the validation occurs twice in such a situation, once for the delete and once for the insert. (It has to be that way because of the way things are tied together under the hood.) That's a terrible user experience, so string is was made lax by default so that this use case would work.
It's not a decision I agreed with — it should have been the other way round, with you needing to request laxity in the test if you want it, which would have added very little overhead here while allowing other uses to be saner — but I was just an ordinary user at that point. I prefer to use a multi-stage validation in my forms, such as using keypress level validation as a soft validation that allows bad input while the user is part-way through using the form, and just indicates that it knows that problems exist anyway, via techniques such as adjusting background colours and disabling submit buttons. (But that's off-topic for your question…)
Library command design is tricky. It takes careful consideration of use-cases to get right. Sometimes we fail.
The problem originated in code that was external to Tcl and Tk in about the time of Tcl 8.1.0. Most of the patch that introduced this was very good (it also gave us commands such as string equal and string map) but this was an aspect that could have done with a little more cooking.

How to select a specific .m function when two exist?

First, here the way i'm calling the function :
eval([functionName '(''stringArg'')']); % functionName = 'someStringForTheFunctionName'
Now, I have two functionName functions in my path, one that take the stringArg and another one that takes something else. I'm getting some errors because right now the first one it finds is the function that doesn't take the stringArg. Considering the way i'm calling the functionName function, how is it possible to call the correct function?
Edit:
I tried the function which :
which -all someStringForTheFunctionName
The result :
C:\........\x\someStringForTheFunctionName
C:\........\y\someStringForTheFunctionName % Shadowed
The shadowed function is the one i want to call.
Function names must be unique in MATLAB. If they are not, so there are duplicate names, then MATLAB uses the first one it finds on your search path.
Having said that, there are a few options open to you.
Option 1. Use # directories, putting each version in a separate directory. Essentially you are using the ability of MATLAB to apply a function to specific classes. So, you might set up a pair of directories:
#char
#double
Put your copies of myfun.m in the respective directories. Now when MATLAB sees a double input to myfun, it will direct the call to the double version. When MATLAB gets char input, it goes to the char version.
BE CAREFUL. Do not put these # directories explicitly on your search path. DO put them INSIDE a directory that is on your search path.
A problem with this scheme is if you call the function with a SINGLE precision input, MATLAB will probably have a fit, so you would need separate versions for single, uint8, int8, int32, etc. You cannot just have one version for all numeric types.
Option 2. Have only one version of the function, that tests the first argument to see if it is numeric or char, then branches to perform either task as appropriate. Both pieces of code will most simply be in one file then. The simple scheme will have subfunctions or nested functions to do the work.
Option 3. Name the functions differently. Hey, its not the end of the world.
Option 4: As Shaun points out, one can simply change the current directory. MATLAB always looks first in your current directory, so it will find the function in that directory as needed. One problem is this is time consuming. Any time you touch a directory, things slow down, because there is now disk input needed.
The worst part of changing directories is in how you use MATLAB. It is (IMHO) a poor programming style to force the user to always be in a specific directory based on what code inputs they wish to run. Better is a data driven scheme. If you will be reading in or writing out data, then be in THAT directory. Use the MATLAB search path to categorize all of your functions, as functions tend not to change much. This is a far cleaner way to work than requiring the user to migrate to specific directories based on how they will be calling a given function.
Personally, I'd tend to suggest option 2 as the best. It is clean. It has only ONE main function that you need to work with. If you want to keep the functions district, put them as separate nested or sub functions inside the main function body. Inside of course, they will have distinct names, based on how they are driven.
OK, so a messy answer, but it should do it. My test function was 'echo'
funcstr='echo'; % string representation of function
Fs=which('-all',funcstr);
for v=1:length(Fs)
if (strcmp(Fs{v}(end-1:end),'.m')) % Don''t move built-ins, they will be shadowed anyway
movefile(Fs{v},[Fs{v} '_BK']);
end
end
for v=1:length(Fs)
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile([Fs{v} '_BK'],Fs{v});
end
try
eval([funcstr '(''stringArg'')']);
break;
catch
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile(Fs{v},[Fs{v} '_BK']);
end
end
end
for w=1:v
if (strcmp(Fs{v}(end-1:end),'.m'))
movefile([Fs{v} '_BK'],Fs{v});
end
end
You can also create a function handle for the shadowed function. The problem is that the first function is higher on the matlab path, but you can circumvent that by (temporarily) changing the current directory.
Although it is not nice imo to change that current directory (actually I'd rather never change it while executing code), it will solve the problem quite easily; especially if you use it in the configuration part of your function with a persistent function handle:
function outputpars = myMainExecFunction(inputpars)
% configuration
persistent shadowfun;
if isempty(shadowfun)
funpath1 = 'C:\........\x\fun';
funpath2 = 'C:\........\y\fun'; % Shadowed
curcd = cd;
cd(funpath2);
shadowfun = #fun;
cd(curcd); % and go back to the original cd
end
outputpars{1} = shadowfun(inputpars); % will use the shadowed function
oupputpars{2} = fun(inputparts); % will use the function highest on the matlab path
end
This problem was also discussed here as a possible solution to this problem.
I believe it actually is the only way to overload a builtin function outside the source directory of the overloading function (eg. you want to run your own sum.m in a directory other than where your sum.m is located.)
EDIT: Old answer no longer good
The run command won't work because its a function, not a script.
Instead, your best approach would be honestly just figure out which of the functions need to be run, get the current dir, change it to the one your function is in, run it, and then change back to your start dir.
This approach, while not perfect, seems MUCH easier to code, to read, and less prone to breaking. And it requires no changing of names or creating extra files or function handles.

Is stdout necessary in tcl?

I'm very new to tcl and I hope to become proficient in this language, so I thought I should ask if there was a reason why some example codes have stdout in the code and some just use puts. The book doesn't seem to explain this. Was the IDE updated to automatically assume stdout?
The puts command writes to stdout if you don't provide it with a channel name (and has worked this way since… well, since forever). If you want, you can put it in or leave it out; it doesn't matter, but might make things clearer if done one way or the other. Let it be your choice. The only mandatory argument to puts is the string to write.
In performance terms, it's slightly faster to omit the value, but you should disregard that as the additional cost of a lookup is absolutely measly by comparison with the cost of doing IO at all. Clarity of expression is the only valid reason for picking one over the other.
There is a case where it matters though. If you want to write code that writes to standard output by default, but where you can override it to write to a file (or a socket or a pipe or …) then it's much easier to have the channel name stored in a variable (that is set to stdout by default, of course). Like that, all your code is the same except for one bit: it's not a good idea to close stdout normally. That's easy to avoid with code like this:
# Also don't want to close stdin or stderr...
if {![string match std* $channel]} {
close $channel
}