Swig and %rename filters - swig

Is there a way to run multiple filters in a single %rename call in SWIG?
I know from the manual that I can use a line like this:
%rename("%(strip:[H3D])s") "";
which will turn all methods such as "H3DFoo" in to "Foo". There are other in-built filters for doing case transformation, but there is no documentation on how to do multiple steps.
Using another %rename replaces the filter, and I haven't found a separator to run multiple filters on the string. So, it appears possible to convert type casing or remove a prefix and not both.
In this particular case it might be possible to use the regex filter, but it would be nice to be able to both remove a prefix and convert type casing. The other option is to put a %rename on every single declaration, but this defeats the purpose of %rename being able to apply to a module in general.

I think I'd be inclined to go for the variant of %rename that can call a command if your rules are more complicated than a single variant or a regex.
I would use perl personally and it has a plethora of CPAN modules for things like renaming, e.g.:
%rename("command:perl build/rename.pl <<<")
The manual warns against this because it's slow spawning processes to perform it. Given that typically you don't run SWIG very often I don't see that as a huge draw back.

Related

Having Multiple Commands for Calling a Specific Programming Language: To Provide a Delimiter-less Option or Not?

After re-reading the off/on topic lists, I'm still not certain if this question is best posted to this site, so apologies in advance, if it is not.
Overview:
I am working on a project that mixes several programming languages and we are trying to determine important considerations for the command used to call one in particular.
For definiteness, I will list the specific languages; however, I think the principles ought to be general, so familiarity with these specific languages is not really essential.
Specific Context
Specifically, we are using: Maxima, KaTeX, Markdown and HTML). While building the prototype, we have used the following (I believe, standard) conventions:
KaTeX delimited by $ $ or $$ $$;
HTML delimited by < > </ > pairs;
Markdown works anywhere in the body, except within KaTeX or Maxima environments;
The only non-standard convention we used during this design phase was to call on Maxima using \comp{<Maxima commands>}. This command works within all the other environments (which is desired).
Now that we are ready to start using the platform, it has become apparent that this temporary command for calling Maxima is cumbersome for our users. The vast majority of use cases involve simply calling a single variable or function, e.g.
As such, we have $\eval{function-name()}(\eval{variable-name})$
as opposed to actually using Maxima for computation, e.g.
Here, it is clear that $\eval{a} + \eval{b} = \eval{a+b}$
(where \eval{a+b} would return the actual sum, as calculated by Maxima).
As such, our users would prefer a delimiter-less command option for invoking a single variable or function, e.g. \#<variable-name-in-Maxima> and \#<function-name>(<argument>) (where # is some reserved character not used in the other languages), while also having a delimited alternative for the (much less frequent) cases where they actually want to use Maxima for computation; perhaps something like \#{a+b}.
However, we have a general sense that this is not a best practice, even though we can't foresee any specific issue.
"Research" / Comparisons:
Indeed, there is precedence for delimit-less expressions for single arguments like x^2 (on any calculator) or Knuth's a \over b in TeX (which persists in LaTeX with \frac12 being parsed as \frac{1}{2}.
IIRC Knuth's point was that this delimit-less notation was more semantic (and so, in his view, preferable), and because delimiters can be added, ambiguity can be avoided, whenever the need arises: e.g. x^{22}, {a+b}\over{c+d} and \frac{12}{3}.
The Question, Proper:
Can anyone point to or explain actual shortcomings / risks associated with a dual solution like:
\#<var>, \#<function>(<arg>) and,
\#[<extended expression>],
(where # is a reserved (& escapable) character), for calling one language amongst others, as opposed to only using a delimited command?
Any alternative suggestions for how to achieve the ease-of-use and more semantic code enabled by the above solution, while keeping the code unambiguous would be very much welcome and appreciated.

Why does a function name have to be specified in a use statement?

In perl, sometimes it is necessary to specify the function name in the use statement.
For example:
use Data::DPath ('dpath');
will work but
use Data::DPath;
won't.
Other modules don't need the function names specified, for example:
use WWW::Mechanize;
Why?
Each module chooses what functions it exports by default. Some choose to export no functions by default at all, you have to ask for them. There's a few good reasons to do this, and one bad one.
If you're a class like WWW::Mechanize, then you don't need to export any functions. Everything is a class or object method. my $mech = WWW::Mechanize->new.
If you're a pragma like strict then there are no functions nor methods, it does its work simply by being loaded.
Some modules export waaay too many functions by default. An example is Test::Deep which exports...
all any array array_each arrayelementsonly arraylength arraylengthonly bag blessed bool cmp_bag cmp_deeply cmp_methods cmp_set code eq_deeply hash
hash_each hashkeys hashkeysonly ignore Isa isa listmethods methods noclass
none noneof num obj_isa re reftype regexpmatches regexponly regexpref
regexprefonly scalarrefonly scalref set shallow str subbagof subhashof
subsetof superbagof superhashof supersetof useclass
The problem comes when another module tries to export the same functions, or if you write a function with the same name. Then they clash and you get mysterious warnings.
$ cat ~/tmp/test.plx
use Test::Deep;
use List::Util qw(all);
$ perl -w ~/tmp/test.plx
Subroutine main::all redefined at /Users/schwern/perl5/perlbrew/perls/perl-5.20.2/lib/5.20.2/Exporter.pm line 66.
at /Users/schwern/tmp/test.plx line 2.
Prototype mismatch: sub main::all: none vs (&#) at /Users/schwern/perl5/perlbrew/perls/perl-5.20.2/lib/5.20.2/Exporter.pm line 66.
at /Users/schwern/tmp/test.plx line 2.
For this reason, exporting lots of functions is discouraged. For example, the Exporter documentation advises...
Do not export method names!
Do not export anything else by default without a good reason!
Exports pollute the namespace of the module user. If you must export try to use #EXPORT_OK in preference to #EXPORT and avoid short or common symbol names to reduce the risk of name clashes.
Unfortunately, some modules take this too far. Data::DPath is a good example. It has a really clear main function, dpath(), which it should export by default. Otherwise it's basically useless.
You can always turn off exporting with use Some::Module ();.
The reason is that some modules simply contain functions in them and they may or may not have chosen to export them by default, and that means they may need to be explicitly imported by the script to access directly or use a fully qualified name to access them. For example:
# in some script
use SomeModule;
# ...
SomeModule::some_function(...);
or
use SomeModule ('some_function');
# ...
some_function(...);
This can be the case if the module was not intended to be used in an object-oriented way, i.e. where no classes have been defined and lines such as my $obj = SomeModule->new() wouldn't work.
If the module has defined content in the EXPORT_OK array, it means that the client code will only get access to it if it "asks for it", rather than "automatically" when it's actually present in the EXPORT array.
Some modules automatically export their content by means of the #EXPORT array. This question and the Exporter docs have more detail on this.
Without you actually posting an MCVE, it's difficult to know what you've done in your Funcs.pm module that may be allowing you to import everything without using EXPORT and EXPORT_OK arrays. Perhaps you did not include the package Funcs; line in your module, as #JonathanLeffler suggested in the comments. Perhaps you did something else. Perl is one of those languages where people pride themselves in the TMTOWTDI mantra, often to a detrimental/counter-productive level, IMHO.
The 2nd example you presented is very different and fairly straightforward. When you have something like:
use WWW::Mechanize;
my $mech = new WWW::Mechanize;
$mech->get("http://www.google.com");
you're simply instantiating an object of type WWW::Mechanize and calling an instance method, called get, on it. There's no need to import an object's methods because the methods are part of the object itself. Modules looking to have an OOP approach are not meant to export anything. They're different situations.

Iterating over a string in Vimscript or Parse a JSON file

So I'm creating a vim script that needs to load and parse a JSON file into a local object graph. I searched and I couldn't find any native way to process a JSON file, and I don't want to add any dependencies to the script. So I wrote my own function to parse the JSON string (gotten from the file), but it's really slow. At the moment, I iterate through each character in the file like so:
let len = strlen(jsonString) - 1
let i = 0
while i < len
let c = strpart(jsonString, i, 1)
let i += 1
" A lot of code to process file....
" Note: I've tried short cutting the process by searching for enclosing double-quotes when I come across the initial double quotes (also taking into account escaping '\' character. It doesn't help
endwhile
I've also tried this method:
for c in split(jsonString, '\zs')
" Do a lot of parsing ....
endfor
For reference, a file with ~29,000 characters takes about 4 seconds to process, which is unacceptable.
Is there a better way to iterate over a string in vim script?
Or better yet, have I missed a native function to parse JSON?
Update:
I asked for no dependencies because I:
Didn't want to deal with them
Genuinely wanted some ideas for best way to do this without someone else's work.
Sometimes I just like to do things manually even though the problem has already been solved.
I'm not against plugins or dependencies at all, it's just that I'm curious. Thus the question.
I ended up creating my own function to parse the JSON file. I was creating a script that could parse the package.json file associated with node.js modules. Because of this, I could rely on a fairly consistent format and quit the processing whenever I'd retrieved the information I needed. This usually cut out large chunks of the file since most developers put the largest chunk of the file, their "readme" section, at the end. Because the package.json file is strictly defined, I left the process somewhat fragile. It assumed a root dictionary { } and actively looks for certain entries. You can find the script here: https://github.com/ahayman/vim-nodejs-complete/blob/master/after/ftplugin/javascript.vim#L33.
Of course, this doesn't answer my own question. It's only the solution to my unique problem. I'll wait a few days for new answers and pick the best one before the bounty ends (already set an alarm on my phone).
The simplest solution with the least dependencies is just using the json_decode vim function.
let dict = json_decode(jsonString)
Even though Vim's origin dates back a lot it happens that its internal string() eval() representation is that close to JSON that its likely to work unless you need special characters.
You can lookup the implementation here which even supports true/false/null if you want:
https://github.com/MarcWeber/vim-addon-json-encoding
Better use that library (vim-addon-manager allows to install dependencies easily).
Now it depends on your data whether this is good enough.
Now Benjamin Klein posted your question to vim_use which is why I'm replying.
Best and fast replies happen if you subscribe to the Vim mailinglist.
Goto vim.sf.net and follow the community link.
You cannot expect the Vim community to scrape stackoverflow.
I've added the keyword "json" and "parsing" to that little code that it can be found easier.
If this solution does not work for you you can try the many :h if_* bindings or write an external script which extracts the information you're looking for, or turns JSON into Vim's dictionary representation which can be read by eval() escaping special characters you care about correctly.
If you seek for completely correct solution omitting dependencies is one of the worst thing you can do. The eval() variant mentioned by #MarcWeber is one of the fastest, but it has its disadvantages:
Using solution for securing eval I mentioned in comment makes it no longer the fastest. In fact after you use this it makes eval() slower by more then an order of magnitude (0.02s vs 0.53s in my test).
It does not respect surrogate pairs.
It cannot be used to verify that you have correct JSON: it accepts some strings (e.g. "\<C-o>") that are not JSON strings and it allows trailing commas.
It fails to give normal error messages. It fails badly if you use vam#VerifyIsJSON I mentioned in p.1.
It fails to load floating point values like 1e10 (vim requires numbers to look like 1.0e10, but numbers like 1e10 are allowed: note “and/or” in the first paragraph).
. All of the above (except for the first) statements also apply to vim-addon-json-encoding mentioned by #MarcWeber because it uses eval. There are some other possibilities:
Fastest and the most correct is using python: pyeval('json.loads(vim.eval("varname"))'). Not faster then eval, but fastest among other possibilities. (0.04 in my test: approximately two times slower then eval())
Note that I use pyeval() here. If you want solution for vim version that lacks this functionality it will no longer be one of the fastest.
Use my json.vim plugin. It has an advantages of slightly better error reporting compared to failed vam#VerifyIsJSON, slightly worse compared to eval() and it correctly loads floating-point numbers. It can be used for verification of strings (it does not accept "\<C-a>"), but it loads lists with trailing comma just fine. It does not support surrogate pairs. It is also very slow: in the test I used (it uses 279702 character long strings) it takes 11.59s to load. Json.vim tries to use python if possible though.
For the best error reporting you can take yaml.vim and purge YAML support out of it leaving only JSON (I once have done the same thing for pyyaml, though in python: see markedjson library used in powerline: it is pyyaml minus YAML stuff plus classes with marks). But this variant is even slower then json.vim and should only be used if the main thing you need is error reporting: 207 seconds for loading the same 279702 character long string.
Note that the only variant mentioned that satisfies both requirements “no dependencies” and “no python” is eval(). If you are not fine with its disadvantages you have to throw away one or both of these requirements. Or copy-paste code. Though if you take speed into account only two candidates are left: eval() and python: if you want to parse json fast you really must use C and only these solutions spend most time in functions written in C.
Most other interpreters (ruby/perl/TCL) do not have pyeval() equivalent so they will be slower even if their JSON implementation is written in C. Some other (lua/racket (mzscheme)) have pyeval() equivalent, but e.g. luaeval('{}') is zero meaning that you will have to add additional step explicitly and recursively converting objects into vim dictionaries and lists (e.g. luaeval('vim.dict({})')) which will impact performance. Cannot say anything about mzeval(), but I have never heard about anybody actually using racket (mzscheme) with vim.

Equivalent of abbrev-mode but for functions?

I'm a big fan of abbrev-mode and I'd like something a bit similar: you start typing and as soon as you enter some punctation (or just a space would be enough) it invokes a function (if I type space after a special abbreviation, of course, just like abbrev-mode does).
I definitely do NOT want to execute some function every single time I hit space...
So instead of expanding the abbreviation using abbrev-mode, it would run a function of my choice.
Of course it needs to be compatible with abbrev-mode, which I use all the time.
How can I get this behavior?
One approach could be to use pre-abbrev-expand-hook. I don't use abbrev mode myself, but it rather sounds as if you could re-use the abbrev mode machinery this way, and simply define some 'abbreviations' which expand to themselves (or to nothing?), and then you catch them in that hook and take whatever action you wish to.
The expand library is apparently related, and that provides expand-expand-hook, which may be another alternative?
edit: Whoops; pre-abbrev-expand-hook is obsolete since 23.1
abbrev-expand-functions is the correct variable to use:
Wrapper hook around `expand-abbrev'.
The functions on this special hook are called with one argument:
a function that performs the abbrev expansion. It should return
the abbrev symbol if expansion took place.
See M-x find-function RET expand-abbrev RET for the code, and you'll also want to read C-h f with-wrapper-hook RET to understand how this hook is used.
EDIT:
Your revised question adds some key details that my answer didn't address. phils has provided one way to approach this issue. Another would be to use yasnippet . You can include arbitrary lisp code in your snippet templates, so you could do something like this:
# -*- mode: snippet -*-
# name: foobars
# key: fbf
# binding: direct-keybinding
# --
`(foo-bar-for-the-win)`
You'd need to ensure your function didn't return anything, or it would be inserted in the buffer. I don't use abbrev-mode, so I don't know if this would introduce conflicts. yas/snippet takes a bit of experimenting to get it running, but it's pretty handy once you get it set up.
Original answer:
You can bind space to any function you like. You could bind all of the punctuation keys to the same function, or to different functions.
(define-key your-mode-map " " 'your-choice-function)
You probably want to do this within a custom mode map, so you can return to normal behaviour when you switch modes. Globally setting space to anything but self-insert would be unhelpful.
Every abbrev is composed of several elements. Among the main elements are the name (e.g. "fbf"), the expansion (any string you like), and the hook (a function that gets called). In your case it sounds like you want the expansion to be the empty string and simply specify your foo-bar-for-the-win as the hook.

multilevel parsing algorithm

rephrase...
I'd like to know how to best to parse functions/conditionals. so if you have something like: [if {a} is {12 or 34}][if {b} not {55}] show +c+ [/if][/if] which is a conditional inside a conditional. Looks like I can't do this with regex only.
original question
for now I have a pretty simple way of parsing out some commands through actionscript.
I'm using regexp to find tags, commands and operands using...
+key_word+ // any text surrounded by +
[ifempty +val_1+]+val_2+[/ifempty] //simple conditional
[ifisnot={`true,yes`} +ShowTitle+]+val_3+[/ifisnot] // conditional with operands
my current algorithm matches the opening tag[**] with the first closing tag [/**] even though it doesn't match. Which means that I could not do something like [ifempty +val_2+][ifnotempty +val_2]+val_3+[/ifnotempty]+val_4+[/ifempty] - essentially putting one conditional inside another one.
I'm using an inline way of parsing that splits the string into an array of strings based on this regexp \[[^\/](?:[^\]])*\](?:[^\]])*\[\/(?:[^\]])*\]
can anyone suggest a more robust algorithm with a more robust parsing convention/standard? especially for as3.
Regular expressions define Regular Languages. Regular Languages cannot have regions of constrained, but potentially infinite, recursion.
One way of thinking about it is that all Regular Languages can be represented by a Finite State Machine. You would need a state for every possible number of if's, but the machine must be 'finite', so your in a bind. A classic example is:
a{n}b{n}, n >= 0
(meaning n a's, followed by n b's)
As you parse each a, you would need to go to another state (FSMs have no memory beyond the state their in, that's the only way they could remember n to match it later). To parse any number of n's, you would need an infinite number of states.
This is the same situation you're in, a regular expression could express a finite number of ifs (although it would take quite a bit of copy-pasting), but not an infinite number. Note however that some regular expression implementations cheat a bit, giving them more power than their mathematical equivalents.
In any case, your best bet is to use a more powerful parsing method. A recursive descent parser is particularly fun to implement, and could easily do what you need. You could also look into an LR-parser, or build a simple parser using a stack. Depending on your language, you might be able to find a parsing library such as pyparse for Python or Boost Spirit for C++.