How can I collapse multiple arguments into one SWIG parameter - swig

I'm trying to write a typemap that converts multiple/variable arguments into one input parameter.
For example, say I have a function that takes a vector.
void foo(vector<int> x);
And I want to call it like this (happens to be in Perl)
foo(1,2,3,4);
The typemap should take arguments ($argnum, ...), gather them into one vector and then pass that to foo.
I have this so far:
typedef vector<int> vectori;
%typemap(in) (vectori) {
for (int i=$argnum-1; i<items; i++) {
$1->push_back( <argv i> ); // This is language dependent, of course.
}
}
This would work, except that SWIG checks the number of arguments
if ((items < 1) || (items > 1)) {
SWIG_croak("Usage: foo(vectori);");
}
If I do:
void foo(vectori, ...);
SWIG will expect to call foo with two arguments.
foo(arg1, arg2);
Perhaps there's a way to tell SWIG to suppress arg2 from the call to foo?
I can't use this in my .i:
void foo(...)
because I want to have different typemaps, depending on the types that foo is expecting (an array of int, strings, whatever). Maybe there's a way to give a type to "..."
Is there a way to do this?

SWIG has built-in support for some STL classes. Try this for your SWIG .i file:
%module mymod
%{
#include <vector>
#include <string>
void foo_int(std::vector<int> i);
void foo_str(std::vector<std::string> i);
%}
%include <std_vector.i>
%include <std_string.i>
// Declare each template used so SWIG exports an interface.
%template(vector_int) std::vector<int>;
%template(vector_str) std::vector<std::string>;
void foo_int(std::vector<int> i);
void foo_str(std::vector<std::string> i);
Then call it with array syntax in the language of choice:
#Python
import mymod
mymod.foo_int([1,2,3,4])
mymod.foo_str(['abc','def','ghi'])

SWIG determines the argument count at the time SWIG generates the bindings. SWIG does provide some limited support for variable argument lists but I'm not sure this is the right approach to take. If you're interested, you can read more about it in the SWIG vararg documentation section.
I think a better approach would be to pass these values in as an array reference. Your typemap would then look something like this (not tested):
%typemap(in) vectori (vector<int> tmp)
{
if (!SvROK($input))
croak("Argument $argnum is not a reference.");
if (SvTYPE(SvRV($input)) != SVt_PVAV)
croak("Argument $argnum is not an array.");
$1 = &$tmp;
AV *arrayValue = (AV*)SvRV($input);
int arrayLen = av_len(arrayLen);
for (int i=0; i<=arrayLen; ++i)
{
SV* scalarValue = av_fetch(arrayValue , i, 0);
$1->push_back( SvPV(*scalarValue, PL_na) );
}
};
Then from Perl you'd use array notation:
#myarray = (1, 2, 3, 4);
foo(\#myarray);

Related

Using SWIG to wrap structures containing const char * without memory leak

I'm attempting to use SWIG to wrap a pre-existing library interface that expects the caller to manage the lifetime of some const char * values.
struct Settings {
const char * log_file;
int log_level;
};
// The Settings struct and all members only need to be valid for the duration of this call.
int Initialize(const struct Settings* settings);
int DoStuff();
int Deinitialize();
I started off using the most basic input to SWIG to wrap the library:
%module lib
%{
#include "lib.h"
%}
%include "lib.h"
This leads to SWIG warning about a potential memory leak:
lib.h(2) : Warning 451: Setting a const char * variable may leak memory.
Which is entirely understandable as looking at lib_wrap.c, SWIG has generated code that will malloc a buffer into the log_file value but never frees it:
SWIGINTERN PyObject *_wrap_Settings_log_file_set(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
struct Settings *arg1 = (struct Settings *) 0 ;
char *arg2 = (char *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
int res2 ;
char *buf2 = 0 ;
int alloc2 = 0 ;
PyObject *swig_obj[2] ;
if (!SWIG_Python_UnpackTuple(args, "Settings_log_file_set", 2, 2, swig_obj)) SWIG_fail;
res1 = SWIG_ConvertPtr(swig_obj[0], &argp1,SWIGTYPE_p_Settings, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Settings_log_file_set" "', argument " "1"" of type '" "struct Settings *""'");
}
arg1 = (struct Settings *)(argp1);
res2 = SWIG_AsCharPtrAndSize(swig_obj[1], &buf2, NULL, &alloc2);
if (!SWIG_IsOK(res2)) {
SWIG_exception_fail(SWIG_ArgError(res2), "in method '" "Settings_log_file_set" "', argument " "2"" of type '" "char const *""'");
}
arg2 = (char *)(buf2);
if (arg2) {
size_t size = strlen((const char *)((const char *)(arg2))) + 1;
arg1->log_file = (char const *)(char *)memcpy(malloc((size)*sizeof(char)), arg2, sizeof(char)*(size));
} else {
arg1->log_file = 0;
}
resultobj = SWIG_Py_Void();
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return resultobj;
fail:
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return NULL;
}
If I change the type of log_file to char * then the warning goes away and it appears that multiple attempts to set the value of log_file will no longer leak memory:
SWIGINTERN PyObject *_wrap_Settings_log_file_set(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
struct Settings *arg1 = (struct Settings *) 0 ;
char *arg2 = (char *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
int res2 ;
char *buf2 = 0 ;
int alloc2 = 0 ;
PyObject *swig_obj[2] ;
if (!SWIG_Python_UnpackTuple(args, "Settings_log_file_set", 2, 2, swig_obj)) SWIG_fail;
res1 = SWIG_ConvertPtr(swig_obj[0], &argp1,SWIGTYPE_p_Settings, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Settings_log_file_set" "', argument " "1"" of type '" "struct Settings *""'");
}
arg1 = (struct Settings *)(argp1);
res2 = SWIG_AsCharPtrAndSize(swig_obj[1], &buf2, NULL, &alloc2);
if (!SWIG_IsOK(res2)) {
SWIG_exception_fail(SWIG_ArgError(res2), "in method '" "Settings_log_file_set" "', argument " "2"" of type '" "char *""'");
}
arg2 = (char *)(buf2);
if (arg1->log_file) free((char*)arg1->log_file);
if (arg2) {
size_t size = strlen((const char *)(arg2)) + 1;
arg1->log_file = (char *)(char *)memcpy(malloc((size)*sizeof(char)), (const char *)(arg2), sizeof(char)*(size));
} else {
arg1->log_file = 0;
}
resultobj = SWIG_Py_Void();
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return resultobj;
fail:
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return NULL;
}
However it still appears that the memory allocated for log_file will be leaked when the Settings object is garbage collected in Python.
What is the recommended way of managing lifetimes of char * struct values in SWIG in a way which avoids these memory leaks?
Strings are a bit awkward to do right here. There are several ways to side-step the issue you're seeing. Simplest is to use a fixed size array in the struct, but it's 2019. Personally I'd wholeheartedly recommend using idiomatic C++ instead (it's 2019!), which would mean std::string and then the whole issue evaporates.
Failing that you're stuck in a case where to make the interface Pythonic you'll have to do some extra work. We can keep the total amount of work low and the nice thing about SWIG is that we can pick and choose where we target the extra effort we make, there's no "all or nothing". The main problem here is that we want to tie the lifespan of the buffer the log_file path is stored in to the lifespan of the Python Settings object itself. We can achieve that in multiple different ways depending on your preference for writing Python code, C or Python C API calls.
What we can't really solve is the case were you're given a borrowed pointer to a Settings struct by some other code (i.e. it's not owned/managed by Python) and you want to change log_file string in that borrowed object. The API you've got doesn't really give us a way to do that, but it seems like this isn't a case that really matters in your current module.
So without further ado below are a few options for tying the lifespan of a buffer that holds your string to a Python object that points to the buffer.
Option #1: Make Settings wholly or partially immutable, use a single malloc call to hold both the struct itself and the string it refers to. For this use case that's probably my preferred option.
We can do that fairly simply by giving the Settings type a constructor in Python which handles this and it doesn't force you to use C++:
%module lib
%{
#include "lib.h"
%}
// Don't let anybody change this other than the ctor
%immutable Settings::log_file;
%include "lib.h"
%extend Settings {
Settings(const char *log_file) {
assert(log_file); // TODO: handle this properly
// Single allocation for both things means the single free() is sufficient and correct
struct Settings *result = malloc(strlen(log_file) + 1 + sizeof *result);
char *buf = (void*)&result[1];
strcpy(buf, log_file);
result->log_file = buf;
return result;
}
}
If you wanted to make the path mutable you could write a little extra Python code that wraps this up and acts a proxy which creates a new immutable object every time you "mutate" it on the Python side. You could also go the other way and make the other members of settings immutable. (Thinking about it some more it'd be neat if SWIG could optionally auto synthesize a kwargs constructor for aggregate/POD types and wouldn't be too hard to add that as a patch).
This is my personal preference here, I like immutable things and overall it's a fairly small tweak to the generated interface to get something sane.
Option #2a: Make another Python object that manages the lifespan of the string buffer and then "stash" a reference to that inside the Python side of every Settings struct that's owned by Python.
%module lib
%{
#include "lib.h"
%}
%typemap(in) const char *log_file %{
// Only works for Python owned objects:
assert(SWIG_Python_GetSwigThis($self)->own & SWIG_POINTER_OWN); // TODO: exception...
// Python 2.7 specific, 3 gets more complicated, use bytes buffers instead.
$1 = PyString_AsString($input);
assert($1); // TODO: errors etc.
// Force a reference to the original input string to stick around to keep the pointer valid
PyObject_SetAttrString($self, "_retained_string", $input);
%}
%typemap(memberin) const char *log_file %{
// Because we trust the in typemap has retained the pointer for us this is sufficient now:
$1 = $input;
%}
%include "lib.h"
These typemaps work together to keep a reference to the PyObject string stashed inside the Settings PyObject as an attribute. It only works safely here because a) we assume Python owns the object, and we're not using -builtin in SWIG, so we can safely stash things in attributes to keep them around and b) because it's const char *, not char * we can be pretty sure that (unless there's some K&R silliness going on) that nobody will be changing the buffer.
Option #2b: The general idea is the same, but instead of using typemaps, which means writing Python C API calls use something like this:
%extend Settings {
%pythoncode {
#property
# ....
}
}
To do the same thing. Similar code could also be produced using %pythonprepend instead if preferred. However this is my least preferred solution here, so I've not fully fleshed it out.
You can tell SWIG to use char* semantics for log_file. Unfortunately, it doesn't seem possible to use Settings::log_file (the required memberin does not show up in the pattern matching), so there could be clashes if that data member name is used in other structs as well with the same type but different semantics. This would look like:
%module lib
%{
#include "lib.h"
%}
%typemap(out) char const *log_file = char *;
%typemap(memberin) char const *log_file = char *;
%extend Settings {
Settings() {
Settings* self = new Settings{};
self->log_file = nullptr;
self->log_level = 0;
return self;
}
~Settings() {
delete[] self->log_file; self->log_file = nullptr;
delete self;
}
}
%include "lib.h"
(Note that SWIG in my case produces delete[], not free().)
EDIT: added a custom destructor to delete the log_file memory on garbage collection. (And for good measure also a constructor to make sure that an uninitialized log_file is nullptr, not some random memory.) What this does, is add an internal function delete_Settings to the wrapper file, which gets called in _wrap_delete_Settings, which is called on object destruction. Yes, syntax is a bit odd, b/c you're effectively describing Python's __del__ (taking a self), only labeled as a C++ destructor.

How to change the default code generated by SWIG for the allocation of memory for a C structure?

I am using a flexible array in the structure. So I want to change the memory allocated for that structure with some of my own code. Basically I want to change the new_structname() and structname_variable_set() functions.
typedef struct vector{
int x;
char y;
int arr[0];
} vector;
here, SWIG generated new_vector() function to allocate memory by calling calloc(1,sizeof(struct vector)) where swig will not handle these type of structure in a special manner. So we need to modify the swig generated new_vector() in order to allocate memory for the flexible array. So is there any way to handle this?
There are a few ways you can do this. What you're looking for though is %extend. That lets us define new constructors and implement them as we see fit. (It even works with a C compiler, they're only constructors from the perspective of the target language).
Using your vector as a starting point we can illustrate this:
%module test
%include <stdint.i>
%inline %{
typedef struct vector{ int x; char y; int arr[0]; }vector;
%}
%extend vector {
vector(const size_t len) {
vector *v = calloc(1, sizeof *v + len);
v->x = len;
return v;
}
}
With this SWIG synthesises a new_vector function in the generated module code as you'd hoped.
I also assumed that you want to record the length inside the struct as one of its members. If that's not the case you can simply delete the assignment I made.

How can set different function signature to the same function pointer?

How can I set a function pointer depending on some condition to functions with different signature?
Example:
short int A()
{
return 0;
}
long int B()
{
return 0;
}
void main()
{
std::function<short int()> f = A;
f();
if(true)
{
//error
f = B;
}
}
How can use the same function pointer for two functions with different signature?
Is it possible?
If is not, there is an efficient way to call the appropriate function depending on behavior instead of use a variable and split the whole code with if statements?
EDIT / EXPANSION ("2nd case")
#include <SDL.h>
class Obj { //whatever ...}
class A
{
private:
Uint16 ret16() { return SDL_ReadLE16(_pFile); }
Uint32 ret32() { return SDL_ReadLE32(_pFile); }
_pFile = nullptr;
public:
Obj* func()
{
Obj obj = new Obj();
_pFile = SDL_RWFromFile("filename.bin","r"));
auto ret = std::mem_fn(&SHPfile::ret16);
if(true)
{
ret = std::mem_fn(&SHPfile::ret32);
}
//ret();
// continue whatever
// ....
SDL_RWclose(_pFile);
return *obj;
}
}
I have a compilation error on a similar case using the Uint16 and Uint32 variable of SDL 2 library, using std::mem_fn
the compiler give me this error (relative to my code, but it's implemented in a way like the above example):
error: no match for ‘operator=’ (operand types are ‘std::_Mem_fn<short unsigned int (IO::File::*)()>’ and ‘std::_Mem_fn<unsigned int (IO::File::*)()>’)
To resolve this compilation error, I forced both the function to return a int type.
Is there a better way?
Or I did something wrong?
The comments already say that clang accepts the code as is, and I can now say that GCC 4.8.4 and GCC 4.9.2 both accept it as well, after fixing void main() to say int main().
This use of std::function is perfectly valid. The C++11 standard says:
20.8.11.2 Class template function [func.wrap.func]
function& operator=(const function&);
function& operator=(function&&);
function& operator=(nullptr_t);
There is no template assignment operator here, so assignment of B could only construct a new temporary function<short int()> object, and move-assign from that. To determine whether the construction of that temporary is possible:
20.8.11.2.1 function construct/copy/destroy [func.wrap.func.con]
template<class F> function(F f);
template <class F, class A> function(allocator_arg_t, const A& a, F f);
7 Requires: F shall be CopyConstructible. f shall be Callable (20.8.11.2) for argument types ArgTypes and return type R. The copy constructor and destructor of A shall not throw exceptions.
20.8.11.2 Class template function [func.wrap.func]
2 A callable object f of type F is Callable for argument types ArgTypes and return type R if the expression INVOKE(f, declval<ArgTypes>()..., R), considered as an unevaluated operand (Clause 5), is well formed (20.8.2).
20.8.2 Requirements [func.require]
2 Define INVOKE(f, t1, t2, ..., tN, R) as INVOKE(f, t1, t2, ..., tN) implicitly converted to R.
1 Define INVOKE(f, t1, t2, ..., tN) as follows:
... (all related to pointer-to-member types)
f(t1, t2, ..., tN) in all other cases.
In short, this means that std::function<short int()> can be used with any function that can be called with no arguments, and which has a return type that can be implicitly converted to short. long clearly can be implicitly converted to short, so there is no problem whatsoever.
If your compiler's library doesn't accept it, and you cannot upgrade to a more recent version, one alternative is to try boost::function instead.
Aaron McDaid points out lambdas as another alternative: if your library's std::function is lacking, you can write
std::function<short int()> f = A;
f = []() -> short int { return B(); };
but if you take this route, you can take it a step further and avoid std::function altogether:
short int (*f)() = A;
f = []() -> short int { return B(); };
This works because lambas that don't capture anything are implicitly convertible to a pointer-to-function type that matches the lambda's arguments and return type. Effectively, it's short for writing
short int B_wrapper() { return B(); }
...
f = B_wrapper;
Note: the conversion from long to short may lose data. If you want to avoid that, you can use std::function<long int()> or long int (*)() instead.
No, you can't do that in a statically typed language unless your types all have a common super type, and C++ doesn't have that for primitives. You would need to box them into an object, then have the function return the object.
However, if you did that, you may as well just keep an object pointer around and use that instead of a function pointer, especially since it's going to make it easier to actually do something useful with the result without doing casts all over the place.
For example, in a calculator I wrote in Java, I wanted to work with BigInteger fractions as much as possible to preserve precision, but fallback to doubles for operations that returned irrational numbers. I created a Result interface, with BigFractionResult and DoubleResult implementations. The UI code would call things like Result sum = firstOperand.add(otherOperand) and didn't have to care which implementation of add it was using.
The cleanest option that comes to mind is templates:
#include <iostream>
using namespace std;
template <typename T>
T foo() {
return 0;
}
int main() {
long a = foo<long>();
cout << sizeof a << " bytes with value " << a << endl;
int b = foo<int>();
cout << sizeof b << " bytes with value " << b << endl;
short c = foo<short>();
cout << sizeof c << " bytes with value " << c << endl;
return 0;
}
In ideone.com this outputs:
4 bytes with value 0
4 bytes with value 0
2 bytes with value 0
Hopefully this is what you needed.
If for some reason you really need to pass an actual function around, I would recommend looking into std::function and trying to write some template code using that.

SWIG: objects of a custom class as output argument (with Python)

(This is a question I asked yesterday, but I simplified it)
I've created a class, of which I want two objects as output arguments of a function (called Test below). But when I run the swig command swig -c++ -python swigtest.i I'm getting the error "Warning 453: Can't apply (MyClass &OUTPUT). No typemaps are defined." I tried adding typemaps, but that doesn't help. I also tried using pointers, pointers to pointers and references to pointers, that doesn't help either.
I feel like I've overlooked something simple, because this should be quite a common thing to do. Or do I need to write a complex typemap, like I've seen around but don't understand (yet)?
Below is my code:
MyClass.h (simplified to make it understandable, so switching to just int doesn't help):
class MyClass
{
int x;
public:
int get() const
{
return x;
}
};
void Test(MyClass &obj1, MyClass &obj2);
swigtest.i:
%module swigtest
%include typemaps.i
%{
#define SWIG_FILE_WITH_INIT
%}
%{
#include "MyClass.h"
%}
%include "MyClass.h"
%apply (MyClass& OUTPUT) { MyClass &obj1 }
%apply (MyClass& OUTPUT) { MyClass &obj2 }
As noted in my previous comment, the %apply OUTPUT trick only works for a limited set of POD types.
For future Swiggers, this solution worked for me (in C# bindings):
%typemap(cstype) CustomType* "out CustomType"
%typemap(csin,
pre=" $csclassname temp$csinput = new $csclassname();",
post=" $csinput = temp$csinput;"
) CustomType* "$csclassname.getCPtr(temp$csinput)"
This generates a public interface with an "out" param for CustomType passed by pointer. The internal P/Invoke interface (csim) is left as raw pointers. The "csin" typemap creates a temp variable and assigns to the output parameter.
Also worth noting that in C#, if MyCustomType is already a reference type, you may not need this, however it's strange to have an API that modifies the parameter value without declaring it as "out" (this actually works for my type, but I prefer the explicit out param).
Try:
%module swigtest
%{
#define SWIG_FILE_WITH_INIT
#include "MyClass.h"
%}
%include "typemaps.i"
%apply MyClass *OUTPUT { MyClass &obj1, MyClass &obj2 };
%include "MyClass.h"
You could also create a wrapper that returns a std::list:
%include "std_list.i"
%ignore Test;
%rename(Test) TestWrap;
%inline %{
std::list<MyClass> TestWrap() {
MyClass obj1, obj2;
Test(obj1, obj2);
std::list<MyClass> tempList;
tempList.push_back(obj1);
tempList.push_back(obj2);
return tempList;
}
%}
I settled for adding extra Python code in swigtest.i that wraps the test function, so that I can write obj1, obj2 = Test2(). I still think there must be an easier solution,
// swigtest.i:
%module swigtest
%{
#define SWIG_FILE_WITH_INIT
#include "MyClass.h"
%}
%include "MyClass.h"
%insert("python") %{
def Test2():
obj1 = swigtest.MyClass()
obj2 = swigtest.MyClass()
swigtest.Test(obj1, obj2)
return obj1, obj2
%}

Use SWIG to apply multiple Java data types for same C data type

I have two C functions that I'm exposing through SWIG to my Java layer and both have an input param with a const void * data type ("val) that needs to be a uint8_t for the addCategory function but a char for the addAttribute function. I'm currently, in the SWIG Interface file, using the %apply to map the const void * C type to a short on the Java side. Is there a way to modify the SWIG interface file to support both a char (String) and a uint8_t (short) for the const void * input parameter?
C Functions from header file:
int
addCategory(query_t *query, type_t type, const void *val);
int
addAttribute(query_t *query, type_t type, const void *val);
SWIG Interface File:
%module Example
%include "stdint.i"
void setPhy_idx(uint32_t value);
%include "arrays_java.i"
void setId(unsigned char *value);
%{
#include "Example.h"
%}
%apply char * { unsigned char * };
%apply char * { void * };
%apply uint8_t { const void * }
%apply int32_t { int32_t * }
%include "Example.h"
You can't directly do this - what type would be used in this place in Java? You need to help SWIG decide that in some way.
You have (at least) three possible solutions:
Use a type hierarchy - The base type will be what the function takes, the subclasses will get wrapped also. You could do this on the C++ side, or on the Java side using SWIG's typemap facilities. I think this is needlessly complicated though, so I've not made an example here.
Use overloads (or even different functions, with different names altogether - you could use %rename to make them back into overloads in Java even if they have different names in C)
Use a union. This will get wrapped with set and get functions by SWIG:
%module test
union values {
unsigned char *string;
void *generic;
uint8_t someOtherThing;
uint32_t number;
};
void func(values v);
This results in a Java class called values, which func() takes and can pass one of the members of the union through. Clearly you'd want to %apply appropriate typemaps for the members of the union.