Julia Constructors: Member variables not visible / out of scope in constructor - constructor

In C++ I can do the following:
class foo {
private:
int N;
public:
foo(const int pN) {
N = pN;
std::cout << N << std::endl;
}
};
or, with the concept of outer constructors in Julia in mind,
class foo {
private:
int N;
};
foo::foo(const int pN) {
N = pN;
std::cout << N << std::endl;
}
Can you do the same in Julia, i.e., set some member variables and then do something with them? Consider the MWE below:
struct foo
N::Int
function foo(pN::Int)
new(pN)
println("Hello World") # Gets printed
println(N) # ERROR: LoadError: UndefVarError: N not defined
end
end
Why is that and how do I deal with this?
Even more strange is the behaviour for outer constructors:
struct foo
N::Int
end
function foo(pN::Int)
println("Hello World") # Not shown
foo(pN)
println("Hello World") # Not shown
println(N) # No error
end
although I get the warning that the outer constructor overwrites the "default" one - so I suspected that I would at least see something, either print message or Error.

There is no scope. Julia does not have object oriented programming facilities. Store the result of new as self and access the value of N as a field of self.
julia> struct foo
N::Int
function foo(pN::Int)
self = new(pN)
println("Hello World")
println(self.N)
self
end
end
julia> foo(5)
Hello World
foo(5)
Your second example results in a stack overflow.

Related

Support for std::tuple in swig?

When calling a swig generated function returning std::tuple, i get a swig object of that std::tuple.
Is there a way to use type-maps or something else to extract the values? I have tried changing the code to std::vector for a small portion of the code, and that works. (using %include <std_vector.i> and templates) But i don't want to make too many changes in the C++ part.
Edit: here is a minimal reproducible example:
foo.h
#pragma once
#include <tuple>
class foo
{
private:
double secret1;
double secret2;
public:
foo();
~foo();
std::tuple<double, double> return_thing(void);
};
foo.cpp
#include "foo.h"
#include <tuple>
foo::foo()
{
secret1 = 1;
secret2 = 2;
}
foo::~foo()
{
}
std::tuple<double, double> foo::return_thing(void) {
return {secret1, secret2};
}
foo.i
%module foo
%{
#include"foo.h"
%}
%include "foo.h"
When compiled on my linux using
-:$ swig -python -c++ -o foo_wrap.cpp foo.i
-:$ g++ -c foo.cpp foo_wrap.cpp '-I/usr/include/python3.8' '-fPIC' '-std=c++17' '-I/home/simon/Desktop/test_stack_overflow_tuple'
-:$ g++ -shared foo.o foo_wrap.o -o _foo.so
I can import it in python as shown:
test_module.ipynb
import foo as f
Foo = f.foo()
return_object = Foo.return_thing()
type(return_object)
print(return_object)
Outputs is
SwigPyObject
<Swig Object of type 'std::tuple< double,double > *' at 0x7fb5845d8420>
Hopefully this is more helpful, thank you for responding
To clarify i want to be able to use the values in python something like this:
main.cpp
#include "foo.h"
#include <iostream>
//------------------------------------------------------------------------------'
using namespace std;
int main()
{
foo Foo = foo();
auto [s1, s2] = Foo.return_thing();
cout << s1 << " " << s2 << endl;
}
//------------------------------------------------------------------------------
Github repo if anybody is interested
https://github.com/simon-cmyk/test_stack_overflow_tuple
Our goal is to make something like the following SWIG interface work intuitively:
%module test
%include "std_tuple.i"
%std_tuple(TupleDD, double, double);
%inline %{
std::tuple<double, double> func() {
return std::make_tuple(0.0, 1.0);
}
%}
We want to use this within Python in the following way:
import test
r=test.func()
print(r)
print(dir(r))
r[1]=1234
for x in r:
print(x)
i.e. indexing and iteration should just work.
By re-using some of the pre-processor tricks I used to wrap std::function (which were themselves originally from another answer here on SO) we can define a neat macro that "just wraps" std::tuple for us. Although this answer is Python specific it should in practice be fairly simple to adapt for most other languages too. I'll post my std_tuple.i file, first and then annotate/explain it after:
// [1]
%{
#include <tuple>
#include <utility>
%}
// [2]
#define make_getter(pos, type) const type& get##pos() const { return std::get<pos>(*$self); }
#define make_setter(pos, type) void set##pos(const type& val) { std::get<pos>(*$self) = val; }
#define make_ctorargN(pos, type) , type v##pos
#define make_ctorarg(first, ...) const first& v0 FOR_EACH(make_ctorargN, __VA_ARGS__)
// [3]
#define FE_0(...)
#define FE_1(action,a1) action(0,a1)
#define FE_2(action,a1,a2) action(0,a1) action(1,a2)
#define FE_3(action,a1,a2,a3) action(0,a1) action(1,a2) action(2,a3)
#define FE_4(action,a1,a2,a3,a4) action(0,a1) action(1,a2) action(2,a3) action(3,a4)
#define FE_5(action,a1,a2,a3,a4,a5) action(0,a1) action(1,a2) action(2,a3) action(3,a4) action(4,a5)
#define GET_MACRO(_1,_2,_3,_4,_5,NAME,...) NAME
%define FOR_EACH(action,...)
GET_MACRO(__VA_ARGS__, FE_5, FE_4, FE_3, FE_2, FE_1, FE_0)(action,__VA_ARGS__)
%enddef
// [4]
%define %std_tuple(Name, ...)
%rename(Name) std::tuple<__VA_ARGS__>;
namespace std {
struct tuple<__VA_ARGS__> {
// [5]
tuple(make_ctorarg(__VA_ARGS__));
%extend {
// [6]
FOR_EACH(make_getter, __VA_ARGS__)
FOR_EACH(make_setter, __VA_ARGS__)
size_t __len__() const { return std::tuple_size<std::decay_t<decltype(*$self)>>{}; }
%pythoncode %{
# [7]
def __getitem__(self, n):
if n >= len(self): raise IndexError()
return getattr(self, 'get%d' % n)()
def __setitem__(self, n, val):
if n >= len(self): raise IndexError()
getattr(self, 'set%d' % n)(val)
%}
}
};
}
%enddef
This is just the extra includes we need for our macro to work
These apply to each of the type arguments we supply to our %std_tuple macro invocation, we need to be careful with commas here to keep the syntax correct.
This is the mechanics of our FOR_EACH macro, which invokes each action per argument in our variadic macro argument list
Finally the definition of %std_tuple can begin. Essentially this is manually doing the work of %template for each specialisation of std::tuple we care to name inside of the std namespace.
We use our macro for each magic to declare a constructor with arguments for each element of the correct type. The actual implementation here is the default one from the C++ library which is exactly what we need/want though.
We use our FOR_EACH macro twice to make a member function get0, get1, getN of the correct type of each tuple element and the correct number of them for the template argument size. Likewise for setN. Doing it this way allows the usual SWIG typemaps for double, etc. or whatever types your tuple contains to be applied automatically and correctly for each call to std::get<N>. These are really just an implementation detail, not intended to be part of the public interface, but exposing them makes no real odds.
Finally we need an implementation of __getitem__ and a corresponding __setitem__. These simply look up and call the right getN/setN function on the class and call that instead. We take care to raise IndexError instead of the default exception if an invalid index is used as this will stop iteration correctly when we try to iterate of the tuple.
This is then sufficient that we can run our target code and get the following output:
$ swig3.0 -python -c++ -Wall test.i && g++ -shared -o _test.so test_wrap.cxx -I/usr/include/python3.7 -m32 && python3.7 run.py
<test.TupleDD; proxy of <Swig Object of type 'std::tuple< double,double > *' at 0xf766a260> >
['__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__swig_destroy__', '__swig_getmethods__', '__swig_setmethods__', '__weakref__', 'get0', 'get1', 'set0', 'set1', 'this']
0.0
1234.0
Generally this should work as you'd hope in most input/output situations in Python.
There are a few improvements we could look to make:
Implement repr
Implement slicing so that tuple[n:m] type indexing works
Handle unpacking like Python tuples.
Maybe do some more automatic conversions for compatible types?
Avoid calling __len__ for every get/setitem call, either by caching the value in the class itself, or postponing it until the method lookup fails?

How to write constructor for a smart pointer

I am having some trouble writing the constructor for my UniquePtr class.
Here is the UniquePtr constructor
UniquePtr::UniquePtr(Foo *ptr) {
_ptr = ptr;
Foo(myStr);
}
Here is the Foo constructor
Foo::Foo( const string& tag ) : _serial{++_count}, _tag{tag} {
// If tag is empty, set it to the serial number
cout << "* c-tor - Foo S#: " << _serial
<< ( _tag.length( ) > 0 ? " Tag: " : "" ) << _tag << "\n";
}
And this is the instruction for what I need to do: "Modify your constructor so that it will actually perform the memory allocation step during construction. Since the Foo class’s constructor takes an optional std::string argument, your new UniquePtr constructor should do the same. This parameter should default to the empty string. Pass the value from the UniquePtr constructor directly into the Foo constructor when you allocate the new object."
What I should do with my constructor right now?

How to reuse functors with member data over many kernel executions in CUDA to improve memory usage and decrease copy time?

I am translating a c++11 program which calculates contact forces between particle pairs into a cuda program. All the particle pairs are independent from each other. I use a functor to calculate the contact force. This functor does many computations and contains a lot of member variables. Therefore I am trying to reuse the functors, instead of making one new functor per particle pair.
Because the functor contains virtual functions, the functor cloning is done on the device instead of on the host.
I am thinking of a scheme which goes like this:
1) Clone M functors
2) Start computing M particle pairs
3) Particle pair M+1 waits until one particle pair has completed and then reuses its functor
However, other ideas are also very welcome.
I've made a very simplified version of the program. In this play program, the F variable does not have to be a member variable, but in the real program it needs to be. There is also a lot more member data and particle pairs (N) in the real program. N is often a few million.
#include <stdio.h>
#define TPB 4 // realistic value = 128
#define N 10 // realistic value = 5000000
#define M 5 // trade of between copy time and parallel gain.
// Realistic value somewhere around 1000 maybe
#define OPTION 1
// option 1: Make one functor per particle pair => works, but creates too many functor clones
// option 2: Only make one functor clone => no more thread independent member variables
// option 3: Make M clones which get reused => my suggestion, but I don't know how to program it
struct FtorBase
{
__device__ virtual void execute(long i) = 0;
__device__ virtual void show() = 0;
};
struct FtorA : public FtorBase
{
__device__ void execute(long i) final
{
F = a*i;
}
__device__ void show() final
{
printf("F = %f\n", F);
}
double a;
double F;
};
template <class T>
__global__ void cloneFtor(FtorBase** d_ftorBase, T ftor, long n_ftorClones)
{
const long i = threadIdx.x + blockIdx.x * blockDim.x;
if (i >= n_ftorClones) {
return;
}
d_ftorBase[i] = new T(ftor);
}
struct ClassA
{
typedef FtorA ftor_t;
FtorBase** getFtor()
{
FtorBase** d_cmFtorBase;
cudaMalloc(&d_cmFtorBase, N * sizeof(FtorBase*));
#if OPTION == 1
// option 1: Create one copy of the functor per particle pair
printf("using option 1\n");
cloneFtor<<<(N + TPB - 1) / TPB, TPB>>>(d_cmFtorBase, ftor_, N);
#elif OPTION == 2
// option 2: Create just one copy of the functor
printf("using option 2\n");
cloneFtor<<<1, 1>>>(d_cmFtorBase, ftor_, 1);
#elif OPTION == 3
// option 3: Create M functor clones
printf("using option 3\n");
printf("This option is not implemented. I don't know how to do this.\n");
cloneFtor<<<(M + TPB - 1) / TPB, TPB>>>(d_cmFtorBase, ftor_, M);
#endif
cudaDeviceSynchronize();
return d_cmFtorBase;
}
ftor_t ftor_;
};
__global__ void cudaExecuteFtor(FtorBase** ftorBase)
{
const long i = threadIdx.x + blockIdx.x * blockDim.x;
if (i >= N) {
return;
}
#if OPTION == 1
// option 1: One functor per particle was created
ftorBase[i]->execute(i);
ftorBase[i]->show();
#elif OPTION == 2
// option 2: Only one single functor was created
ftorBase[0]->execute(i);
ftorBase[0]->show();
#elif OPTION == 3
// option 3: Reuse the fuctors
// I don't know how to do this
#endif
}
int main()
{
ClassA* classA = new ClassA();
classA->ftor_.a = .1;
FtorBase** ftorBase = classA->getFtor();
cudaExecuteFtor<<<(N + TPB - 1) / TPB, TPB>>>(ftorBase);
cudaDeviceSynchronize();
return 0;
}
I am checking the output of F to see whether the member variable is independent in each call. As expected, when using a different functor for each particle pair (option 1), all the F values are different and when using only one functor for the whole program (option 2), all the F values are the same.
using option 1
F = 0.800000
F = 0.900000
F = 0.000000
F = 0.100000
F = 0.200000
F = 0.300000
F = 0.400000
F = 0.500000
F = 0.600000
F = 0.700000
using option 2
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
F = 0.700000
I wonder if there is a way to get all different F values in this play example without taking N copies (option 3).
PS: I am using Ubuntu 18.04, nvcc 9.1 and a NVIDIA GeForce GTX 1060 Mobile graphics card (cuda compatability 6.1).
UPDATE:
In the previous code I presented, there was only a problem in debug mode (comilation with -G flag) but not in the release version. I'm guessing that the compiler optimised printf("F = %f\n", F); to printf("F = %f\n", a*i); so that the problem of thread dependent member variables, what this question is about, disappeared.
I updated the code, so the compiler cannot do the substitution in the printf anymore.

"Storing unsafe C derivative of temporary Python reference" when trying to access struct pointer

I want to use a library that gives me a dynamic array. The dynamic array struct has a property void* _heap_ptr which gives the start of the array.
After having built the list, I want to access this pointer in cython (to make a copy of the array). But I cannot seem to get the pointer element from the struct.
Here is my pyx:
cimport src.clist as l
def main():
cdef l.ptr_list basic_list
cdef int i = 42
basic_list = l.create_list_size(sizeof(i), 100)
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
for i in range(1):
print(arr[i])
This is the error message:
Error compiling Cython file:
------------------------------------------------------------
...
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
^
------------------------------------------------------------
src/test.pyx:14:20: Cannot convert Python object to 'int *'
Error compiling Cython file:
------------------------------------------------------------
...
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
^
------------------------------------------------------------
src/test.pyx:14:20: Storing unsafe C derivative of temporary Python reference
And my pxd:
cdef extern from "src/list.h":
ctypedef struct _list:
void* _heap_ptr
ctypedef struct ptr_list:
pass
ptr_list create_list_size(size_t size, int length)
list_destroy(ptr_list this_list)
void* list_at_ptr(ptr_list this_list, int index)
list_add_ptr(ptr_list this_list, void* value)
How can I fix my code? Why is this happening? From my investigations that error message pops up if you have forgotten to declare something as C (ie. use malloc not libc.stdlib.malloc, but I cannot see that anything similar is happening here.)
There are two issues in your code.
First: struct ptr_list has no members and thus no member _heap_ptr. It probably should have been
ctypedef struct ptr_list:
void* _heap_ptr
Cython's error message is not really helpful here, but as you said it pops up usually when a C-declaration is forgotten.
Second: you need to cast from void * to int * explicitly:
arr = <int*>basic_list._heap_ptr

std::find with type T** vs T*[N]

I prefer to work with std::string but I like to figure out what is going wrong here.
I am unable to understand out why std::find isn't working properly for type T** even though pointer arithmetic works on them correctly. Like -
std::cout << *(argv+1) << "\t" <<*(argv+2) << std::endl;
But it works fine, for the types T*[N].
#include <iostream>
#include <algorithm>
int main( int argc, const char ** argv )
{
std::cout << *(argv+1) << "\t" <<*(argv+2) << std::endl;
const char ** cmdPtr = std::find(argv+1, argv+argc, "Hello") ;
const char * testAr[] = { "Hello", "World" };
const char ** testPtr = std::find(testAr, testAr+2, "Hello");
if( cmdPtr == argv+argc )
std::cout << "String not found" << std::endl;
if( testPtr != testAr+2 )
std::cout << "String found: " << *testPtr << std::endl;
return 0;
}
Arguments passed: Hello World
Output:
Hello World
String not found
String found: Hello
Thanks.
Comparing types of char const* amounts to pointing to the addresses. The address of "Hello" is guaranteed to be different unless you compare it to another address of the string literal "Hello" (in which case the pointers may compare equal). Your compare() function compares the characters being pointed to.
In the first case, you're comparing the pointer values themselves and not what they're pointing to. And the constant "Hello" doesn't have the same address as the first element of argv.
Try using:
const char ** cmdPtr = std::find(argv+1, argv+argc, std::string("Hello")) ;
std::string knows to compare contents and not addresses.
For the array version, the compiler can fold all literals into a single one, so every time "Hello" is seen throughout the code it's really the same pointer. Thus, comparing for equality in
const char * testAr[] = { "Hello", "World" };
const char ** testPtr = std::find(testAr, testAr+2, "Hello");
yields the correct result