Why NVCC is unable to match function definition to an existing declaration? - cuda

I have this files that produce the following error when compiling with nvcc
error C2244: 'TemplateClass<N>::print': unable to match function definition to an existing declaration
note: see declaration of 'TemplateClass<N>::print'
note: definition note: 'void TemplateClass<N>::print(const std::string [])'
note: existing declarations note: 'void TemplateClass<N>::print(const std::string [N])'
Template.h
#pragma once
#include <string>
#include <iostream>
template <unsigned int N>
class TemplateClass
{
private:
std::string name;
public:
TemplateClass();
TemplateClass(const std::string& name);
void print(const std::string familyName[N]);
};
#include "template.inl"
Template.inl
template <unsigned int N>
TemplateClass<N>::TemplateClass()
{
name = "Unknown";
}
template <unsigned int N>
TemplateClass<N>::TemplateClass(const std::string& name)
{
this->name = name;
}
template <unsigned int N>
void TemplateClass<N>::print(const std::string familyName[N])
{
std::cout << "My name is " << name << " ";
for (auto i = 0; i < N; i++)
std::cout << familyName[i] << " ";
std::cout << std::endl;
}
consume_template.cu
#include "template.h"
void consume_template_gpu()
{
TemplateClass<3> obj("aname");
std::string namesf[3];
namesf[0] = "un";
namesf[1] = "deux";
namesf[2] = "trois";
obj.print(namesf);
}
I am using VS2017 15.4.5, later versions failed to create the project with CMake.
The project was created with CMake like this
cmake_minimum_required(VERSION 3.10)
project(template_inl_file LANGUAGES CXX CUDA)
set (lib_files template.h consume_template.cu)
add_library(template_inl_file_lib ${lib_files})

Just out of curiosity , try using std::string namesf = {"un","deux","trois"}; It seems like a compiler issue. Trying different formats might help compiler to understand better. Otherwise code seems to be ok.
Maybe you're missing some linkage with CMake. Also try compiling straight from VS2017 without using CMake by creating a CUDA project.

What's happening is that the array is decaying into a pointer and the size of the array is lost during compilation. So this
template <unsigned int N>
void TemplateClass<N>::print(const std::string familyName[N]);
will be actually turned into this
template <unsigned int N>
void TemplateClass<N>::print(const std::string* familyName);
as we can see there is no way for the compiler to know that it has to generate different functions depending on the size of the array (i.e. the template parameter N).
To solve this we can use an old trick to avoid array decay like this
template <unsigned int N>
void TemplateClass<N>::print(const std::string (&familyName)[N]);
Now the size N is present through the compilation and the compiler knows there are different functions to be generated. I guess, as pointed out in the comments of the question that NVCC produces code that VS does not produce by itself and then it does not know how to handle it.
More info on the topic on the following links
http://pointer-overloading.blogspot.ch/2013/09/c-template-argument-deduction-to-deduce.html
http://en.cppreference.com/w/cpp/language/template_argument_deduction
https://theotherbranch.wordpress.com/2011/08/24/template-parameter-deduction-from-array-dimensions/

Related

nvcc warns about a device variable being a host variable - why?

I've been reading in the CUDA Programming Guide about template functions and is something like this working?
#include <cstdio>
/* host struct */
template <typename T>
struct Test {
T *val;
int size;
};
/* struct device */
template <typename T>
__device__ Test<T> *d_test;
/* test function */
template <typename T>
T __device__ testfunc() {
return *d_test<T>->val;
}
/* test kernel */
__global__ void kernel() {
printf("funcout = %g \n", testfunc<float>());
}
I get the correct result but a warning:
"warning: a host variable "d_test [with T=T]" cannot be directly read in a device function" ?
Has the struct in the testfunction to be instantiated with *d_test<float>->val ?
KR,
Iggi
Unfortunately, the CUDA compiler seems to generally have some issues with variable templates. If you look at the assembly, you'll see that everything works just fine. The compiler clearly does instantiate the variable template and allocates a corresponding device object.
.global .align 8 .u64 _Z6d_testIfE;
The generated code uses this object just like it's supposed to
ld.global.u64 %rd3, [_Z6d_testIfE];
I'd consider this warning a compiler bug. Note that I cannot reproduce the issue with CUDA 10 here, so this issue has most likely been fixed by now. Consider updating your compiler…
#MichaelKenzel is correct.
This is almost certainly an nvcc bug - which I have now filed (you might need an account to access that.
Also note I've been able to reproduce the issue with less code:
template <typename T>
struct foo { int val; };
template <typename T>
__device__ foo<T> *x;
template <typename T>
int __device__ f() { return x<T>->val; }
__global__ void kernel() { int y = f<float>(); }
and have a look at the result on GodBolt as well.

SWIG parser error

I have following header file.
#include <string>
namespace A {
namespace B {
struct Msg {
std::string id;
std::string msg;
Msg(std::string new_id, std::string new_msg)
: id(new_id), msg(new_msg)
{
}
};
template<bool HAS_ID>
class ID {
public:
template<typename TOBJ>
auto get(TOBJ parent) -> decltype(parent.id()) {
return parent.id();
}
};
} // namespace B
} // namespace A
When i swig it, it gives me an error
Error: Syntax error in input(3). at line 20 pointing to line
auto get(TOBJ parent) -> decltype(parent.id())
Target language is Java
How can i fix this problem? I only want to create wrapper for Msg struct and for nothing else in the header. As this looks like a Swig parser error, using %ignore directive does not seem to work.
Thank you
Although SWIG 3.x added limited decltype support it looks like the case you have is unsupported currently. (See decltype limitations)
I think the best you'll get for now is to surround the offending code in preprocessor macros to hide it, e.g.:
#include <string>
namespace A {
namespace B {
struct Msg {
std::string id;
std::string msg;
Msg(std::string new_id, std::string new_msg)
: id(new_id), msg(new_msg)
{
}
};
template<bool HAS_ID>
class ID {
public:
#ifndef SWIG
template<typename TOBJ>
auto get(TOBJ parent) -> decltype(parent.id()) {
return parent.id();
}
#endif
};
} // namespace B
} // namespace A
If you can't edit the file like that for whatever reason there are two options:
Don't use %include with the header file that doesn't parse. Instead write something like:
%{
#include "header.h" // Not parsed by SWIG here though
%}
namespace A {
namespace B {
struct Msg {
std::string id;
std::string msg;
Msg(std::string new_id, std::string new_msg)
: id(new_id), msg(new_msg)
{
}
};
} // namespace B
} // namespace A
in your .i file, which simply tells SWIG about the type you want to wrap and glosses over the one that doesn't work.
Alternatively get creative with the pre-processor and find a way to hide it using a bodge, inside your .i file you could write something like:
#define auto // \
void ignore_me();
%ignore ignore_me;
Another similar bodge would be to hide the contents of decltype with:
#define decltype(x) void*
Which just tells SWIG to assume all decltype usage is a void pointer. (Needs SWIG 3.x and could be combined with %ignore which ought to do the ignore, or a typemap to really fix it)

how can the vim script(clang_complete ) complete function ,template?

In the clang_complete.txt(the help file), it shows these in clang_complete-compl_kinds:
2.Completion kinds *clang_complete-compl_kinds*
Because libclang provides a lot of information about completion, there are
some additional kinds of completion along with standard ones (see >
:help complete-items for details):
'+' - constructor
'~' - destructor
'e' - enumerator constant
'a' - parameter ('a' from "argument") of a function, method or template
'u' - unknown or buildin type (int, float, ...)
'n' - namespace or its alias
'p' - template ('p' from "pattern")
the question are:
1. i cannot access the complete-items(no this file)
2. can someone tell me how to use the parameter '+' 'a' and so on.
3. or can you tell me how to show function parameters when ( is typed.
thanks!
(forgive my poor english)
It's been a long time, but i'll answer to help future visitors.
I don't fully understand your questions, but I'll answer the 3rd one. Clang complete only launches automatic suggestion/completion when writing '.', '->' or '::', but you can launch it manually.
I use it this way. In this source:
#include <iostream>
using namespace std;
void ExampleFunc (float foo, int &bar)
{
cout << foo;
bar++;
}
int main (int argc, char **argv)
{
int a(0);
Exa[cursor here]
return 0;
}
Writing "Exa" you can press <C-X><C-U> and you will get a preview window with:
Example (float foo, int &bar)
and a completion window (the same that appears when you press <C-N> (CTRL-N) in insert mode) with:
Example f void Example(float foo, int &bar)
If there are several matches, you can move down or up with <C-N> or <C-P> and complete with <CR> (enter).
The completion is not perfect, but it should work for many other cases, for example (as you mentioned) templates:
#include <vector>
using namespace std;
int main (int argc, char **argv)
{
struct MyType {int asdf; float qwer;};
vector<MyType> vec;
ve // suggestions after <C-X><C-U>:
// "vec v vector<MyType> vec" v is for variable
// "vector p vector<Typename _Tp>" p is for pattern (template)
// constructors with its parameters, etc.
vec. // auto-fired suggestions: all std::vector methods
vec[0]. // auto-fired suggestions: "asdf", "qwer" and MyType methods
return 0;
}
If those examples don't work for you, you haven't installed the plugin properly.
By the way, you can map <C-X><C-U> to other shortcut.

Thrust vector of type uint2: "has no member x" compiler error?

I have just started using the Thrust library. I am trying to make a vector of length 5 on the device. Her I am just setting the members of the first element vec[0]
#include<thrust/device_vector.h>
#include<iostream>
.
.
.
thrust::device_vector<uint2> vec(5);
vec[0]=make_uint2(4,5);
std::cout<<vec[0].x<<std::endl;
However for the above code I get the error
error: class "thrust::device_reference<uint2>" has no member "x"
1 error detected in the compilation of "/tmp/tmpxft_000020dc_00000000-4_test.cpp1.ii".
Where am I going wrong? I thought that accessing a member of a native CUDA vector data type such as uint2 with .x and .y was the correct way of doing .
As talonmies notes in his comment, you can't directly access the members of elements owned by a device_vector, or any object wrapped with device_reference. However, I wanted to provide this answer to demonstrate an alternative approach to your problem.
Even though device_reference doesn't allow you to access the members of the wrapped object, it is compatible with operator<<. This code should work as expected:
#include <thrust/device_vector.h>
#include <iostream>
// provide an overload for operator<<(ostream, uint2)
// as one is not provided otherwise
std::ostream &operator<<(std::ostream &os, const uint2 &x)
{
os << x.x << ", " << x.y;
return os;
}
int main()
{
thrust::device_vector<uint2> vec(5);
vec[0] = make_uint2(4,5);
std::cout << vec[0] << std::endl;
return 0;
}

C++: Explicit DLL Loading: First-chance Exception on non "extern C" functions

I am having trouble importing my C++ functions. If I declare them as C functions I can successfully import them. When explicit loading, if any of the functions are missing the extern as C decoration I get a the following exception:
First-chance exception at 0x00000000 in cpp.exe: 0xC0000005: Access violation.
DLL.h:
extern "C" __declspec(dllimport) int addC(int a, int b);
__declspec(dllimport) int addCpp(int a, int b);
DLL.cpp:
#include "DLL.h"
int addC(int a, int b) {
return a + b;
}
int addCpp(int a, int b) {
return a + b;
}
main.cpp:
#include "..DLL/DLL.h"
#include <stdio.h>
#include <windows.h>
int main() {
int a = 2;
int b = 1;
typedef int (*PFNaddC)(int,int);
typedef int (*PFNaddCpp)(int,int);
HMODULE hDLL = LoadLibrary(TEXT("../Debug/DLL.dll"));
if (hDLL != NULL)
{
PFNaddC pfnAddC = (PFNaddC)GetProcAddress(hDLL, "addC");
PFNaddCpp pfnAddCpp = (PFNaddCpp)GetProcAddress(hDLL, "addCpp");
printf("a=%d, b=%d\n", a,b);
printf("pfnAddC: %d\n", pfnAddC(a,b));
printf("pfnAddCpp: %d\n", pfnAddCpp(a,b)); //EXCEPTION ON THIS LINE
}
getchar();
return 0;
}
How can I import c++ functions for dynamic loading? I have found that the following code works with implicit loading by referencing the *.lib, but I would like to learn about dynamic loading.
Thank you to all in advance.
Update:
bindump /exports
1 00011109 ?addCpp##YAHHH#Z = #ILT+260(?addCpp##YAHHH#Z)
2 00011136 addC = #ILT+305(_addC)
Solution:
Create a conversion struct as
found here
Take a look at the
file exports and copy explicitly the
c++ mangle naming convention.
PFNaddCpp pfnAddCpp = (PFNaddCpp)GetProcAddress(hDLL, "?addCpp##YAHHH#Z");
Inevitably, the access violation on the null pointer is because GetProcAddress() returns null on error.
The problem is that C++ names are mangled by the compiler to accommodate a variety of C++ features (namespaces, classes, and overloading, among other things). So, your function addCpp() is not really named addCpp() in the resulting library. When you declare the function with extern "C", you give up overloading and the option of putting the function in a namespace, but in return you get a function whose name is not mangled, and which you can call from C code (which doesn't know anything about name mangling.)
One option to get around this is to export the functions using a .def file to rename the exported functions. There's an article, Explicitly Linking to Classes in DLLs, that describes what is necessary to do this.
It's possible to just wrap a whole header file in extern "C" as follows. Then you don't need to worry about forgetting an extern "C" on one of your declarations.
#ifdef __cplusplus
extern "C" {
#endif
__declspec(dllimport) int addC(int a, int b);
__declspec(dllimport) int addCpp(int a, int b);
#ifdef __cplusplus
} /* extern "C" */
#endif
You can still use all of the C++ features that you're used to in the function bodies -- these functions are still C++ functions -- they just have restrictions on the prototypes to make them compatible with C code.