Usage of printf() in Cuda 4.0 Compilation Error - cuda

I have a GTX 570 (Fermi architecture) which is of compute Capability 2.0. I have Cuda version 4.0 on my computer
and I am using Ubuntu 10.10
With Cuda 4.0 it is possible to use printf() inside kernels. Here is an example code from page 125 of the Cuda 4.0 programming guide
#if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ < 200)
#define printf(f, ...) ((void)(f, __VA_ARGS__),0)
#endif
__global__ void helloCUDA(float f)
{
printf(“Hello thread %d, f=%f\n”, threadIdx.x, f);
}
void main()
{
helloCUDA<<<1, 5>>>(1.2345f);
cudaDeviceReset();
}
I am getting the following compilation error.
gaurish108 MyPractice: nvcc printf_inkernel.cu -o printf_inkernel
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: expected an expression
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(10): error: unrecognized token
printf_inkernel.cu(15): warning: return type of function "main" must be "int"
8 errors detected in the compilation of "/tmp/tmpxft_000014cd_00000000-4_printf_inkernel.cpp1.ii".
Why is it not recognizing printf? I tried adding the flag -arch=sm_20 , but I get the same error.

It looks like you've got a weird quote character at either end of printf's formatter string.
If you copy and paste this program, it ought to compile and run without error:
#include <stdio.h>
__global__ void helloCUDA(float f)
{
printf("Hello thread %d, f=%f\n", threadIdx.x, f);
}
int main()
{
helloCUDA<<<1, 5>>>(1.2345f);
cudaDeviceReset();
return 0;
}
And the output:
$ nvcc -arch=sm_20 test.cu -run
Hello thread 0, f=1.234500
Hello thread 1, f=1.234500
Hello thread 2, f=1.234500
Hello thread 3, f=1.234500
Hello thread 4, f=1.234500
I don't understand the need for the weird macro which begins the program. I'd get rid of it.

Related

error while connecting mariadb with c : undefined reference to `mysql_init#4'

I am trying to connect to mariadb database using c program. Initially it was showing error for #include <mysql.h> as no such file or directory.
But after including directory name, that problem is solved now, but it is showing another error.
Following is the code I was trying to run:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// #include "C:/Program Files/MariaDB 10.11/include/mysql/my_global.h"
#include "mysql/mysql.h"
int main (int argc, char* argv[])
{
// Initialize Connection
MYSQL *conn;
if (!(conn = mysql_init(0)))
{
fprintf(stderr, "unable to initialize connection struct\n");
exit(1);
}
// Connect to the database
if (!mysql_real_connect(
conn, // Connection
"mariadb.example.net", // Host
"db_user", // User account
"db_user_password", // User password
"test", // Default database
3306, // Port number
NULL, // Path to socket file
0 // Additional options
));
{
// Report the failed-connection error & close the handle
fprintf(stderr, "Error connecting to Server: %s\n", mysql_error(conn));
mysql_close(conn);
exit(1);
}
// Use the Connection
// ...
// Close the Connection
mysql_close(conn);
return 0;
}
I am getting following error in output:
PS C:\Dev\Win> gcc Db_con.c -o Db_con
C:\Users\hajos\AppData\Local\Temp\ccGZ2Rhz.o:Db_con.c:(.text+0x1e): undefined reference to `mysql_init#4'
C:\Users\hajos\AppData\Local\Temp\ccGZ2Rhz.o:Db_con.c:(.text+0xa1): undefined reference to `mysql_real_connect#32'
C:\Users\hajos\AppData\Local\Temp\ccGZ2Rhz.o:Db_con.c:(.text+0xaf): undefined reference to `mysql_error#4'
C:\Users\hajos\AppData\Local\Temp\ccGZ2Rhz.o:Db_con.c:(.text+0xd9): undefined reference to `mysql_close#4'
collect2.exe: error: ld returned 1 exit status
Can anyone explain what is the problem and how to solve it?
You have to link against the MariaDB Connector/C libraries.
From MariaDB Connector/C documentation:
Linking your application against MariaDB Connector/C
Windows
For static linking the library libmariadb.lib is required, for dynamic linking use libmariadb.dll. Using the MSI installer, these libraries can be found in the lib directory of your MariaDB Connector/C installation.
Unless you use the experimental plugin remote_io (which requires the curl library) there are no dependencies to other libraries than the Windows system libraries.

CUDA constant memory usage across multiple source files showing different behaviors on cuda-11.2 and cuda-11.4

Minimum repro:
kernel.cu:
#include <stdio.h>
__constant__ int N_GPU;
void wrapper_fn(int *ptr)
{
cudaMemcpyToSymbol(N_GPU, ptr, sizeof(int), cudaMemcpyDeviceToDevice);
}
__global__ void printKernel() {
printf("N = %d; \n", N_GPU);
}
driver.cu:
#include "cuda_runtime.h"
#include <stdio.h>
void wrapper_fn(int*);
__global__ void printKernel();
int main()
{
int N = 10;
int* d_N_ptr;
cudaMalloc(&d_N_ptr, sizeof(int));
cudaMemcpy(d_N_ptr, &N, sizeof(int), cudaMemcpyDefault);
wrapper_fn(d_N_ptr);
printKernel <<<1, 1 >>>();
cudaPeekAtLastError();
cudaDeviceSynchronize();
return 0;
}
Both on cuda-11.4 and cuda-11.2, running nvcc kernel.cu driver.cu compiles. The expected output (i.e N = 10;) is only seen in 11.2 and not 11.4.
Upon running cuda-gdb on 11.4, I get the following:
...
[New Thread 0x7fffee240700 (LWP 54339)]
warning: Cuda API error detected: cudaMalloc returned (0xde)
warning: Cuda API error detected: cudaMemcpy returned (0xde)
warning: Cuda API error detected: cudaMemcpyToSymbol returned (0xde)
warning: Cuda API error detected: cudaLaunchKernel returned (0xde)
warning: Cuda API error detected: cudaPeekAtLastError returned (0xde)
warning: Cuda API error detected: cudaDeviceSynchronize returned (0xde)
[Thread 0x7fffee240700 (LWP 54339) exited]
...
Any particular nvcc flags I'm missing that's important in the 11.4? or particular API changes I'm missing? Thanks in advance!
So the answer has to do with my driver version. The error code as seen from the cuda-gdb output (0xde = 222) is due to the fact that the compiled PTX is too new for the driver installed (my driver was 460.35), and the "CUDA Enhanced Compatibility" was used to run on my older driver, that didn't support the necessary PTX JIT.
TLDR; compiling to the exact architecture-specific SASS solved for cuda 11.4.
I did this by adding the the -arch compute_70 flag to my nvcc compilation command.

cuDevicePrimaryCtxRetain returns CUDA_ERROR_INVALID_DEVICE after acc_init

I was trying the new PGI community release (17.4) with a toy example (see below) and I'm getting an error inside the CUDA driver api when calling acc_init.
The code to reproduce the error is:
#include <openacc.h>
#include <cuda_runtime_api.h>
#include <stdio.h>
int main()
{
acc_init( acc_device_nvidia );
int ndev = acc_get_num_devices( acc_device_nvidia );
printf("Num OpenACC devices: %d\n", ndev);
cudaGetDeviceCount(&ndev);
printf("Num CUDA devices: %d\n", ndev);
return 0;
}
Compiled with:
/usr/local/pgi/linux86-64/17.4/bin/pgcc -acc -ta=tesla -Mcuda ./test.c -o oacc_test.pgi
cuda memcheck output:
$ cuda-memcheck ./oacc_test.pgi
========= CUDA-MEMCHECK
========= Program hit CUDA_ERROR_INVALID_DEVICE (error 101) due to "invalid device ordinal" on CUDA API call to cuDevicePrimaryCtxRetain.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so (cuDevicePrimaryCtxRetain + 0x15c) [0x1e8d1c]
========= Host Frame:/usr/local/pgi/linux86-64/17.4/lib/libaccnc.so (__pgi_uacc_cuda_initdev + 0x80b) [0x6f0b]
========= Host Frame:/usr/local/pgi/linux86-64/17.4/lib/libaccg.so (__pgi_uacc_enumerate + 0x148) [0x11388]
========= Host Frame:/usr/local/pgi/linux86-64/17.4/lib/libaccg.so (__pgi_uacc_initialize + 0x5b) [0x117ab]
========= Host Frame:/usr/local/pgi/linux86-64/17.4/lib/libaccapi.so (acc_init + 0x22) [0xe4f2]
========= Host Frame:./oacc_test.pgi [0xbc4]
========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xf1) [0x202b1]
========= Host Frame:./oacc_test.pgi [0xaca]
=========
Num OpenACC devices: 1
Num CUDA devices: 1
========= ERROR SUMMARY: 1 error
Apparently __pgi_uacc_cuda_initdev is passing a '-1' as the second parameter (CUdevice dev) to cuDevicePrimaryCtxRetain (bug?):
Breakpoint 1, 0x00007ffff4ab0bc0 in cuDevicePrimaryCtxRetain () from /usr/lib/x86_64-linux-gnu/libcuda.so
(cuda-gdb) p /x $rsi
$7 = 0xffffffff
I suppose this isn't normal. Is this a bug of 17.4 or is my installation broken?
It's normal and a benign error. Basically what's happening is the PGI runtime is querying if there's already a CUDA context created. But since there isn't CUDA runtime call to just query the existence of a context, we call "cuDevicePrimaryCtxRetain". If it errors, then we know that we need to create a new context.
Note that in PGI release 17.7 we did change this call a bit so you will no longer see the error when running cuda-memcheck.

exception when using boost::serialization with boost::asio

my server runs following code:
boost::asio::streambuf streambuf;
std::istream istream(&streambuf);
boost::archive::xml_iarchive xml_iarchive(istream);
boost::asio::read_until(socket_, streambuf, '\n');
When the server is up and running I connect vie telnet from another machine. Immediately after connection is established, the connection is getting closed and the server crashes with following exception:
terminate called after throwing an instance of 'boost::archive::xml_archive_exception'
what(): unrecognized XML syntax
Where is the failure at the code snippet above? It looks to me that the telnet session is sending a '\n' before I manually enter some XML string.
You didn't post a sscce, so I created one for you
#include <boost/asio.hpp>
#include <boost/archive/xml_iarchive.hpp>
int
main()
{
try {
boost::asio::streambuf streambuf;
std::istream istream(&streambuf);
boost::archive::xml_iarchive xml_iarchive(istream);
} catch ( const std::exception& e ) {
std::cerr << e.what() << std::endl;
}
}
As expected, an exception is throw from line 10:
samm$ ./a.out
unrecognized XML syntax
This has nothing to do with Boost.Asio, you're trying to deserialize an empty buffer, which isn't valid XML. To solve this, delay the deserialization until after reading from the socket into the buffer
boost::asio::read_until(socket_, streambuf, '\n');
std::istream istream(&streambuf);
boost::archive::xml_iarchive xml_iarchive(istream);

Weird error compiling code that calls Surface low-level CUDA API

This minimal example:
int main()
{
struct surfaceReference* surfaceReferencePointer;
cudaGetSurfaceReference(&surfaceReferencePointer, "surfaceReference");
}
Fails when it is compiled like this:
nvcc -g -arch=sm_20 -o foo.out foo.cu
Showing the following error message:
foo.cu(4): warning: argument of type "surfaceReference **" is incompatible with parameter of type "const surfaceReference **"
foo.cu(4): warning: argument of type "surfaceReference **" is incompatible with parameter of type "const surfaceReference **"
foo.cu: In function ‘int main()’:
foo.cu:4: error: invalid conversion from ‘surfaceReference**’ to ‘const surfaceReference**’
foo.cu:4: error: initializing argument 1 of ‘cudaError_t cudaGetSurfaceReference(const surfaceReference**, const char*)’
I cannot understand what I am doing wrong. I am compiling on a Linux Ubuntu 64-bit machine, using CUDA 3.2.
The struct surfaceReference* pointer should be defined as const:
int main()
{
const struct surfaceReference* surfaceReferencePointer;
cudaGetSurfaceReference(&surfaceReferencePointer, "surfaceReference");
}
Kudos to codymanix, who provided the right answer in the comments.