Specifying openmp CUDA flag in a Cmake project - cuda

How can I pass openmp flag to NVCC when in a cmake project?
My CMakeLists.txt for this project looks like this, but compilation fails with "undefined reference to `omp_get_wtime'" message.
cmake_minimum_required (VERSION 2.8.2)
set (CMAKE_MODULE_PATH
"${CMAKE_SOURCE_DIR}/cmake"
${CMAKE_MODULE_PATH}
)
find_package (CUDA 4.0 REQUIRED)
if(OPENMP_FOUND)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}
${OpenMP_EXE_LINKER_FLAGS}") endif()
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/bin)
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/lib)
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/lib)
if(UNIX)
add_definitions(-DUNIX)
endif(UNIX)
add_subdirectory(xor)
CUDA_BUILD_CLEAN_TARGET()

I just found out that adding find_package (OpenMP) to the previous script did the trick.

Related

nvcc compilation errors with "M_PI" and "or"

When trying to compile this piece of code:
#define _USE_MATH_DEFINES
#include <cmath>
#include <cstdio>
void minimal_example(){
int i=2;
if(i==3 or i==4) printf("I want %d!\n",M_PI);
}
using
nvcc -x cu -Xcompiler=/permissive- -dc cuda_nvcc_cl_test.cu -o cuda_nvcc_cl_test.obj
I get the follwing errors (in line 7):
error: expected a ")"
error: identifier "M_PI" is undefined
I am using Windows 10 with Visual Studio's cl.exe (Version 19.16.27031.1 for x64) and CUDA toolkit 10.1.
When replacing cmath with math.h and or with || (alternatively add #include <ciso646>), the errors disappear. However, are there some compiler options or other possibilities so that I can keep the code as is?
Also why did -Xcompiler=/permissive- not help?
There are 2 issues here:
Apparently nvcc includes cmath prior to parsing your code. As discussed in the accepted answer here, if you include cmath and you don't have the define instantiated at that point, you won't get M_PI defined, and subsequent inclusions of cmath won't fix this, due to include guards. A possible workaround for this is to add -D_USE_MATH_DEFINES to your compile command line. This puts the define in place from the start of the compilation process, and M_PI gets defined that way.
Contrary to what I read as correct or standard behavior, the use of or in place of || seems to be conditioned on inclusion of ciso646 (on nvcc on windows/visual studio only. nvcc on linux doesn't seem to need this). Yes, I understand it is not supposed to work that way, but nevertheless it appears to be necessary. This may be an issue with Visual Studio. You can experiment with the /Za switch if you like. (It didn't seem to help when I tried it.)
With CUDA 10.1, on VS2019, when I compile this:
#include <cstdio>
#include <ciso646>
void minimal_example(){
int i=2;
if(i==3 or i==4) printf("I want %f!\n",M_PI);
}
with this command line:
nvcc -x cu -dc test.cu -o test.obj -D_USE_MATH_DEFINES
I get no errors or warnings. Note that I have also changed your printf format specifier from %d to %f to be consistent with the type of M_PI.
If you really don't want to include ciso646, nvcc supports the -include switch to include a file directly from the command line. Therefore I can compile this:
#include <cstdio>
void minimal_example(){
int i=2;
if(i==3 or i==4) printf("I want %f!\n",M_PI);
}
like this:
nvcc -x cu -dc test.cu -o test.obj -D_USE_MATH_DEFINES -include ciso646
with no errors or warnings.

How to enable separate compilation of CUDA with ROS?

In default, CUDA requires that all device codes are compiled in a single translation/compilation unit unless you allow the separate compilation/linking of the device code.
But how can I enable it in a CMakeList.txt for the ROS (or more generally camke)?
The following is my current CMakeList file:
cmake_minimum_required(VERSION 2.8.3)
project(edt)
set(ROS_BUILD_TYPE Debug)
set(CMAKE_BUILD_TYPE Debug)
# Compile as C++11, supported in ROS Kinetic and newer
add_compile_options(-std=c++11)
find_package(catkin REQUIRED COMPONENTS
roscpp
rospy
std_msgs
)
FIND_PACKAGE(CUDA REQUIRED)
SET(CUDA_NVCC_FLAGS "-arch=sm_61;-g;-G" CACHE STRING "nvcc flags" FORCE)
SET (CUDA_VERBOSE_BUILD ON CACHE BOOL "nvcc verbose" FORCE)
catkin_package(
INCLUDE_DIRS include
CATKIN_DEPENDS roscpp rospy std_msgs message_runtime
)
include_directories(
include
${catkin_INCLUDE_DIRS}
${ZED_INCLUDE_DIRS}
)
cuda_add_executable(
${PROJECT_NAME}_node
src/main.cpp
src/test.cu
src/test2.cu
)
add_dependencies(
${PROJECT_NAME}_node
${${PROJECT_NAME}_EXPORTED_TARGETS}
${catkin_EXPORTED_TARGETS}
)
## Specify libraries to link a library or executable target against
target_link_libraries(
${PROJECT_NAME}_node
${catkin_LIBRARIES}
${ZED_LIBRARIES}
)
The test2.cu defines a function that is called in test.cu (see cuda_add_executable in the CMakeList), and it currently gives error said the .o file cannot be created.
Does this work for you?
set_property(TARGET ${TARGET_NAME}PROPERTY CUDA_SEPARABLE_COMPILATION ON)
That's the common way, in my understanding.

Using CMakes CHECK_CXX_COMPILER_FLAG with nvcc/cuda

I'm trying to compile some coda using CUDA with MakeFiles generated by CMake.
I'd like to use CHECK_CXX_COMPILER_FLAG or something similar to check if the used nvcc version supports a given flag. In my case it is "--expt-relaxed-constexpr" (Cuda 7.5) and "--relaxed-constexpr" (Cuda 7?)
Of course I could compare the CUDA version but I find the compile-flag check more fail-safe.
Is there any CMake command similar to CHECK_CXX_COMPILER_FLAG that invokes the nvcc compiler and not the host compiler?
I am not aware of an official way to check for a specific nvcc flag, but you can write a macro yourself rather simply:
CheckNvccCompilerFlag.cmake
MACRO(CHECK_NVCC_COMPILER_FLAG _FLAG _RESULT)
EXECUTE_PROCESS(COMMAND ${CUDA_NVCC_EXECUTABLE} "${_FLAG}" ERROR_VARIABLE NVCC_OUT)
IF("${NVCC_OUT}" MATCHES "Unknown option")
SET(${_RESULT} 0)
ELSE()
SET(${_RESULT} 1)
ENDIF()
ENDMACRO()
A demo use:
CMakeList.txt
PROJECT(cuda_flag_test)
FIND_PACKAGE(CUDA)
INCLUDE(CheckNvccCompilerFlag.cmake)
CHECK_NVCC_COMPILER_FLAG("--asdf" HAS_NVCC_ASDF)
IF(HAS_NVCC_ASDF)
MESSAGE(STATUS "asdf is supported")
ENDIF()
CHECK_NVCC_COMPILER_FLAG("--relaxed-constexpr" HAS_NVCC_RELAXED_CONSTEXPR)
IF(HAS_NVCC_RELAXED_CONSTEXPR)
MESSAGE(STATUS "relaxed-constexpr is supported")
ENDIF()
output
...
-- Found CUDA: /opt/cuda (found version "7.0")
-- relaxed-constexpr is supported
...
(Personally, I would rely on CUDA_VERSION.)
Short answer: Yes, there is. NVCC will define a macro: __CUDACC_RELAXED_CONSTEXPR__.
#ifdef __CUDACC_RELAXED_CONSTEXPR__
do somthing
#endif

Getting DRAM_Reads and DRAM_Writes in command line mode of CUDA Profiler

I am trying to use the CUDA Profiler in command line; I am interested in DRAM_Reads and DRAM_Writes - and I am providing the following counters in my CUDA_PROFILE_LOG file:
fb_subp0_read_sectors
fb_subp0_write_sectors
fb0_subp0_read_sectors
fb0_subp0_write_sectors
fb1_subp0_read_sectors
fb1_subp0_write_sectors
But I notice in my cuda_profile files, there is an error like:
NV_Warning: Ignoring the invalid profiler config option: fb0_subp0_read_sectors
NV_Warning: Ignoring the invalid profiler config option: fb0_subp0_write_sectors
NV_Warning: Ignoring the invalid profiler config option: fb1_subp0_read_sectors
NV_Warning: Ignoring the invalid profiler config option: fb1_subp0_write_sectors
The values I get from fb_subp0_read_sectors and fb_subp0_write_sectors counters are not equal to what I get from NVidia Visual Profiler, which is perhaps because I am not passing correct counters to the config file.
The GPU is Tesla M2050 and CUDA 4.1 is used. How do I get DRAM_Reads and DRAM_Writes in command line?
EDIT: After doing a bit of read-up, I think GPU could either have fb0/1... or fb... counters. But even if I have:
fb_subp0_read_sectors
fb_subp0_write_sectors
fb_subp1_read_sectors
fb_subp1_write_sectors
I get the warning:
NV_Warning: Counter 'fb_subp1_read_sectors' is not compatible with other selected counters and it cannot be profiled in this run.
NV_Warning: Counter 'fb_subp1_write_sectors' is not compatible with other selected counters and it cannot be profiled in this run.
Thanks,
Sayan
Not all counters can be profiled in a single run, due to hardware constraints.
According to the warning message, you may try profiling the first two counters in the first run, and then the last two in the second run.

Problem with conflict between mysql and math.h

The problem is that the compiler says that there is a redefinition of a function between a library that belongs to MySQL and math.h from the std library.
I have been over this for two days and I still can't figure it out.
Has this ever happened to anyone?
This is the output from the compiler
C:\mingw\bin\mingw32-make.exe all
'Building file: ../src/interfaz/ventanaconf.cpp'
'Invoking: GCC C++ Compiler'
C:\mingw\bin\mingw32-g++.exe -mms-bitfields -I"c:\dev-cpp\gtkmm\include\gtkmm-2.4"
-I"c:\dev-cpp\gtkmm\lib\gtkmm-2.4\include" -I"c:\dev-cpp\gtkmm\include\glibmm-2.4"
-I"c:\dev-cpp\gtkmm\lib\glibmm-2.4\include" -I"c:\dev-cpp\gtkmm\include\gdkmm-2.4"
-I"c:\dev-cpp\gtkmm\lib\gdkmm-2.4\include" -I"c:\dev-cpp\gtkmm\include\pangomm-1.4"
-I"c:\dev-cpp\gtkmm\include\atkmm-1.6" -I"c:\dev-cpp\gtkmm\include\sigc++-2.0"
-I"c:\dev-cpp\gtkmm\lib\sigc++-2.0\include" -I"c:\dev-cpp\gtkmm\include\cairomm-1.0"
-I"c:\gtk\include\gtk-2.0"
-I"c:\gtk\include\glib-2.0"
-I"c:\gtk\lib\glib-2.0\include"
-I"c:\gtk\lib\gtk-2.0\include"
-I"c:\gtk\include\pango-1.0"
-I"c:\gtk\include\cairo"
-I"c:\gtk\include\freetype2"
-I"c:\gtk\include"
-I"c:\gtk\include\atk-1.0"
-I"c:\Archivos de programa\MySQL\MySQL Server 5.0\include"
-O0 -g3 -w -c -fmessage-length=0 -MMD -MP -MF"src/interfaz/ventanaconf.d"
-MT"src/interfaz/ventanaconf.d"
-o"src/interfaz/ventanaconf.o" "../src/interfaz/ventanaconf.cpp"
In file included from c:/Archivos de programa/MySQL/MySQL Server 5.0/include/my_global.h:73,
from ../src/interfaz/../gestiondb/gestordb.h:6,
from ../src/interfaz/../gestiondb/operacionesdb.h:5,
from ../src/interfaz/ventanamodulos.h:20,
from ../src/interfaz/ventanaconf.h:27,
from ../src/interfaz/ventanaconf.cpp:1:
c:/Archivos de programa/MySQL/MySQL Server 5.0/include/config-win.h: **In function `double rint(double)':
c:/Archivos de programa/MySQL/MySQL Server 5.0/include/config-win.h:228: error: redefinition of `double rint(double)'
C:/mingw/bin/../lib/gcc/mingw32/3.4.2/../../../../include/math.h:620: **error: `double rint(double)' previously defined here**
C:\mingw\bin\mingw32-make.exe: *** [src/interfaz/ventanaconf.o] Error 1**
Thanks in advance!!!
This thread in the mysql support area seems to indicate that they've taken the definition of rint() out of their config_win.h file as of April this year (even though the patch was proposed in 2006). Are you using a version of the MySQL source newer than that?
The problem was about an included library, which linux simply ignores, but windows want out. There is no problem letting it out in linux neither...
Somedays i feel SOOOOOOOOOOOOOOOOOOOOOOO STUPID:..
In line 228 of c:/Archivos de programa/MySQL/MySQL Server 5.0/include/config-win.h you should find a declaration/definition of function named "rint". In line 620 of C:/mingw/bin/../lib/gcc/mingw32/3.4.2/../../../../include/math.h you should find another definition of a function with the same name (which probably even does the same).
To solve the problem you will have to delete/outcomment/undefine one of these definitions.
Be prepared to get a similar problem when linking, if you also link two libraries with the same function.