Why does Intel C++ Compiler 17.0 and previous versions crash while trying to compile this very simple code employing a unique_ptr? - stl

I've tried to compile the last version of a library developed by Intel (https://github.com/embree/) using Microsoft Visual Studio 2017 with Intel C++ Compiler 17.0 under Windows 7.
I've come across a compilation error that prevented me to go further. I managed to isolate a very simple test case on which this combination of compiler/compiling environment systematically fails. The sample itself is pretty useless:
#include <memory>
int main(int argc, char ** argv)
{
std::unique_ptr<int> ptr;
}
Here is what Visual Studio 2017 outputs:
1>------ Début de la génération : Projet : Test2000, Configuration : Debug x64 ------
1>test2000.cpp
1>test2000.cpp(5): error : access violation
1> std::unique_ptr<int> ptr;
1> ^
1>
1>compilation aborted for test2000.cpp (code 4)
1>Génération du projet "Test2000.vcxproj" terminée -- ÉCHEC.
========== Génération : 0 a réussi, 1 a échoué, 0 mis à jour, 0 a été ignoré ==========
I found this quite peculiar and tried it gcc.godbolt.org and was surprised to be able to reproduce the same behaviour on a gcc environment with Intel C++ Compiler 13, 16 and 17 (you can test it by yourself here https://godbolt.org/g/3SXxuH).
ICC's output under a GCC environment is more detailed and basically consists in this error:
/opt/compiler-explorer/gcc-7.2.0/bin/../include/c++/7.2.0/bits/move.h(48): error: identifier "__builtin_addressof" is undefined
{ return __builtin_addressof(__r); }
My guess is that there is some kind of incompatibility between the last version of the STL headers and Intel C++ Compiler 17.0 (and previous versions too). A quick Google search yielded this only forum post with no answer:
https://www.cpume.com/question/fgshnonz-intel-icc-compile-c-code-cause-error.html
Has anyone met this problem and found a solution (other than switching to Intel C++ Compiler 18.0, of course) before?

Related

cudaMalloc hang when building x64 version binary [duplicate]

My simple cuda helloworld application runs fine when built in 32 bit using visual studio 2015 community on windows 10. However, if I build it in 64 bit, it is not executed
GPU: Telsa K40c
ToolKit: CUDA 8
Operating System: windows 10 64bit
Visual Studio: community edition.
there is no error message in output console.
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include<stdio.h>
__global__ void welcome() {
printf("Hello world :)");
}
int main() {
welcome << <1, 1 >> > ();
cudaDeviceSynchronize();
return 0;
}
I faced the same issue and opened a bug to nvidia #1855074.
They reproduced it successfully and I'm waiting for update on it.
One thing is sure, it's on their side.
The only workaround I found was to put my card in WDDM mode via nvidia-smi, which broke my taskbar.
I recommend to wait for a fix.

cuda vs2013 v120xp compile error

I'm using VS2013(update 4) + CUDA 6.5 + win7-32bit
My CUDA program compiles fine without the v120xp option, I need it support winXP. But there're lots of compile error with v120xp specified.
To reproduce the problem:
Create a new project with VS2013's CUDA wizard
Change the Platform Toolset to Visual Studio 2013 - Windows XP (v120_xp)
Compile
The compile error looks like:
1>G:\vs2013\VC\include\yvals.h(666): error : expected a ";"
1>G:\vs2013\VC\include\yvals.h(667): error : expected a ";"
1>G:\vs2013\VC\include\exception(460): error : "explicit" is not allowed
1> kernel.cu
I also compiled the program with CMake, everything is ok(with the v120_xp). Though I write code with CMake, my company uses VS2013, so I need to generate a VS2013 project for my colleagues.
How to make it compile? Thanks.
Finally, a workaround for this:
Don't change the Platform Toolset, leave it as v120, and add /SUBSYSTEM:WINDOWS,5.01 or /SUBSYSTEM:CONSOLE,5.01 manually in Command Line setting

cuda 6 unified memory segmentation fault

In order to use unified memory feature in CUDA 6, the following requirement must be meet,
a GPU with SM architecture 3.0 or higher (Kepler class or newer)
a 64-bit host application and operating system, except on Android
Linux or Windows
My setup is,
System: ubuntu 13.10 (64-bit)
GPU: GTX770
CUDA: 6.0
Driver Version: 331.49
The sample code are taken from the programming guide page 210.
__device__ __managed__ int ret[1000];
__global__ void AplusB(int a, int b) {
ret[threadIdx.x] = a + b + threadIdx.x;
}
int main() {
AplusB<<< 1, 1000 >>>(10, 100);
cudaDeviceSynchronize();
for(int i=0; i<1000; i++)
printf("%d: A+B = %d\n", i, ret[i]);
return 0;
}
The nvcc compile option I used is,
nvcc -m64 -Xptxas=-Werror -arch=compute_30 -code=sm_30 -o UM UnifiedMem.cu
This code compiles perfectly fine. During execution, it produces "segmentation fault" at printf(). It feels like that unified memory feature didn't come into effect. The address of variable ret is still of GPU but printf is called on CPU. CPU is trying to access a piece of data that is not allocated on CPU so it produces a segmentation fault. Can anybody help me? What is wrong here?
Thought I am not certain sure (and I can't check it for myself right now) I think that because Ubuntu 13.10 has gcc in version of 4.8.1, which I believe is not supported yet even in newest CUDA Toolkit 6.0. Try to compile your code with host compiler gcc 4.7.3 (that is, the same one that is included in officially supported Ubuntu 13.04 for default). For that you might install gcc-4.7 package and point /usr/bin/gcc-4.7 as host compiler for nvcc. For C++ support I believe you need g++-4.7 as well.
If you need some simple step-by-step guide, then you might proceed with http://n00bsys0p.co.uk/blog/2014/01/23/nvidia-cuda-55ubuntu-1310-saucy-salamander. It's for CUDA Toolkit 5.5, but I think it should be relevant for recent version as well.

exception (first chance) ... cudaError_enum at memory

So I am working on a project which is spitting me out that error, after some research showed that the problem lies with the cublas library.
So now I have the following "minimal" problem:
I opened the simpleCUBLAS example out of the NVIDIA CUDA SDK (4.2) to test if I can reproduce the problem .
the programm itself works but VS2010 gives me a similar output:
Eine Ausnahme (erste Chance) bei 0x75e3c41f in simpleCUBLAS.exe: Microsoft C++-Ausnahme: cudaError_enum an Speicherposition 0x003bf704..
7 times
so to my specs:
I use a GTX 460 for computing, compile with sm_20 use VS2010 on Windows 7 64-bit
and nvcc --version gives me:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2011 NVIDIA Corporation
Built on Fri_Jan_13_01:18:37_PST_2012
Cuda compilation tools, release 4.1, V0.2.1221
this is my first time posting here so I apologize for the horrible format it is posted
The observation you are making has to do with an exception that is caught and handled properly within the CUDA libraries. It is, in some cases, a normal part of CUDA GPU operation. As you have observed, your application returns no API errors and runs correctly. If you were not within the VS environment that can report this, you would not observe this at all.
This is considered normal behavior under CUDA. I believe there were some attempts to eliminate it in CUDA 5.5. You might wish to try that, although it's not considered an issue either way.

CUDA Toolkit 4.1/4.2: nvcc Crashes with an Access Violation

I am developing a CUDA application for GTX 580 with Visual Studio 2010 Professional on Windows 7 64bit. My project builds fine with CUDA Toolkit 4.0, but nvcc crashes when I choose CUDA Toolkit 4.1 or 4.2 with the following error:
1> Stack dump:
1> 0. Running pass 'Promote Constant Global' on module 'moduleOutput'.
1>CUDACOMPILE : nvcc error : 'cicc' died with status 0xC0000005 (ACCESS_VIOLATION)
Strangely enough, the program compiles OK with "compute_10,sm_10" specified for "Code Generation", but "compute_20,sm_20" does not work. The code in question can be downloaded here:
http://www.meriken2ch.com/files/CUDA_SHA-1_Tripper_MERIKENs_Branch_0.04_Alpha_1.zip
(README.txt is in Japanese, but comments in source files are in English.)
I am suspecting a newly introduced bug in CUDA Toolkit 4.1/4.2. Has anybody encountered this issue? Is there any workaround for it? Any kind of help will be much appreciated.
This appears to have been a compiler bug in CUDA 4.x that is fixed in CUDA 5.0 (according to a comment from #meriken2ch, the project builds fine with CUDA 5.0 RC).