How do I profile cuda project using Compute Visual Profiler without executable? - cuda

So I got Visual Studio 2010 installed with NSight for VS. Projects are compiling and working just fine (but still I don't have intellisense recognized altough I spend many many hours and went through all tutorials).
Now I would like to use Compute Visual Profiler (NSight) to which I need executable. So when I go to Debug or Release after running the projest I can't find exe file there. How to create it ?
Anyway I am able to run Compute Visual Profiler profiling this project by pointing the Debug directory but then I got the error :
Program run #18 completed.
Error :
Application : "C:(...)dot_product.exe.intermediate.manifest".
Profiler data file 'C:/(...)temp_compute_profiler_0_0.csv' for application run 0 not found.
So my question is How could I possibly profile my project?

Related

Problem: Could not load file or assembly 'CrystalDecisions.Web, Version=13.0.3500.0

Using Visual Studio 2010, I am reporting on my local computer vith SAP Crystal Reports. Everything works on the local computer very well.
However, when I upload my project to highh security remote hosting after publishing the project, I get the specified warning as mentioned below.
In this case, I would like to ask; am I making a mistake in the coding or my Crystal Report versions is wrong? or this is the hosting server's company problem?
Thank you.
Configuration Error
Parser Error Message: Could not load file or assembly 'CrystalDecisions.Web, Version=13.0.3500.0, Culture=neutral, PublicKeyToken=692fbea5521e1304' or one of its dependencies. The system cannot find the file specified.
Source Error:
An application error occurred on the server. The current custom error settings for this application prevent the details of the application error from being viewed remotely (for security reasons). It could, however, be viewed by browsers running on the local server machine.
Source File: D:\IISDIRS\web.config Line: 29
Assembly Load Trace: The following information can be helpful to determine why the assembly 'CrystalDecisions.Web, Version=13.0.3500.0, Culture=neutral, PublicKeyToken=692fbea5521e1304' could not be loaded.
The machine has a Crystal runtime engine that is incompatible with the way your application was built (the Crystal references in your application).
Reference/Version=13.0.2000.0 works with Crystal runtimes 0 to 24
Reference/Version=13.0.3500.0 works with Crystal runtimes 21 to 25
Reference/Version=13.0.4000.0 works with Crystal runtimes 26 and later (for now)
For example, if your Crystal references in your project show 13.0.2000.0, then you need to install Crystal runtime engine SP20.
If you need to remove a Crystal runtime engine service pack and install another, this link has great instructions for cleaning it up:
https://help.jeff-net.com/knowledgebase/article/uninstalling-a-crystal-runtime-service-pack-manually-removing-the-crystal-runtime-engine
Please be careful removing Crystal runtime engines, too. OTHER software may be dependent on a specific version. For example, SAGE apps only work with SP21 currently.

Building mxnet for windows (both cpu and gpu mode) - Running into errors

Mxnet is supposed to build and run, on CPU as well as on GPU, for multiple OSs including Windows.
I'm trying to build mxnet from source on Windows Server 2016 that has NVIDIA K80 GPU on it.
I followed all the instructions in https://mxnet.incubator.apache.org/get_started/windows_setup.html but not able to move past the point of building mxnet in Visual Studio 2013.
The error I'm seeing is
'mshadow::cuda::AddTakeGrad' : ambiguous call to overloaded function indexing_op.h
If I fix this generic call to AddTakeGrad to make it a specific call to mshadow::cuda::, then some other polymorphic function ends up with the same error and so on ...
I tried searching a lot to find if anyone was successful in building mxnet for windows (on both cpu mode and gpu mode) but couldn't find any.
Question: Has anyone been able to successfully build mxnet on Windows? If so, could you help with this error as well as any specific instructions to get it to build for both cpu mode as well as gpu mode?
These days it should be possible to just pip install.

Debugging Windows 10 Mobile App on a Device

I created new UWP app in VS2015, but when I am trying to debug App on a device it's giving me below error:
Error : DEP6400 : Failed to deploy. Make sure another deployment or debugging session is not in progress for the same emulator or device from a different instance of Visual Studio: Error writing file '%FOLDERID_SharedData%\PhoneTools\11.0\Debugger\bin\RemoteDebugger\vbdebug.dll'. Error 0x80072736: An operation was attempted on something that is not a socket.
Below is my system spec:
Windows 10 PC Build: 10240,
Visual Studio 2015, UWP SDK: 10586,
Package Manifest Target Version : 10586
Package Manifest Min Version : 10240
Windows 10 Mobile Build on Device: 10.0.10586.71,
Please guide me what this error is and how to resolve this error.
Zee,
Please make sure that you are checking the following below running the app.
There is a chance that you already opened a visual studio and you would have run app. Then you would have opened visual studio again and you tried the run the app again.
Here the problem is, only one visual studio can run at a time. When the first visual studio is running and when you try to launch the project once again, it will fail.
Try to restart your machine and run the project.
If you are not willing to restart the machine, open your task manager and close all the visual studio running instance.

Developing using CUDA on several computers, when only one has a GPU installed

I am a Java developer. To speed some of our algorithms, we have decided to try CUDA.
But the Issue is, currently we have only one server with GPU installed and 3 developers have to work on it (by transferring the file each time over ssh and compiling and running it over there). This obviously is a tedious process.
What I would like to know is: On my machine which does not have GPU, can I using NSight work on CUDA by compiling and generating files locally. This can automatically be transferred to server to get the result.
If we can at least work on algorithm locally using NSight (or any other IDE) and not pure vim and then compile it to remove compile time errors, this would save quite some time.
On Linux you can do remote debugging using Nsight Eclipse Edition as documented here. This requires 5.5 or later. On Windows you need to start the Nsight monitor on the server and then just configure Nsight Visual Studio Edition to use the remote machine.

CUDA Visual Profiler doesn't generate timeline

I'm trying to determine where a slowdown is occurring in my GPU code. I've verified that the code runs correctly on its own (it doesn't throw any errors, outputs are correct, finishes cleanly, etc). When I try to profile the code in Visual Profiler, it seems to run normally, dumping correct intermediate outputs to stdout. The GPU is being used (I've checked with cuda-gdb and dumping printf()s from inside my kernels). Once all the code has completed, Visual Profiler reports that viper has terminated the executable. However, no timeline is generated. Instead, the main window shows 0, 10, 20, 25 microseconds all "collapsed" on top of one another. When I tell the Visual Profiler to run all analysis options, it proceeds through the 24 runs without problems, but still no timeline is generated.
I'm using CUDA 4.2, driver version 295.41 on Ubuntu x86_64 with a GeForce 460.
When the visual profiler fails to generate a timeline it is typically because it cannot locate a component required for profiling. This component is a shared library found in /usr/local/cuda/lib64 called libcuinj.so. Is that path on your LD_LIBRARY_PATH? How are you launching the Visual Profiler? The script in /usr/local/cuda/bin/nvvp should set the path correctly for you.
The 4.2 version of the visual profiler does not do a good job of reporting errors when this shared library is not found. The upcoming 5.0 version of the visual profiler has much better error reporting in this regard.
I don't know if it's the same under Linux, but in Nsight under Windows, there are two basic types of profiling that you can run. "Application trace" and "Profile". Only under Application trace do you get the timelines. Application trace records the timestamps when CUDA and kernel calls were made. The Profile setting offers options to analyze the kernels. It reads the hardware counters from the GPU and generates performance information related to one or multiple kernels (and no timelines).