GetLastError() return value 2 - createfile

A very basic question, I cannot understand why I get error 2 with the simplest example of windows system programming book.
I post the source:
#include <Windows.h>
#include <stdio.h>
#define BUF_SIZE 65536
int main(int argc, LPTSTR argv[]){
HANDLE fileIN, fileOUT;
DWORD nIn, nOut; //numero di byte LETTI/SCRITTI
CHAR buffer[BUF_SIZE];
if(argc!=3){
printf("Usage: %s file1 file2\n", argv[0]);
return 1;
}
//printf("%s\n",argv[0]);
//printf("%s\n",argv[1]);
//printf("%s\n",argv[2]);
fileIN = CreateFileW(argv[1],GENERIC_READ,FILE_SHARE_READ,NULL, OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
if(fileIN == INVALID_HANDLE_VALUE){
printf("Invalid handle value, error %x\n", GetLastError());
return 2;
}
fileOUT = CreateFileW(argv[2], GENERIC_WRITE,0, NULL, CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL, NULL);
if(fileOUT == INVALID_HANDLE_VALUE){
printf("Invalid handle value, error %x\n", GetLastError());
return 3;
}
while(ReadFile(fileIN, buffer, BUF_SIZE, &nIn, NULL) && nIn > 0){
WriteFile(fileOUT, buffer, nIn, &nOut, NULL);
if(nIn != nOut){
printf("During copy, error %x\n", GetLastError());
return 4;
}
}
CloseHandle(fileIN);
CloseHandle(fileOUT);
return 0;
}
Can you help me please? I'm using MS VSc++ 2010 SP 1
Thank you all in advange
N.

Please, forgive me if I am wrong but I believe that you are trying to create files with functions unicode and, may be, you are providing paths ansi.
It is easy to test if I am wrong changing the terminating W into A.
Please, forgive my possible error.

Related

cudaGetLastError. Which kernel execution raised it?

I have implemented a pipeline where many kernels are launched in a specific stream. The kernels are enqueued into the stream and executed when the scheduler decides it’s best.
In my code, after every kernel enqueue, I check if there’s any error by calling cudaGetLastError which, according to the documentation, "it returns the last error from a runtime call. This call, may also return error codes from previous asynchronous launches". Thus, if the kernel has only been enqueued, not executed, I understand that the error returned refers only if the kernel was enqueued correctly (parameters checking, grid and block size, shared memory, etc...).
My problem is: I enqueue many different kernels without waiting for finalization of the execution of each kernel. Imagine now, I have a bug in one of my kernels (let's call it Kernel1) which causes a illegal memory access (for instance). If I check the cudaGetLastError right after enqueuing it, the return value is success because it was correctly enqueued. So my CPU thread moves on and keep enqueuing kernels to the stream. At some point Kernel1 is executed and raised the illegal memory access. Thus, next time I check for cudaGetLastError I will get the cuda error but, by that time, the CPU thread is another point forward in the code. Consequently, I know there's been an error, but I have no idea which kernel raised it.
An option is to synchronize (block the CPU thread) until the execution of every kernel have finished and then check the error code, but this is not an option for performance reasons.
The question is, is there any way we can query which kernel raised a given error code returned by cudaGetLastError? If not, which is in your opinion the best way to handle this?
There is an environment variable CUDA_​LAUNCH_​BLOCKING which you can use to serialize kernel execution of an otherwise asynchronous sequence of kernel launches. This should allow you to isolate the kernel instance which is causing an error, either via internal error checking in your host code, or via an external tool like cuda-memcheck.
I have tested 3 different options:
Set CUDA_​LAUNCH_​BLOCKING environment variable to 1. This forces to block the CPU thread until the kernel execution has finished. We can check after each execution if there's been an error catching the exact point of failure. Although, this has an obvious performance impact but this may help to bound the bug in a production environment without having to perform any change at the client side.
Distribute the production code compiled with the flag -lineinfo and run the code again with cuda-memncheck. This has no performance impact and we do not need to perform any change in the client either. Although, we have to execute the binary in a slightly different environment and in some cases, like a service running GPU tasks, can be difficult to achieve.
Insert a callback after each kernel call. In the userData parameter, include a unique id for the kernel-call, and possibly some information on the parameters used. This can be directly distributed in a production environment and always gives us the exact point of failure and we don't need to perform any change at the client side. Although, the performance impact of this approach is huge. Apparently, the callback functions, are processed by a driver thread and cause for the performance impact. I wrote a code to test it
#include <cuda_runtime.h>
#include <vector>
#include <chrono>
#include <iostream>
#define BLOC_SIZE 1024
#define NUM_ELEMENTS BLOC_SIZE * 32
#define NUM_ITERATIONS 500
__global__ void KernelCopy(const unsigned int *input, unsigned int *result) {
unsigned int pos = blockIdx.x * BLOC_SIZE + threadIdx.x;
result[pos] = input[pos];
}
void CUDART_CB myStreamCallback(cudaStream_t stream, cudaError_t status, void *data) {
if (status) {
std::cout << "Error: " << cudaGetErrorString(status) << "-->";
}
}
#define CUDA_CHECK_LAST_ERROR cudaStreamAddCallback(stream, myStreamCallback, nullptr, 0)
int main() {
cudaError_t c_ret;
c_ret = cudaSetDevice(0);
if (c_ret != cudaSuccess) {
return -1;
}
unsigned int *input;
c_ret = cudaMalloc((void **)&input, NUM_ELEMENTS * sizeof(unsigned int));
if (c_ret != cudaSuccess) {
return -1;
}
std::vector<unsigned int> h_input(NUM_ELEMENTS);
for (unsigned int i = 0; i < NUM_ELEMENTS; i++) {
h_input[i] = i;
}
c_ret = cudaMemcpy(input, h_input.data(), NUM_ELEMENTS * sizeof(unsigned int), cudaMemcpyKind::cudaMemcpyHostToDevice);
if (c_ret != cudaSuccess) {
return -1;
}
unsigned int *result;
c_ret = cudaMalloc((void **)&result, NUM_ELEMENTS * sizeof(unsigned int));
if (c_ret != cudaSuccess) {
return -1;
}
cudaStream_t stream;
c_ret = cudaStreamCreate(&stream);
if (c_ret != cudaSuccess) {
return -1;
}
std::chrono::steady_clock::time_point start;
std::chrono::steady_clock::time_point end;
start = std::chrono::steady_clock::now();
for (unsigned int i = 0; i < 500; i++) {
dim3 grid(NUM_ELEMENTS / BLOC_SIZE);
KernelCopy <<< grid, BLOC_SIZE, 0, stream >>> (input, result);
CUDA_CHECK_LAST_ERROR;
}
cudaStreamSynchronize(stream);
end = std::chrono::steady_clock::now();
std::cout << "With callback took (ms): " << std::chrono::duration<float, std::milli>(end - start).count() << '\n';
start = std::chrono::steady_clock::now();
for (unsigned int i = 0; i < 500; i++) {
dim3 grid(NUM_ELEMENTS / BLOC_SIZE);
KernelCopy <<< grid, BLOC_SIZE, 0, stream >>> (input, result);
c_ret = cudaGetLastError();
if (c_ret) {
std::cout << "Error: " << cudaGetErrorString(c_ret) << "-->";
}
}
cudaStreamSynchronize(stream);
end = std::chrono::steady_clock::now();
std::cout << "Without callback took (ms): " << std::chrono::duration<float, std::milli>(end - start).count() << '\n';
c_ret = cudaStreamDestroy(stream);
if (c_ret != cudaSuccess) {
return -1;
}
c_ret = cudaFree(result);
if (c_ret != cudaSuccess) {
return -1;
}
c_ret = cudaFree(input);
if (c_ret != cudaSuccess) {
return -1;
}
return 0;
}
Ouput:
With callback took (ms): 47.8729
Without callback took (ms): 1.9317
(CUDA 9.2, Windows 10, Visual Studio 2015, Nvidia Tesla P4)
To me, in a production environment, the only valid approach is number 2.

CUDA program returning incorrect results [duplicate]

This question already has answers here:
What is the canonical way to check for errors using the CUDA runtime API?
(4 answers)
Closed 2 years ago.
I have copied a vector addition example from the book "CUDA By Example" and I am getting unexpected incorrect results. Here is my code
#define N (33*1024)
__global__
void add(int *a, int *b,int *c){
int tid = threadIdx.x+blockIdx.x*blockDim.x;
while (tid < N){
c[tid] = a[tid]+b[tid];
tid+=blockDim.x*gridDim.x;
}
}
int main()
{
int a[N], b[N], c[N];
int *dev_a, *dev_b, *dev_c;
cudaMalloc((void**)&dev_a,N*sizeof(int));
cudaMalloc((void**)&dev_b,N*sizeof(int));
cudaMalloc((void**)&dev_c,N*sizeof(int));
for(int i = 0 ; i<N;i++){
a[i]= -i;
b[i]= i*i;
}
cudaMemcpy(dev_a,a,N*sizeof(int),cudaMemcpyHostToDevice);
cudaMemcpy(dev_b,b,N*sizeof(int),cudaMemcpyHostToDevice);
cudaMemcpy(dev_c,c,N*sizeof(int),cudaMemcpyHostToDevice);
add<<<128,128>>>(dev_a,dev_b,dev_c);
cudaMemcpy(c,dev_c, N*sizeof(int), cudaMemcpyDeviceToHost);
bool success=true;
//print results
for(int i=0; i<N;i++){
if((a[i]+b[i])!=c[i]){
printf("Error: %d + %d != %d\n",a[i],b[i],c[i]);
success=false;
}
}
if(success) printf("we did it!\n");
cudaFree(dev_a);
cudaFree(dev_a);
cudaFree(dev_a);
printf("done");
return EXIT_SUCCESS;
}
and I am getting a bunch of incorrect addition results, here is just a few
Error: -33784 + 1141358656 != 255
Error: -33785 + 1141426225 != 0
Error: -33786 + 1141493796 != 0
Error: -33787 + 1141561369 != 0
Error: -33788 + 1141628944 != 4609792
Error: -33789 + 1141696521 != 0
Error: -33790 + 1141764100 != 4207408
and there are many many more. I am a complete CUDA beginner but my guess is I either
A) copied the code incorrectly from the book OR
B) the incorrect results come from the fact that I am using CUDA 10 which came out long after this book was written
EDIT: I restarted my computer and it worked
I happen to be able to repeat your error if I alter my configuration. Something is probably wrong with your configuration, too. When I used fitting CUDA and driver versions it worked after fixing a minor typo:
cudaFree(dev_a); //this line is copied three times in your code
Please wrap your cuda call with something like the following to check the return value. Must be one of the cuda functions failed.
#define CUDA_CHECK_RETURN(value) { \
cudaError_t _m_cudaStat = value; \
if (_m_cudaStat != cudaSuccess) { \
fprintf(stderr, "Error %s at line %d in file %s\n", \
cudaGetErrorString(_m_cudaStat), __LINE__, __FILE__); \
exit(1); \
} }
//for example
CUDA_CHECK_RETURN(cudaMemcpy(c,dev_c, N*sizeof(float), cudaMemcpyDeviceToHost));
It should tell you what might go wrong.

How do I open a URL from C++?

how can I open a URL from my C++ program?
In ruby you can do
%x(open https://google.com)
What's the equivalent in C++? I wonder if there's a platform-independent solution. But if there isn't, I'd like the Unix/Mac better :)
Here's my code:
#include <stdio.h>
#include <string.h>
#include <fstream>
int main (int argc, char *argv[])
{
char url[1000] = "https://www.google.com";
std::fstream fs;
fs.open(url);
fs.close();
return 0;
}
Your question may mean two different things:
1.) Open a web page with a browser.
#include <windows.h>
#include <shellapi.h>
...
ShellExecute(0, 0, L"http://www.google.com", 0, 0 , SW_SHOW );
This should work, it opens the file with the associated program. Should open the browser, which is usually the default web browser.
2.) Get the code of a webpage and you will render it yourself or do some other thing. For this I recommend to read this or/and this.
I hope it's at least a little helpful.
EDIT: Did not notice, what you are asking for UNIX, this only work on Windows.
Use libcurl, here is a simple example.
EDIT: If this is about starting a web browser from C++, you can invoke a shell command with system on a POSIX system:
system("<mybrowser> http://google.com");
By replacing <mybrowser> with the browser you want to launch.
Here's an example in windows code using winsock.
#include <winsock2.h>
#include <windows.h>
#include <iostream>
#include <string>
#include <locale>
#pragma comment(lib,"ws2_32.lib")
using namespace std;
string website_HTML;
locale local;
void get_Website(char *url );
int main ()
{
//open website
get_Website("www.google.com" );
//format website HTML
for (size_t i=0; i<website_HTML.length(); ++i)
website_HTML[i]= tolower(website_HTML[i],local);
//display HTML
cout <<website_HTML;
cout<<"\n\n";
return 0;
}
//***************************
void get_Website(char *url )
{
WSADATA wsaData;
SOCKET Socket;
SOCKADDR_IN SockAddr;
int lineCount=0;
int rowCount=0;
struct hostent *host;
char *get_http= new char[256];
memset(get_http,' ', sizeof(get_http) );
strcpy(get_http,"GET / HTTP/1.1\r\nHost: ");
strcat(get_http,url);
strcat(get_http,"\r\nConnection: close\r\n\r\n");
if (WSAStartup(MAKEWORD(2,2), &wsaData) != 0)
{
cout << "WSAStartup failed.\n";
system("pause");
//return 1;
}
Socket=socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
host = gethostbyname(url);
SockAddr.sin_port=htons(80);
SockAddr.sin_family=AF_INET;
SockAddr.sin_addr.s_addr = *((unsigned long*)host->h_addr);
cout << "Connecting to "<< url<<" ...\n";
if(connect(Socket,(SOCKADDR*)(&SockAddr),sizeof(SockAddr)) != 0)
{
cout << "Could not connect";
system("pause");
//return 1;
}
cout << "Connected.\n";
send(Socket,get_http, strlen(get_http),0 );
char buffer[10000];
int nDataLength;
while ((nDataLength = recv(Socket,buffer,10000,0)) > 0)
{
int i = 0;
while (buffer[i] >= 32 || buffer[i] == '\n' || buffer[i] == '\r')
{
website_HTML+=buffer[i];
i += 1;
}
}
closesocket(Socket);
WSACleanup();
delete[] get_http;
}
I was having the exact same problem in Windows.
I noticed that in OP's gist, he uses string("open ") in line 21, however, by using it one comes across this error:
'open' is not recognized as an internal or external command
After researching, I have found that open is MacOS the default command to open things. It is different on Windows or Linux.
Linux: xdg-open <URL>
Windows: start <URL>
For those of you that are using Windows, as I am, you can use the following:
std::string op = std::string("start ").append(url);
system(op.c_str());
I've had MUCH better luck using ShellExecuteA(). I've heard that there are a lot of security risks when you use "system()". This is what I came up with for my own code.
void SearchWeb( string word )
{
string base_URL = "http://www.bing.com/search?q=";
string search_URL = "dummy";
search_URL = base_URL + word;
cout << "Searching for: \"" << word << "\"\n";
ShellExecuteA(NULL, "open", search_URL.c_str(), NULL, NULL, SW_SHOWNORMAL);
}
p.s. Its using WinAPI if i'm correct. So its not multiplatform solution.
There're already answers for windows. In linux, I noticed open https://www.google.com always launch browser from shell, so you can try:
system("open https://your.domain/uri");
that's say
system(("open "s + url).c_str()); // c++
https://linux.die.net/man/1/open
C isn't as high-level as the scripting language you mention. But if you want to stay away from socket-based programming, try Curl. Curl is a great C library and has many features. I have used it for years and always recommend it. It also includes some stand alone programs for testing or shell use.
For linux environments, you can use xdg-open. It is installed by default on most distributions. The benefit over the accepted answer is that it opens the user's preferred browser.
$ xdg-open https://google.com
$ xdg-open steam://run/10
Of course you can wrap this in a system() call.
Create a function and copy the code using winsock which is mentioned already by Software_Developer.
For Instance:
#ifdef _WIN32
// this is required only for windows
if (WSAStartup(MAKEWORD(2,2), &wsaData) != 0)
{
//...
}
#endif
winsock code here
#ifdef _WIN32
WSACleanup();
#endif

How to use tcl apis in a c code

I want to use some of the functionalities(APIs) of my tcl code in another "c" code file. But i am not getting how to do that especiallly how to link them. For that i have taken a very simple tcl code which contains one API which adds two numbers and prints the sum. Can anybody tell me how can i call this tcl code to get the sum. How can i write a c wrapper that will call this tcl code. Below is my sample tcl program that i am using :
#!/usr/bin/env tclsh8.5
proc add_two_nos { } {
set a 10
set b 20
set c [expr { $a + $b } ]
puts " c is $c ......."
}
To evaluate a script from C code, use Tcl_Eval() or one of its close relatives. In order to use that API, you need to link in the Tcl library, initialize the Tcl library and create an interpreter to hold the execution context. Plus you really ought to do some work to retrieve the result and print it out (printing script errors out is particularly important, as that helps a lot with debugging!)
Thus, you get something like this:
#include <tcl.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
Tcl_Interp *interp;
int code;
char *result;
Tcl_FindExecutable(argv[0]);
interp = Tcl_CreateInterp();
code = Tcl_Eval(interp, "source myscript.tcl; add_two_nos");
/* Retrieve the result... */
result = Tcl_GetString(Tcl_GetObjResult(interp));
/* Check for error! If an error, message is result. */
if (code == TCL_ERROR) {
fprintf(stderr, "ERROR in script: %s\n", result);
exit(1);
}
/* Print (normal) result if non-empty; we'll skip handling encodings for now */
if (strlen(result)) {
printf("%s\n", result);
}
/* Clean up */
Tcl_DeleteInterp(interp);
exit(0);
}
I think i have sloved it out. You were correct. The problem was with the include method that i was using. I have the files tcl.h, tclDecls.h and tclPlatDecls.h included in the c code but these files were not existing in the path /usr/include so i was copying these files to that directory, may be it was not a proper way to do. Finally i have not copied those files to /usr/include and gave the include path while compiling. I have created executable and it is givingthe proper result on terminal. Thanks for your help.
Here is the exact c code i am using :
#include <tcl.h>
#include <tclDecls.h>
#include <tclPlatDecls.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char **argv) {
Tcl_Interp *interp;
int code;
char *result;
printf("inside main function \n");
// Tcl_InitStubs(interp, "8.5", 0);
Tcl_FindExecutable(argv[0]);
interp = Tcl_CreateInterp();
code = Tcl_Eval(interp, "source simple_addition.tcl; add_two_nos");
/* Retrieve the result... */
result = Tcl_GetString(Tcl_GetObjResult(interp));
/* Check for error! If an error, message is result. */
if (code == TCL_ERROR) {
fprintf(stderr, "ERROR in script: %s\n", result);
exit(1);
}
/* Print (normal) result if non-empty; we'll skip handling encodings for now */
if (strlen(result)) {
printf("%s\n", result);
}
/* Clean up */
Tcl_DeleteInterp(interp);
exit(0);
}
And to compile this code and to generate executable file i am using below command :
gcc simple_addition_wrapper_new.c -I/usr/include/tcl8.5/ -ltcl8.5 -o simple_addition_op
I have executed the file simple_addition_op and got below result which was proper
inside main function
c is 30 .......
My special thanks to Donal Fellows and Johannes

assiging makes pointer integer witout cast

hello all i have write a c program which connects to a mysql server and executes a sql query from a text file which has only one query.
#include <mysql.h>
#include <stdio.h>
main() {
MYSQL *conn;
MYSQL_RES *res;
MYSQL_ROW row;
char *server = "127.0.0.1";
char *user = "root";
char *password = "PASSWORD"; /* set me first */
char *database = "har";
conn = mysql_init(NULL);
char ch, file_name[25];
char *ch1;
FILE *fp;
printf("Enter the name of file you wish to see ");
gets(file_name);
fp = fopen(file_name,"r"); // read mode
if( fp == NULL )
{
perror("Error while opening the file.\n");
exit(0);
}
while( ( ch = fgetc(fp) ) != EOF )
printf("%c",ch);
ch1=ch;
/* Connect to database */
if (!mysql_real_connect(conn, server,
NULL , NULL, database, 0, NULL, 0)) {
fprintf(stderr, "%s\n", mysql_error(conn));
exit(0);
}
printf("%c",ch);
/* send SQL query */
if (mysql_query(conn, ch1)) {
fprintf(stderr, "%s\n", mysql_error(conn));
exit(0);
}
res = mysql_use_result(conn);
/* output table name */
printf("MySQL Tables in mysql database:\n");
while ((row = mysql_fetch_row(res)) != NULL)
printf("%s \n", row[0]);
/* close connection */
mysql_free_result(res);
mysql_close(conn);
fclose(fp);
}
i am unable to understand where i have gone wrong....
thanks in advance...
This is the line causing problem:
ch1=ch;
ch1 is a pointer to a character, whereas ch is a character.
Do you intend to store the bytes read from fp in a char array pointed by ch1? What you are doing is, every time in the while loop you are reading a character using fgetc storing it in ch and printing it.
Then, when while loop gets over, you are assigning a char to a char pointer. I am not sure what you are trying to do with this. But this definitely causes the problem.
You're going wrong in a lot of ways:
You don't declare the return type or arguments for main.
You're using gets. Never ever use gets, don't even think about. Use fgets instead.
fgetc returns an int, not a char so your ch should be an int. You won't be able to recognize EOF until you fix this.
You're declaring char ch and char *ch1 but assigning ch to ch1. That's where the error in your title is coming from.
Your code appears to be trying feed your SQL to MySQL one byte at a time and that's not going to do anything useful. I think you're meaning to use fgets to read the SQL file one line a time so that you can feed each line to MySQL as a single SQL statement.
You should spend some time reading about your compiler's warning switches