Allocating array of strings in cuda - cuda

Let us assume that we have the following strings that we need to store in a CUDA array.
"hi there"
"this is"
"who is"
How do we declare a array on the GPU to do this. I tried using C++ strings but it does not work.

Probably the best way to do this is to use structure that is similar to common compressed sparse matrix formats. Store the character data packed into a single piece of linear memory, then use a separate integer array to store the starting indices, and perhaps a third array to store the string lengths. The storage overhead of the latter might be more efficient that storing a string termination byte for every entry in the data and trying to parse for the terminator inside the GPU code.
So you might have something like this:
struct gpuStringArray {
unsigned int * pos;
unsigned int * length; // could be a smaller type if strings are short
char4 * data; // 32 bit data type will improve memory throughput, could be 8 bit
}
Note I used a char4 type for the string data; the vector type will give better memory throughput, but it will mean strings need to be aligned/suitably padded to 4 byte boundaries. That may or may not be a problem depending on what a typical real string looks like in your application. Also, the type of the (optional) length parameter should probably be chosen to reflect the maximum admissible string length. If you have a lot of very short strings, it might be worth using an 8 or 16 bit unsigned type for the lengths to save memory.
A really simplistic code to compare strings stored this way in the style of strcmp might look something like this:
__device__ __host__
int cmp4(const char4 & c1, const char4 & c2)
{
int result;
result = c1.x - c2.x; if (result !=0) return result;
result = c1.y - c2.y; if (result !=0) return result;
result = c1.z - c2.z; if (result !=0) return result;
result = c1.w - c2.w; if (result !=0) return result;
return 0;
}
__device__ __host__
int strncmp4(const char4 * s1, const char4 * s2, const unsigned int nwords)
{
for(unsigned int i=0; i<nwords; i++) {
int result = cmp4(s1[i], s2[i]);
if (result != 0) return result;
}
return 0;
}
__global__
void tkernel(const struct gpuStringArray a, const gpuStringArray b, int * result)
{
int idx = threadIdx.x + blockIdx.x * blockDim.x;
char4 * s1 = a.data + a.pos[idx];
char4 * s2 = b.data + b.pos[idx];
unsigned int slen = min(a.length[idx], b.length[idx]);
result[idx] = strncmp4(s1, s2, slen);
}
[disclaimer: never compiled, never tested, no warranty real or implied, use at your own risk]
There are some corner cases and assumptions in this which might catch you out depending on exactly what the real strings in your code look like, but I will leave those as an exercise to the reader to resolve. You should be able to adapt and expand this into whatever it is you are trying to do.

You have to use C-style character strings char *str. Searching for "CUDA string" on google would have given you this CUDA "Hello World" example as first hit: http://computer-graphics.se/hello-world-for-cuda.html
There you can see how to use char*-strings in CUDA. Be aware that standard C-functions like strcpy or strcmp are not available in CUDA!
If you want an array of strings, you just have to use char** (as in C/C++). As for strcmp and similar functions, it highly depends on what you want to do. CUDA is not really well suited for string operations, maybe it would help if you would provide a little more detail about what you want to do.

Related

How to return string from __global__ function to main function in C CUDA [duplicate]

I am trying to add 2 char arrays in cuda, but nothing is working.
I tried to use:
char temp[32];
strcpy(temp, my_array);
strcat(temp, my_array_2);
When I used this in kernel - I am getting error : calling a __host__ function("strcpy") from a __global__ function("Process") is not allowed
After this, I tried to use these functions in host, not in kernel - no error,but after addition I am getting strange symbols like ĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶĶ.
So, how I can add two ( or more ) char arrays in CUDA ?
So, how I can add two ( or more ) char arrays in CUDA ?
write your own functions:
__device__ char * my_strcpy(char *dest, const char *src){
int i = 0;
do {
dest[i] = src[i];}
while (src[i++] != 0);
return dest;
}
__device__ char * my_strcat(char *dest, const char *src){
int i = 0;
while (dest[i] != 0) i++;
my_strcpy(dest+i, src);
return dest;
}
And while we're at it, here is strcmp
As the error message explains, you are trying to call host functions ("CPU functions") from a global kernel ("GPU function"). Within a global kernel you only have access to functions provided by the CUDA runtime API, which doesn't include the C standard library (where strcpy and strcat are defined).
You have to create your own str* functions according to what you want to do. Do you want to concatenate an array of chars in parallel, or do it serially in each thread?

mysql_real_escape_string including slashes in output (C, not PHP)

I've seen this question several times relating to PHP (here is an example). The answer was generally 'stop using magic quotes'. I am having this problem in C however. When I insert binary data into a BLOB in my MySQL database, having run it through mysql_real_escape_string(), some 5c ('\') characters appear in the blob. This disrupts the data and makes it unusable. How can I prevent / fix this?
#define CHUNK_SZ (1024*256)
void insertdb(int16_t *data, size_t size, size_t nmemb)
{
static int16_t *buf;
static unsigned long index;
static short initialized;
unsigned long i;
struct tm *info;
time_t rawtime;
char dbuf[12];
char tbuf[12];
char *chunk;
if(initialized==0){
buf = (int16_t *) malloc(CHUNK_SZ);
initialized = 1;
}
if(index + (nmemb*size) + 1 >= CHUNK_SZ || do_exit == 1){
time(&rawtime);
info = localtime(&rawtime);
snprintf(dbuf, 16, "%d-%02d-%02d", 1900+info->tm_year, 1+info->tm_mon, info->tm_mday);
snprintf(tbuf, 16, "%02d:%02d:%02d", info->tm_hour, info->tm_min, info->tm_sec);
chunk = (char *) malloc(index*2+1);
mysql_real_escape_string(con, chunk, (char *) buf, index);
char *st = "INSERT INTO %s (date, time, tag, data) VALUES ('%s', '%s', %d, '%s')";
int len = strlen(st)+strlen(db_mon_table)+strlen(dbuf)+strlen(tbuf)+sizeof(tag)+index*2+1;
char *query = (char *) malloc(len);
int qlen = snprintf(query, len, st, our_table, dbuf, tbuf, tag, chunk);
if(mysql_real_query(con, query, qlen)){
fprintf(stderr, "%s\n", mysql_error(con));
mysql_close(con);
exit(1);
}
free(chunk);
index = 0;
} else {
memcpy((void *) buf+index, (void *) data, nmemb*size);
index += (nmemb*size);
}
return;
}
EDIT: Please look here. They use the same function to escape binary data (from an image), insert it, and afterward get the same image from the database. That my binary data is somehow different from an image's binary data makes no sense to me.
If you're inserting into a BLOB column, then instead of escaping the data via mysql_real_escape_string(), you should probably express it as a HEX string. You will have to figure out how to encode your int16_t data into the needed byte sequence, as at minimum you have a byte-order question to sort out (but if you're in control of both encoding and decoding then you just need to make them match).
Alternatively, if the data are genuinely textual, rather than binary, then the type of the column should probably be Text rather than BLOB. In that case, you should continue to use an ordinary SQL string and mysql_real_escape_string().

C MySQL Types Error

I'm trying to store results taken from a MySQL query into an array of structs. I can't seem to get the types to work though, and I've found the MySQL documentation difficult to sort through.
My struct is:
struct login_session
{
char* user[10];
time_t time;
int length;
};
And the loop where I'm trying to get the data is:
while ( (row = mysql_fetch_row(res)) != NULL ) {
strcpy(records[cnt].user, &row[0]);
cnt++;
}
No matter what I try though I constantly get the error:
test.c:45: warning: passing argument 1 of ‘strcpy’ from incompatible pointer type
/usr/include/string.h:128: note: expected ‘char * __restrict__’ but argument is of type ‘char **’
test.c:45: warning: passing argument 2 of ‘strcpy’ from incompatible pointer type
/usr/include/string.h:128: note: expected ‘const char * __restrict__’ but argument is of type ‘MYSQL_ROW’
Any pointers?
Multiple problems, all related to pointers and arrays, I recommend you do some reading.
First, char * user[10] is defining an array of 10 char * values, not an array of char, which is was I suspect you want. The warning even says as much, strcpy() expects a char *, the user field on its own is seen as a char **.
Second, you're one & away from what you want in the second argument.
Copied from mysql.h header:
typedef char **MYSQL_ROW; /* return data as array of strings */
A MYSQL_ROW is an array of char arrays. Using [] does a dereference, so you dereference down to a char * which is what strcpy() takes, but then you take the address of it using &.
Your code should look more like this:
struct login_session
{
char user[10];
time_t time;
int length;
};
while ( (row = mysql_fetch_row(res)) != NULL ) {
strcpy(records[cnt].user, row[0]);
cnt++;
}
I don't know what guarantees you have about the data coming from mysql, but if you can't be absolutely sure that the rows are <= 10 characters long and null ('\0') terminated, you should use strncpy() to avoid any possibility of overflowing the user array.

Extracting integers from a query string

I am creating a program that can make mysql transactions through C and html.
I have this query string
query = -id=103&-id=101&-id=102&-act=Delete
Extracting "Delete" by sscanf isn't that hard, but I need help extracting the integers and putting them in an array of int id[]. The number of -id entries can vary depending on how many checkboxes were checked in the html form.
I've been searching for hours but haven't found any applicable solution; or I just did not understand them. Any ideas?
Thanks
You can use strstr and atoi to extract the numbers in a loop, like this:
char *query = "-id=103&-id=101&-id=102&-act=Delete";
char *ptr = strstr(query, "-id=");
if (ptr) {
ptr += 4;
int n = atoi(ptr);
printf("%d\n", n);
for (;;) {
ptr = strstr(ptr, "&-id=");
if (!ptr) break;
ptr += 5;
int n = atoi(ptr);
printf("%d\n", n);
}
}
Demo on ideone.
You want to use strtok or a better solution, to tokenize this string with & and = as tokens.
Take a look at cplusplus.com for more information and an example.
This is the output you would get from strtok
Output:
Splitting string "- This, a sample string." into tokens:
This
a
sample
string
Once you figure out how to split them, the next hurdle is to convert the numbers from strings to ints. For this you need to look at atoi or its safer more robust cousin strtol
Most likely I would write a small lexical scanner to tackle the task. Meaning, I would analyze the string one character at a time, according to a regular expression representing the set of possible inputs.

type casting to unsigned long long in CUDA?

Basically what I want is an function works like hiloint2uint64(), just join two 32 bit integer and reinterpret the outcome as an uint64.
I cannot find any function in CUDA that can do this, anyhow, is there any ptx code that can do that kind of type casting?
You can define your own function like this:
__host__ __device__ unsigned long long int hiloint2uint64(int h, int l)
{
int combined[] = { h, l };
return *reinterpret_cast<unsigned long long int*>(combined);
}
Maybe a bit late by now, but probably the safest way to do this is to do it "manually" with bit-shifts and or:
uint32_t ui_h = h;
uint32_t ui_l = l;
return (uint64_t(h)<<32)|(uint64_t(l));
Note the other solution presented in the other answer isn't safe, because the array of ints might not be 8-byte aligned (and shifting some bits is faster than memory read/write, anyway)
Use uint2 (but define the temporary variable as 64-bit value: unsigned long long int) instead of arrays to be sure of alignment.
Be careful about the order of l and h.
__host__ __device__ __forceinline__ unsigned long long int hiloint2uint64(unsigned int h, unsigned int l)
{
unsigned long long int result;
uint2& src = *reinterpret_cast<uint2*>(&result);
src.x = l;
src.y = h;
return result;
}
The CUDA registers have a size of 32 bits anyway. In the best case the compiler won't need any extra code. In the worst case it has to reorder the registers by moving a 32-bit value.
Godbolt example https://godbolt.org/z/3r9WYK9e7 of how optimized it gets.