include
int main(void)
{
unsigned long long count=0;
count=(20000*(200000+1));
printf("%lu\n",count); //18446744073414604320
printf("%llu",count); //18446744073414604320
return 0;
}
now for count=
(2000*(200000+1)); //400002000
I dont know whats going on, basically i wanted to calculate total number of subsets for a set of size "n".
I know :-
total subsets are= (n*(n+1))/2
constraints = 1<n<=200000
BUT my code fails for extreme value of n.
can anyone plz help me out for giving right output.
thanks!!
Related
I want to design a kernel to add a matrix row pairs concurrently, but I don't know how to accomplish it.
For example, I have a data matrix, which size is (512, 1024), and I want to add its row pairs(row1+row2, row3+row4,...,row511+row512) at same time.
The reason I’m considering doing this is just for saving time.
Could you give me some advice?
Thanks!
Something like this may be useful:
const int width = 1024;
const int rows = 512;
template <typename T>
__global__ void row_add(const T * __restrict__ din, T * __restrict__ dout){
int idx = width*2*blockIdx.x + threadIdx.x;
if (dout == din)
dout[idx] += dout[idx+width];
else
dout[idx-blockIdx.x*width] = din[idx]+din[idx+width];
}
It depends on the width dimension being 1024 or less. You would launch it like this:
row_add<<<rows/2, width>>>(d_in, d_out);
If you pass it different pointers for d_in and d_out, it will assume you want the output written contiguously to a separate array. If you pass it the same pointer for d_in and d_out, it will assume you want the results of row 0+1 written to row 0, the results of row 2+3 written to row 2, and so on.
The rows dimension has to be an even number, obviously from your problem statement (adding rows pairwise).
coded in browser, not tested, may contain bugs
UPDATE: I solved my problem (scroll down).
I'm writing a small C program and I want to do the following:
The program is connected to a mysql database (that works perfectly) and I want to do something with the data from the database. I get about 20-25 rows per query and I created my own struct, which should contain the information from each row of the query.
So my struct looks like this:
typedef struct {
int timestamp;
double rate;
char* market;
char* currency;
} Rate;
I want to pass an empty array to a function, the function should calculate the size for the array based on the returned number of rows of the query. E.g. there are 20 rows which are returned from a single SQL query, so the array should contain 20 objectes of my Rate struct.
I want something like this:
int main(int argc, char **argv)
{
Rate *rates = ?; // don't know how to initialize it
(void) do_something_with_rates(&rates);
// the size here should be ~20
printf("size of rates: %d", sizeof(rates)/sizeof(Rate));
}
How does the function do_something_with_rates(Rate **rates) have to look like?
EDIT: I did it as Alex said, I made my function return the size of the array as size_t and passed my array to the function as Rate **rates.
In the function you can access and change the values like (*rates)[i].timestamp = 123 for example.
In C, memory is either dynamically or statically allocated.
Something like int fifty_numbers[50] is statically allocated. The size is 50 integers no matter what, so the compiler knows how big the array is in bytes. sizeof(fifty_numbers) will give you 200 bytes here.
Dynamic allocation: int *bunch_of_numbers = malloc(sizeof(int) * varying_size). As you can see, varying_size is not constant, so the compiler can't figure out how big the array is without executing the program. sizeof(bunch_of_numbers) gives you 4 bytes on a 32 bit system, or 8 bytes on a 64 bit system. The only one that know how big the array is would be the programmer. In your case, it's whoever wrote do_something_with_rates(), but you're discarding that information by either not returning it, or taking a size parameter.
It's not clear how do_something_with_rates() was declared exactly, but something like: void do_something_with_rates(Rate **rates) won't work as the function has no idea how big rates is. I recommend something like: void do_something_with_rates(size_t array_size, Rate **rates). At any rate, going by your requirements, it's still a ways away from working. Possible solutions are below:
You need to either return the new array's size:
size_t do_something_with_rates(size_t old_array_size, Rate **rates) {
Rate **new_rates;
*new_rates = malloc(sizeof(Rate) * n); // allocate n Rate objects
// carry out your operation on new_rates
// modifying rates
free(*rates); // releasing the memory taken up by the old array
*rates = *new_rates // make it point to the new array
return n; // returning the new size so that the caller knows
}
int main() {
Rate *rates = malloc(sizeof(Rate) * 20);
size_t new_size = do_something_with_rates(20, &rates);
// now new_size holds the size of the new array, which may or may not be 20
return 0;
}
Or pass in a size parameter for the function to set:
void do_something_with_rates(size_t old_array_size, size_t *new_array_size, Rate **rates) {
Rate **new_rates;
*new_rates = malloc(sizeof(Rate) * n); // allocate n Rate objects
*new_array_size = n; // setting the new size so that the caller knows
// carry out your operation on new_rates
// modifying rates
free(*rates); // releasing the memory taken up by the old array
*rates = *new_rates // make it point to the new array
}
int main() {
Rate *rates = malloc(sizeof(Rate) * 20);
size_t new_size;
do_something_with_rates(20, &new_size, &rates);
// now new_size holds the size of the new array, which may or may not be 20
return 0;
}
Why do I need to pass the old size as a parameter?
void do_something_with_rates(Rate **rates) {
// You don't know what n is. How would you
// know how many rate objects the caller wants
// you to process for any given call to this?
for (size_t i = 0; i < n; ++i)
// carry out your operation on new_rates
}
Everything changes when you have a size parameter:
void do_something_with_rates(size_t size, Rate **rates) {
for (size_t i = 0; i < size; ++i) // Now you know when to stop
// carry out your operation on new_rates
}
This is a very fundamental flaw with your program.
I want to also want the function to change the contents of the array:
size_t do_something_with_rates(size_t old_array_size, Rate **rates) {
Rate **new_rates;
*new_rates = malloc(sizeof(Rate) * n); // allocate n Rate objects
// carry out some operation on new_rates
Rate *array = *new_rates;
for (size_t i = 0; i < n; ++i) {
array[i]->timestamp = time();
// you can see the pattern
}
return n; // returning the new size so that the caller knows
}
sizeof produces a value (or code to produce a value) of the size of a type or the type of an expression at compile time. The size of an expression can therefore not change during the execution of the program. If you want that feature, use a variable, terminal value or a different programming language. Your choice. Whatever. C's better than Java.
char foo[42];
foo has either static storage duration (which is only partially related to the static keyword) or automatic storage duration.
Objects with static storage duration exist from the start of the program to the termination. Those global variables are technically called variables declared at file scope that have static storage duration and internal linkage.
Objects with automatic storage duration exist from the beginning of their initialisation to the return of the function. These are usually on the stack, though they could just as easily be on the graph. They're variables declared at block scope that have automatic storage duration and internal linkage.
In either case, todays compilers will encode 42 into the machine code. I suppose it'd be possible to modify the machine code, though that several thousands of lines you put into that task would be much better invested into storing the size externally (see other answer/s), and this isn't really a C question. If you really want to look into this, the only examples I can think of that change their own machine code are viruses... How are you going to avoid that antivirus heuristic?
Another option is to encode size information into a struct, use a flexible array member and then you can carry both the array and the size around as one allocation. Sorry, this is as close as you'll get to what you want. e.g.
struct T_vector {
size_t size;
T value[];
};
struct T_vector *T_make(struct T_vector **v) {
size_t index = *v ? (*v)->size++ : 0, size = index + 1;
if ((index & size) == 0) {
void *temp = realloc(*v, size * sizeof *(*v)->value);
if (!temp) {
return NULL;
}
*v = temp;
// (*v)->size = size;
*v = 42; // keep reading for a free cookie
}
return (*v)->value + index;
}
#define T_size(v) ((v) == NULL ? 0 : (v)->size)
int main(void) {
struct T_vector *v = NULL; T_size(v) == 0;
{ T *x = T_make(&v); x->value[0]; T_size(v) == 1;
x->y = y->x; }
{ T *y = T_make(&v); x->value[1]; T_size(v) == 2;
y->x = x->y; }
free(v);
}
Disclaimer: I only wrote this as an example; I don't intend to test or maintain it unless the intent of the example suffers drastically. If you want something I've thoroughly tested, use my push_back.
This may seem innocent, yet even with that disclaimer and this upcoming warning I'll likely see a comment along the lines of: Each successive call to make_T may render previously returned pointers invalid... True, and I can't think of much more I could do about that. I would advise calling make_T, modifying the value pointed at by the return value and discarding that pointer, as I've done above (rather explicitly).
Some compilers might even allow you to #define sizeof(x) T_size(x)... I'm joking; don't do this. Do it, mate; it's awesome!
Technically we aren't changing the size of an array here; we're allocating ahead of time and where necessary, reallocating and copying to a larger array. It might seem appealing to abstract allocation away this way in C at times... enjoy :)
I have tried below program using atomicInc().
__global__ void ker(int *count)
{
int n=1;
int x = atomicInc ((unsigned int *)&count[0],n);
CUPRINTF("In kernel count is %d\n",count[0]);
}
int main()
{
int hitCount[1];
int *hitCount_d;
hitCount[0]=1;
cudaMalloc((void **)&hitCount_d,1*sizeof(int));
cudaMemcpy(&hitCount_d[0],&hitCount[0],1*sizeof(int),cudaMemcpyHostToDevice);
ker<<<1,4>>>(hitCount_d);
cudaMemcpy(&hitCount[0],&hitCount_d[0],1*sizeof(int),cudaMemcpyDeviceToHost);
printf("count is %d\n",hitCount[0]);
return 0;
}
Output is:
In kernel count is 1
In kernel count is 1
In kernel count is 1
In kernel count is 1
count is 1
I'm not understanding why it is not incrementing. Can anyone help
Referring to the documentation, atomicInc does this:
for the following:
atomicInc ((unsigned int *)&count[0],n);
compute:
((count[0] >= n) ? 0 : (count[0]+1))
and store the result back in count[0]
(If you're not sure what the ? operator does, look here)
Since you've passed n = 1, and count[0] starts out at 1, atomicInc never actually increments the variable count[0] beyond 1.
If you want to see it increment beyond 1, pass a larger value for n.
The variable n actually acts as a "rollover value" for the incrementing process. When the variable to be incremented actually reaches the value of n, the next atomicInc will reset it to zero.
Although you haven't asked the question, you might ask, "Why do I never see a value of zero, if I am hitting the rollover value?"
To answer this, you must remember that all 4 of your threads are executing in lockstep. All 4 of them execute the atomicInc instruction before any execute the subsequent print statement.
Therefore we have a variable of count[0] which starts out at 1.
The first thread to execute the atomic resets it to zero.
The next thread increments it to 1.
The third thread resets it to zero.
The fourth and final thread increments it to 1.
Then all 4 threads print out the value.
As another experiment, try launching 5 threads instead of 4, see if you can predict what the value printed out will be.
ker<<<1,5>>>(hitCount_d);
As #talonmies indicated in the comments, if you swap your atomicInc for an atomicAdd:
int x = atomicAdd ((unsigned int *)&count[0],n);
You'll get results that you were probably expecting.
I'm a bit confused about analyzing space complexity in general. I'm not sure the meaning of "extra space taken up by the algorithm". What counts as space of 1?
In the example here
int findMin(int[] x) {
int k = 0; int n = x.length;
for (int i = 1; i < n; i++) {
if (x[i] < x[k]) {
k = i;
}
}
return k;
}
The space complexity is O(n), and I'm guessing it's due to an array size of n.
But for something like heapsort, it takes O(1). Wouldn't an in-place heapsort also need to have an array of size n(n is size of input)? Or are we assuming the input is already in an array? Why is heapsort's space complexity O(1)?
Thanks!
Heapsort requires only a constant amount of auxiliary storage, hence O(1). The space used by the input to be sorted is of course O(n).
Actually extra space corresponds to extra stack space that an algo uses i.e. other dan the input and generally it requires stack in recursive function calls , if recursion is present in algo than surely it will use stack to store contents until it get solved by termination condition.
The size of the stack will be O(height of the recursion tree).
Hope this is helpful!!
This works as expected:
for (var i:uint = 5; i >= 1; i-- )
{
trace(i); // output is from 5~1, as expected
}
This is the strange behavior:
for (var i:uint = 5; i >= 0; i-- )
{
trace(i)
}
// output:
5
4
3
2
1
0
4294967295
4294967294
4294967293
...
Below 0, something like a MAX_INT appears and it goes on decrementing forever. Why is this happening?
EDIT
I tested a similar code using C++, with a unsigned int and I have the same result. Probably the condition is being evaluated after the decrement.
The behavior you are describing has little to do with any programming language. This is true for C, C++, actionscript, etc. Let me say this though, what you do see is quite normal behavior and has to do with the way a number is represented (see the wiki article and read about unsigned integers).
Because you are using an uint (unsigned integer). Which can only be a positive number, the type you are using cannot represent negative numbers so if you take a uint like this:
uint i = 0;
And you reduce 1 from the above
i = i - 1;
In this case i does not represent negative numbers, as it is unsigned. Then i will display the maximum value of a uint data type.
Your edit that you posted above,
"...in C++, .. same result..."
Should give you a clue as to why this is happening, it has nothing to do with what language you are using, or when the comparison is done. It has to do with what data type you are using.
As an excercise, fire up that C++ program again and write a program that displays the maximum value of a uint. The program should not display any defined constants :)..it should take you one line of code too!