Optimizing code for reading some VLVs in a file? - function

I'm trying to read some variable-length-values from a file I created.
The file contains the following:
81 7F 81 01 2F F3 FF
There are two VLVs there, 81 7F and 81 01 which are 255 and 129 in decimal.
I also created some file-reader functions that go like this:
void read_byte_from_file_to(std::fstream& file, uint8_t& to) {
file.read((char*)&to, 1);
}
unsigned long readVLV(std::fstream& t_midi_file) {
unsigned long result = 0;
static unsigned long sum = 0, depth = 0, count = 0;
uint8_t c;
read_byte_from_file_to(t_midi_file, c);
++count;
if (c & 0x80) {
readVLV(t_midi_file);
}
sum += (c & 0x7F) << (7 * depth++);
if (count == depth) {
result = sum;
sum = 0;
depth = 0;
count = 0;
}
return result;
};
While running readVLV n times gives correct answers for the first n VLVs when reading from a file, I absolutely hate how I wrote it, which so much statics parameters and that ugly parameter reset. SO if someone could head me in the right direction I'd be very pleased.

A basic _readVLV which takes the positional state of the function could be done by writing
unsigned long _readVLV(
std::fstream& t_midi_file,
unsigned long sum,
unsigned long depth) {
uint8_t c;
read_byte_from_file_to(t_midi_file, c);
if (c & 0x80) {
sum += _readVLV(t_midi_file, sum, depth);
++depth;
}
return (c & 0x7F) << (7 * depth);
}
and creating a global readVLV function that takes the positional information and the file like so
unsigned long readVLV(std::fstream& t_midi_file) {
unsigned long sum = 0, depth = 0, count = 0;
return _readVLV(t_midi_file, sum, depth, count);
}

Related

cant understand the calculation of return statement of binary program with recursion in c

Program of binary conversion with recursion
it is working fine but i cant understand the meaning of one statement
Can any one help me to explain following
return (num % 2) + 10 * binary_conversion(num / 2);
while having input of 13
i am lil confused getting like this num =13;
13%2 = 1 + 10 * 6 = 66 , something stupid like calculation
int binary_conversion(int);
int main()
{
int num, bin;
printf("Enter a decimal number: ");
scanf("%d", &num);
bin = binary_conversion(num);
printf("The binary equivalent of %d is %d\n", num, bin);
}
int binary_conversion(int num)
{
if (num == 0)
{
return 0;
}
else
{
return (num % 2) + 10 * binary_conversion(num / 2);
}
}
Your confusions stems from not understanding the operation of recursion. It's time to interview the function with print statements. This will allow you to follow the control and data flow of the routine.
int binary_conversion(int num)
{
printf("ENTER num = %d\n", num);
if (num == 0)
{
printf("BASE CASE returns 0\n");
return 0;
}
else
{
printf("RECURSION: new bit = %d, recur on %d\n", num % 2, num / 2);
return (num % 2) + 10 * binary_conversion(num / 2);
}
}

adding 1 to a binary number using logical operations

As title describes; I want to add 1 to a 4 bit binary number using only AND OR XOR operations. How can I achieve that?
Regards
Think about what you're doing when you perform addition of decimal numbers in long-hand. It's exactly the same.
Here's how I'd do it, showing a lot of working.
Label the four bits from b0 (least significant bit) to b3 (most significant bit), and introduce 5 carry bits, c0 to c4. The modified values are b3', b2', b1', b0', so your nibble, the carry bits, and the modified values are:
{ b3 b2 b1 b0 }
{ c4 c3 c2 c1 c0 }
{ b3' b2' b1' b0' }
and they are related through:
c0 = 1 (this is to flip the least significant bit)
b0' = XOR(b0, 1)
c1 = AND(b0, 1)
b1' = XOR(b1, c0)
c2 = AND(b1, c0)
b2' = XOR(b2, c1)
c3 = AND(b2, c1)
b3' = XOR(b3, c2)
c4 = AND(b3, c2)
Note:
There's no need for OR to be used.
The choice of four bits is arbitrary - beyond the first bit, the logic is copy/pasta.
When the last carry bit c3 is 0, the number is silently overflowing (going from 15 to 0).
There's no need to have four carry bits, but in keeping with the hand-addition paradigm, I've introduced them anyway.
Four bits is a Nibble.
Sample C# class:
public class Nibble
{
const int bits = 4;
private bool[] _bools = new bool[bits];
public void Reset()
{
for ( int i = 0; i < _bools.Length; i++ )
_bools[i] = false;
}
public void Increment()
{
bool[] result = new bool[bits];
bool[] carries = new bool[bits + 1];
carries[0] = true;
for ( int i = 0; i < bits; i++ )
{
result[i] = _bools[i] ^ carries[i];
carries[i + 1] = _bools[i] && carries[i];
}
if ( carries[bits] )
Console.WriteLine("Overflow!");
_bools = result;
}
public byte Value
{
get
{
byte result = 0;
for ( int i = 0; i < bits; i++ )
{
if ( _bools[i] )
result += (byte)(1 << i);
}
return result;
}
}
}
Usage:
static class Program
{
static void Main()
{
var nibble = new Nibble();
for ( int i = 0; i < 17; i++ )
{
Console.WriteLine(nibble.Value);
nibble.Increment();
}
}
}
Run on Ideone here

Rearranging an array in CUDA

I have the following problem that I want to implement on CUDA:
I want to read an array (say "flag[20]"), and based on a certain condition, write indices of this array to another array (say "pindex[]")
Simple code implementation in C can be:
int N = 20;
int flag[N];
int pindex[N];
for(int i=0;i<N;i++)
flag[i] = -1;
for(int i=0;i<N;i+=2)
flag[i] = 0;
for(int i=0;i<N;i++)
pindex[i] = 0;
//operation: count # of times flag != -1 and write those indices in a different array
int pcount1 = 0;
for(int i=0;i<N;i++)
{
if(flag[i] != -1)
{
pindex[pcount1] = i;
++pcount1;
}
}
How will I implement this in CUDA?
I can use atomicAdd() to calculate total number of times my condition is satisfied. But, how do I write indices in a different array. For example, I tried the following:
__global__ void kernel_tryatomic(int N,int* pcount,int* flag, int* pindex)
{
int tId=threadIdx.x;
int n=(blockIdx.x*2+blockIdx.y)*BlockSize+tId;
if(n > N-1) return;
if(flag[n] != -1)
{
atomicAdd(pcount,1);
atomicExch(&pindex[*pcount],n);
//pindex[*pcount] = n;
}
}
This code calculates "pcount" correctly, but does not update "pindex" array.
I need help to do this operation on GPUs.
Thanks
Since your condition (flag) is conceptually a binary, you can use binary prefix sum (thoroughly explained here) to determine which place the thread with a positive flag should write.
For example if N is 20, with the help of below __device__ functions:
__device__ int lanemask_lt(int lane) {
return (1 << (lane)) − 1;
}
__device__ int warp_prefix_sums(int lane, int p) {
const int mask = lanemask_lt( lane );
int b = __ballot( p );
return __popc( b & mask );
}
your __global__ function can simply be written like below:
__global__ void kernel_scan(int N,int* pcount,int* flag, int* pindex)
{
int tId=threadIdx.x;
if(tId >= N)
return;
int threadFlag = ( flag[tId] == -1 ) ? 0 : 1;
int position_to_write = warp_prefix_sum( tId & (warpSize-1), threadFlag );
if( threadFlag )
pindex[ position_to_write ] = tId;
}
If N is bigger than the warp size (32), you can use intra-block binary prefix sum that is explained in the provided link.

Implementing the exponential function with basic arithmetic operations

For the purpose of the exercise, I have to implement the exponential function with the most basic arithmetic operations. I came up with this, where x is the base and y the exponent:
function expAetB() {
product=1;
for (i=0; i<y; i++)
{
product=product*x;
}
return product;
};
However, there are more basic operations than product=product*x;. I should somehow be able to insert instead another for loop which multiply and pass the result, but I can't find a way to do it without falling into an infinite loop.
In the same way that exponentiation is repeated multiplication, so multiplication is simply repeated addition.
Simply create another function mulAetB which does that for you, and watch out for things like negative inputs.
You could go even one more level and define adding in terms of increment and decrement, but that may be overkill.
See, for example, the following program which uses the overkill method of addition:
#include <stdio.h>
static unsigned int add (unsigned int a, unsigned int b) {
unsigned int result = a;
while (b-- != 0) result++;
return result;
}
static unsigned int mul (unsigned int a, unsigned int b) {
unsigned int result = 0;
while (b-- != 0) result = add (result, a);
return result;
}
static unsigned int pwr (unsigned int a, unsigned int b) {
unsigned int result = 1;
while (b-- != 0) result = mul (result, a);
return result;
}
int main (void) {
int test[] = {0,5, 1,9, 2,4, 3,5, 7,2, -1}, *ip = test;
while (*ip != -1) {
printf ("%d + %d = %3d\n" , *ip, *(ip+1), add (*ip, *(ip+1)));
printf ("%d x %d = %3d\n" , *ip, *(ip+1), mul (*ip, *(ip+1)));
printf ("%d ^ %d = %3d\n\n", *ip, *(ip+1), pwr (*ip, *(ip+1)));
ip += 2;
}
return 0;
}
The output of this program shows that the calculations are correct:
0 + 5 = 5
0 x 5 = 0
0 ^ 5 = 0
1 + 9 = 10
1 x 9 = 9
1 ^ 9 = 1
2 + 4 = 6
2 x 4 = 8
2 ^ 4 = 16
3 + 5 = 8
3 x 5 = 15
3 ^ 5 = 243
7 + 2 = 9
7 x 2 = 14
7 ^ 2 = 49
If you really must have it in a single function, it's a simple matter of refactoring the function call to be inline:
static unsigned int pwr (unsigned int a, unsigned int b) {
unsigned int xres, xa, result = 1;
// Catch common cases, simplifies rest of function (a>1, b>0)
if (b == 0) return 1;
if (a == 0) return 0;
if (a == 1) return 1;
// Do power as repeated multiplication.
result = a;
while (--b != 0) {
// Do multiplication as repeated addition.
xres = result;
xa = a;
while (--xa != 0)
result = result + xres;
}
return result;
}

How to simplify this loop?

Considering an array a[i], i=0,1,...,g, where g could be any given number, and a[0]=1.
for a[1]=a[0]+1 to 1 do
for a[2]=a[1]+1 to 3 do
for a[3]=a[2]+1 to 5 do
...
for a[g]=a[g-1]+1 to 2g-1 do
#print a[1],a[2],...a[g]#
The problem is that everytime we change the value of g, we need to modify the code, those loops above. This is not a good code.
Recursion is one way to solve this(although I was love to see an iterative solution).
!!! Warning, untested code below !!!
template<typename A, unsigned int Size>
void recurse(A (&arr)[Size],int level, int g)
{
if (level > g)
{
// I am at the bottom level, do stuff here
return;
}
for (arr[level] = arr[level-1]+1; arr[level] < 2 * level -1; arr[level]++)
{
recurse(copy,level+1,g);
}
}
Then call with recurse(arr,1,g);
Imagine you are representing numbers with an array of digits. For example, 682 would be [6,8,2].
If you wanted to count from 0 to 999 you could write:
for (int n[0] = 0; n[0] <= 9; ++n[0])
for (int n[1] = 0; n[1] <= 9; ++n[1])
for (int n[2] = 0; n[2] <= 9; ++n[2])
// Do something with three digit number n here
But when you want to count to 9999 you need an extra for loop.
Instead, you use the procedure for adding 1 to a number: increment the final digit, if it overflows move to the preceding digit and so on. Your loop is complete when the first digit overflows. This handles numbers with any number of digits.
You need an analogous procedure to "add 1" to your loop variables.
Increment the final "digit", that is a[g]. If it overflows (i.e. exceeds 2g-1) then move on to the next most-significant "digit" (a[g-1]) and repeat. A slight complication compared to doing this with numbers is that having gone back through the array as values overflow, you then need to go forward to reset the overflowed digits to their new base values (which depend on the values to the left).
The following C# code implements both methods and prints the arrays to the console.
static void Print(int[] a, int n, ref int count)
{
++count;
Console.Write("{0} ", count);
for (int i = 0; i <= n; ++i)
{
Console.Write("{0} ", a[i]);
}
Console.WriteLine();
}
private static void InitialiseRight(int[] a, int startIndex, int g)
{
for (int i = startIndex; i <= g; ++i)
a[i] = a[i - 1] + 1;
}
static void Main(string[] args)
{
const int g = 5;
// Old method
int count = 0;
int[] a = new int[g + 1];
a[0] = 1;
for (a[1] = a[0] + 1; a[1] <= 2; ++a[1])
for (a[2] = a[1] + 1; a[2] <= 3; ++a[2])
for (a[3] = a[2] + 1; a[3] <= 5; ++a[3])
for (a[4] = a[3] + 1; a[4] <= 7; ++a[4])
for (a[5] = a[4] + 1; a[5] <= 9; ++a[5])
Print(a, g, ref count);
Console.WriteLine();
count = 0;
// New method
// Initialise array
a[0] = 1;
InitialiseRight(a, 1, g);
int index = g;
// Loop until all "digits" have overflowed
while (index != 0)
{
// Do processing here
Print(a, g, ref count);
// "Add one" to array
index = g;
bool carry = true;
while ((index > 0) && carry)
{
carry = false;
++a[index];
if (a[index] > 2 * index - 1)
{
--index;
carry = true;
}
}
// Re-initialise digits that overflowed.
if (index != g)
InitialiseRight(a, index + 1, g);
}
}
I'd say you don't want nested loops in the first place. Instead, you just want to call a suitable function, taking the current nesting level, the maximum nesting level (i.e. g), the start of the loop, and whatever if needs as context for the computation as arguments:
void process(int level, int g, int start, T& context) {
if (level != g) {
for (int a(start + 1), end(2 * level - 1); a < end; ++a) {
process(level + 1, g, a, context);
}
}
else {
computation goes here
}
}