Without using recursion how can a stack overflow exception be thrown? - exception

Without using recursion how can a stack overflow exception be thrown?

Since no one else has mentioned it:
throw new System.StackOverflowException();
You might do this when testing or doing fault-injection.

Declare an ENORMOUS array as a local variable.

If you call enough methods, a stack overflow can occur anytime. Although, if you get stack overflow errors without using recursion, you may want to rethink how you're doing things. It's just so easy with recursion because in an infinite loop, you call a ton of methods.

The following applies to Windows, but most OSs implement this in a similar fashion.
The short answer is: if you touch the last guard page, it will throw.
An exception of type EXCEPTION_STACK_OVERFLOW (C00000FD) is raised when your application touches the bottom page of the stack, that is marked a PAGE_GUARD protection flag, and there is no room to grow the stack (commit one more page), see How to trap stack overflow in a Visual C++ application.
The typical case when this happens is when the stack has grown as the result of many function frames on the stack (ie. out of control recursion), as the result of fewer frames but very large frame sizes (functions with a very large local scoped object) or by explicitly allocating from the stack with _alloca.
Another way to cause the exception is to simply intentionally touch the guard page, eg. by dereferencing a pointer that points into that page. This can happen due to a variable initializion bug.
Stack overflows can occur on valid execution paths if the input causes a very deep nesting level. For instance see Stack overflow occurs when you run a query that contains a large number of arguments inside an IN or a NOT IN clause in SQL Server.

Every method call that has not yet returned consumes some stack space. (Methods with more local variables consume more space.) A very deep call stack can result in stack overflow.
Note that on systems with limited memory (mobile devices and such) you don't have much stack space and will run out sooner.

Short answer: if you have an object which calls an internal object, you increase the stack trace by 1. So, if you have 1000s of objects nested inside one another, each calling its internal object, eventually you'll get a stack overflow.
Here's a demonstration of how to generate primes using nested iterators:
using System;
using System.Collections.Generic;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Program p = new Program();
IEnumerator<int> primes = p.AllPrimes().GetEnumerator();
int numberOfPrimes = 1000;
for (int i = 0; i <= numberOfPrimes; i++)
{
primes.MoveNext();
if (i % 1000 == 0)
{
Console.WriteLine(primes.Current);
}
}
Console.ReadKey(true);
}
IEnumerable<int> FilterDivisors(IEnumerator<int> seq, int num)
{
while (true)
{
int current = seq.Current;
if (current % num != 0)
{
yield return current;
}
seq.MoveNext();
}
}
IEnumerable<int> AllIntegers()
{
int i = 2;
while (true)
{
yield return i++;
}
}
IEnumerable<int> AllPrimes()
{
IEnumerator<int> nums = AllIntegers().GetEnumerator();
while (true)
{
nums.MoveNext();
int prime = nums.Current;
yield return prime;
// nested iterator makes a big boom
nums = FilterDivisors(nums, prime).GetEnumerator();
}
}
}
}
There's no recursion, but the program will throw a stack overflow exception after around 150,000 primes.

If you're talking about C++ with a reasonable standard library, I image that this would work:
while (true) {
alloca(1024 * 1024); // arbitrary - 1M per iteration.
}
Details on alloca.

int main()
{
//something on the stack
int foo = 0;
for (
//pointer to an address on the stack
int* p = &foo;
//forever
;
//ever lower on the stack (assuming that the stack grows downwards)
--p)
{
//write to the stack
*p = 42;
}
}

You can allocate a few bytes in the stack as well.
static void Main(string[] args)
{
Span<byte> b = stackalloc byte[1024 * 1024 * 1024]; // Process is terminating due to StackOverflowException.
}

Easiest way to make a StackOverflowException is the following:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
SomeClass instance = new SomeClass();
string name = instance.Name;
}
}
public class SomeClass
{
public string Name
{
get
{
return Name;
}
}
}
}

Related

Can a branch in CUDA be ignored if all the warps go one path? If so, is there a way I could give the compiler/runtime this information?

Suppose we have code like the following (I have not compiled this, it may be wrong)
__global__ void myKernel()
{
int data = someArray[threadIdx.x];
if (data == 0) {
funcA();
} else {
funcB();
}
}
Now Suppose there's 1024-thread block running, and someArray is all zero.
Further suppose that funcB() is costly to run, but funcA() is not.
I assume the compiler has to emit both paths sequentially, like doing funcA first, then funcB after. This is not ideal.
Is there a way to hint to CUDA to not do it? Or does the runtime notice "no threads are active so I will skip over all the instructions as I see them"?
Or better yet, what if the branch was something like this (again, haven't compiled this, but it illustrates what I am trying to convey)
__constant__ int constantNumber;
__global__ void myKernel()
{
if (constantNumber == 123) {
funcA();
} else {
funcB();
}
}
and then I set constantNumber to 123 before launching the kernel. Would this still cause both paths to be taken?
This can be achieved using __builtin_assume.
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#__builtin_assume
Quoting the documentation:
void __builtin_assume(bool exp)
Allows the compiler to assume that the Boolean argument is true. If the argument is not true at run time, then the behavior is undefined. The argument is not evaluated, so any side-effects will be discarded.

Cannot invoke mult(float) on the primitive type float

I'm working on a simple gravity program in Processing. My program takes particles and attracts them to each other based on the formula for gravity. Unfortunately, once I try to multiply the force to the direction with PVector.mult(), I get the error in the title:
Cannot invoke mult(float) on the primitive type float.
Here is my code for the method. G is defined elsewhere.
public float distance(Particle other) {
return location.sub(other.location).mag();
}
public PVector direction(Particle other) {
return location.sub(other.location).normalize();
}
public void gravity(Particle other) {
float grav = (G*((mass * other.mass)/pow(distance(other), 2)));
if(distance(other) != 0) {
acceleration.add(distance(other).mult(grav));
}
Why am I not able to pass a float where a float is due?
Let's take this line apart and split it into multiple steps:
acceleration.add(distance(other).mult(grav));
Here's my attempt to split it into multiple lines:
float grav = 42;
float distanceFromOther = distance(other);
float multipliedValue = distanceFromOther.mult(grav);
acceleration.add(multipliedValue);
Hopefully this makes it more obvious what's going on: you're trying to call mult() on a float value, which won't work. You need to call mult on a PVector or another class that contains a mult() function.

Stack overflow on recursive function

got a Problem with recursive funktions. I made this one in java, that is just pretty basic, but doesn't work tho, due to an Stack overflow error. I mean what this function does is to open the funktion just as often as the size of the difference between a given number and the number you declare in the main funktion, what should really not be a problem for the stack, but well, doen't work the whole time, or whats the mistake here...?
thanks for the answers in advance :)
public class Übung_Baeume {
static int anzAufrufe=0;
static int zahl=23;
public static int zaehleAufrufe(int uebergabe)
{
anzAufrufe++;
if (uebergabe==zahl){
return anzAufrufe;
}
return zaehleAufrufe(uebergabe-1) +
zaehleAufrufe(uebergabe+1);
}
public static void main(String[] args) {
System.out.println(zaehleAufrufe(40));
}
}
ubergabe if not equal to 23 will recurse with ubergabe +1 and unbergabe - 1. Now each of those will do the same so you can just try this out:
zaehleAufrufe(40) ; ==>
zaehleAufrufe(39) + zaehleAufrufe(41) ; ==> neither of these are 23
zaehleAufrufe(38) + zaehleAufrufe(40) + zaehleAufrufe(40) + zaehleAufrufe(42)
Notice that last one.. Even though some of these eventually will hit a base case you see that you on the 3. expansion have 2 zaehleAufrufe(40). Each one of these expands like the above turning also into two zaehleAufrufe(40) and no one of these will even hit a base case.
For recursion to work you need to become simpler problems and in fact yours become several of the same amount and thus infinite recursion.
To open a function as many times as the difference you only recurse once:
public static int zaehleAufrufe(int uebergabe)
{
anzAufrufe++;
if (uebergabe <= zahl) {
return anzAufrufe;
}
return zaehleAufrufe(uebergabe-1);
}
zaehleAufrufe(40) ; ==>
zaehleAufrufe(39) ; ==>
...
zaehleAufrufe(23) ; ==> 18
This almost always means that nothing can stop the recursion from going deeper and deeper. There is no condition that stops when a certain level is reached whether the goal is achieved or not.
In your code you start from 40 and will stop only when you get to 23. But one of your branches is increasing the number:
return zaehleAufrufe(uebergabe-1) + zaehleAufrufe(uebergabe+1);
and will never go down to 23.
Welcome to StackOverflow with a stack overflow :)
P.S. The best thing to do is to reconsider your algorythm. If in a case you are sure you want to use a recursion, but it's branching is unpredictable due to depending on unknown data, you can put a level-limiting value. It is a dirty hack but there are cases when it is useful.
It is importaint to say that with this limit your code will still fail
- it will try to call this function as much as 2^33 times = about 8 billion, which is big enough :)
public class Übung_Baeume {
static int anzAufrufe=0;
static int zahl=23;
static int max_level = 32;
static bool fault = 0;
public static int zaehleAufrufe(int uebergabe, int level)
{
if(level == max_level)
{
fault = 1;
return 0;
}
anzAufrufe++;
if (uebergabe==zahl){
return anzAufrufe;
}
return zaehleAufrufe(uebergabe-1, level+1) +
zaehleAufrufe(uebergabe+1, level+1);
}
public static void main(String[] args) {
int ret = zaehleAufrufe(40,0);
if(fault == 0)
System.out.println(ret);
else
System.out.println("Fault - recursion level limit reached!");
}
}

Thrust - accessing neighbors

I would like to use Thrust's stream compaction functionality (copy_if) for distilling indices of elements from a vector if the elements adhere to a number of constraints. One of these constraints depends on the values of neighboring elements (8 in 2D and 26 in 3D). My question is: how can I obtain the neighbors of an element in Thrust?
The function call operator of the functor for the 'copy_if' basically looks like:
__host__ __device__ bool operator()(float x) {
bool mark = x < 0.0f;
if (mark) {
if (left neighbor of x > 1.0f) return false;
if (right neighbor of x > 1.0f) return false;
if (top neighbor of x > 1.0f) return false;
//etc.
}
return mark;
}
Currently I use a work-around by first launching a CUDA kernel (in which it is easy to access neighbors) to appropriately mark the elements. After that, I pass the marked elements to Thrust's copy_if to distill the indices of the marked elements.
I came across counting_iterator as a sort of substitute for directly using threadIdx and blockIdx to acquire the index of the processed element. I tried the solution below, but when compiling it, it gives me a "/usr/include/cuda/thrust/detail/device/cuda/copy_if.inl(151): Error: Unaligned memory accesses not supported". As far as I know I'm not trying to access memory in an unaligned fashion. Anybody knows what's going on and/or how to fix this?
struct IsEmpty2 {
float* xi;
IsEmpty2(float* pXi) { xi = pXi; }
__host__ __device__ bool operator()(thrust::tuple<float, int> t) {
bool mark = thrust::get<0>(t) < -0.01f;
if (mark) {
int countindex = thrust::get<1>(t);
if (xi[countindex] > 1.01f) return false;
//etc.
}
return mark;
}
};
thrust::copy_if(indices.begin(),
indices.end(),
thrust::make_zip_iterator(thrust::make_tuple(xi, thrust::counting_iterator<int>())),
indicesEmptied.begin(),
IsEmpty2(rawXi));
#phoad: you're right about the shared mem, it struck me after I already posted my reply, subsequently thinking that the cache probably will help me. But you beat me with your quick response. The if-statement however is executed in less than 5% of all cases, so either using shared mem or relying on the cache will probably have negligible impact on performance.
Tuples only support 10 values, so that would mean I would require tuples of tuples for the 26 values in the 3D case. Working with tuples and zip_iterator was already quite cumbersome, so I'll pass for this option (also from a code readability stand point). I tried your suggestion by directly using threadIdx.x etc. in the device function, but Thrust doesn't like that. I seem to be getting some unexplainable results and sometimes I end up with an Thrust error. The following program for example generates a 'thrust::system::system_error' with an 'unspecified launch failure', although it first correctly prints "Processing 10" to "Processing 41":
struct printf_functor {
__host__ __device__ void operator()(int e) {
printf("Processing %d\n", threadIdx.x);
}
};
int main() {
thrust::device_vector<int> dVec(32);
for (int i = 0; i < 32; ++i)
dVec[i] = i + 10;
thrust::for_each(dVec.begin(), dVec.end(), printf_functor());
return 0;
}
Same applies to printing blockIdx.x Printing blockDim.x however generates no error. I was hoping for a clean solution, but I guess I am stuck with my current work-around solution.

Segfault Copy Constructor

My code is as follows:
void Scene::copy(Scene const & source)
{
maxnum=source.maxnum;
imagelist = new Image*[maxnum];
for(int i=0; i<maxnum; i++)
{
if(source.imagelist[i] != NULL)
{
imagelist[i] = new Image;
imagelist[i]->xcoord = source.imagelist[i]->xcoord;
imagelist[i]->ycoord = source.imagelist[i]->ycoord;
(*imagelist[i])=(*source.imagelist[i]);
}
else
{
imagelist[i] = NULL;
}
}
}
A little background: The Scene class has a private int called maxnum and an dynamically allocated Array of Image pointers upon construction. These pointers point to images. The copy constructor attempts to make a deep copy of all of the images in the array. Somehow I'm getting a Segfault, but I don't see how I would be accessing an array out of bounds.
Anyone see something wrong?
I'm new to C++, so its probably something obvious.
Thanks,
I would suggest that maxnum (and maybe imagelist) become a private data member and implement const getMaxnum() and setMaxnum() methods. But I doubt that is the cause of any segfault the way you described this.
I would try removing that const before your reference and implement const public methods to extract data. It probably compiles since it is just a reference. Also, I would try switching to a pointer instead of pass by reference.
Alternatively, you can create a separate Scene class object and pass the Image type data as an array pointer. And I don't think you can declare Image *imagelist[value];.
void Scene::copy(Image *sourceimagelist, int sourcemaxnum) {
maxnum=sourcemaxnum;
imagelist=new Image[maxnum];
//...
imagelist[i].xcoord = sourceimagelist[i].xcoord;
imagelist[i].ycoord = sourceimagelist[i].ycoord;
//...
}
//...
Scene a,b;
//...
b.Copy(a.imagelist,a.maxnum);
If the source Image had maxnum set higher than the actual number of items in its imagelist, then the loop would run past the end of the source.imagelist array. Maybe maxnum is getting initialized to the value one while the array starts out empty (or maxnum might not be getting initalized at all), or maybe if you have a Scene::remove_image() function, it might have removed an imagelist entry without decrementing maxnum. I'd suggest using an std::vector rather than a raw array. The vector will keep track of its own size, so your for loop would be:
for(int i=0; i<source.imagelist.size(); i++)
and it would only access as many items as the source vector held. Another possible explanation for the crash is that one of your pointers in source.imagelist belongs to an Image that was deleted, but the pointer was never set to NULL and is now a dangling pointer.
delete source.imagelist[4];
...
... // If source.imagelist[4] wasn't set to NULL or removed from the array,
... // then we'll have trouble later.
...
for(int i=0; i<maxnum; i++)
{
if (source.imagelist[i] != NULL) // This evaluates to true even when i == 4
{
// When i == 4, we're reading the xcoord member from an Image
// object that no longer exists.
imagelist[i]->xcoord = source.imagelist[i]->xcoord;
That last line will access memory that it shouldn't. Maybe the object still happens to exist in memory because it hasn't gotten overwritten yet, or maybe it has been overwritten and you'll retrieve an invalid xcoord value. If you're lucky, though, then your program will simply crash. If you're dealing directly with new and delete, make sure that you set a pointer to NULL after you delete it so that you don't have a dangling pointer. That doesn't prevent this problem if you're holding a copy of the pointer somewhere, though, in which case the second copy isn't going to get set to NULL when you delete-and-NULL the first copy. If you later try to access the second copy of the pointer, you'll have no way of knowing that it's no longer pointing to a valid object.
It's much safer to use a smart pointer class and let that deal with memory management for you. There's a smart pointer in the standard C++ library called std::auto_ptr, but it has strange semantics and can't be used in C++ containers, such as std::vector. If you have the Boost libraries installed, though, then I'd suggest replacing your raw pointers with a boost::shared_ptr.