Performance problems when using renderTexture-> saveToFile() and nodeGrid-> runAction() features cocos2d-x - cocos2d-x

In process of learning cocos2d-x I have the following questions.
Function renderTexture-> saveToFile (filename, Image :: Format :: PNG) is very slow. App literally freezes for 2-4 seconds,
all animation hang when I call this method!
It's not normal, how to fight it?
Using the object and method nodeGrid runAction for sprites can apply different effects.
In my application I use Waves3D and ShuffleTiles. And it also causes terrible brakes (especially ShuffleTiles), and when the device is not very
productive they hang up to hardreset or the application terminates.
Can I do something wrong? Why such effects if they cause such severe problems?
Here is an example of code that terribly slow if its calls at the same time for 5-9 sprites.
ActionInterval * shuffle = ShuffleTiles :: create (2, Size (15, 15), 100);
nodeGrid = NodeGrid :: create ();
nodeGrid-> runAction ((Sequence *) Sequence :: create (shuffle, NULL));

Related

How to slow down the All actions of cocos2dx Game

I'm implementing a game on cocos2d-x.
Now I implemented a "Replay of My Game" feature (Game shows from start)
But I want to replay my game at the speed of 1x , 2x , 3x , 4x. When changing speed to 2x all actions (move and rotate etc.) should work with respect to new changed variable.
How can I do that by changing the general speed of CCAction?
I want a general solution. I know the solution with variables or scheduler,
but I want a general solution.
You can use following code to slow or fast all scheduler and action:-
float val = 2.0; // to fast
val = 0.5; // to slow
Director->getInstance()->setTimeScale(val);
Default value is 1.0;
Write a class like CCEaseIn by yourself.
Rewrite update(float time).
m_pInner->update(powf(time, m_fRate)); // this is what update() like in CCEaseIn
The code may be changed like this:
m_pInner->update(func(time));
func(float time) is the function to change the time. like time/2 which means to 0.5x, time*2 means 2x. You may save some param to make the function more adaptable.

Scala JPanel rendering synchronisation

I'm doing a simulation program in Scala and I'm trying to render the simulation in a JPanel by overriding the paintComponent:
override def paintComponent(g: Graphics2D) = {
g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
super.paintComponent(g)
tx1 = g.getTransform()
g.setColor(new Color(0,0,0))
simulator.getVehicles foreach{vehc =>
g.translate(vehc.getPos.x,vehc.getPos.y)
g.draw(new Ellipse2D.Double(-Vehicle.rad, -Vehicle.rad, Vehicle.diam, Vehicle.diam))
g.drawLine(0,0,(Vehicle.rad*vehc.getDir.x).toInt,(Vehicle.rad*vehc.getDir.y).toInt)
g.setTransform(tx1)
}
}
I have the simulation itself running on a different thread:
def run{
//logic loop
time = System.currentTimeMillis();
dt = 1000/60
while(loop)
{
getVehicles.foreach{
_.move
}
collider.solvecollisions()
Thread.sleep(dt- (time - System.currentTimeMillis()))
time = System.currentTimeMillis();
}}
GetVehicles returns a Buffer[Vehicle] of all simulated vehicles.
My problem is that there is jittering in the rendering. What I mean is that sometimes some of the vehicles are rendered a timestep later than others.
I presume this happens because the simulation loop updates the positions at the same time the render loop fetches the positions, and there is some overlap. i.e. when rendering begins in timestep n,half of the vehicles are rendered and then timestep n+1 happens and the rest of the vehicles are rendered a timestep further. First I thought this was a problem to be solved with double buffering, but since paintComponent already does that, I dont think thats the case.
Any ideas how to fix this? I tried simply rendering getVehicles.clone but that didn't help because the references to the vehicles are still the same.
Thanks!
It appears that your vehicle models are mutable things (_.move). Then if simulation and painting run in different threads, it is no surprise that you don't get a consistent world view in Swing.
I can see the following solutions, depending on your requirements:
run the simulation updates on the event-dispatch-thread. Advantage: no need to change your code at all. Disadvantage: may render the GUI sluggish if simulation is heavy
create one global "world" lock to which you synchronize. Advantage: requires very little change to the code. Disadvantage: Unless GUI updates are at low rate, both simulation and rendering block each other. Might be useful if GUI updates are a fraction of simulation rate.
adopt an immutable model, and your simulation will then create one consistent updated world in each step. Advantage: rendering and simulations will automatically be consistent. Probably the fastest solution. Disadvantage: You need to rewrite your simulation. Probably the best solution.
Change your var mutable state to STM reference cells. Might work nicely if GUI rate is low compared to simulation rate, because then this "optimistic" approach might work with relatively few rollbacks. I'm not sure how this works out with Scala-STM and the renderer doing only read access. Perhaps you need a full multi-versioned STM to avoid rollbacks.
To outline the immutable variant:
trait Vehicle {
def move: Vehicle // return updated model
}
trait Collisions {
def solve(in: Seq[Vehicle]): Seq[Vehicle] // return corrected models
}
trait World {
def vehicles: Seq[Vehicle]
}
trait Simulator {
protected def coll: Collisions
// create updated world
def run(prev: World): World = new World {
val vehicles = coll.solve(prev.vehicles.map(_.move))
}
}

Does D std lib include something like boost.fusion and boost.mpl?

I'm still evaluating if i should start using D for prototyping numerical code in physics.
One thing that stops me is I like boost, specifically fusion and mpl.
D is amazing for template meta-programming and i would think it can do mpl and fusion stuff but I would like to make sure.
Even if i'll start using d, it would take me a while to get to the mpl level. So i'd like someone to share their experience.
(by mpl i mean using stl for templates and by fusion, i mean stl for tuples.)
a note on performance would be nice too, since it's critical in physics simulations.
In D, for the most part, meta-programming is just programming. There's not really any need for a library like boost.mpl
For example, consider the lengths you would have to go to in C++ to sort an array of numbers at compile time. In D, you just do the obvious thing: use std.algorithm.sort
import std.algorithm;
int[] sorted(int[] xs)
{
int[] ys = xs.dup;
sort(ys);
return ys;
}
pragma(msg, sorted([2, 1, 3]));
This prints out [1, 2, 3] at compile time. Note: sort is not built into the language and has absolutely no special code for working at compile time.
Here's another example that builds a lookup table for Fibonacci sequence at compile time.
int[] fibs(int n)
{
auto fib = recurrence!("a[n-1] + a[n-2]")(1, 1);
int[] ret = new int[n];
copy(fib.take(n), ret);
return ret;
}
immutable int[] fibLUT = fibs(10).assumeUnique();
Here, fibLUT is constructed entirely at compile time, again without any special compile time code needed.
If you want to work with types, there are a few type meta functions in std.typetuple. For example:
static assert(is(Filter!(isUnsigned, int, byte, ubyte, dstring, dchar, uint, ulong) ==
TypeTuple!(ubyte, uint, ulong)));
That library, I believe, contains most of the functionality you can get from Fusion. Remember though, you really don't need to use much of template meta-programming stuff in D as much as you do in C++, because most of the language is available at compile time anyway.
I can't really comment on performance because I don't have vast experience with both. However, my instinct would be that D's compile time execution is faster because you generally don't need to instantiate numerous templates. Of course, C++ compilers are more mature, so I could be wrong here. The only way you'll really find out is by trying it for your particular use case.

Can I use AS3 Stage3D AGAL to achieve CUDA like processing?

I have a program what detects a ball in a 320x240 stream runtime, but if I stream bigger resolution, it gets too slow. I'm assuming if I could use the GPU to calculate each pixels (with their neighbor frames, and neigbor pixels) it would be faster. Anyone knows if I can get data BACK from the GPU with AGAL?
in sort, I have the loop below, what goes through each pixel of the frame, and I want to calculate the most on GPU, to achive better performance.
for(var i:int=cv.length-1; i>1;i--){
if( (110*255) < (cv[i] & 0x0000FF00) && (cv[i] & 0x0000FF00) < (150*255)){ //i zöld
if( (cv[i+2] & 0x0000FF00) > (150*255) ) { //i+2 világos
if(floodhere(cv, i+2)){ //méret nagy
prevDiff[i]=0xffffffff; //fehér
close.push(i);
}
else prevDiff[i]=0xffff0000 //méret kicsi -> piros
} else {
prevDiff[i]=0xff000055 //kék
}
} else {
prevDiff[i]=0xff000000 //fekete
}
}
You can use AGAL to make fast calculations on the GPU, just be aware of the limits.
It goes roughly like this:
You need to upload you data as as textures (a n*m matrices), one datapoint is a 3x8 bit value. Uploading any kind of big data to the GPU is slow, thus you should not do it in every frame. Getting the texture back to Actionscript is slow too.
You can upload data to the GPU to its global variable memory (but only a limited amount)
The GPU will run your AGAL program parallel on every element on this matrix, and the output will be an n*m matrix too.
Every program instance has access to 3 things: Its coordinates, the global variables, and the uploaded matrices. The output of your program will be written to an output matrix to the same position. If you write multiple programs, the can access this output matrix quickly, but getting it back to the normal memory (for actionscript manipulation) is slow.
AGAL programs are very limited compared to Actionscript:
- max. 256 instructions.
- no loops, functions, classes. You only have mathematical operators and conditionals ("if-else").
- cannot write to the global memory
You may be able to use PixelBender. It also works in separate thread(s) and makes use of multicore CPUs so is much quicker than actionscript.
See http://www.flashmagazine.com/tutorials/detail/using_pixel_bender_to_calculate_information/ for an example
No way to get data back. You can only get color back. Moreover, to get pixel color in actionscript you should copy data from texture to bitmapdata wich is VEEEERY slow.

Is there really a performance hit when catching exceptions?

I asked a question about exceptions and I am getting VERY annoyed at people saying throwing is slow. I asked in the past How exceptions work behind the scenes and I know in the normal code path there are no extra instructions (as the accepted answer says) but I am not entirely convinced throwing is more expensive then checking return values. Consider the following:
{
int ret = func();
if (ret == 1)
return;
if (ret == 2)
return;
doSomething();
}
vs
{
try{
func();
doSomething();
}
catch (SpecificException1 e)
{
}
catch (SpecificException2 e)
{
}
}
As far as I know there isn't a difference except the ifs are moved out of the normal code path into an exception path and an extra jump or two to get to the exception code path. An extra jump or two doesn't sound like much when it reduces a few ifs in your main and more often run) code path. So are exceptions actually slow? Or is this a myth or an old issue with old compilers?
(I'm talking about exceptions in general. Specifically, exceptions in compiled languages like C++ and D; though C# was also in my mind.)
Okay - I just ran a little test to make sure that exceptions are actually slower. Summary: On my machine a call w/ return is 30 cycles per iteration. A throw w/ catch is 20370 cycles per iteration.
So to answer the question - yes - throwing exceptions is slow.
Here's the test code:
#include <stdio.h>
#include <intrin.h>
int Test1()
{
throw 1;
// return 1;
}
int main(int argc, char*argv[])
{
int result = 0;
__int64 time = 0xFFFFFFFF;
for(int i=0; i<10000; i++)
{
__int64 start = __rdtsc();
try
{
result += Test1();
}
catch(int x)
{
result += x;
}
__int64 end = __rdtsc();
if(time > end - start)
time = end - start;
}
printf("%d\n", result);
printf("time: %I64d\n", time);
}
alternative try/catch written by op
try
{
if(Test1()!=0)
result++;
}
catch(int x)
{
result++;
I don't know exactly how slow it is, but throwing an exception that already exists (say it was created by the CLR) is not much slower, cause you've already incurred the hit of constructing the exception. ... I believe it's the construction of an exception that creates the majority of the addtional performance hit ... Think about it, it has to create a stack trace, (including reading debug symbols to add lines numbers and stuff) and potentially bundle up inner exceptions, etc.
actually throwing an exception only adds the additional code to traverse up the stack to find the appropriate catch clause (if one exists) or transfer control to the CLRs unhandled Exception handler... This portion could be expensive for a very deep stack, but if the catch block is just at the bottom of the same method you are throwing it in, for example, then it will be relatively cheap.
If you are using exceptions to actually control the flow it can be a pretty big hit.
I was digging in some old code to see why it ran so slow. In a big loop instead of checking for null and performing a different action it caught the null exception and performed the alternative action.
So don't use exceptions for things they where not designed to do because they are slower.
Use exceptions and generally anything without worrying about performance. Then, when you are finished, measure the performance with profiling tools. If it's not acceptable, you can find the bottlenecks (which probably won't be the exception handling) and optimize.
In C# raising exceptions do have an every so slight performance hit, but this shouldn't scare you away from using them. If you have a reason, you should throw an exception. Most people who have problems with using them cite the reason being because they can disrupt the flow of a program.
Really if your reasons for not using them is a performance hit, your time can be better spent optimizing other parts of your code. I have never run into a situation where throwing an exception caused the program to behave so slowly that it had to be re-factored out (well the act of throwing the exception, not how the code treated it).
Thinking about it a little more, with all that being said, I do try and use methods which avoid throwing exceptions. If possible I'll use TryParse instead of Parse, or use KeyExists etc. If you are doing the same operation 100s of times over and throwing many exception small amounts of inefficiency can add up.
Yes. Exceptions make your program slower in C++. I created an 8086 CPU Emulator a while back. In the code I used exceptions for CPU Interrupts and Faults. I made a little test case of a big complex loop that ran for about 2 minutes doing emulated opcodes. When I ran this test through a profiler, my main loop was making a significant amount of calls to an "exception checker" function of gcc(actually there were two different functions related to this. My test code only threw one exception at the end however.) These exception functions were called in my main loop I believe every time(this is where I had the try{}catch{} part.). The exception functions cost me about 20% of my runtime speed.(the code spent 20% of it's time in there). And the exception functions were also the 3rd and 4th most called functions in the profiler...
So yes, using exceptions at all can be expensive, even without constant exception throwing.
tl;dr IMHO, Avoiding exceptions for performance reasons hits both categories of premature and micro- optimizations. Don't do it.
Ah, the religious war of exceptions.
The various types of answers to this are usually:
the usual mantra (a good one, IMHO): "use exceptions for exceptional situations" (IOW, not part of "normal" code paths).
If your normal user paths involved intentionally using exceptions as a control-flow mechanism, that's a smell.
tons of detail, without really answering the original question
if you really want detail:
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
http://blogs.msdn.com/ricom/archive/2006/09/14/754661.aspx
etc.
someone pointing at microbenchmarks showing that something like i/j with j == 0 is 10x slower catching div-by-zero than checking j == 0
pragmatic answer of how to approach performance for apps in general
usually along the lines of:
make perf goals for your scenarios (ideally working with customers)
build it so it's maintainable, readable, and robust
run it and check perf of goal scenarios
if a set of scenarios aren't making goal, USE A PROFILER to tell you where your time is being spent and go from there.
IOW, any perf changes, especially micro-optimizations like this, made without profiling data driving that decision, is typically a huge waste of time.
Keep in mind that your perf wins will typically come from algorithmic changes (adding an index to a table to avoid table scans, moving something with large n from O(n^3) to O(n ln n), etc.).
More fun links:
http://en.wikipedia.org/wiki/Program_optimization
http://www.flounder.com/optimization.htm
If you want to know how exceptions work in Windows SEH, then I believe this article by Matt Pietrik is considered the definitive reference. It isn't light reading. If you want to extend this to how exceptions work in .NET, then you need to read this article by Chris Brumme, which is most definitely the definitive reference. It isn't light reading either.
The summary of Chris Brumme's article gives a detailed explanation as to why exception are significantly slower than using return codes. It's too long to reproduce here, and you've got a lot of reading to do before you can fully understand why.
Part of the answer is that the compiler isn't trying very hard to optimize the exceptional code path.
A catch block is a very strong hint to the compiler to agressively optimize the non-exceptional code path at the expense of the exceptional code path. To reliably hint to a compiler which branch of an if statement is the exceptional one you need profile guided optimization.
The exception object must be stored somewhere, and because throwing an exception implies stack unwinding, it can't be on the stack. The compiler knows that exceptions are rare - so the optimizer isn't going to do anything that might slow down normal execution - like keeping registers or 'fast' memory of any kind available just in case it needs to put an exception in one. You may find you get a page fault. In contrast, return codes typically end up in a register (e.g. EAX).
it's like concating strings vs stringbuilder. it's only slow if you do it a billion times.