What do Composite Layers do in Chrome Performance? - html

While I was studying Composite during browser rendering, I confirmed that a function called Composite layers in Chrome Performance runs on the main thread. What is its role?
As far as I know, Compositor thread composite layers and making them into a single bitmap but It takes quite a long time in Composite Layers too. To move element, I used transform

Related

How to increase FPS of A-Frame(HTML)

I am making a VR animation using A-Frame (HTML). My animation has many 3D models. But problem is that when I run the animation it gives low fps (15-20) and high draw calls (230-240). Due to this both animation and camera control are lagging. So, how to increase fps and reduce draw calls?
The number of draw calls sounds high, but not so high as to cause a frame rate drop as low as 15-20 FPS (though it depends a bit what spec system you are running on).
As well as looking at ways to reduce draw calls, you might also want to reduce the complexity of the models you are using, or the resolution of the textures, and look into other possible causes of performance problems.
Some options:
reducing texture resolutions - just open in a picture editor like Paint or GIMP and reduce the resolution. Keep textures to power of two resolutions where possible, e.g. 512 x 512 or 1024 x 1024.
reducing model complexity. Look at decimation. Best done outside the browser as a pre-processing step, with a 3D modelling tool such a blender. Also, worth checking how many meshes are in each model, and whether those can be combined in a single mesh.
reducing calls. You need to either merge geometries or (if you are using the same model multiple times) use instancing. Some suggestions for instancing here: Is there an instancing component available for A-Frame to optimize my scene with many repeated objects?. Merging geometries will involve writing some Javascript code yourself, but might be a better option if you don't have repeated geometries.
If you haven't already done this, also worth reviewing this list of performance tips here: https://aframe.io/docs/1.2.0/introduction/best-practices.html#performance
It could be something else on that list, e.g. raycasting, garbage collection issues etc. that's causing the problem.
Using the browser debuger to profile your code may give some further clues as to what's going on with performance.

Computer vision detection on small object bad results why?

Currently, for detection (localisation + recognition tasks) we use mainly deep learning algorithm in computer vision. Two types of detector exist :
one stage : SSD, YOLO, retinanet, ...
two stage : RCNN, Fast RCNN and faster RCNN for example
Using these detectors on very small objects (10 pixels for example) is a very challenging tasks and it seems the one stage algorithm are worse than the two stage algorithm. But I do not really understand why it works better on Faster RCNN for example. In fact, the one and two stage detector use both of them the anchor concept, and most of them use the same backbone like VGG16 or resnet50/resnet101. That means the receptive fields is the same. For example, I tried to detect very small object on retinanet and on faster RCNN. On retinanet, small object are not detected contrary to faster rcnn. I do not understand why. What is the explication theoretically ? (same backbone : resnet50)
I think in general networks like retinaNet are trying to bridge the gap you mention.Usually in one stage networks we will have anchor boxes of varying scales in the feature maps produced by the Backbone net, These feature maps are produced by heavily down sampling the input image, A lot of information about small object might be lost while performing this operation.While this is the case with one stage detectors, In two stage detectors because of flexibility of the RPN network, The RPN network may still propose regions which are small and this may help it to perform slightly better than its one stage counterparts.
I don't think you should be very surprised that both of these might use the same backbone, After the conv features are extracted both networks use different methods to perform detection.
Hope this helps, Let me know if i wasn't clear enough,or you have questions.

Detect intersection without causing bodies to collide

I want to detect the intersection of two objects (sprites) in my scene. I don't want the object geometric intersection to cause a collision between the bodies in the scene.
I've created PhysicalBody for both of my object shapes, but I can't find a way to detect the intersection without having both bodies hit each other on impact.
I'm using cocos2d-x 3+ with the default chipmunk engine (which I'd like to stick with for now)
The question is, how do I detect the intersection of elements without having them physically push each other when they intersect.
The answer is very simple (Though it took me 2 days to figure it out)
When contact is detected and onContactBegin() is called, when the relevant shape is being hit returning false will stop the physical interaction.

why game is running slow in libgdx?

I am making racing game in Libgdx.My game apk size is 9.92 mb and I am using four texture packer of total size is 9.92 Mb. My game is running on desktop but its run on android device very slow. What is reason behind it?
There are few loopholes which we neglect while programming.
Desktop processors are way more powerful so the game may run smoothly on Desktop but may slow on mobile Device.
Here are some key notes which you should follow for optimum game flow:
No I/O operations in render method.
Avoid creating Objects in Render Method.
Objects must be reused (for instance if your game have 1000 platforms but on current screen you can display only 3, than instead of making 1000 objects make 5 or 6 and reuse them). You can use Pool class provided by LibGdx for object pooling.
Try to load only those assets which are necessary to show on current screen.
Try to check your logcat if the Garbage collector is called. If so than try to use finalize method of object class to find which class object are collected as garbage and try to improve on it.
Good luck.
I've got some additional tips for improving performance:
Try to minimize texture bindings (or generally bindings when you're making a 3D game for example) in you render loop. Use texture atlases and try to use one texture after binding as often as possible, before binding another texture unit.
Don't display things that are not in the frustum/viewport. Calculate first if the drawn object can even be seen by the active camera or not. If it's not seen, just don't load it onto your GPU when rendering!
Don't use spritebatch.begin() or spritebatch.end() too often in the render loop, because every time you begin/end it, it's flushed and loaded onto the GPU for rendering its stuff.
Do NOT load assets while rendering, except you're doing it once in another thread.
The latest versions of libgdx also provide a GLProfiler where you can measure how many draw calls, texture bindings, vertices, etc. you have per frame. I'd strongly recommend this since there always can be situations where you would not expect an overhead of memory/computational usage.
Use libgdx Poolable (interface) objects and Pool for pooling objects and minimizing the time for object creation, since the creation of objects might cause tiny but noticable stutterings in your game-render loop
By the way, without any additional information, no one's going to give you a good or precise answer. If you think it's not worth it to write enough text or information for your question, why should it be worth it to answer it?
To really understand why your game is running slow you need to profile your application.
There are free tools avaiable for this.
On Desktop you can use VisualVM.
On Android you can use Android Monitor.
With profiling you will find excatly which methods are taking up the most time.
A likely cause of slowdowns is texture binding. Do you switch between different pages of packed textures often? Try to draw everything from one page before switching to another page.
The answer is likely a little more that just "Computer fast; phone slow". Rather, it's important to note that your computer Java VM is likely Oracles very nicely optimized JVM while your phone's Java VM is likely Dalvik, which, to say nothing else of its performance, does not have the same optimizations for object creation and management.
As others have said, libGDX provides a Pool class for just this reason. Take a look here: https://github.com/libgdx/libgdx/wiki/Memory-management
One very important thing in LibGDX is that you should make sure that sometimes loading assets from the memory cannot go in the render() method. Make sure that you are loading the assets in the right times and they are not coming in the render method.
Another very important thing is that try to calculate your math and make it independent of the render in the sense that your next frame should not wait for calculations to happen...!
These are the major 2 things i encountered when I was making the Snake game Tutorial.
Thanks,
Abhijeet.
One thing I have found, is that drawing is laggy. This means that if you are drawing offscreen items, then it uses a lot of useless resources. If you just check if they are onscreen before drawing, then your performance improves by a lot surprisingly.
Points to ponder (From personal experience)
DO NOT keep calling a function,in render method, that updates something like time,score on HUD (Make these updates only when required eg when score increases ONLY then update score etc)
Make calls IF specific (Make updations on certain condition, not all the time)
eg. Calling/updating in render method at 60FPS - means you update time 60 times a sec when it just needs to be updated once per sec )
These points will effect hugely on performance (thumbs up)
You need to check the your Image size of the game.If your image size are more than decrease the size of images by using the following link "http://tinypng.org/".
It will be help you.

Tri-Color Incremental Updating GC: Does it need to scan each stack twice?

Let me give you a short introduction to a tri-color GC (in case somebody reads it who has never heard of it); if you don't care, skip it and jump to The Problem.
How a Tri-Color GC Works
In a tri-color GC an object has one out of three possible colors; white, gray and black. A tri-color GC can be described as follows:
All objects are initially white.
All objects reachable because a global variable or a stack variable refers to it ("the root objects") are colored gray.
We take any gray object, find all references it has to white objects and color those white objects gray. Then we color the object itself black.
We continue at step 3 as long as we have gray objects.
If we have no gray objects any longer, all remaining objects are either white or black.
All black objects haven been proven to be reachable and must stay alive. All white objects are unreachable and can be deleted.
So far this is not too complicated… at least if the GC is StW (Stop the World), meaning it will pause all threads while collecting garbage. If it is concurrent, a tri-color GC has an invariant that must hold true at all times:
A black object must not refer to a white object!
This holds true automatically for a StW GC, since every object that is colored black has been examined previously and all white objects it was pointing to were colored gray, thus a black object may only refer to other black objects or gray objects.
If threads are not paused, threads can execute code that would break this invariant. There are several ways how to prevent this:
Capture all read access to pointers and look if this read access is made to a white object. If it is, color that object gray immediately. If a ref to this object is now assigned to a black object, it won't matter, the object is gray and not white any longer (this implementation uses a read-barrier)
Capture all write access to pointers and look if the assigned object is white and the object it is assigned to is black. If so, color the white object gray. This is the more obviously way of doing things, but also needs a bit more processing time (this implementation uses a write-barrier)
Since read-accesses are much more common than write-accesses, even though the second possibility involves more processing time when the barrier is hit, it is called less often and such the favored one. A GC working like that is called an "incremental updating GC".
There is an alternative to both techniques, called SatB (Snapshot at the Beginning). This variation works slightly different, considering the fact that it is not really necessary to uphold the invariant at all times, since it does not matter if a black object refers to a white one as long as the GC knows that this white object used to be and still is accessible during the current GC cycle (either because there are still gray objects referring to this white object as well, or because the a ref to this white object is put onto an explicit stack that is also considered by the GC when it runs out of gray objects). SatB collectors are used more often in practice, because they have some advantages, but IMHO they are harder to implement.
I'm referring here to a incremental updating GC, that uses variant 2: Whenever the code tries to make a black object point to a white object, it immediately colors the object gray. That way this object won't be missed in the collection cycle.
The Problem
So much about tri-color GCs. But there is one thing I don't understand about tri-color GCs. Let's assume we have an object A, that is referred to by the stack and itself refers to an object B.
stack -> A.ref -> B
Now the GC starts a cycle, halts the thread, scans the stack and sees A as directly accessible, coloring A gray. Once it is done with scanning the whole stack, it unpauses the thread again and starts processing at step (3). Before it starts doing anything, it is preempted (can happen) and the thread runs again and executes the following code:
localRef = A.ref; // localRef points to B
A.ref = NULL; // Now only the stack points to B
sleep(10000); // Sleep for the whole GC cycle
Since the invariant has not been violated, B was white, but has not been assigned to a black object, the color of B has not changed, it is still white. A does not refer to B any longer, so while processing the "gray" A, B won't change its color and A will become black. At the end of the cycle, B is still white and looks like garbage. However, localRef is referring to B, thus it is not garbage.
The Question
Am I right, that a tri-color GC must scan the stack of each thread twice? Once at the very beginning, to identify root objects (getting color gray) and again before deleting the white objects, as those might be referenced by the stack, even though no other object refers to them any longer. No description of the algorithm I've seen so far mentioned anything about scanning the stack twice. They all only said, that when used concurrent, it is important that the invariant is enforced at all time, otherwise reachable objects are missed. But as far as I can see, that is not enough. The stack must be considered like a single big object and once scanned, the "stack is black" and every ref update of the stack must cause the object to be colored gray.
If that is really the case, using incremental updating may be more tricky than I initially thought and has some performance drawbacks, since stack changes are the most frequent ones of all.
A bit of terminology:
Let me give some names so that explanations are clearer.
A variable is any slot for data, which may contain a pointer and may change over time. This includes global variables, local variables, CPU registers and fields in allocated objects.
In a tricolor incremental or concurrent GC, there are three types of variables:
the true roots, which are always accessible (CPU registers, global variables);
the fast variables, which are scanned in a stop-the-world fashion;
the slow variables, which are handled with the colors. Slow variables are fields in colored objects.
The "true roots" and "fast variables" will be hereafter collectively called roots.
The application threads are called the mutators because they change the contents of variables.
With an incremental or concurrent GC, GC pauses occur regularly. The world is stopped (mutators are paused), and the roots are scanned. This scan reveals a number of references to colored objects. Object colors are adjusted accordingly (such white objects are made grey).
When the GC is incremental, some object scanning activity takes place: some grey objects are scanned (and painted black), greying referenced white objects. This activity (the "marking") is maintained for some time, but not necessarily as long as there are grey objects. At some point, the marking stops and the world is awakened. The GC is called "incremental" because the GC cycle is performed in small increments, interleaved with mutator activity.
In a concurrent GC, scanning of grey objects occurs concurrently with mutator activity. The world is then awakened as soon as the roots have been scanned. With a concurrent GC, access barriers are a tad complex to implement because they must handle concurrent access from the GC thread; but at a conceptual level, this is not very different from an incremental GC. A concurrent GC can be viewed as an optimization over incremental GC, which takes advantage of the presence of multiple CPU cores (a concurrent GC has little advantage over an incremental GC when there is only one core).
Roots need not be protected by an access barrier, since they are scanned with the world stopped. The GC mark phase ends when the following conditions are simultaneously met:
the roots have just been scanned;
all objects are either black or white, but not grey.
so this situation can occur only during a pause. At that point, the sweep phase begins, during which white objects are released. The sweep can be done incrementally or concurrently; objects created during the sweep are immediately painted black. When the sweep is finished, a new GC mark phase can take place: objects (which are all black at that point) are all repainted white (this is done atomically by simply changing the way color bits are interpreted).
Variable Classification:
With that being said, I can now answer to your question. With the description above, the question becomes: what are the roots ? This is actually up to the implementation; there are several possibilities, and trade-offs.
True roots must always be scanned; true roots are the CPU register contents and the global variables. Note that the stacks are not true roots; only the current stack frame pointer is.
Since fast variables are accessed without barriers, it is customary to make stack frames fast variables (i.e. roots as well). This is because while write accesses are rare system-wide, they are quite common in the local variables. It has been measured (on some Lisp programs) that about 99% of writes (of a pointer value) have a local variable as target.
Fast variables are often extended even further, in the case of a generational GC: the "young generation" consists in a special allocation area for new objects, limited in length, and scanned as fast variables. The bright side of fast variables is fast access (hence the name); the downside is that all these fast variables may be scanned only during a pause (the world is stopped). There is a trade-off on the size of the fast variables, which often translates to a limit on the young generation size. A larger young generation promotes average performance (by reducing the number of access barriers) at the cost of longer pauses.
At the other extreme, you may have no fast variable at all, and no root but the true roots. The stack frames are then handled as objects, each with their own color. Pauses are then minimal (a mere snapshot of a dozen register) but barriers must be used even for access to local variables. This is expensive, but has some advantages:
Hard guarantees on pause times can be made. This is difficult if stack frames are roots, because each new thread has its own stack, so the roots total size may grow up to arbitrary amounts as new threads are launched. If only CPU registers and global variables (no more than a few dozens in typical cases, and their number is known at compilation time) are roots, then pauses can be kept very short.
This allows for dynamic allocation of stack frames in the heap. This is needed if you play with co-routines and continuations, as with Scheme's call/cc primitive. In such a case, frames are no longer handled as a pure "stack". Proper handling of continuations in a GC-aware language mostly requires that function frames be allocated dynamically.
It is possible to make stack frames non-root while keeping a young generation as root. Guarantees on pause times can still be made (depending on the young generation size, which is fixed) and some trickery can be applied to make sure that stack frames are in the young generation when their function is active. This can ensure barrier-free access to local variables. None of this is really free, but it can be made efficient enough for most purposes.
Another Conceptual View:
Another way to view root-handling is the following: roots are the variables for which the tricolor rule (no black-to-white pointer) is not maintained at all times; these variables are allowed to be mutated without constraint. But they must be brought back in line regularly, by stopping the world and scanning them.
In practice, the mutators are racing with the GC. Mutators create new objects, and point to them; each pause creates new grey objects. In a concurrent or incremental GC, if you let the mutators play with roots for too long, then each pause may create a big batch of new grey objects. In the worst case, the GC cannot scan objects fast enough to keep up with the rate of grey object creation. This is an issue because white objects can be released only during the sweep phase, which is reached only if at some point the GC may complete its marking. A usual implementation strategy for an incremental GC is to scan grey objects, during each pause, for a total size which is proportional to the total size of roots. Thus, pause time remains bounded by the roots total size, and if the proportionality factor is well balanced then it can be guaranteed that the GC will ultimately terminate is marking phase and enter the sweeping phase.
In a concurrent GC, things are a bit more complex, because the mutators roam freely in the wild. A possible implementation would make a little bit of incremental marking while the world is still stopped.
Bibliography:
Garbage Collection: Algorithms for Automatic Dynamic Memory Management: a must-read book on garbage collection.
Thomas obviously has the best answer. However, I'm just going to add a little side-note here.
The root nodes can be conceptually thought of as black nodes, because any object referred to by a root node must by grey or black.
Therefore, in order to maintain the invariant, assignment to a root variably should automatically grey the object.