I'm building an Air application, which, due to the workflow of the animators, needs to load a lot of external swf files.
I load them all via FileStream / Loader Objects at the boot process and then store them in an object for further use.
The instant they get loaded, I use a gotoAndStop(1) command to make them stop looping (the original files have no scripts whatsoever).
After the load process I can see the system memory go slowly but consistently up.
When I manually invoke Garbage Collection with the System.gc() command,
the memory gets cleared again.
If I let the application run for hours it seems the Garbage Collector does not run.
Any ideas what the problem might be? Or should I just forget about it and just occasionally run System.gc() manually?
Thanks a lot!
The Garbage Collector will only run when it needs to, so it is entierly possible that it would go a very long time before running (especially if you have a lot of RAM available).
The important thing is the memory is cleared when it is run. This tells me that you don't have a leak, as it can be cleared up.
Also, how are you measuring the system memory? If you are doing it through Task Manager those numbers aren't really to be relied upon.
I recommend Process Explorer . Monitor the 'Private Bytes' column instead.
Related
I'm working with an STM32F469 chip with a Micron MT25Q Quad_SPI Flash. To program the Flash, there needs to be an external loader program developed. That's all working, but the problem is that verification of the QSPI Flash is extremely slow.
Looking in the log file, it shows that the Flash is being programmed in 150K byte blocks. However, the verification is being done in 1K byte blocks. In addition, the chip is re-initialized before each block check. I've tried this with both through STM32cubeIDE and in STM32cubeProgrammer directly.
The external programmer program include the correct chip configuration information and specifies a 64K page size. I don't see how to get the programmer to use a larger block size. It looks like it understands what part of the SRAM is used and is using the balance of the 256K in the on-board SRAM for programming the QSPI Flash. It could use the same size for reading the data back or use the Verify() function in the external loader. It's calling Read() and then checking the data itself.
Any thought or hints?
Let me add some observations on creating a new external loader. The first observation is "Don't." If you can pick a supported external chip and pin it out to use an existing loader, then do that. STM provides just 4 example programs but they must have 50 external loaders. If the hardware design copies the schematic for a demo board that has an external loader, you should be fine and avoid doing the development work.
The external loader is not a complete executable. It provides a set of functions to do basic operations like Init(), Erase(), Read() and Write(). The trick is that there is no main() and no start-up code is run when the program starts.
The external loader is an ELF file, renamed to "*.stldr". The programming tool looks into the debug information to find the location of the functions. It then sets the registers to provide the parameters, the PC to run the function, and then let's it run. There's some super-clever work going on to make this work. The programmer looks at the returned value (R0) to see if things pass or not. It can also figure out if the function has crashed the core or otherwise timed-out.
What makes writing the external super fun is that the debugger is running the program so there's no debugger available to see what the code is doing. I settled on outputting errors, and encoded information, on the return() from the called functions to give hints as to what was happening.
The external loader isn't a "full" program. Without the startup code, lots of on-chip stuff isn't set up and some just isn't going to work. At least I couldn't figure it out. I'm not sure if it wasn't configured right or the debugger was blocking its use. Looking at the example external loaders, they are written in a very simple way and do not call the HAL or use interrupts. You'll need to provide core set-up functions to configure the clock chains. That Hal_Delay() method will never return as the timers and/or interrupts aren't working. I could never make them work and suspect the NVIC was somehow being disabled. I ended up replacing the HAL_delay() function with a for loop that spun based on core clock rate and a the instruction cycles per loop.
The app note suggests developing a stand-alone program to debug the basic capabilities. That's a good idea but a challenge. Prior to starting the external loader, I had the QSPI doing the needed operations but from a C++ application calling the HAL. Creating an external loader from that was a long exercise in stripping out and replacing functionality. A hint is that the examples are written at a register level. I'm not that good to deal directly with the QuadSPI peripheral and the chip's instruction set at the same time.
The normal start-up of a program is eliminated. Everything that's done before the main() is called (E.g., in startup_stm32f469nihx.s) is up to you. This includes setting the clock chains to boost the core clock and get the peripheral buses working. The program runs in the on-chip SRAM so any initialized variables are loaded correctly. There's no moving data needed but the stack and uninitialized data areas could/should still be zeroed.
I hope this helps someone!
Today I've faced the same issue.
I was able to improve the Verify speed with two simple steps, but Verify still much slower than programming and this is strange...
If anyone find a way to change the 1KB block read of STM32CubeProgrammer I would like to know =).
Follow the changes I made to improve a bit the performance.
Add a kind of lock in the Init Function to avoid multiple initializations. This was the most significant change because I'am checking the Flash ID in my initialization proccess. Other aproachs could be safer but this simple code snippet worked to me.
int Init(void)
{
static uint32_t lock;
if(lock != 0x43213CA5)
{
lock = 0x43213CA5;
/* Init procedure goes here */
}
return(1);
}
Cache a page instead of reading the external memory for each call. This will help more if your external memory page read has too much overhead, otherwise this idea won't give relevant results.
Using Version 48.0.2564.109 m.
We have a javascript web app (built with ExtJS). In Chrome, when we leave our app sitting there for a while, the GC starts going nuts. In Task Manager, you can see the CPU constantly spinning around 25%.
I took timeline snapshots and CPU profiles, and you can see the GC, about 10 times a seconds, try to collect memory, but collects 0B.
Our app is a large enterprise application and does use quite a bit of memory and updates the screen periodically.
But, there is absolutely no javascript code running during this time. So I can't see that it is something our app is actively doing
Does anyone know what could be triggering this?
It is killing performance of our app.
Also, it only happens when our tab is active. If you switch to a different tab, the CPU dies down and the GC stops.
Is there other data I need to collect to help determine this?
What is your app current JS heap size? You can check it by collecting timeline and enabling memory check box.
It looks like your app is close to the V8 memory limit, so V8 is trying to free some memory. If it is expected for the app to use that much memory, you can increase the limit on your host with something like: --js-flags="--max-old-space-size=2048"
Otherwise it might be just a memory leak in your code. Use heap profiler to hunt it down.
While unit profiling my classes I noticed that the String class endlessly accumulates (eating up over 90% of the memory in my sizable app). Luckily this is only while running in Profiler mode of Flash Builder 4.6. In debug or deployment (as AIR) memory usage levels off as expected using embedded on-screen memory profilier (Mr Doobs Stats).
To verify I made a test app that was simply a URLLoader continuously loading a text file. When running in Profilier mode using URLLoaderDataFormat.String the String data is never GC'd and grows continuously whereas using URLLoaderDataFormat.BINARY the data is nearly immediately GC'd and stays level.
I hesitate to call this a bug, because possibly it's necessary part of the way the Profilier works… but perhaps this is abnormal for the Profiler? This is the essence of my StackOverflow inquiry.
At any rate, this burned up a couple work-days for me, so if you're Googling wondering why the String class is growing like crazy and never getting GC'd consider measuring your apps memory usage outside the Profilier to verify. In my case I was mislead into thinking I had run into some problem with Master Strings — though it's good understand Master Strings and their impact on memory (see:) don't get mislead like I did.
I'm writing a Windows CE application, and I want to play a sound (a short wav file) when something happens. Since this sound will be played often, my first instinct was to load the wav file into a memory stream and reuse that stream instead of reading the file every time.
But then it occured to me that these Windows Mobile devices only have one kind of memory, which is used both for data storage (= the file system) as well as for program memory; there's even a nice slider in the control panel which you can use to delegate memory to either storage or program execution. So, theoretically, reading a file from the file system (or some value from a SQL Server CE database) should take (almost) the same amount of time as reading this value from some in-memory object, right?
Is this assumption correct (i.e., in-memory caching on application level doesn't make sense here) or did I miss something? For simplicity, let's assume that only the internal memory of the device is used (no memory card).
The assumption may or may not be valid. Where in storage does it reside? If it's persistent storage (like a storage card folder or anything else that remains when you hard reset) then it's backed by Flash, which is way, way slower than RAM and there will be a difference in load perf, though how much it might impact your app I can't say - only testing will tell you that.
When I want to play a short WAV file on Windows Mobile (e.g. notification sound). I usually add it as a resource to my executable. AFAIK resources are loaded into RAM since they are part of the executable image. You can then conveniently call PlaySound() with the SND_RESOURCE (and probably OR that with SND_ASYNC too so the call isn't going to block while the file is being played) flag.
I have a VxWorks application running on ARM uC.
First let me summarize the application;
Application consists of a 3rd party stack and a gateway application.
We have implemented an operating system abstraction layer to support OS in-dependency.
The underlying stack has its own memory management&control facility which holds memory blocks in a doubly linked list.
For instance ; we don't directly perform malloc/new , free/delege .Instead we call OSA layer's routines and it gets the memory from OS and puts it in a list then returns this memory to application.(routines : XXAlloc , XXFree,XXReAlloc)
And when freeing the memory we again use XXFree.
In fact this block is a struct which has
-magic numbers indication the beginning and end of memory block
-size that user requested allocated
-size in reality due to alignment issue previous and next pointers
-pointer to piece of memory given back to application. link register that shows where in the application xxAlloc is called.
With this block structure stack can check if a block is corrupted or not.
Also we have pthread library which is ported from Linux that we use to
-create/terminate threads(currently there are 22 threads)
-synchronization objects(events,mutexes..)
There is main task called by taskSpawn and later this task created other threads.
this was a description of application and its VxWorks interface.
The problem is :
one of tasks suddenly gets destroyed by VxWorks giving no information about what's wrong.
I also have a jtag debugger and it hits the VxWorks taskDestoy() routine but call stack doesn't give any information neither PC or r14.
I'm suspicious of specific routine in code where huge xxAlloc is done but problem occurs
very sporadic giving no clue that I can map it to source code.
I think OS detects and exception and performs its handling silently.
any help would be great
regards
It resolved.
I did an isolated test. Allocated 20MB with malloc and memset with 0x55 and stopped thread of my application.
And I wrote another thread which checks my 20MB if any data else than 0x55 is written.
And quess what!! some other thread which belongs other components in CPU (someone else developed them) write my allocated space.
Thanks 4 your help
If your task exits, taskDestroy() is called. If you are suspicious of huge xxAlloc, verify that the allocation code is not calling exit() when memory is exhausted. I've been bitten by this behavior in a third party OSAL before.
Sounds like you are debugging after integration; this can be a hell of a job.
I suggest breaking the problem into smaller pieces.
Process
1) you can get more insight by instrumenting the code and/or using VxWorks intrumentation (depending on which version). This allows you to get more visibility in what happens. Be sure to log everything to a file, so you move back in time from the point where the task ends. Instrumentation is a worthwile investment as it will be handy in more occasions. Interesting hooks in VxWorks: Taskhooklib
2) memory allocation/deallocation is very fundamental functionality. It would be my first candidate for thorough (unit) testing in a well-defined multi-thread environment. If you have done this and no errors are found, I'd first start to look why the tas has ended.
other possible causes
A task will also end when the work is done.. so it may be a return caused by a not-so-endless loop. Especially if it is always the same task, this would be my guess.
And some versions of VxWorks have MMU support which must be considered.