tkzinc bbox command taking hours to complete - tcl

I have tcl script for creating tkzinc canvas which have multiple compartments and other components associated with each compartment.
The code runs perfectly fine with 32-bit centos5.6 and takes less than a minute but on 64-bit centos7 it takes around 25-28 hours load the canvas.
Need help on the same.

Related

Using "tic" without "toc" in Octave

Quick background, Octave GUI version 6.2.0 takes at least 20-30 seconds to initialize on one computer, but 1-2 seconds on another computer running the same version and same OS. I wanted to figure out where the holdup was, so I used the tic and toc timer functions in various places within the octaverc startup file. At some point, I noticed that merely having tic at the very end of the startup file completely fixed the slow initialize time.
So while my main issue is resolved (though I don't know why that worked), I wanted to see if using tic without ever using toc in an Octave session can cause any issues, e.g., does this leave some timer process running in the background? Thanks in advance.
tic stores the current clock time in an internal variable (see tic_toc_timestamp here, the source code for tic starts right after that line), toc compares the current clock time to the stored variable (see its source code here).
So, no, there’s no timer running in the background after you call tic.

Chrome memory measurement now almost flat for longer test runs

In order to check our web application for memory leaks, I run a machine which does the following:
it runs automated End-to-End tests over (almost) the entire application in Chrome
after each block of tests, it goes to a state of the web application where almost nothing happens
it triggers gc(); for garbage collection
it saves totalJSHeapSize, and usedJSHeapSize to a file
it plots out the results for each test run to a graph
That way, we can see how much the memory increases and which are the problematic parts of our application: At some point the memory increases, at some point it decreases.
Till yesterday, it looked like this:
Bright red (upper line): totalJSHeapSize, light red (lower line): usedJSHeapSize
Yesterday, I updated Chrome to version 69. And now the chart looks quite different:
The start and end amount of memory used (usedJSHeapSize) is almost the same. But as you can clearly see, the way it changes over the course of the test (ca. 1,5h) is quite different.
My questions are now:
Is this a change in reality or in measurement? I.e. did Chrome change its memory handling? Or just the way it puts out memory values via totalJSHeapSize, and usedJSHeapSize?
Concerning memory leaks, is it good news or bad news for me? Like: Before I had dozens of spots where memory increases, now I have just three. Is this true? Or are the memory leaks in the now flat areas still there and hidden?
I'm also thankful for any background information on how Chrome changed its memory measurement.
Some additional info:
The VM runs under KUbuntu 18.04
It's a single web page application done with AngularJS 1.6
The outcome of the memory measurement is quite stable - both before and after the update of Chrome
EDIT:
It seems this was a bug of Chrome version 69. At least, with an update to Chrome 70, this strange behavior is gone and everything looks almost as before.
I don't think you should be worry about it. This can happen due to the memory manager used inside the chrome. You didn't mentioned the version of your first memory graph, possibility that the memory manager used between these two version is different. Chrome was using the TCMalloc which take the large chunk of memory from the OS and manage it, once the memory shortage happenned with TCMalloc then it ask again a big chunk of memory from OS and start managing it. So the later graph what you are seeing have less up and downs (but bigger then previous one) due to that. Hope it answered your query.
As you mentioned that
The outcome of the memory measurement is quite stable - both before and after the update of Chrome
You don't need to really worry about it, the way previously chrome was allocating memory and how it does with new version is different(possible different memory manager) that's it.

My Periodic Task just does not work

While attached to debugger it runs just fine. The Periodic Task is invoked and runs over and over, but when I deploy it to my device It seems to run 1-2 times and then stops.
What It does is setting the live tile background image from isolated storage. The images are created in the application and then saved to isolated storage. As mentioned it works well while attached to the debugger.
The only constraint I could think that could break it would be the memory cap. The application creates and saves 40 images of ~25kB each, and that isn't 1 MB! The application is maybe <4 MB, so that is 5 MB... a lot less than the 11 MB minimal requirement.
So it can't be the memory cap kicking in. Two consecutive unhandled crashes should also break the task, but I've thrown all the code in the task's OnInvoke() in a try/catch.
Now I'm out of ideas what stopping my periodic task when running without being connected to visual studio running in debugger. Any clues?
Firstly are you using Windows 8.1 phone by any chance? Since there is an issue with Periodic tasks do not run on windows phone 8.1 devices as you can see on this forum
Background agent can’t use more than 6MB of memory. You can get the current memory usage using the following snippet :
var memory = DeviceStatus.ApplicationMemoryUsageLimit
- DeviceStatus.ApplicationCurrentMemoryUsage;
automatically executed by the OS each 30 minutes
the operation can’t exceed 25 seconds per run
if the phone switch to battery saver mode the background agent may not be executed
on some devices only 6 background agents may be planned simultaneously
agents can’t use more that 6MB of memory
agents have to be re-planned each 2 weeks
an agent that crashes two times is automatically disabled by the system
Periodic tasks are unscheduled after two consecutive crashes. You need to make sure that this doesn't happen (check internet connectivity if required, set a timeout on web requests, etc.).
You should place your code in a try/catch block and log exceptions in the Isolated Storage to see what happened afterwards.
Here is the list of constraints that apply on scheduled agents (MSDN): Constraints for all Scheduled Task Types
Here is also a series of blog posts that could help you: Windows Phone: Background Agents Pitfalls
Have you actually measured and logged the memory that's being used? What you're saying isn't very correct:
When the background agent starts it has already taken 5-6MB to load what it needs from the .NET framework.
If you mean that the compressed files are 25KB each, you should know that the images in the memory are not compressed (at least not that much).
There are two things you can try:
Use this property and check the peak memory usage: DeviceStatus.ApplicationPeakMemoryUsage. Write it to some file (maybe every 5 images or so) and check if it's okay. Paste the results, please.
Note: When testing the memory usage, it's best to build the app in "Release" and run it without debugging on a device. That's most accurate. There are some minor variations, so you should run the agent several times to be sure it's working within the limits. You can force start it from the app using ScheduledActionService.LaunchForTest.
Also, I'd suggest you subscribe to the Application.Current.UnhandledException event and mark all exceptions as handled (and log them, so that you can fix them). That's for extra safety.
P.S. When the background agent stops executing, is it "blocked" in the list of background tasks on the device?

Why does Flash Builder 4.6 Profiler seem to leak Strings, whereas Debug mode GC's as expected

While unit profiling my classes I noticed that the String class endlessly accumulates (eating up over 90% of the memory in my sizable app). Luckily this is only while running in Profiler mode of Flash Builder 4.6. In debug or deployment (as AIR) memory usage levels off as expected using embedded on-screen memory profilier (Mr Doobs Stats).
To verify I made a test app that was simply a URLLoader continuously loading a text file. When running in Profilier mode using URLLoaderDataFormat.String the String data is never GC'd and grows continuously whereas using URLLoaderDataFormat.BINARY the data is nearly immediately GC'd and stays level.
I hesitate to call this a bug, because possibly it's necessary part of the way the Profilier works… but perhaps this is abnormal for the Profiler? This is the essence of my StackOverflow inquiry.
At any rate, this burned up a couple work-days for me, so if you're Googling wondering why the String class is growing like crazy and never getting GC'd consider measuring your apps memory usage outside the Profilier to verify. In my case I was mislead into thinking I had run into some problem with Master Strings — though it's good understand Master Strings and their impact on memory (see:) don't get mislead like I did.

CUDA apps time out & fail after several seconds - how to work around this?

I've noticed that CUDA applications tend to have a rough maximum run-time of 5-15 seconds before they will fail and exit out. I realize it's ideal to not have CUDA application run that long but assuming that it is the correct choice to use CUDA and due to the amount of sequential work per thread it must run that long, is there any way to extend this amount of time or to get around it?
I'm not a CUDA expert, --- I've been developing with the AMD Stream SDK, which AFAIK is roughly comparable.
You can disable the Windows watchdog timer, but that is highly not recommended, for reasons that should be obvious.
To disable it, you need to regedit HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Watchdog\Display\DisableBugCheck, create a REG_DWORD and set it to 1.
You may also need to do something in the NVidia control panel. Look for some reference to "VPU Recovery" in the CUDA docs.
Ideally, you should be able to break your kernel operations up into multiple passes over your data to break it up into operations that run in the time limit.
Alternatively, you can divide the problem domain up so that it's computing fewer output pixels per command. I.e., instead of computing 1,000,000 output pixels in one fell swoop, issue 10 commands to the gpu to compute 100,000 each.
The basic unit that has to fit within the time slice is not your entire application, but the execution of a single command buffer. In the AMD Stream SDK, a long sequence of operations can be broken up into multiple time slices by explicitly flushing the command queue with a CtxFlush() call. Perhaps CUDA has something similar?
You should not have to read all of your data back and forth across the PCIX bus on every time slice; you can leave your textures, etc. in gpu local memory; you just have some command buffers complete occasionally, to prove to the OS that you're not stuck in an infinite loop.
Finally, GPUs are fast, so if your application is not able to do useful work in that 5 or 10 seconds, I'd take that as a sign that something is wrong.
[EDIT Mar 2010 to update:] (outdated again, see the updates below for the most recent information) The registry key above is out-of-date. I think that was the key for Windows XP 64-bit. There are new registry keys for Vista and Windows 7. You can find them here: http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx
or here: http://msdn.microsoft.com/en-us/library/ee817001.aspx
[EDIT Apr 2015 to update:] This is getting really out of date. The easiest way to disable TDR for Cuda programming, assuming you have the NVIDIA Nsight tools installed, is to open the Nsight Monitor, click on "Nsight Monitor options", and under "General" set "WDDM TDR enabled" to false. This will change the registry setting for you. Close and reboot. Any change to the TDR registry setting won't take effect until you reboot.
[EDIT August 2018 to update:]
Although the NVIDIA tools allow disabling the TDR now, the same question is relevant for AMD/OpenCL developers. For those: The current link that documents the TDR settings is at https://learn.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys
On Windows, the graphics driver has a watchdog timer that kills any shader programs that run for more than 5 seconds. Note that the Xorg/XFree86 drivers don't do this, so one possible workaround is to run the CUDA apps on Linux.
AFAIK it is not possible to disable the watchdog timer on Windows. The only way to get around this on Windows is to use a second card that has no displayed screens on it. It doesn't have to be a Tesla but it must have no active screens.
Resolve Timeout Detection and Recovery - WINDOWS 7 (32/64 bit)
Create a registry key in Windows to change the TDR settings to a
higher amount, so that Windows will allow for a longer delay before
TDR process starts.
Open Regedit from Run or DOS.
In Windows 7 navigate to the correct registry key area, to create the
new key:
HKEY_LOCAL_MACHINE>SYSTEM>CurrentControlSet>Control>GraphicsDrivers.
There will probably one key in there called DxgKrnlVersion there as a
DWord.
Right click and select to create a new key REG_DWORD, and name it
TdrDelay. The value assigned to it is the number of seconds before
TDR kicks in - it > is currently 2 automatically in Windows (even
though the reg. key value doesn't exist >until you create it). Assign
it with a new value (I tried 4 seconds), which doubles the time before
TDR. Then restart PC. You need to restart the PC before the value will
work.
Source from Win7 TDR (Driver Timeout Detection & Recovery)
I have also verified this and works fine.
The most basic solution is to pick a point in the calculation some percentage of the way through that I am sure the GPU I am working with is able to complete in time, save all the state information and stop, then to start again.
Update:
For Linux: Exiting X will allow you to run CUDA applications as long as you want. No Tesla required (A 9600 was used in testing this)
One thing to note, however, is that if X is never entered, the drivers probably won't be loaded, and it won't work.
It also seems that for Linux, simply not having any X displays up at the time will also work, so X does not need to be exited as long as you screen to a non-X full-screen terminal.
This isn't possible. The time-out is there to prevent bugs in calculations from taking up the GPU for long periods of time.
If you use a dedicated card for CUDA work, the time limit is lifted. I'm not sure if this requires a Tesla card, or if a GeForce with no monitor connected can be used.
The solution I use is:
1. Pass all information to device.
2. Run iterative versions of algorithms, where each iteration invokes the kernel on the memory already stored within the device.
3. Finally transfer memory to host only after all iterations have ended.
This enables control over iterations from CPU (including option to abort), without the costly device<-->host memory transfers between iterations.
The watchdog timer only applies on GPUs with a display attached.
On Windows the timer is part of the WDDM, it is possible to modify the settings (timeout, behaviour on reaching timeout etc.) with some registry keys, see this Microsoft article for more information.
It is possible to disable this behavior in Linux. Although the "watchdog" has an obvious purpose, it may cause some very unexpected results when doing extensive computations using shaders / CUDA.
The option can be toggled in your X-configuration (likely /etc/X11/xorg.conf)
Adding: Option "Interactive" "0" to the device section of your GPU does the job.
see CUDA Visual Profiler 'Interactive' X config option?
For details on the config
and
see ftp://download.nvidia.com/XFree86/Linux-x86/270.41.06/README/xconfigoptions.html#Interactive
For a description of the parameter.