I've been building a Chrome extension using in part the Chrome debugger protocol.
Certain events in the Network domain like requestWillBeSent include a "timestamp" as well as a "wallTime."
The walltime is a regular seconds since 1970 format, but the timestamp is in seconds but its not clear where its 0 is, and many events have no wallTime so I'm really trying to figure out how to derive wallTime from timeStamp.
Based on this I believed to be based on the navigationStart value but that did not yield the correct date based on either the background page of the extension's navigationStart nor the page where the event originated navigationStart.
Is it possible at all to use timestamp to get at the wallTime or am I out of luck?
According to source code in InspectorNetworkAgent.cpp:
wallTime is currentTime() (normal system time)
timestamp is monotonicallyIncreasingTime()
On Windows it's based on the number of milliseconds that have elapsed since the system was started, and you can't get that info from an extension.
On POSIX systems (e.g. Linux) clock_gettime in CLOCK_MONOTONIC mode is used that represents monotonic time since some unspecified starting point.
According to source code in time.h:
TimeTicks and ThreadTicks represent an abstract time that is most of the time
incrementing, for use in measuring time durations. Internally, they are
represented in microseconds. They can not be converted to a human-readable
time, but are guaranteed not to decrease (unlike the Time class). Note that
TimeTicks may "stand still" (e.g., if the computer is suspended), and
ThreadTicks will "stand still" whenever the thread has been de-scheduled by
the operating system.
Related
Using only driver api, for example, I have a profiling with single process below(cuCtxCreate), cuCtxCreate overhead is nearly comparable to 300MB data copy to/from GPU:
In CUDA documentation here, it says(for cuDevicePrimaryCtxRetain) Retains the primary context on the device, creating it **if necessary**. Is this an expected behavior for repeated calls to same process from command line(such as running a process 1000 times for explicitly processing 1000 different input images)? Does device need CU_COMPUTEMODE_EXCLUSIVE_PROCESS to work as intended(re-use same context when called multiple times)?
For now, upper image is same even if I call that process multiple times. Even without using profiler, timings show around 1second completion time.
Edit: According the documentation, primary context is one per device per process. Does this mean there won't be a problem when using multiple threaded single application?
What is re-use time limit for primary context? Is 1 second between processes okay or does it have to be miliseconds to keep primary context alive?
I'm already caching ptx codes into a file so the only remaining overhead looks like cuMemAlloc(), malloc() and cuMemHostRegister() so re-using latest context from last call to same process would optimize timings good.
Edit-2: Documentation says The caller must call cuDevicePrimaryCtxRelease() when done using the context. for cuDevicePrimaryCtxRetain. Is caller here any process? Can I just use retain in first called process and use release on the last called process in a list of hundreds of sequentally called processes? Does system need a reset if last process couldn't be launched and cuDevicePrimaryCtxRelease not called?
Edit-3:
Is primary context intended for this?
process-1: retain (creates)
process-2: retain (re-uses)
...
process-99: retain (re-uses)
process-100: 1 x retain and 100 x release (to decrease counter and unload at last)
Everything is compiled for sm_30 and device is Grid K520.
GPU was at boost frequency during cuCtxCreate()
Project was 64-bit(release mode) compiled on a windows server 2016 OS and CUDA driver installation with windows-7 compatibility(this was the only way working for K520 + windows_server_2016)
tl;dr: No, it is not.
Is cuDevicePrimaryCtxRetain() used for having persistent CUDA context objects between multiple processes?
No. It is intended to allow the driver API to bind to a context which a library which has used the runtime API has already lazily created. Nothing more than that. Once upon a time it was necessary to create contexts with the driver API and then have the runtime bind to them. Now, with these APIs, you don't have to do that. You can, for example, see how this is done in Tensorflow here.
Does this mean there won't be a problem when using multiple threaded single application?
The driver API has been fully thread safe since about CUDA 2.0
Is caller here any process? Can I just use retain in first called process and use release on the last called process in a list of hundreds of sequentally [sic] called processes?
No. Contexts are always unique to a given process. They can't be shared between processes in this way
Is primary context intended for this?
process-1: retain (creates)
process-2: retain (re-uses)
...
process-99: retain (re-uses)
process-100: 1 x retain and 100 x release (to decrease counter and unload at last)
No.
cudaEventRecord takes an event ID and a stream ID as parameters. The Runtime API reference does not say whether the stream is required to be associated with the current device - and I can't test whether that's the case since I only have one GPU at most on any system I have access to right now.
Assuming that it must be a stream on the current device:
what happens if it gets a stream on another device?
Assuming that it can be a stream on any device:
What happens when it gets the (current device's) default stream's ID? After all, all devices' default streams have the same (null) ID?
Is there any difference in behavior based on whether the stream's device is current or not?
Combining the information from #Talonmies' answer and the Stream and Event Behavior section of the CUDA C Programming Guide which #RobertCrovella linked to in his comment.
Must the stream be associated with the current device?
No, it can be any device. However, event recording does require that the stream and the event be associated with the same device.
Is there any difference in behavior based on whether the stream's device is current or not?
Typically, no, but...
What happens when it gets the (current device's) default stream's ID?
... the Default stream is an exception to that rule. Since (each device's own) default stream has the same ID, passing the null ID to cudaEventRecord means that it will check what device is currently set to determine which stream to record the event on (and this needs to be the same device the event is associated with).
When parsing NAL units from a H.264 source is it possible to determine the end of an Access Unit without having to find the start of the next one? I am aware of the following section in the H.264 spec:
7.4.1.2.4 Detection of the first VCL NAL unit of a primary coded picture
And I have currently implemented this. The problem here though, is that if there is a large time gap at the end of an Access Unit I won't 'get' the Access Unit until the start of the next one. Is there another way to determine the end (ie. last NAL) of an Access Unit?
I am also aware of the Marker Bit in the RTSP standard but it is not reliable enough for us to use. And in some cases it is just plain wrong.
no, I don't think so.
Unreliable marker bit is the only way to signal end of access unit (in case of RTP).
They should have handled it more reliably in h.264 payload (rfc 6184).
You can check for timestamps and sequence number to infer start of new AU but that is also unreliable (packet loss, reordering, need to wait for first packet of next AU)
I've been looking for a solution for this for a while now and I still haven't found it. Our app needs to poll a YouTube video object using player.getCurrentTime() to drive some screen animations. Using the flash API this was great because we could poll the player at 40ms intervals (25 FPS) and get very accurate current player time values. We have now started to use the iFrame API which unfortunately does not allow anything near that level of accuracy. I did some research and it seems that because it's an iFrame, a postMessage is used to expose the players state to the player.getCurrentTime() call. Unfortunately this post message event is fired very infrequently - sometimes as low as 4 times a second. Worse, the actual rate the message fires seems to be dependent on the render engine for the browser.
Does anybody know if it is possible to force the render engine to fire those messages more frequently so that greater time resolution can be achieved polling the player? I tried requestAnimationFrame and it doesn't solve the problem. Has anybody successfully managed to get the iFrame player to report more accurate times and more frequently?
I've come up with a workaround for my original problem. I wrote a simple tween function that will poll the iFrame player at the frequency I desire and interpolate the time instants in between. The player itself only updates the current time every 250 ms or so depending on the render engine and platform. If you poll it more frequently than that, it will return the same current time value on several consecutive polls. However, if you apply some logic, you can detect when the player returns a new current time and update your own timer accordingly. I run the function below on a timer with a 25 ms interval. On each iteration, I add 25 ms to the current time except in the case where I detect a change in the current time reported by the player. In that case I update my own timer with the new "actual" current time. There maybe a small jump or non linearity in the time when you do this but if you poll the player at a high enough rate, this should be imperceptible.
window.setInterval(tween_time, 25);
function tween_time() {
time_update = (ytplayer.getCurrentTime()*1000)
playing=ytplayer.getPlayerState();
if (playing==1){
if (last_time_update == time_update)
{
current_time_msec += 25;
}
if (last_time_update != time_update)
{
current_time_msec = time_update;
}
}
do_my_animations();
last_time_update = time_update;
}
In HTML5 it is likely that the getCurrentTime() function and postMessage event in Youtube API are linked to the currentTime property and timeupdate event of the HTML5 media element specification.
The rate at which the timeupdate event fires varies between browsers and as of today cannot be tuned for the level of precision you are looking for (Flash is still a bit ahead on this one). According to the specification:
If the time was reached through the usual monotonic increase of the current playback position during normal playback, and if the user agent has not fired a timeupdate event at the element in the past 15 to 250ms and is not still running event handlers for such an event, then the user agent must queue a task to fire a simple event named timeupdate at the element. (In the other cases, such as explicit seeks, relevant events get fired as part of the overall process of changing the current playback position.)
The event thus is not to be fired faster than about 66Hz or slower than 4Hz (assuming the event handlers don't take longer than 250ms to run). User agents are encouraged to vary the frequency of the event based on the system load and the average cost of processing the event each time, so that the UI updates are not any more frequent than the user agent can comfortably handle while decoding the video.
For the currentTime property the precision is expressed in seconds. Any precision below of the second is browser specific implementation and should not be taken for granted (in reality you will get sub second precision in modern browsers like Chrome but with fluctuating efficiency).
On top of that the Youtube API could be throttling all of those things up to get to the larger common ground of 250ms precision and make all browsers happy (hence 4 events per second and what you did notice in your tests). For your case scenario you better off trying to scale your animations to this 250 ms precision notion and allow for some margin of error for a better user experience. In the future browsers and HTML5 media will get better and hopefully we will get true milliseconds precision.
I have a webpage in my LAN in order to input barcodes in real time to a db through a field (framework django + postgresql + nginx). It works fine, but lately we have a customer that uses 72 char barcodes (Code Matrix) that slows down inputs because before next scan, user must wait the redraw of the last one in the field (it takes about 1-2 seconds, redrawing one char after the other).
Is there a way to reduce latency of drawing scanned text in the html field?
Best thing would be to show directly all the scanned barcode, not one char after the other. The scanner is set to add an "Enter" after the scanned text.
At the end, as Brad stated, the problem is more related to scanner's settings (USB in HID mode), although PC speed is also an issue. After several tests, on a dual core linux machine I estimate delay due 85% to the scanner and 15% to PC/browser combo.
To solve the problem I first searched and downloaded the complete manual of our 2D barcode scanner (306 pages), then I focused on USB Keystroke Delay as a cause, but default setting was already set to 'No Delay'.
The setting that affected reading speed was USB Polling Interval, an option that applies only to the USB HID Keyboard Emulation Device.
The polling interval determines the rate at which data can be sent between the scanner and the host computer. A lower number indicates a faster data rate: default was 8ms, wich I lowered to 3ms without problems. Lower rates weren't any faster, probably because it was reached the treshold where PC becomes the bottleneck.
CAUTION: Ensure your host machine can handle the selected data rate, selecting a data rate that is too fast for your host machine may result in lost data: in my case when I lowered polling interval to 1ms there were no data loss within the working PC, but when testing inside a virtual machine there was data loss as soon as I reached 6ms.
Another interesting thing is that browsers tend to respond significantly slower after a long period of use with many tabs open (a couple of hours in my case), probably due to caching.
Tests done with Firefox and Chromium browsers on old dual core PC with OS Lubuntu (linux).
This probably has nothing to do with your page, but with the speed of the scanner interface. Most of those scanners intentionally rate-limit their input so as not to fill the computer's buffer, avoiding characters getting dropped. Think about this... when you copy/paste text, it doesn't take a long time to redraw characters. Everything appears instantly.
Most of those scanners are configurable. Check to see if there is an option on your scanner to increase its character rate.
On Honeywell and many other brand scanners the USB Keystroke Interval is marked as an INTERCHARACHTER DELAY.
Also if there is a BAUD rate that would be something to increase.