Looking for help understanding the difference between the codecs:
avc1.42c020 and avc1.428020
I have a program that can request video in either of these formats but I'm not sure which one I should choose. Is one higher quality than the other? Will one impact CPU usage / network bitrate more than the other? Or are these mostly the same?
Hoping someone can explain what the numbers represent or point me in the right direction to look it up. Thanks!
This is described in Section 7 of RFC 6190. The three bytes that intrigue you are called the profile-level-id, and indicate the profile and sub-profile of the AVC codec that the peer supports. In your particular case,
42c020 indicates support for the Constrained Baseline profile, and 428020 indicates support for the Baseline profile.
The Baseline profile has slightly better support for dealing with packet loss, but some devices might not support it (only the Constrained Baseline profile is compulsory to implement in WebRTC according to RFC 7742). In practice, however, WebRTC doesn't need the features omitted in the Constrained Baseline profile (it has other mechanisms for dealing with packet loss), so it should be fine to choose the Constrained Baseline profile in all cases.
Related
I'm a SW developer trying to understand configuration of the RISC-V Platform-Level Interrupt Controller (PLIC) that's in a rocket-chip derived SoC in an FPGA. Please correct me if my terminology is off.
I'm trying to programmatically configure the PLIC after a warm boot, in particular clearing interrupt pending bits. I've read the RISC-V PLIC Specification which talks about up to 15872 contexts. While I can certainly iterate over all contexts with 1024 interrupts each, I would like to be more economical.
Where do I find the actual number of contexts? Is it constant for all rocket-chips designs? Is it a tunable value? What is the right question to ask the FPGA colleagues? They use chisel which I understand to be some sort of design language or tool.
To clarify terminology: What is a hart context?
We use the term hart to unambiguously and concisely describe a hardware thread as opposed to software-managed thread contexts.
The RISCV specification allows for up to 15872 contexts, but in practice you'll see many fewer - The actual number is set by each specific RISCV implementation. It is customizable in rocket-chip, so it could vary. The default configuration may offer more insight, but your specific configuration could be anything.
Your questions:
Where do the contexts come from? Where do I find the total number of contexts?
You can speculate what the number should be from implementation details, but as far as I'm aware there is no register that says how many contexts there might be. This will be implementation specific. Your best bet is looking at your rocket chip configuration.
From the Linux core docs:
A hart context is a privilege mode in a hardware execution thread. For example, in an 4 core system with 2-way SMT, you have 8 harts and probably at least two privilege modes per hart; machine mode and supervisor mode.
That means you would have 16 contexts for that case (4 cores x 2 threads x 2 privilege modes).
From this issue:
PLIC contexts are 1:1 with harts' interruptible privilege modes. (e.g. if you have 3 harts, each of which supports taking interrupts into M-mode and S-mode, you have 6 contexts.)
In this case, M mode and S mode are privilege modes.
Is there a Scala/Chisel/VHDL line of code to grep for the number of contexts?
No. The best you'll probably be able to do is find relevant values in your rocket-chip configs and figure out what it should be. Or ask someone with lots of RISCV experience on your team what the number should be. There isn't a register that stores the total number of contexts.
Is it constant for all rocket-chips designs?
No. The design could specify any number of harts or user modes. This is implementation specific and rocket-chip doesn't enforce any particular values.
Is it a tunable value?
Yes. The spec mentions a maximum, but in practice it can be any number <= that spec.
What is the right question to ask the FPGA colleagues?
Ask what is the maximum number of contexts they expect. If they don't know, ask them how many harts there are in your implementation, and how many user modes. Then multiply the two.
More resources
RISCV Wikipedia page
Official RISCV specification
RISCV PLIC specification
Rocket-chip docs
Chisel docs
I have a simple question.
What is the current maximum bitrate value supported by Google Chrome browser for web camera ?
For example, if I have a virtual source with high bitrate output (constant bitrate 50 Mbits)
Would I be able to get all 50 Mbits in my Chrome browser when using this device?
Thank you.
The camera's bitrate is irrelevant in this case, since WebRTC is going to encode that information using a video codec that compresses that information anyway.
What matters for WebRTC are 4 separate parameters:
The resolution supplied and the one the other end of the session is capable of receiving
The frame rate supplied and the one the other end of the session is capable of receiving
The network conditions - there's a limit enforced by the network and it is dynamic in nature, so WebRTC will try to estimate it at all times and accommodate to it
The maximum bitrate imposed by the participants
WebRTC in its nature will not limit the amount of bandwidth it takes and will try to use as much as it possibly can. That said, the actual bitrate used even without any limits will still depend on (1), (2) and the type of codec being used. It won't reach 50mbps...
For the most part, 2.5mbps will be enough for almost any type of content in WebRTC. 1080p will take up to 4mbps and 4K probably around 15mbps.
Is passing a pointer to cudaHostRegister that's not page aligned allowed/portable? I'm asking because the simpleStream example does manual page-aligment, but I can't find this requirement in the documentation. Maybe it's a portability problem (similar to mlock() supporting non-aligned on linux, but POSIX does not in general)?
I changed to bandwidth test and using non-aligned, but registered memory performs the same as that returned by cudaHostAlloc. Since I use these pinned buffers for overlapping copies and computation, I'm also interested in whether non-alignment prevents that (so far I could not detect a performance loss).
All my tests were on x86-64 linux.
Maybe it's a portability problem (similar to mlock() supporting non-aligned on linux, but POSIX does not in general)?
Both Linux's mlock and Windows' VirtualLock will lock all pages containing a byte or more of the address range you want to lock, manual alignment is not needed. But as you noted, POSIX allows for an implementation to require the argument of mlock to be page-aligned. This is notably the case on OS X's mlock which will round up a page-unaligned address to the next page boundary, therefore not locking the entirety of the address range.
The documentation of cudaHostRegister makes no mention of any alignment constraint on its arguments. As such, a consumer of this API would be in right to expect that any concern of alignment on the underlying platform is the responsibility of cudaHostRegister, not the user. But without seeing the source of cudaHostRegister, it's impossible to tell if this is actually the case. As the sample is deliberately manually taking care of alignment, it is possible that cudaHostRegister doesn't have such transparent alignment-fixing functionality.
Therefore, yes, it is likely the sample was written to ensure its portability across OSes supported by CUDA (Windows, Linux, Mac OS X).
I just found the following lines in the old 4.0 NVIDIA Library... Maybe it can be helpful for future questions:
The CUDA context must have been created with the cudaMapHost flag in order for the cudaHostRegisterMapped flag to have any effect.
The cudaHostRegisterMapped flag may be specified on CUDA contexts for devices that do not support mapped pinned memory. The failure is deferred to cudaHostGetDevicePointer() because the memory may be mapped into other CUDA contexts via the cudaHostRegisterPortable flag.
and finally
The pointer ptr and size size must be aligned to the host page size (4 KB).
so it is about the host page size.
When parsing NAL units from a H.264 source is it possible to determine the end of an Access Unit without having to find the start of the next one? I am aware of the following section in the H.264 spec:
7.4.1.2.4 Detection of the first VCL NAL unit of a primary coded picture
And I have currently implemented this. The problem here though, is that if there is a large time gap at the end of an Access Unit I won't 'get' the Access Unit until the start of the next one. Is there another way to determine the end (ie. last NAL) of an Access Unit?
I am also aware of the Marker Bit in the RTSP standard but it is not reliable enough for us to use. And in some cases it is just plain wrong.
no, I don't think so.
Unreliable marker bit is the only way to signal end of access unit (in case of RTP).
They should have handled it more reliably in h.264 payload (rfc 6184).
You can check for timestamps and sequence number to infer start of new AU but that is also unreliable (packet loss, reordering, need to wait for first packet of next AU)
How many peer connections can I create on a single client? Is there any limit?
I assume you've arrived at 256 experimentally since there is currently no documentation/specs to suggest it. I don't know exactly how things have changed since 2013, but currently, my own experiments cap out at 500 simultaneous connections per page. As far as I can tell, Firefox has no such limit.
The real limit, according to Chromium source code is 500 (source). As far as I can tell, there was no limit before this was implemented (source) even going as far back as the WebKit days.
I think one reason that it can be tricky to keep track of is that Chrome (and FF for that matter) have always been bad at the garbage collection of dead connections. If you check chrome://webrtc-internals (FF equivalent: about:webrtc), there will often be a build-up of zombie connections that count towards the 500 limit. These persist until you manually destroy them, or close/refresh the page. One way to work around this is through your own heartbeat implementation or using the signalling server to notify of peers disconnecting such that other peers can destroy their connection (although this requires a persistent connection to a signalling server).
Maximum peer connections limit is 256 (on chrome).
Not sure about other major browsers, depending on your bandwidth they are limited to give certain stability.
Not sure if there is any hard limit(other than runtime memory), but there is certainly soft one.
If you are considering fully mesh topology(app in which every client is connected to every other client), then you have to consider main deficiency of this topology. For large number of participants in video conference session bandwidth which is required to sustain the overall session grows for each new participant.
Therefore, users with low bandwidth will not be able to handle video conference session with big number of participants.
Hope it helps.
This is an interesting topic.. I was just watching this youtube video about Multi Peer in WebRTC. The presenters said it just depend on the number of peers, but the highest he did was on less than 6 peers. Also this depends on your bandwidth size. The best thing you can do is to develop an WebRTC and try connecting with your friends and judge as this also depends on the country you are in. Like I live in Botswana and the network is not fast so i wont expect to be having 6 peers while I am still suffering to get a clear communication with only one person this side.
According to this source:
In practice, even under optimal network conditions, a mesh video call doesn’t work well beyond five participants.