Like we can read H264 m line in SDP and confirm whether its H264 BP or HP, is there a way to differentiate SVC HP and BP by looking at SDP information?
SVC line:
a=rtpmap:122 X-H264UC/90000
a=fmtp:122 packetization-mode=1;mst-mode=NI-TC
H264 line:
a=fmtp:100 profile-level-id=640029;
packetization-mode=1;
max-mbps=245760; max-fs=8196
H264UC (Unified Communications) is a Microsoft specific H.264 SVC implementation.
The packetization-mode in the SDP corresponds to a UCConfig mode. For example 1 means H.264 UCConfig Mode 1 which conforms to the UC Constrained High toolset.
I am not familiar with the Microsoft implementation but as far as I can tell these are subsets of ITU-T H.264 profiles / scalable profiles.
This document might help you.
Related
I have a web app which needs to support screen-sharing in (unmodified) Chrome and Firefox browsers through WebRTC. I would like to use VP9 due to its quality/bandwidth advantages.
However, both Chrome and Firefox seem to enforce SVC-encding when choosing VP9 and haven't been able to disable it. This is a showstopper because:
The video is consumed by a single endpoint (so SVC encoding only causes overhead).
That single endpoint has hardware VP9 decoding support, but the hardware doesn't seem to support SVC. Discarding layers from SVC to make the hw-decoder happy would degrade quality and software decoding is not really an option due to platform limitations.
Is there a way to disable SVC encoding in SDP negotiations or through javascript?
We develop an IP camera product which streams H.264/MPEG4/MJPEG video via RTSP/UDP. It has a web interface, currently we use the VLC Firefox plugin to allow viewing of the live RTSP stream in the browser but Firefox are dropping support for NPAPI plugins so that's currently a dead end.
The camera itself is a relatively low-powered ARM SoC (think Raspberry Pi level) so we don't have vast spare resource to do things like transcode streams on-the-fly on the board.
The main purpose is to check the video stream is working correctly from the web interface, so streaming a new stream (or transcoding it) in some other format/transport/streaming engine is less desirable than being able to somehow play the original RTSP stream directly. In regular use the video is streamed via RTSP into a VMS server so that's not up for alteration.
In an ideal world the solution would be open-source cross-browser and happen inside an HTML5 tag, but if it works in one or more of the most popular browsers we'll take it.
I've been reading all sorts of stuff here and around the web about the brave new world of the HTML5 video tag, WebRTC, HLS, etc. and have yet to see anything that looks like a sensible and complete solution that doesn't involve some extra conversion/transcoding/re-streaming, often by some half-supported framework or an extra server in the middle which is not a viable solution.
I haven't yet found a proper description of what may or may not be required to "convert" our stream to whatever-html5-video-likes, whether it's just a slightly different wrapper around the same basic video stream or if there's a lot of overhead and everything is different. Likewise it's not clear if the conversion could be achieved either on-board or perhaps even in-browser using JS.
The reason for the title is that if we've got to change the way it all works we may as well aim to do whatever is considered "best practice" and reasonably future-proof as far as possible rather than some expedient fudge that might not work beyond the next round of browser updates / the next W3C press release...
I find it slightly disappointing (but perhaps not surprising) that in 2017 there seems to be no sensible way of achieving this.
Perhaps "least worst practice" would be more suitable terminology...
There are many methods you can use that don't require transcoding.
WebRTC
If you're using RTSP, you're much of the way there in sending your streams via WebRTC.
WebRTC uses SDP for declaring streams, and RTP for the transport of these streams. There are some other layers you need for setting up the WebRTC call, but none of these require particularly expensive computation. Most (all?) WebRTC clients will support H.264 decoding, many with hardware acceleration in-browser.
The easiest way to get started with WebRTC is to implement a browser-to-browser client first. Then, you can go a layer deeper with your own implementation.
WebRTC is the route I recommend to you. NAT traversal (in most cases) and P2P connectivity are built-in, so your customers won't have to remember IP addresses. Simply provide signalling services and your customers can connect directly to their cameras at home from wherever. Provide TURN servers, and they'll be able to connect even if both ends are firewalled. If you don't wish to provide such services, they're lightweight and can run directly on the camera in a mode like you have today.
Fragmented MP4 over HTTP Progressive with <video> tag
This method is much simpler than WebRTC, but totally different than what you're doing now. You can take your H.264 stream, and wrap it directly in an MP4 without transcoding. Then, it can be played in a <video> tag on a page. You'll have to implement the appropriate libs in your code, but here's an FFmpeg example that outputs to STDOUT, which you'd pipe to clients:
ffmpeg \
-i YOUR_CAMERA_HERE \
-vcodec copy \
-acodec copy \
-f mp4 \
-movflags frag_keyframe+empty_moov \
-
Others...
In your case, there's no added benefit to DASH. DASH is intended for utilizing file-based CDNs for streaming. You control the server, so there's no point in writing out files or handling HTTP requests in a file-like manner. While you can certainly use DASH with H.264 streams without transcoding, I think it's a waste of your time.
HLS is much the same. Your stream is compatible with HLS, but HLS is dropping out of favor rapidly due to its lack of flexibility on codec. DASH and HLS are essentially the same mechanism... write a bunch of media segments to a CDN and create a playlist or manifest indicating where they are.
Well, I had to do the same thing while back in a raspberry pi 3. we transcoded it on the fly using ffmpeg on the pi and used https://github.com/phoboslab/jsmpeg to stream mjpeg. then played it on the browser/ionic app.
var canvas = document.getElementById('video-canvas');
this.player = new JSMpeg.Player(this.button.url ,{canvas: canvas});
We were managing up to 4 concurrent streams with minimum delay <2-5 secs on our Pis.
But once we moved to React Native we used the RN VLC wrapper on the phones
we were using the vlc plugin in Chrome to play a multicast stream (RTP Ipv6) but with the deprecation of NPAPI-Plugins we need an alternative. I was trying to search something about html5 video but nothing.
NPAPI deprecation: developer guide
Any idea?
Thanks
RTP directly to the browser is not a solution I'd use today. The implementation effort to transform a number of RTP packets to Media Segments accepted by the Media Source Extension (MSE) is rather high and perhaps it's not even doable on all browsers (chrome.sockets seems to be a way to do it at least on Chrome browsers). Plugin development for more than a single browser is a nasty business as well. Don't go there!
I am not sure if it fits your requirements but here is what I'd do:
I would setup a process that converts RTP packets to MPEG-DASH packets on a server. Coincidentally I implemented a solution like that. You can find it on Github as RTP2DASH. The example receives multiple qualities of the same stream from ffmpeg but you don't need that - a single video stream from any RTP source should be enough as you can run MPEG-DASH with just a single video stream. Doing DASH seems like a big overhead in the beginning but the advantage is that there are players working on all browsers such as the DASH-IF Reference Player (I wouldn't use that one) or Google's Shaka Player (which is included in my example) already there.
I'd like to stream a user's webcam (from the browser) to a server and I need the server to be able to manipulate the stream (run some C algorithms on that video stream) and send the user back information.
I have heavily looked at WebRTC and MediaCapture and read the examples here : https://bitbucket.org/webrtc/codelab/overview .
However this is made for peer-to-peer video chat. From what I have understood, the MediaStream from getUserMedia is transmitted via a RTCPeerConnection (with addStream) ; what I'd like to know is : can I use this, but process the video stream on the server ?
Thanks in advance for your help
Here is the solution I have designed.
I post here for people seeking the same kind of information :-)
Front End side
I use the WebRTC API : get webcam stream with getUserMedia, open RTCPeerConnection (and RTCDataChannel for downside information).
The stream is DTLS encrypted (mandatory), multimedia streams use RTP and RTCP. The video is VP8 encoded and the audio in Opus encoded.
Back End side
On the backend, this is the complex part.
The best (yet) alternative I could find is the Janus Gateway. It takes cares of a lot of stuff, like DTLS handshake, RTP / RTCP demuxing, etc. Basically, it fires a event each time a RTP packet is transmitted. (RTP packets are typically the size of the MTU, so there is not a 1:1 mapping between video frames and RTP packets).
I then built a GStreamer (version 1.0) to depacketize the RTP packets, decode the VP8, ensure video scaling and colorspace / format conversion to issue a BGR matrix (compatible with OpenCV). There is an AppSrc component at the beginning of the pipeline and a AppSink at the end.
What's left to do
I have to take extra measures to ensure good scalability (threads, memory leaks, etc) and find a clean and efficient way of using the C++ library I have inside this program.
Hope this helps !
I want to automate a transcoding workflow to h.264 in the adaptive streaming containers for HLS and Microsoft Smooth Streaming and wonder what my options are.
Ideally, there's Expression Encoder Pro with the Expression SDK that I could use to do just that. However, Expression Encoder pro is no longer for sale and the non-pro version can't do h.264.
There are other h.264 encoders, in particular with x264 there's an encoder proper that's gpl-licensed. x264 really just gives a pure stream output without the container though, let alone the adaptive streaming containers I need.
I found one reasonably priced encoder called Sorenson Squeeze that appears to have all I need (and in fact can use x264 for that part of the job), but I wonder if I have other options that make more sense in terms of spending money on licenses.
I already have licenses for Adobe's Media Encoder through Creative Cloud subscriptions, but Media Encoder can't work from the command line and I don't see any support for adaptive streaming with my desired containers.
Does anybody have more ideas?
FFmpeg and/or libav can transcode to h264 and support Smooth Streaming and HLS, and run on the command line. There's a bit of a learning curve (you in practice need to have an understanding of the container formats used, GOP and fragmentation/segmentation) but they do have the features you need.
If your media is on your local machine, and you have small amounts, buying one of the tools you mentioned might be your best bet.
However, if you have lots of media and you store it on the cloud, look at cloud offerings such as Amazon Elastic Transcoder or encoding.com.
That way you get out of the box support for formats like HLS, and you don't need to worry about licensing. it is all included in their pricing which is "per use". No subscription or upfront costs.
For e.g. MPEG-DASH adaptive bitrate content you can use either tools such as x264 + MP4Box, or cloud-services like bitcodin.