Looking to understand RTSP and H.264 Encapsulation - h.264

I am trying to learn enough about H.264, RTP, RTSP and encapsulation file formats to develop a video recording application.
Specifically, what should I read to understand the problem?
I want to be able to answer the following questions:
Can I save H.264 packets or NALs (Per RFC 6184) to a file?
Can I save the individual payloads as files?
Can I join the RTP payloads simply by concatenating them?
What transformation is needed to save
several seconds of H.264 video in an MP4 container.
What must be done
to later join these MP4 files, or arbitrarily split them, or serve
them as a new RTSP presentation?
I want to be able to answer these questions on a fairly low level so I can implement software that does some of the processes (capture RTP streams, rebroadcast joined MP4s).
Background
The goal is to record video from a network camera onto disk. The camera has an RTSP server that provides an H.264 encoded stream which it sends via RTP to a player. I have successfully played the stream using VLC, but would like to customize the process.

The "raw" video stream is a sequence of NAL units, per H.264 specification. Neither on RTSP, nor on MP4 file you have this stream "as is".
On RTSP connection you typically receive NAL units fragmented, and you need to depacketize them (no you cannot simply concatenate):
RTP H.264 Packet Depacketizer
How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter
MP4 file is a container formatted file, and has its own structure (boxes). So you cannot simply stream NALs into such file and you have to do what is called multiplexing.
How do I create an mp4 file from a collection of H.264 frames and audio frames?

just install rtmpdump along with rtmpsrv and rtmpsuck...
this will do all the work
in one terminal open rtmpsrv and in other open rtmpdump -r "RTMP URL"
this will save the stream in mystream.flv

Related

HTML5 Audio Streaming from Live Source

I am looking into implementing live audio streaming from a vxWorks system into an HTML5 audio player. I have been researching for a while but am stuck on one key step.
Work Flow:
A host has a mic and speaks into the mic
Audio is received into the vxWorks OS. Here it can be encoded, packaged - anything is possible here
????
User opens web page to listen to the live audio in an HTML5 player.
I don't know what goes in step 3. Suppose I have now encoded the audio into Mp3. What technologies do I need to send this to the browser? I believe I can send this through HTTP procotol, but I am not understanding how that is packaged. Ie, how is audio packaged into an HTTP protocol. And what does the HTML5 player want as a source for this data. A URL? Or websocket data?
Thanks.
The solution for #3 depends on whether you need a low-latency live audio stream at the HTML5 player, or latency can be 10-30 seconds.
1. Latency can be high
You need a streaming server / HLS packager that will stream via HLS, and a webpage that hosts Flowplayer or JWPlayer which will play that HLS stream using HTML5 video tag. AAC-encoded (not mp3!) audio stream needs to be pushed (RTMP publishing is a most popular method) to that streaming server / HLS packager.
You could go with free nginx; it is portable to vxWorks and can stream out via HLS. You could also try free VLC or ffmpeg for the same.
If you make it work, the stream will be playable on any browser on any device - iOS, Android, Windows OS etc...
2. You need low latency
This is much harder. You would need another machine running Linux or Windows OS.
On that machine install Unreal Media Server or Evostream server - these servers stream to HTML5 players via Websocket protocol using ISO BMFF packaging.
Same as before, AAC-encoded (not mp3!) audio stream needs to be pushed (RTMP publishing is a most popular method) to that streaming server.
These streams will play on any browser / any OS EXCEPT iOS!

Streaming adaptive audio on the web (low latency)

I am attempting to implement a streaming audio solution for the web. My requirements are these:
Relatively low latency (no more than 2 seconds).
Streaming in a compressed format (Ogg Vorbis/MP3) to save on bandwidth.
The stream is generated on the fly and is unique for each client.
To clarify the last point, my case does not fit the usual pattern of having a stream being generated somewhere and then broadcast to the clients using something like Shoutcast. The stream is dynamic and will adapt based on client input which I handle separately using regular http requests to the same server.
Initially I looked at streaming Vorbis/MP3 as http chunks for use with the html5 audio tag, but after some more research I found a lot of people who say that the audio tag has pretty high latency which disqualifies it for this project.
I also looked into Emscripten which would allow me to play audio using SDL2, but the prospect of decoding Vorbis and MP3 in the browser is not too appealing.
I am looking to implement the server in C++ (probably using the asynchronous facilities of boost.asio), and to have as small a codebase as possible for playback in the browser (the more the browser does implicitly the better). Can anyone recommend a solution?
P.S. I have no problem implementing streaming protocol support from scratch in C++ if there are no ready to use libraries that fit the bill.
You should look into Media Source Extension.
Introduction: http://en.wikipedia.org/wiki/Media_Source_Extensions
Specification: https://w3c.github.io/media-source/

How to stream webcam to server and manipulate the stream

I'd like to stream a user's webcam (from the browser) to a server and I need the server to be able to manipulate the stream (run some C algorithms on that video stream) and send the user back information.
I have heavily looked at WebRTC and MediaCapture and read the examples here : https://bitbucket.org/webrtc/codelab/overview .
However this is made for peer-to-peer video chat. From what I have understood, the MediaStream from getUserMedia is transmitted via a RTCPeerConnection (with addStream) ; what I'd like to know is : can I use this, but process the video stream on the server ?
Thanks in advance for your help
Here is the solution I have designed.
I post here for people seeking the same kind of information :-)
Front End side
I use the WebRTC API : get webcam stream with getUserMedia, open RTCPeerConnection (and RTCDataChannel for downside information).
The stream is DTLS encrypted (mandatory), multimedia streams use RTP and RTCP. The video is VP8 encoded and the audio in Opus encoded.
Back End side
On the backend, this is the complex part.
The best (yet) alternative I could find is the Janus Gateway. It takes cares of a lot of stuff, like DTLS handshake, RTP / RTCP demuxing, etc. Basically, it fires a event each time a RTP packet is transmitted. (RTP packets are typically the size of the MTU, so there is not a 1:1 mapping between video frames and RTP packets).
I then built a GStreamer (version 1.0) to depacketize the RTP packets, decode the VP8, ensure video scaling and colorspace / format conversion to issue a BGR matrix (compatible with OpenCV). There is an AppSrc component at the beginning of the pipeline and a AppSink at the end.
What's left to do
I have to take extra measures to ensure good scalability (threads, memory leaks, etc) and find a clean and efficient way of using the C++ library I have inside this program.
Hope this helps !

html 5 audio streaming faking files. Progressive Download, PCM WAV

As far as i know there is no audio streaming available in html5. Even with the audio tag.
That is, you always have to provide a file instead of passing an audio stream of some sort.
So, we know that most commonly used formats are ogg and mp3(not free). Also wav can be used but due to its size not commonly used.
My question is can I fake a file as if it was a stream, say create the wav file (with the riff header) and specify the PCM format details(freq,channel,blah blah) and pass that as the first few bytes and then send the PCM stream over the wire(actualy audio chunks).
The first issue I see if that RIFF header in the wav files require the chunk sizes which is the length of the file. WELL WE DONT HAVE A LENGTH SINCE IT IS AN AUDIO STREAM.
Any ideas.
Yes, absolutely.
The client doesn't need to know or care that the media it is playing is created live, or being loaded from disk.
You may have challenges depending on the codec used. WAV present the problems you have mentioned with the header, but it should be possible. With MP3, you can generally just send data at any point, and the decoder will sync to the frames on its own.

Live Audio Streaming to a browser methods, must be very simple

I'm recording a mono audio stream using a PIC at 8-bit 8Khz and streaming it raw to another microprocessor that houses a web server. I'm currently buffering the data and turning it into a wav file that gets played in the browser. What I'd like to be able to do is continuously stream the audio as it's being recorded without putting a lot of encoding overhead on the second processor. I've been searching but most searches turn up just streaming from a stored file, but since the file size isn't known ahead of time I'm not sure how to do this without the overhead of mp3 encoding.
You may find that simply creating a WAV file (or other raw format) that keeps growing will, in most players/browser plugins, cause the file to act as a live stream. This is, I believe, basically how like Ogg streaming and similar works. Because the player begins playing before it is done downloading anyway, it keeps playing and downloading until the end of the file, but the file has no end so it just keeps going.
Vlc media player can stream flv and many other formats.