The way to stream html5 audio to the server by websocket - html

There are solutions for HTML5 to get audio stream by using <audio> or <video> tag. Can I have the step reverse? What if I stream to the server by using getUserMedia() and websocket?
It seems that it is not simple as I cannot get the byte stream directly. Is it possible indeed? If possible, how to put the audio stream to server by websocket ws.send()?
Thanks.

Websockets is not a protocol for streaming. However you can achieve your task by using webrtc. It will be much easier doing webrtc p2p than client to server but it can be done. You can actually stream both audio and video to a server from html5 (webrtc).
Look at the webrtc specifications and then implement the ICE, TURN on your server to get the negotiation running. You will then be able to recieve the streams from several browsers to your server.
Not easy... But it can be done :)

Related

Can I access to raw RTP stream from a web browser?

I'm having H264 real-time video stream issuing by gst-rtp-server. Moreover, there is possibility to use an augmented FEC stream from the server to improve performance in noisy environment (like WiFi). FEC works on RTP layer. So, on a client side these two RTP streams must be combined into a final one.
Using GStreamer on a client side inside a dedicated native app works perfectly. But, instead of such native app I'm also considering a modern HTML5 Web browser to receive and render the video stream.
So, formal question: Is it possible to get raw RTP video stream from a modern browser somehow? I need to support iOS, Android, as well as main desktop systems.
Currently, I'm considering GSreamer-based preprocessing on a client side - standalone tiny GStreamer-based service (native GUI-less app) will be activated from a webpage and will perform RTP and FEC-based processing, depaying from RTP and paying to something that HTML5 supports. That new stream then will be issued from the localhost to HTML5's 'video' tag on the webpage.
Alternatively, such GStreamer-based service may be implemented as a NPAPI plugin, but nowdays NPAPI is deprecated way and might be not supported at all.
Any other ideas?
Thanks.

HTML5 Audio Streaming from Live Source

I am looking into implementing live audio streaming from a vxWorks system into an HTML5 audio player. I have been researching for a while but am stuck on one key step.
Work Flow:
A host has a mic and speaks into the mic
Audio is received into the vxWorks OS. Here it can be encoded, packaged - anything is possible here
????
User opens web page to listen to the live audio in an HTML5 player.
I don't know what goes in step 3. Suppose I have now encoded the audio into Mp3. What technologies do I need to send this to the browser? I believe I can send this through HTTP procotol, but I am not understanding how that is packaged. Ie, how is audio packaged into an HTTP protocol. And what does the HTML5 player want as a source for this data. A URL? Or websocket data?
Thanks.
The solution for #3 depends on whether you need a low-latency live audio stream at the HTML5 player, or latency can be 10-30 seconds.
1. Latency can be high
You need a streaming server / HLS packager that will stream via HLS, and a webpage that hosts Flowplayer or JWPlayer which will play that HLS stream using HTML5 video tag. AAC-encoded (not mp3!) audio stream needs to be pushed (RTMP publishing is a most popular method) to that streaming server / HLS packager.
You could go with free nginx; it is portable to vxWorks and can stream out via HLS. You could also try free VLC or ffmpeg for the same.
If you make it work, the stream will be playable on any browser on any device - iOS, Android, Windows OS etc...
2. You need low latency
This is much harder. You would need another machine running Linux or Windows OS.
On that machine install Unreal Media Server or Evostream server - these servers stream to HTML5 players via Websocket protocol using ISO BMFF packaging.
Same as before, AAC-encoded (not mp3!) audio stream needs to be pushed (RTMP publishing is a most popular method) to that streaming server.
These streams will play on any browser / any OS EXCEPT iOS!

Chrome native messaging: can I stream a MediaStream to a native program?

I am writing a web application which needs to show a native window in the host window system. That window must display a video which is being streamed to the web application.
I have written a native program for OS X which displays a video in the way I need, and in the web application I have a MediaStream being sent via WebRTC. I need to connect these together.
I would like to use Chrome's native messaging, which lets me stream JSON objects to a native program. If I can access the raw data stream from the MediaStream, I should be able to transform this into JSON objects, stream those to the native application, where I can reconstruct the raw video stream.
Is something like this possible?
If possible, I strongly recommend to implement a WebRTC media server in your native application and directly communicate between the browser's WebRTC APIs and your server. Anything else has much more overhead.
For example, to go from MediaSource to native messaging, you need a way to serialize the audio and video feed in MediaSource to a sequence of bytes, and then send it over the native messaging channel (which will be JSON-encoded by the browser and then JSON-decoded by your native app).
For audio, you could use audioContext.createMediaStreamSource to bridge from a MediaStream (from WebRTC) to an Audio node (in the Web Audio API), and then use offlineAudioCtx.startRendering to convert from an audio node to raw bytes.
For video, you could paint the video on a canvas and then continously use toDataURL or toBlob to get the underlying data to send it over the wire. (See "Taking still photos with WebRTC" on MDN for a tutorial on taking a single picture, this can be generalized to multiple frames)
This sounds very inefficient, and it probably is, so you'd better implement a WebRTC media server in your native app to get some reasonable performance.

Streaming adaptive audio on the web (low latency)

I am attempting to implement a streaming audio solution for the web. My requirements are these:
Relatively low latency (no more than 2 seconds).
Streaming in a compressed format (Ogg Vorbis/MP3) to save on bandwidth.
The stream is generated on the fly and is unique for each client.
To clarify the last point, my case does not fit the usual pattern of having a stream being generated somewhere and then broadcast to the clients using something like Shoutcast. The stream is dynamic and will adapt based on client input which I handle separately using regular http requests to the same server.
Initially I looked at streaming Vorbis/MP3 as http chunks for use with the html5 audio tag, but after some more research I found a lot of people who say that the audio tag has pretty high latency which disqualifies it for this project.
I also looked into Emscripten which would allow me to play audio using SDL2, but the prospect of decoding Vorbis and MP3 in the browser is not too appealing.
I am looking to implement the server in C++ (probably using the asynchronous facilities of boost.asio), and to have as small a codebase as possible for playback in the browser (the more the browser does implicitly the better). Can anyone recommend a solution?
P.S. I have no problem implementing streaming protocol support from scratch in C++ if there are no ready to use libraries that fit the bill.
You should look into Media Source Extension.
Introduction: http://en.wikipedia.org/wiki/Media_Source_Extensions
Specification: https://w3c.github.io/media-source/

How to stream webcam to server and manipulate the stream

I'd like to stream a user's webcam (from the browser) to a server and I need the server to be able to manipulate the stream (run some C algorithms on that video stream) and send the user back information.
I have heavily looked at WebRTC and MediaCapture and read the examples here : https://bitbucket.org/webrtc/codelab/overview .
However this is made for peer-to-peer video chat. From what I have understood, the MediaStream from getUserMedia is transmitted via a RTCPeerConnection (with addStream) ; what I'd like to know is : can I use this, but process the video stream on the server ?
Thanks in advance for your help
Here is the solution I have designed.
I post here for people seeking the same kind of information :-)
Front End side
I use the WebRTC API : get webcam stream with getUserMedia, open RTCPeerConnection (and RTCDataChannel for downside information).
The stream is DTLS encrypted (mandatory), multimedia streams use RTP and RTCP. The video is VP8 encoded and the audio in Opus encoded.
Back End side
On the backend, this is the complex part.
The best (yet) alternative I could find is the Janus Gateway. It takes cares of a lot of stuff, like DTLS handshake, RTP / RTCP demuxing, etc. Basically, it fires a event each time a RTP packet is transmitted. (RTP packets are typically the size of the MTU, so there is not a 1:1 mapping between video frames and RTP packets).
I then built a GStreamer (version 1.0) to depacketize the RTP packets, decode the VP8, ensure video scaling and colorspace / format conversion to issue a BGR matrix (compatible with OpenCV). There is an AppSrc component at the beginning of the pipeline and a AppSink at the end.
What's left to do
I have to take extra measures to ensure good scalability (threads, memory leaks, etc) and find a clean and efficient way of using the C++ library I have inside this program.
Hope this helps !