I am writing a web application which needs to show a native window in the host window system. That window must display a video which is being streamed to the web application.
I have written a native program for OS X which displays a video in the way I need, and in the web application I have a MediaStream being sent via WebRTC. I need to connect these together.
I would like to use Chrome's native messaging, which lets me stream JSON objects to a native program. If I can access the raw data stream from the MediaStream, I should be able to transform this into JSON objects, stream those to the native application, where I can reconstruct the raw video stream.
Is something like this possible?
If possible, I strongly recommend to implement a WebRTC media server in your native application and directly communicate between the browser's WebRTC APIs and your server. Anything else has much more overhead.
For example, to go from MediaSource to native messaging, you need a way to serialize the audio and video feed in MediaSource to a sequence of bytes, and then send it over the native messaging channel (which will be JSON-encoded by the browser and then JSON-decoded by your native app).
For audio, you could use audioContext.createMediaStreamSource to bridge from a MediaStream (from WebRTC) to an Audio node (in the Web Audio API), and then use offlineAudioCtx.startRendering to convert from an audio node to raw bytes.
For video, you could paint the video on a canvas and then continously use toDataURL or toBlob to get the underlying data to send it over the wire. (See "Taking still photos with WebRTC" on MDN for a tutorial on taking a single picture, this can be generalized to multiple frames)
This sounds very inefficient, and it probably is, so you'd better implement a WebRTC media server in your native app to get some reasonable performance.
Related
I'm having H264 real-time video stream issuing by gst-rtp-server. Moreover, there is possibility to use an augmented FEC stream from the server to improve performance in noisy environment (like WiFi). FEC works on RTP layer. So, on a client side these two RTP streams must be combined into a final one.
Using GStreamer on a client side inside a dedicated native app works perfectly. But, instead of such native app I'm also considering a modern HTML5 Web browser to receive and render the video stream.
So, formal question: Is it possible to get raw RTP video stream from a modern browser somehow? I need to support iOS, Android, as well as main desktop systems.
Currently, I'm considering GSreamer-based preprocessing on a client side - standalone tiny GStreamer-based service (native GUI-less app) will be activated from a webpage and will perform RTP and FEC-based processing, depaying from RTP and paying to something that HTML5 supports. That new stream then will be issued from the localhost to HTML5's 'video' tag on the webpage.
Alternatively, such GStreamer-based service may be implemented as a NPAPI plugin, but nowdays NPAPI is deprecated way and might be not supported at all.
Any other ideas?
Thanks.
I am attempting to implement a streaming audio solution for the web. My requirements are these:
Relatively low latency (no more than 2 seconds).
Streaming in a compressed format (Ogg Vorbis/MP3) to save on bandwidth.
The stream is generated on the fly and is unique for each client.
To clarify the last point, my case does not fit the usual pattern of having a stream being generated somewhere and then broadcast to the clients using something like Shoutcast. The stream is dynamic and will adapt based on client input which I handle separately using regular http requests to the same server.
Initially I looked at streaming Vorbis/MP3 as http chunks for use with the html5 audio tag, but after some more research I found a lot of people who say that the audio tag has pretty high latency which disqualifies it for this project.
I also looked into Emscripten which would allow me to play audio using SDL2, but the prospect of decoding Vorbis and MP3 in the browser is not too appealing.
I am looking to implement the server in C++ (probably using the asynchronous facilities of boost.asio), and to have as small a codebase as possible for playback in the browser (the more the browser does implicitly the better). Can anyone recommend a solution?
P.S. I have no problem implementing streaming protocol support from scratch in C++ if there are no ready to use libraries that fit the bill.
You should look into Media Source Extension.
Introduction: http://en.wikipedia.org/wiki/Media_Source_Extensions
Specification: https://w3c.github.io/media-source/
I'd like to stream a user's webcam (from the browser) to a server and I need the server to be able to manipulate the stream (run some C algorithms on that video stream) and send the user back information.
I have heavily looked at WebRTC and MediaCapture and read the examples here : https://bitbucket.org/webrtc/codelab/overview .
However this is made for peer-to-peer video chat. From what I have understood, the MediaStream from getUserMedia is transmitted via a RTCPeerConnection (with addStream) ; what I'd like to know is : can I use this, but process the video stream on the server ?
Thanks in advance for your help
Here is the solution I have designed.
I post here for people seeking the same kind of information :-)
Front End side
I use the WebRTC API : get webcam stream with getUserMedia, open RTCPeerConnection (and RTCDataChannel for downside information).
The stream is DTLS encrypted (mandatory), multimedia streams use RTP and RTCP. The video is VP8 encoded and the audio in Opus encoded.
Back End side
On the backend, this is the complex part.
The best (yet) alternative I could find is the Janus Gateway. It takes cares of a lot of stuff, like DTLS handshake, RTP / RTCP demuxing, etc. Basically, it fires a event each time a RTP packet is transmitted. (RTP packets are typically the size of the MTU, so there is not a 1:1 mapping between video frames and RTP packets).
I then built a GStreamer (version 1.0) to depacketize the RTP packets, decode the VP8, ensure video scaling and colorspace / format conversion to issue a BGR matrix (compatible with OpenCV). There is an AppSrc component at the beginning of the pipeline and a AppSink at the end.
What's left to do
I have to take extra measures to ensure good scalability (threads, memory leaks, etc) and find a clean and efficient way of using the C++ library I have inside this program.
Hope this helps !
How do I achieve streaming of audio and video data and pass it on the network. I gone through a good article Here, But did not get in depth. I want to have chat application in HTML5
There are mainly below question
How to stream the audio and video data
How to pass to particular IP address.
Get that data and pass to video and audio control
If you want to serve a stream, you need a server doing so, by either downloading and installing, or coding on your own.
Streams only work in one direction, there is no responding or "retrieve back". Streaming is almost the same as downloading, with slight differences, depending on the service and use case.
Most streams are downstreams, but there are also upstreams. Did you hear about BufferStreams in PHP, Java, whatever? It's basically the same: data -> direction -> cursor.
Streams work over many protocols, even via different network layers, for example:
network/subnet broadcast, peer 2 peer, HTTP, DLNA, even FTP streams, ...
The basic nature of a stream is nothing more than data beeing sent to an audience.
You need to decide:
which protocol do you want to use for streaming
which server software
which media / source / live or with selectable start/end
which clients
The most popular HTTP streaming server is Shoutcast by Nullsoft (Winamp).
There is also DLNA which afaik is not HTTP based.
To provide more information, you need to be more specific regarding your basic requirements and decisions.
I'm trying to figure out if HTMl5 is suited for the client part of an online conference system.
The client must be capable to:
1. display live video provided by the server, using the video tag.
2. Similar for the live audio, using audio tag.
3. The system supports text messaging also. Here we can use websockets
4. There is also a desktop sharing feature. For this kind of data stream I was also thinking to websockets. But this is binary data, it can be encoded in base64 before sending. So in the html5 client, it has to be decoded, processed (it is a proprietary protocol) and using a canvas object (?!) draw it to the screen.
Can the webapp process this amount of data in the same time ?
Is it HTML5 prepared for this ?
Can webapps process this amount off data? Yes
Is HTML5 prepared for this? Not yet, but soon
These are all areas that HTML5 is working to address. However, some of the working groups are farther along than others and the features have differing levels of implementation in browsers. Ericsson is doing a lot in this area. They have a patched version of webkit that enables enough of these technologies to do usable video/audio conferencing.
In terms of desktop sharing, noVNC (VNC client in a browser) demonstrates that this is possible. noVNC (disclaimer: I wrote noVNC) does full RFB/VNC decode and render in the browser using Javascript and Canvas. It uses WebSockets to send and receive the data and base64 encode/decodes over the wire since WebSockets doesn't support binary data yet. It uses a WebSockets to TCP proxy websockify to communicate with the VNC servers. It performs quite well.
Here are linked so some of the relevant standards work:
HTML5 index
Full web-apps standard
Canvas
video and audio tags
Media capture
Media capture API
Device tag/element
WebSockets API
Current WebSockets protocol in Chrome/Safari
All WebSockets protocol drafts
ArrayBuffer/Typed Arrays
stream API
File API
The best place to see what the status of various HTML5 related technologies is: http://caniuse.com
you may want to check out the work being done by the Ericsson labs:
https://labs.ericsson.com/developer-community/blog/beyond-html5-implementing-device-and-stream-management-webkit
also look at the index page for the new device API:
https://labs.ericsson.com/developer-community?type=blog