HTML5 & Web audio api: Streaming microphone data from browser to server. Ideal transports and data compression

HTML5 & Web audio api: Streaming microphone data from browser to server. Ideal transports and data compression - html

I am looking to take the audio input from the browser and stream it to multiple listeners. The intended use is for music, so the quality must mp3 standard or thereabouts.
I have attempted two ways, both yielding unsuccessful results:
WebRTC
Streaming audio directly between browsers works fine, but the audio quality seems to be non-customisable though what I have seen. (I have seen that it is using the Opus audio codec, but seems to not expose any controls).
Does anyone have any insight into how to increase the audio quality in WebRTC streams?
Websockets
The issue is the transportation from the browser to the server. The PCM audio data I can acquiring via the method below has proven too large to repeatedly stream to the server via websockets. The stream works perfectly in high speed internet environments, but on slower wifi it is un-usable.
var context = new webkitAudioContext()
navigator.webkitGetUserMedia({audio:true}, gotStream)
function gotStream (stream)
{
var source = context.createMediaStreamSource(stream)
var proc = context.createScriptProcessor(2048, 2, 2)
source.connect(proc)
proc.connect(context.destination)
proc.onaudioprocess = function(event)
{
var audio_data = event.inputBuffer.getChannelData(0)|| new Float32Array(2048)
console.log(audio_data)
// send audio_data to server
}
}
So the main question is, is there any way to compress the PCM data in order to make it easier to stream to the server? Or perhaps there is an easier way to go about this?

There are lots of ways to compress PCM data, sure, but realistically, your best bet is to get WebRTC to work properly. WebRTC is designed to do this - adaptively stream media - although you don't define what you mean by "multiple" listeners (there's a huge difference between 3 listeners and 300,000 simultaneous listeners).

There are several possible ways of resampling and/or compressing your data, none of them native though. I resampled the data to 8Khz Mono (your mileage may vary) with the xaudio.js lib from the speex.js environment. You could also compress the stream using speex, though that is used usually for audio only. In your case, I would probably send the stream to a server, compress it there and stream it to your audience. I really don't believe a simple browser to be good enough to serve data to a huge audience.

WebRTC seems to default to one mono channel around 42 kb/s, it seems to be primarily designed for voice.
You can disable the audio processing features using constraints to get a more consistent input from the browser using:
navigator.mediaDevices.getUserMedia({
audio: {
autoGainControl: false,
channelCount: 2,
echoCancellation: false,
latency: 0,
noiseSuppression: false,
sampleRate: 48000,
sampleSize: 16,
volume: 1.0
}
});
Then you also should set stereo and maxaveragebitrate params on the SDP:
let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);
This should output a string which looks like this:
a=fmtp:111 minptime=10;useinbandfec=1; stereo=1; maxaveragebitrate=510000
This could increase the bitrate up to 520kb/s for stereo, which is 260kps per channel. Actual bitrate depends on the speed of your network and strength of your signal tho.

Related

Play live audio stream - html5

I have a desktop application which streams raw PCM data to my browser over a websocket connection. The stream looks like this ...\\x00\\x00\\x02\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\xff\\xff\\xff\\xff\\....
The question is simple: can I play such a stream in HTML with the Web Audio API / WebRTC / ...?
Any suggestions are very welcome!
code edit
This code plays noise, randomly generated:
function myPCMSource() {
return Math.random() * 2 - 3;
}
var audioContext;
try {
window.AudioContext = window.AudioContext || window.webkitAudioContext;
audioContext = new AudioContext();
} catch(e) {
alert('Web Audio API is not supported in this browser');
}
var bufferSize = 4096;
var myPCMProcessingNode = audioContext.createScriptProcessor(bufferSize, 1, 1);
myPCMProcessingNode.onaudioprocess = function(e) {
var output = e.outputBuffer.getChannelData(0);
for (var i = 0; i < bufferSize; i++) {
output[i] = myPCMSource();
}
}
So changing the myPCMSource() to the websocket stream input, should make it work somehow. But it doesn't. I don't get any errors, but the API is not playing any sound nor noise.

Use a ScriptProcessorNode, but be aware that if there is too much load on the main thread (the thread that runs your javascript, draws the screen, etc.), it will glitch.
Also, your PCM stream is probably in int16, and the Web Audio API works in terms of float32. Convert it like so:
output_float[i] = (input_int16[i] / 32767);
that is, go from a [0; 65535] range to a [-1.0; 1.0] range.
EDIT
I was using output_float[i] = (input_int16[i] / 32767 - 1);, this article shows that you should use output_float[i] = (input_int16[i] / 32767);. Now it's working fine!

Just for the record, the ScriptProcessorNode is deprecated. See the MDN article for details. The feature was replaced by AudioWorklets and the AudioWorkletNode interface.
In short, a ScriptProcessorNode runs outside of the browser's internal audio thread, which creates at least on frame (128 samples) of latency. Worse, the ScriptProcessorNode often fails to respond quickly enough, if the thread is busy, so will just randomly drop the ball every so often.
Worklets are basically task-specific workers that run in one of the browsers internal threads (paint, layout, audio etc). Audio worklets run in the audio thread, and implement the guts of custom audio nodes, which are then exposed through the WebAudio API as normal.
Note: You are also able to run WebAssembly inside worklets to handle the processing.
The solution provided above is still useful, as the basic idea holds, but it would ideally use an audio worklet.

Chrome extension to listen and capture streaming audio

Is it possible for a Chrome extension to listen for streaming audio from any of the browser's tabs? I would like to capture the streaming audio data and then analyse it.
Thanks

You could try 3 ways, neither one does provide 100% guarantee to meet your needs.
Before going into more detailed descriptions, I must note that Chrome extensions do not provide convenient tools for working on per connection level - sufficiently low level, required for stream capturing. This is by design. This is why the 1-st way is:
To look at other browsers, for example Firefox, which provides low-level APIs for connections. They are already known to be used by similar extensions. You may have a look at MediaStealer. If you do not have a specific requirement to build your system on Chrome, you should possibly move to Firefox.
You can develop a Chrome extension, which intercepts HTTP-requests by means of webRequest API, analyses their headers and extracts media urls (such as containing audio/mpeg MIME-type, for example, in HTTP-headers). Just for a quick example of code you make look at the following SO question - How to change response header in Chrome. Having the url you may force appropriate media download as a file. It will land in default downloads folder and may have unfriendly name. (I made such an extension, but I do not have requirements for further processing). If you need to further process such files, it can be a challenge to monitor them in the folder, and run additional analysis in a separate program.
You may have a look at NPAPI plugins in general, and their streaming APIs in particular. I can imagine that you create a plugin registered for, again, audio/mpeg MIME-type, and receives the data via NPP_NewStream, NPP_WriteReady and NPP_Write methods. The plugin can be wrapped into a Chrome extension. Though I made NPAPI plugins, I never used this API, and I'm not sure it will work as expected. Nethertheless, I'm mentioning this possibility here for completenees. This method requires some coding other than web-coding, meaning C/C++. NB. NPAPI plugins are deprecated and not supported in Chrome since September 2015.
Taking into account that you have some external (to the extension) "fingerprinting service" in mind, which sounds like an intelligent data processing, you may be interested in building all the system out of a browser. For example, you could, possibly, involve a HTTP-proxy, saving media from passing traffic.

If you're writing a Chrome extension, you can use the Chrome tabCapture API to record audio.
chrome.tabCapture.capture({audio: true}, function(stream) {
var recorder = new MediaRecorder(stream);
[...]
});
The rest is left as an exercise to the reader; MDN has more documentation on how to use MediaRecorder.
When this question was asked in 2013, neither chrome.tabCapture nor MediaRecorder existed.

Mac OSX solution using soundflower: http://rogueamoeba.com/freebies/soundflower/
After installing soundflower it should appear as a separate audio device in the sound preferences (apple > system preferences > sound). Divert the computer's audio to the 2ch option (stereo, 16ch is surround), then inside a DAW, such as 'audacity', set the audio input as soundflower. Now the sound should be channeled to your DAW ready for recording.
Note: having diverted the audio from the internal speakers to soundflower you will only be able to hear the audio if the 'soundflowerbed' app is actually open. You know it's open if there's a 8 legged blob in the top right task bar. Clicking this icon gives you the sound flower options.

My privoxy has the following log:
2013-08-28 18:25:27.953 00002f44 Request: api.audioaddict.com/v1/di/listener_sessions.jsonp?_method=POST&callback=_AudioAddict_WP_ListenerSession_create&listener_session%5Bid%5D=null&listener_session%5Bis_premium%5D=false&listener_session%5Bmember_id%5D=null&listener_session%5Bdevice_id%5D=6&listener_session%5Bchannel_id%5D=178&listener_session%5Bstream_set_key%5D=webplayer&_=1377699927926
2013-08-28 18:25:27.969 0000268c Request: api.audioaddict.com/v1/ping.jsonp?callback=_AudioAddict_WP_Ping__ping&_=1377699927928
2013-08-28 18:25:27.985 00002d48 Request: api.audioaddict.com/v1/di/track_history/channel/178.jsonp?callback=_AudioAddict_TrackHistory_Channel&_=1377699927942
2013-08-28 18:25:54.080 00003360 Request: pub7.di.fm/di_progressivepsy_aac?type=.flv
So I got the stream url and record it:
D:\Profiles\user\temp>wget pub7.di.fm/di_progressivepsy_aac?type=.flv
--18:26:32-- http://pub7.di.fm/di_progressivepsy_aac?type=.flv
=> `di_progressivepsy_aac#type=.flv'
Resolving pub7.di.fm... done.
Connecting to pub7.di.fm[67.221.255.50]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [video/x-flv]
[ <=> ] 1,234,151 8.96K/s
I got the file that can be reproduced in any multimedia pleer.

Using Node to stream Video to HTML5

I've been playing around with node and websockets and built a small test app that streams audio using websockets. The server breaks apart the mp3 using createReadStream, throttles the stream using node-throttle and sens the binary data using the "ws" module.
On the client side I pick up the chunks on the websocket and use decodeAudioData (http://www.html5rocks.com/en/tutorials/webaudio/intro/) to decode and play the chunk. It all works relatively ok.
What I was curious to do next was to stream video in the same manner to the HTML5 video tag. But I can't really find any reference material on the web to achieve this in the same manner as my audio test above.
Is there a video equivalent for "decodeAudioData"?
Can I feed chunks of data into a video tag?
I've got a similar sample running that I picked up from...
https://gist.github.com/paolorossi/1993068
But this isn't really what I am looking for. First of all it doesn't really seem to be streaming to me. The client buffers it all before playing it.
Also, similar to my audio test I want the stream to be throttled on the server side so that when a new client connects they join the video at whatever point it is currently at. i.e. 30 minutes in or whatever.
Thanks

OK,
I found a solution to this after much searching.
The MediaSource API is what I was looking for...
var mediaSource = new MediaSource();
var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
sourceBuffer.append(new Uint8Array(data));
This link provided the solution...
http://html5-demos.appspot.com/static/media-source.html

Using local file for Web Audio API in Javascript

I'm trying to get sound working on my iPhone game using the Web Audio API. The problem is that this app is entirely client side. I want to store my mp3s in a local folder (and without being user input driven) so I can't use XMLHttpRequest to read the data. I was looking into using FileSystem but Safari doesn't support it.
Is there any alternative?
Edit: Thanks for the below responses. Unfortunately the Audio API is horribly slow for games. I had this working and the latency just makes the user experience unacceptable. To clarify, what I need is sounething like -
var request = new XMLHttpRequest();
request.open('GET', 'file:///./../sounds/beep-1.mp3', true);
request.responseType = 'arraybuffer';
request.onload = function() {
context.decodeAudioData(request.response, function(buffer) {
dogBarkingBuffer = buffer;
}, onError);
}
request.send();
But this gives me the errors -
XMLHttpRequest cannot load file:///sounds/beep-1.mp3. Cross origin requests are only supported for HTTP.
Uncaught Error: NETWORK_ERR: XMLHttpRequest Exception 101
I understand the security risks with reading local files but surely within your own domain should be ok?

I had the same problem and I found this very simple solution.
audio_file.onchange = function(){
var files = this.files;
var file = URL.createObjectURL(files[0]);
audio_player.src = file;
audio_player.play();
};
<input id="audio_file" type="file" accept="audio/*" />
<audio id="audio_player" />
You can test here:
http://jsfiddle.net/Tv8Cm/

Ok, it's taken me two days of prototyping different solutions and I've finally figured out how I can do this without storing my resources on a server. There's a few blogs that detail this but I couldn't find the full solution in one place so I'm adding it here. This may be considered a bit hacky by seasoned programmers but it's the only way I can see this working, so if anyone has a more elegent solution I'd love to hear it.
The solution was to store my sound files as a Base64 encoded string. The sound files are relatively small (less than 30kb) so I'm hoping performance won't be too much of an issue. Note that I put 'xxx' in front of some of the hyperlinks as my n00b status means I can't post more than two links.
Step 1: create Base 64 sound font
First I need to convert my mp3 to a Base64 encoded string and store it as JSON. I found a website that does this conversion for me here - xxxhttp://www.mobilefish.com/services/base64/base64.php
You may need to remove return characters using a text editor but for anyone that needs an example I found some piano tones here - xxxhttps://raw.github.com/mudcube/MIDI.js/master/soundfont/acoustic_grand_piano-mp3.js
Note that in order to work with my example you're need to remove the header part data:audio/mpeg;base64,
Step 2: decode sound font to ArrayBuffer
You could implement this yourself but I found an API that does this perfectly (why re-invent the wheel, right?) - https://github.com/danguer/blog-examples/blob/master/js/base64-binary.js
Resource taken from - here
Step 3: Adding the rest of the code
Fairly straightforward
var cNote = acoustic_grand_piano.C2;
var byteArray = Base64Binary.decodeArrayBuffer(cNote);
var context = new webkitAudioContext();
context.decodeAudioData(byteArray, function(buffer) {
var source = context.createBufferSource(); // creates a sound source
source.buffer = buffer;
source.connect(context.destination); // connect the source to the context's destination (the speakers)
source.noteOn(0);
}, function(err) { console.log("err(decodeAudioData): "+err); });
And that's it! I have this working through my desktop version of Chrome and also running on mobile Safari (iOS 6 only of course as Web Audio is not supported in older versions). It takes a couple of seconds to load on mobile Safari (Vs less than 1 second on desktop Chrome) but this might be due to the fact that it spends time downloading the sound fonts. It might also be the fact that iOS prevents any sound playing until a user interaction event has occured. I need to do more work looking at how it performs.
Hope this saves someone else the grief I went through.

Because ios apps are sandboxed, the web view (basically safari wrapped in phonegap) allows you to store your mp3 file locally. I.e, there is no "cross domain" security issue.
This is as of ios6 as previous ios versions didn't support web audio api

Use HTML5 Audio tag for playing audio file in browser.
Ajax request works with http protocol so when you try to get audio file using file://, browser mark this request as cross domain request. Set following code in request header -
header('Access-Control-Allow-Origin: *');

Expression Encoder 4 live stream consumed by HTML 5 <video>

I'm trying to serve up a live stream (ie. completely buffered in memory, cannot access the past) and am having trouble with Expression Encoder 4.
Ideally, I'd like to just stream a bare H.264 byte stream to the client consumed by:
<video id="mainVideoWindow">
<source src='http://localhost/path/to/my/stream.mp4' type='video/mp4' />
</video>
I figured I could stream it to the client just like any other byte stream over HTTP. However, I'm having trouble figuring out the appropriate code required to do (first day with Expression Encoder, not sure how to go about getting the raw byte stream) so nor do I know if it would work in the first place.
An alternate was to use IIS Live Streaming server:
var source = job.AddDeviceSource(device, null);
job.ActivateSource(source);
job.ApplyPreset(LivePresets.VC1IISSmoothStreaming720pWidescreen);
var format = new PushBroadcastPublishFormat();
format.PublishingPoint = new Uri("http://localhost/test.isml");
job.PublishFormats.Add(format);
job.StartEncoding();
// Let's listen for a keypress or error message to know when to stop encoding
while (Console.ReadKey(true).Key != ConsoleKey.X) ;
// Stop our encoding
Console.WriteLine("Encoding stopped.");
job.StopEncoding();
However, I'm having trouble getting the client side markup to want to display the video on Chrome and I haven't seen anything to indicate that it'd work on Chrome (though http://learn.iis.net/page.aspx/854/apple-http-live-streaming-with-iis-media-services indicates how it would work with an iOS device).
Anyone have any insights?

You are trying to consume (with your sencond example) a Smooth Streaming feed (HTTP-Adaptive Streaming by Microsoft) through HTML5, which is not supported.
This could work on iOS devices if you enable the Apple HTTP Live Streaming to transmux the fragments into MPEG-2 Transport Stream. This will also generate an Apple HTTP Live Streaming manifest which than can be called though the video tag.

...I saw that you have the IIS link. The Apple HTTP Live Streaming needs to be enabled on the IIS Server (IIS Media Services). This will work for iOS devices. Quicktime will get into play...

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008