HTML5 <audio> poor choice for LIVE streaming? - html

As discussed in a previous question, I have built a prototype (using MVC Web API, NAudio and NAudio.Lame) that is streaming live low quality audio after converting it to mp3. The source stream is PCM: 8K, 16-bit, mono and I'm making use of html5's audio tag.
On both Chrome and IE11 there is a 15-34 second delay (high-latency) before audio is heard from the browser which, I'm told, is unacceptable for our end users. Ideally the latency would be no more than 5 seconds. The delay occurs even when using the preload="none" attribute within my audio tag.
Looking more closely at the issue, it appears as though both browsers will not start playing audio until they have received ~32K of audio data. With that in mind, I can affect the delay by changing Lame's MP3 'bitrate' setting. However, if I reduce the delay (by sending more data to the browser for the same length of audio), I will introduce audio drop-outs later.
Examples:
If I use Lame's V0 encoding the delay is nearly 34 seconds which requires almost 0.5 MB of source audio.
If I use Lame's ABR_32 encoding, I can reduce the delay to 10-15 seconds but I will experience pauses and drop-outs throughout the listening session.
Questions:
Any ideas how I can minimize the start-up delay (latency)?
Should I continue investigating various Lame 'presets' in hopes of picking the "right" one?
Could it be that MP3 is not the best format for live streaming?
Would switching to Ogg/Vorbis (or Ogg/OPUS) help?
Do we need to abandon HTML5's audio tag and use Flash or a java applet?
Thanks.

You can not reduce the delay, since you have no control on the browser code and buffering size. HTML5 specification does not enforce any constraint, so I don't see any reason why it would improve.
You can however implement a solution with webaudio API (it's quite simple), where you handle streaming yourself.
If you can split your MP3's chunk in fixed size (so that each MP3 chunks size is known beforehand, or at least, at receive time), then you can have a live streaming in 20 lines of code. The chunk size will be your latency.
The key is to use AudioContext::decodeAudioData.
// Fix up prefixing
window.AudioContext = window.AudioContext || window.webkitAudioContext;
var context = new AudioContext();
var offset = 0;
var byteOffset = 0;
var minDecodeSize = 16384; // This is your chunk size
var request = new XMLHttpRequest();
request.onprogress = function(evt)
{
if (request.response)
{
var size = request.response.length - byteOffset;
if (size < minDecodeSize) return;
// In Chrome, XHR stream mode gives text, not ArrayBuffer.
// If in Firefox, you can get an ArrayBuffer as is
var buf;
if (request.response instanceof ArrayBuffer)
buf = request.response;
else
{
ab = new ArrayBuffer(size);
buf = new Uint8Array(ab);
for (var i = 0; i < size; i++)
buf[i] = request.response.charCodeAt(i + byteOffset) & 0xff;
}
byteOffset = request.response.length;
context.decodeAudioData(ab, function(buffer) {
playSound(buffer);
}, onError);
}
};
request.open('GET', url, true);
request.responseType = expectedType; // 'stream' in chrome, 'moz-chunked-arraybuffer' in firefox, 'ms-stream' in IE
request.overrideMimeType('text/plain; charset=x-user-defined');
request.send(null);
function playSound(buffer) {
var source = context.createBufferSource(); // creates a sound source
source.buffer = buffer; // tell the source which sound to play
source.connect(context.destination); // connect the source to the context's destination (the speakers)
source.start(offset); // play the source now
// note: on older systems, may have to use deprecated noteOn(time);
offset += buffer.duration;
}

Related

Audio distortion occurs when using AudioWorkletProcessor with a MediaStream source and connecting a bluetooth device while it is already running

In our project, we use AudioContext to wire up input from a microphone to an AudioWorkletProcessor and out to a MediaStream. Ultimately, this is sent to other peers in a WebRTC call.
If someone loads the page, the audio always sounds fine. But if they connect with a hard-wired microphone like a laptop mic or webcam, then connect a bluetooth device (such as airpods or headphones), then the audio becomes distorted & robotic sounding.
If we tear out all the other code and simplify it, we still have the issue.
bypassProcessor.js
// Basic processor that wires input to output without transforming the data
// https://github.com/GoogleChromeLabs/web-audio-samples/blob/main/audio-worklet/basic/hello-audio-worklet/bypass-processor.js
class BypassProcessor extends AudioWorkletProcessor {
process(inputs, outputs) {
const input = inputs[0];
const output = outputs[0];
for (let channel = 0; channel < output.length; ++channel) {
output[channel].set(input[channel]);
}
return true;
}
}
registerProcessor('bypass-processor', BypassProcessor);
main.js
const microphoneStream = await navigator.mediaDevices.getUserMedia({
audio: true, // have also tried { channelCount: 1 } and { channelCount: { exact: 1 } }
video: false
})
const audioCtx = new AudioContext()
const inputNode = audioCtx.createMediaStreamSource(microphoneStream)
await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
inputNode.connect(processorNode).connect(audioCtx.destination)
Interestingly, I have found if you comment out the 2 audio worklet lines and instead create a simple gain node, then it works fine.
// await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
// const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
const gainNode = audioCtx.createGain()
Also if you simply create the AudioWorkletNode, but don't even connect it to the others, this also reproduces the issue.
I've created a small React app here that reproduces the problem: https://github.com/JacobMuchow/audio_distortion_repro/tree/master
I've tried some options such as detecting when this happens using 'ondevicechange' event, closing the old AudioContext & nodes and recreating everything, but this only works some of the time. If I wait for some time and then recreate it again, it works so I'm worried about some type of garbage collection issue with the processor when attempting this, but that might be beside the point.
I suspect this has something to do with sample rates... when the AudioContext is correctly recreated it switches from 48 kHz to 16 kHz and then it sounds find. But sometimes it is recreated with 48 kHz still and it continues to sound robotic.
Threads on the internet concerning this are incredibly sparse and I'm hoping someone has specific experience with this issue or this API and can point out what I need to do differently.
For Chrome, the problem is very likely https://crbug.com/1090441 that was recently fixed. I think Firefox doesn't have this problem but I didn't check.

Chrome produces no audio after reaching 50 audio output streams

During my testing, I have found out that reaching 50 audio output streams (as displayed in chrome://media-internals/ Audio tab) on a single tab causes the audio output to disappear. Does Chrome have a set maximum limit of audio output streams allowed per displayed tab? If so, is there some workaround for that? The Chrome version that I am using is Version 87.0.4280.141.
Whenever we're muting/unmuting the audio(second function below) and adjusting the mic volume(first function below), we create a new audio context. Does too many audio context instances caused the issue?
private setLocalStreamVolume(stream: MediaStream | undefined) {
const context = new AudioContext()
const destination = context.createMediaStreamDestination()
const gainNode = context.createGain()
if (stream) {
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(gainNode)
gainNode.connect(destination)
gainNode.gain.value = this._micVolume
}
}
return destination.stream
}
export function mixStreams(streams: Iterable<(MediaStream | undefined)>) {
const context = new AudioContext()
const mixedOutput = context.createMediaStreamDestination()
for(const stream of streams)
if(stream)
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(mixedOutput);
}
return mixedOutput.stream.getTracks()[0]
}
Does too many audio context interactions caused the issue?
Too many AudioContext instances certainly will. In fact, on some systems you can only use a single AudioContext.
I'm not sure what your specific use case is, but you probably only need one AudioContext. All your MediaStreamSourceNodes can live in the same context.

Play stream from gstreamer in browser

I want to play stream from gstreamer in a web browser.
I played around a with RTP, WebRTC and SDP files but, while VLC was able to connect to stream by simple SDP, browsers were not. I later understood that WebRTC requires secure connection which only complicates things and is not needed for my purposes. I stumbled upon Media Source Extension (MSE) of html5, which seems that it could help, but I'm not able to find some comprehensive tutorial or appropriate specs on how to get gstreamer to stream correct data and later how to play them using MSE. I'm also not sure about latency with using MSE.
So is there a way to play stream from gstreamer in a browser?
Thanks.
Using node webrtc project, I was able to combine output from gstreamer with webrtc call. For gstreamer, there is a project which enables it's use with node gstreamer superficial. So basically, you need to run gstremaer process from node process, which can then control output from gstremaer. On every gstreamer frame there is a callback called which takes the frame and can send it to webrtc calls.
Then an webrtc calls needs to be implemented. There is required some signaling protocol for calls. One side of the call will be the server and another will be the client's browser, instead of two browsers. Then a video track will be created where frames from gstreamer superficial will be pushed.
const { RTCVideoSource } = require("wrtc").nonstandard;
const gstreamer = require("gstreamer-superficial");
const source = new RTCVideoSource();
// This is WebRTC video track which should be used with addTransceiver see below
const track = source.createTrack();
const frame = {
width: 1920,
height: 1080,
data: null
};
const pipeline = new gstreamer.Pipeline("v4l2src ! videorate ! video/x-raw,format=YUY2,width=1920,height=1080,framerate=25/1 ! videoconvert ! video/x-raw,format=I420 ! appsink name=sink");
const appsink = pipeline.findChild("sink");
const pull = function() {
appsink.pull(function(buf, caps) {
if (buf) {
frame.data = new Uint8Array(buf);
try {
source.onFrame(frame);
} catch (e) {}
pull();
} else if (!caps) {
console.log("PULL DROPPED");
setTimeout(pull, 500);
}
});
};
pipeline.play();
pull();
// Example:
const useTrack = SomeRTCPeerConnection => SomeRTCPeerConnection.addTransceiver(track, { direction: "sendonly" });

Web Audio API: How to load another audio file?

I want to write a basic script for HTML5 Web Audio API, can play some audio files. But I don't know how to unload a playing audio and load another one. In my script two audio files are playing in the same time,but not what I wanted.
Here is my code:
var context,
soundSource,
soundBuffer;
// Step 1 - Initialise the Audio Context
context = new webkitAudioContext();
// Step 2: Load our Sound using XHR
function playSound(url) {
// Note: this loads asynchronously
var request = new XMLHttpRequest();
request.open("GET", url, true);
request.responseType = "arraybuffer";
// Our asynchronous callback
request.onload = function() {
var audioData = request.response;
audioGraph(audioData);
};
request.send();
}
// This is the code we are interested in
function audioGraph(audioData) {
// create a sound source
soundSource = context.createBufferSource();
// The Audio Context handles creating source buffers from raw binary
soundBuffer = context.createBuffer(audioData, true/* make mono */);
// Add the buffered data to our object
soundSource.buffer = soundBuffer;
// Plug the cable from one thing to the other
soundSource.connect(context.destination);
// Finally
soundSource.noteOn(context.currentTime);
}
// Stop all of sounds
function stopSounds(){
// How can do this?
}
// Events for audio buttons
document.querySelector('.pre').addEventListener('click',
function () {
stopSounds();
playSound('http://thelab.thingsinjars.com/web-audio-tutorial/hello.mp3');
}
);
document.querySelector('.next').addEventListener('click',
function() {
stopSounds();
playSound('http://thelab.thingsinjars.com/web-audio-tutorial/nokia.mp3');
}
);
You should be pre-loading sounds into buffers once, at launch, and simply resetting the AudioBufferSourceNode whenever you want to play it back.
To play multiple sounds in sequence, you need to schedule them using noteOn(time), one after the other, based on buffer respective lengths.
To stop sounds, use noteOff.
Sounds like you are missing some fundamental web audio concepts. This (and more) is described in detail and shown with samples in this HTML5Rocks tutorial and the FAQ.

Akamai HDCore + live stream = occasional blips of black

I noticed I'd only getting the "blips of black" (maybe 300ms of all black) whenever the stream quality changes (due to the DSS throttle).
I thought maybe there is not enough buffer, but the stream change takes about 7s (according to the HDCore debug messages) and the bufferTime, according to the associated netStream, is set to 10 seconds by default.
Perhaps there's a better way to set up the buffer in HDCore? This worked fine with OSMF, but OSMF doesn't support HTTP DSS.
Using: Flash Player 10.2 and Akamai HDCore 2.1.20
Embed Code:
<script type="text/javascript">
/*var str = '?';
for(var b in flashVars) str += b + '=' + flashVars[b] + '&';
alert(str);*/
var params = {
allowFullScreen:"true",
wmode:"window",
bgcolor:"#000000"
};
swfobject.embedSWF(WEBCAST_SWF_URL, "flashContent", "512", "288", "10.2.0", "/flash/expressinstall.swf?", null, params);
</script>
I noticed that running locally and hitting the swf both worked find.
So I changed the wrapper in the HTML and that fixed the "blip". I switched from swfobject to the native non-swfobject wrapper and everything worked (AC_OETags.js).
Happy streaming.