During my testing, I have found out that reaching 50 audio output streams (as displayed in chrome://media-internals/ Audio tab) on a single tab causes the audio output to disappear. Does Chrome have a set maximum limit of audio output streams allowed per displayed tab? If so, is there some workaround for that? The Chrome version that I am using is Version 87.0.4280.141.
Whenever we're muting/unmuting the audio(second function below) and adjusting the mic volume(first function below), we create a new audio context. Does too many audio context instances caused the issue?
private setLocalStreamVolume(stream: MediaStream | undefined) {
const context = new AudioContext()
const destination = context.createMediaStreamDestination()
const gainNode = context.createGain()
if (stream) {
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(gainNode)
gainNode.connect(destination)
gainNode.gain.value = this._micVolume
}
}
return destination.stream
}
export function mixStreams(streams: Iterable<(MediaStream | undefined)>) {
const context = new AudioContext()
const mixedOutput = context.createMediaStreamDestination()
for(const stream of streams)
if(stream)
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(mixedOutput);
}
return mixedOutput.stream.getTracks()[0]
}
Does too many audio context interactions caused the issue?
Too many AudioContext instances certainly will. In fact, on some systems you can only use a single AudioContext.
I'm not sure what your specific use case is, but you probably only need one AudioContext. All your MediaStreamSourceNodes can live in the same context.
Related
In our project, we use AudioContext to wire up input from a microphone to an AudioWorkletProcessor and out to a MediaStream. Ultimately, this is sent to other peers in a WebRTC call.
If someone loads the page, the audio always sounds fine. But if they connect with a hard-wired microphone like a laptop mic or webcam, then connect a bluetooth device (such as airpods or headphones), then the audio becomes distorted & robotic sounding.
If we tear out all the other code and simplify it, we still have the issue.
bypassProcessor.js
// Basic processor that wires input to output without transforming the data
// https://github.com/GoogleChromeLabs/web-audio-samples/blob/main/audio-worklet/basic/hello-audio-worklet/bypass-processor.js
class BypassProcessor extends AudioWorkletProcessor {
process(inputs, outputs) {
const input = inputs[0];
const output = outputs[0];
for (let channel = 0; channel < output.length; ++channel) {
output[channel].set(input[channel]);
}
return true;
}
}
registerProcessor('bypass-processor', BypassProcessor);
main.js
const microphoneStream = await navigator.mediaDevices.getUserMedia({
audio: true, // have also tried { channelCount: 1 } and { channelCount: { exact: 1 } }
video: false
})
const audioCtx = new AudioContext()
const inputNode = audioCtx.createMediaStreamSource(microphoneStream)
await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
inputNode.connect(processorNode).connect(audioCtx.destination)
Interestingly, I have found if you comment out the 2 audio worklet lines and instead create a simple gain node, then it works fine.
// await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
// const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
const gainNode = audioCtx.createGain()
Also if you simply create the AudioWorkletNode, but don't even connect it to the others, this also reproduces the issue.
I've created a small React app here that reproduces the problem: https://github.com/JacobMuchow/audio_distortion_repro/tree/master
I've tried some options such as detecting when this happens using 'ondevicechange' event, closing the old AudioContext & nodes and recreating everything, but this only works some of the time. If I wait for some time and then recreate it again, it works so I'm worried about some type of garbage collection issue with the processor when attempting this, but that might be beside the point.
I suspect this has something to do with sample rates... when the AudioContext is correctly recreated it switches from 48 kHz to 16 kHz and then it sounds find. But sometimes it is recreated with 48 kHz still and it continues to sound robotic.
Threads on the internet concerning this are incredibly sparse and I'm hoping someone has specific experience with this issue or this API and can point out what I need to do differently.
For Chrome, the problem is very likely https://crbug.com/1090441 that was recently fixed. I think Firefox doesn't have this problem but I didn't check.
I am using javascript to parse an SWF file and displaying the contents in an HTML5 canvas.
I am having an issue with playing back the audio data from the audiostream swf tags. The audio is split up per frame and I am able to get the audio data into an array of base64 data, in the same order as the frames. Creating/destorying audio elemnts on each frame does not seem like the best way to go about it, but it is the only way I can think of. Is there a better way to go about this?
Note: There are rewind/fastforward/pause buttons in the swf file as well, so the audio will need to align with the frames when they are sent back, so I don't believe I can just create one long audio file from the smaller bits of data.
You will want to load these audio files as AudioBuffers and play them through the Web Audio API.
What you currently have are data-URLs, that do represent full audio file (with metadata).
Loading all of these in Audio elements may indeed not be a good idea, for a start because some browsers may not let you do so, and then because HTMLMediaElement are not meant for perfect timing.
So you will need to first fetch all these data-URLs to get back their actual binary content in ArrayBuffers, then you'll be able to extract the raw PCM audio data from these audio files.
// would be the same with data-URLs
const urls = [
"kbgd2jm7ezk3u3x/hihat.mp3",
"h2j6vm17r07jf03/snare.mp3",
"1cdwpm3gca9mlo0/kick.mp3",
"h8pvqqol3ovyle8/tom.mp3"
].map( (path) => 'https://dl.dropboxusercontent.com/s/' + path );
const audio_ctx = new AudioContext();
Promise.all( urls.map( toAudioBuffer ) )
.then( activateBtn )
.catch( console.error );
async function toAudioBuffer( url ) {
const resp = await fetch( url );
const arr_buffer = await resp.arrayBuffer();
return audio_ctx.decodeAudioData( arr_buffer );
}
function activateBtn( audio_buffers ) {
const btn = document.getElementById( 'btn' );
btn.onclick = playInSequence;
btn.disabled = false;
// simply play one after the other
// you could add your own logic of course
async function playInSequence() {
await audio_ctx.resume(); // to make noise we need to be allowed by a user gesture
let current = 0;
while( current < audio_buffers.length ) {
// create a bufferSourceNode, no worry, it weights nothing
const source = audio_ctx.createBufferSource();
source.buffer = audio_buffers[ current ];
// so it makes noise
source.connect( audio_ctx.destination );
// [optional] promisify
const will_stop = new Promise( (res) => source.onended = res );
source.start(0); // start playing now
await will_stop;
current ++;
}
}
}
<button id="btn" disabled>play all in sequence</button>
I ended up making an array in javascript of the index of the sound id, which corresponds with the frame id, and inside of the object it contains an audio element create as I parse the tags. The elements are not added into the DOM, and they are created up front, so they persist for the life of the frame-handler (as they are stored in a sounds array inside of the object), so there is no create/destory cost.
This way, when I play the frames (the visuals) I can call play on the audio element corresponding to the active frame. As the frames control which audio is played, the rewind/fastforward functionality is retained.
I am currently looking to find a best way to store a incoming webrtc video streams. I am joining the videocall using webrtc (via chrome) and I would like to record every incoming video stream to from each participant to the browser.
The solutions I am researching are:
Intercept network packets coming to the browsers e.g. using Whireshark and then decode. Following this article: https://webrtchacks.com/video_replay/
Modifying a browser to store recording as a file e.g. by modifying Chromium itself
Any screen-recorders or using solutions like xvfb & ffmpeg is not an options due the resources constrains. Is there any other way that could let me capture packets or encoded video as a file? The solution must be working on Linux.
if the media stream is what you want a method is to override the browser's PeerConnection. Here is an example:
In an extension manifest add the following content script:
content_scripts": [
{
"matches": ["http://*/*", "https://*/*"],
"js": ["payload/inject.js"],
"all_frames": true,
"match_about_blank": true,
"run_at": "document_start"
}
]
inject.js
var inject = '('+function() {
//overide the browser's default RTCPeerConnection.
var origPeerConnection = window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection;
//make sure it is supported
if (origPeerConnection) {
//our own RTCPeerConnection
var newPeerConnection = function(config, constraints) {
console.log('PeerConnection created with config', config);
//proxy the orginal peer connection
var pc = new origPeerConnection(config, constraints);
//store the old addStream
var oldAddStream = pc.addStream;
//addStream is called when a local stream is added.
//arguments[0] is a local media stream
pc.addStream = function() {
console.log("our add stream called!")
//our mediaStream object
console.dir(arguments[0])
return oldAddStream.apply(this, arguments);
}
//ontrack is called when a remote track is added.
//the media stream(s) are located in event.streams
pc.ontrack = function(event) {
console.log("ontrack got a track")
console.dir(event);
}
window.ourPC = pc;
return pc;
};
['RTCPeerConnection', 'webkitRTCPeerConnection', 'mozRTCPeerConnection'].forEach(function(obj) {
// Override objects if they exist in the window object
if (window.hasOwnProperty(obj)) {
window[obj] = newPeerConnection;
// Copy the static methods
Object.keys(origPeerConnection).forEach(function(x){
window[obj][x] = origPeerConnection[x];
})
window[obj].prototype = origPeerConnection.prototype;
}
});
}
}+')();';
var script = document.createElement('script');
script.textContent = inject;
(document.head||document.documentElement).appendChild(script);
script.parentNode.removeChild(script);
I tested this with a voice call in google hangouts and saw that two mediaStreams where added via pc.addStream and one track was added via pc.ontrack. addStream would seem to be local media streams and the event object in ontrack is a RTCTrackEvent which has a streams object. which I assume are what you are looking for.
To access these streams from your extenion's content script you will need to create audio elements and set the "srcObject" property to the media stream: e.g.
pc.ontrack = function(event) {
//check if our element exists
var elm = document.getElementById("remoteStream");
if(elm == null) {
//create an audio element
elm = document.createElement("audio");
elm.id = "remoteStream";
}
//set the srcObject to our stream. not sure if you need to clone it
elm.srcObject = event.streams[0].clone();
//write the elment to the body
document.body.appendChild(elm);
//fire a custom event so our content script knows the stream is available.
// you could pass the id in the "detail" object. for example:
//CustomEvent("remoteStreamAdded", {"detail":{"id":"audio_element_id"}})
//then access if via e.detail.id in your event listener.
var e = CustomEvent("remoteStreamAdded");
window.dispatchEvent(e);
}
Then in your content script you can listen for that event/access the mediastream like so:
window.addEventListener("remoteStreamAdded", function(e) {
elm = document.getElementById("remoteStream");
var stream = elm.captureStream();
})
With the capture stream available to your content script you can do pretty much anything you want with it. For example, MediaRecorder works really well for recording the stream(s) or you could use something like peer.js or maybe binary.js to stream to another source.
I haven't tested this but it should also be possible to override the local streams. For example, in the inject.js you could establish some blank mediastream, override navigator.mediaDevices.getUserMedia and instead of returning the local mediastream return your own mediastream.
This method should work in firefox and maybe others as well assuming you use an extenion/app to load the inject.js script at the start of the document. It being loaded before any of the target's libs is key to making this work.
edited for more detail
edited for even more detail
Capturing packets will only give you the network packets which you would then need to turn into frames and put into a container. A server such as Janus can record videos.
Running headless chrome and using the javascript MediaRecorder API is another option but much more heavy on resources.
Using the following code I get all zeroes in the audio stream from my microphone (using Chrome):
navigator.mediaDevices.getUserMedia({audio:true}).then(
function(stream) {
var audioContext = new AudioContext();
var source = audioContext.createMediaStreamSource(stream);
var node = audioContext.createScriptProcessor(8192, 1, 1);
source.connect(node);
node.connect(audioContext.destination);
node.onaudioprocess = function (e) {
console.log("Audio:", e.inputBuffer.getChannelData(0));
};
}).catch(function(error) {console.error(error);})
I created a jsfiddle here: https://jsfiddle.net/g3dck4dr/
What's wrong here?
Umm, something in your hardware config is wrong? The fiddle works fine for me (that is, it shows non-zero values). Do other web audio input tests work, like https://webaudiodemos.appspot.com/input/index.html?
Test to make sure you've selected the right input, and you don't have a hardware mute switch on.
As discussed in a previous question, I have built a prototype (using MVC Web API, NAudio and NAudio.Lame) that is streaming live low quality audio after converting it to mp3. The source stream is PCM: 8K, 16-bit, mono and I'm making use of html5's audio tag.
On both Chrome and IE11 there is a 15-34 second delay (high-latency) before audio is heard from the browser which, I'm told, is unacceptable for our end users. Ideally the latency would be no more than 5 seconds. The delay occurs even when using the preload="none" attribute within my audio tag.
Looking more closely at the issue, it appears as though both browsers will not start playing audio until they have received ~32K of audio data. With that in mind, I can affect the delay by changing Lame's MP3 'bitrate' setting. However, if I reduce the delay (by sending more data to the browser for the same length of audio), I will introduce audio drop-outs later.
Examples:
If I use Lame's V0 encoding the delay is nearly 34 seconds which requires almost 0.5 MB of source audio.
If I use Lame's ABR_32 encoding, I can reduce the delay to 10-15 seconds but I will experience pauses and drop-outs throughout the listening session.
Questions:
Any ideas how I can minimize the start-up delay (latency)?
Should I continue investigating various Lame 'presets' in hopes of picking the "right" one?
Could it be that MP3 is not the best format for live streaming?
Would switching to Ogg/Vorbis (or Ogg/OPUS) help?
Do we need to abandon HTML5's audio tag and use Flash or a java applet?
Thanks.
You can not reduce the delay, since you have no control on the browser code and buffering size. HTML5 specification does not enforce any constraint, so I don't see any reason why it would improve.
You can however implement a solution with webaudio API (it's quite simple), where you handle streaming yourself.
If you can split your MP3's chunk in fixed size (so that each MP3 chunks size is known beforehand, or at least, at receive time), then you can have a live streaming in 20 lines of code. The chunk size will be your latency.
The key is to use AudioContext::decodeAudioData.
// Fix up prefixing
window.AudioContext = window.AudioContext || window.webkitAudioContext;
var context = new AudioContext();
var offset = 0;
var byteOffset = 0;
var minDecodeSize = 16384; // This is your chunk size
var request = new XMLHttpRequest();
request.onprogress = function(evt)
{
if (request.response)
{
var size = request.response.length - byteOffset;
if (size < minDecodeSize) return;
// In Chrome, XHR stream mode gives text, not ArrayBuffer.
// If in Firefox, you can get an ArrayBuffer as is
var buf;
if (request.response instanceof ArrayBuffer)
buf = request.response;
else
{
ab = new ArrayBuffer(size);
buf = new Uint8Array(ab);
for (var i = 0; i < size; i++)
buf[i] = request.response.charCodeAt(i + byteOffset) & 0xff;
}
byteOffset = request.response.length;
context.decodeAudioData(ab, function(buffer) {
playSound(buffer);
}, onError);
}
};
request.open('GET', url, true);
request.responseType = expectedType; // 'stream' in chrome, 'moz-chunked-arraybuffer' in firefox, 'ms-stream' in IE
request.overrideMimeType('text/plain; charset=x-user-defined');
request.send(null);
function playSound(buffer) {
var source = context.createBufferSource(); // creates a sound source
source.buffer = buffer; // tell the source which sound to play
source.connect(context.destination); // connect the source to the context's destination (the speakers)
source.start(offset); // play the source now
// note: on older systems, may have to use deprecated noteOn(time);
offset += buffer.duration;
}