Audio distortion occurs when using AudioWorkletProcessor with a MediaStream source and connecting a bluetooth device while it is already running - html5-audio

In our project, we use AudioContext to wire up input from a microphone to an AudioWorkletProcessor and out to a MediaStream. Ultimately, this is sent to other peers in a WebRTC call.
If someone loads the page, the audio always sounds fine. But if they connect with a hard-wired microphone like a laptop mic or webcam, then connect a bluetooth device (such as airpods or headphones), then the audio becomes distorted & robotic sounding.
If we tear out all the other code and simplify it, we still have the issue.
bypassProcessor.js
// Basic processor that wires input to output without transforming the data
// https://github.com/GoogleChromeLabs/web-audio-samples/blob/main/audio-worklet/basic/hello-audio-worklet/bypass-processor.js
class BypassProcessor extends AudioWorkletProcessor {
process(inputs, outputs) {
const input = inputs[0];
const output = outputs[0];
for (let channel = 0; channel < output.length; ++channel) {
output[channel].set(input[channel]);
}
return true;
}
}
registerProcessor('bypass-processor', BypassProcessor);
main.js
const microphoneStream = await navigator.mediaDevices.getUserMedia({
audio: true, // have also tried { channelCount: 1 } and { channelCount: { exact: 1 } }
video: false
})
const audioCtx = new AudioContext()
const inputNode = audioCtx.createMediaStreamSource(microphoneStream)
await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
inputNode.connect(processorNode).connect(audioCtx.destination)
Interestingly, I have found if you comment out the 2 audio worklet lines and instead create a simple gain node, then it works fine.
// await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
// const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
const gainNode = audioCtx.createGain()
Also if you simply create the AudioWorkletNode, but don't even connect it to the others, this also reproduces the issue.
I've created a small React app here that reproduces the problem: https://github.com/JacobMuchow/audio_distortion_repro/tree/master
I've tried some options such as detecting when this happens using 'ondevicechange' event, closing the old AudioContext & nodes and recreating everything, but this only works some of the time. If I wait for some time and then recreate it again, it works so I'm worried about some type of garbage collection issue with the processor when attempting this, but that might be beside the point.
I suspect this has something to do with sample rates... when the AudioContext is correctly recreated it switches from 48 kHz to 16 kHz and then it sounds find. But sometimes it is recreated with 48 kHz still and it continues to sound robotic.
Threads on the internet concerning this are incredibly sparse and I'm hoping someone has specific experience with this issue or this API and can point out what I need to do differently.

For Chrome, the problem is very likely https://crbug.com/1090441 that was recently fixed. I think Firefox doesn't have this problem but I didn't check.

Related

Chrome produces no audio after reaching 50 audio output streams

During my testing, I have found out that reaching 50 audio output streams (as displayed in chrome://media-internals/ Audio tab) on a single tab causes the audio output to disappear. Does Chrome have a set maximum limit of audio output streams allowed per displayed tab? If so, is there some workaround for that? The Chrome version that I am using is Version 87.0.4280.141.
Whenever we're muting/unmuting the audio(second function below) and adjusting the mic volume(first function below), we create a new audio context. Does too many audio context instances caused the issue?
private setLocalStreamVolume(stream: MediaStream | undefined) {
const context = new AudioContext()
const destination = context.createMediaStreamDestination()
const gainNode = context.createGain()
if (stream) {
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(gainNode)
gainNode.connect(destination)
gainNode.gain.value = this._micVolume
}
}
return destination.stream
}
export function mixStreams(streams: Iterable<(MediaStream | undefined)>) {
const context = new AudioContext()
const mixedOutput = context.createMediaStreamDestination()
for(const stream of streams)
if(stream)
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(mixedOutput);
}
return mixedOutput.stream.getTracks()[0]
}
Does too many audio context interactions caused the issue?
Too many AudioContext instances certainly will. In fact, on some systems you can only use a single AudioContext.
I'm not sure what your specific use case is, but you probably only need one AudioContext. All your MediaStreamSourceNodes can live in the same context.

Play stream from gstreamer in browser

I want to play stream from gstreamer in a web browser.
I played around a with RTP, WebRTC and SDP files but, while VLC was able to connect to stream by simple SDP, browsers were not. I later understood that WebRTC requires secure connection which only complicates things and is not needed for my purposes. I stumbled upon Media Source Extension (MSE) of html5, which seems that it could help, but I'm not able to find some comprehensive tutorial or appropriate specs on how to get gstreamer to stream correct data and later how to play them using MSE. I'm also not sure about latency with using MSE.
So is there a way to play stream from gstreamer in a browser?
Thanks.
Using node webrtc project, I was able to combine output from gstreamer with webrtc call. For gstreamer, there is a project which enables it's use with node gstreamer superficial. So basically, you need to run gstremaer process from node process, which can then control output from gstremaer. On every gstreamer frame there is a callback called which takes the frame and can send it to webrtc calls.
Then an webrtc calls needs to be implemented. There is required some signaling protocol for calls. One side of the call will be the server and another will be the client's browser, instead of two browsers. Then a video track will be created where frames from gstreamer superficial will be pushed.
const { RTCVideoSource } = require("wrtc").nonstandard;
const gstreamer = require("gstreamer-superficial");
const source = new RTCVideoSource();
// This is WebRTC video track which should be used with addTransceiver see below
const track = source.createTrack();
const frame = {
width: 1920,
height: 1080,
data: null
};
const pipeline = new gstreamer.Pipeline("v4l2src ! videorate ! video/x-raw,format=YUY2,width=1920,height=1080,framerate=25/1 ! videoconvert ! video/x-raw,format=I420 ! appsink name=sink");
const appsink = pipeline.findChild("sink");
const pull = function() {
appsink.pull(function(buf, caps) {
if (buf) {
frame.data = new Uint8Array(buf);
try {
source.onFrame(frame);
} catch (e) {}
pull();
} else if (!caps) {
console.log("PULL DROPPED");
setTimeout(pull, 500);
}
});
};
pipeline.play();
pull();
// Example:
const useTrack = SomeRTCPeerConnection => SomeRTCPeerConnection.addTransceiver(track, { direction: "sendonly" });

Ways to capture incoming WebRTC video streams (client side)

I am currently looking to find a best way to store a incoming webrtc video streams. I am joining the videocall using webrtc (via chrome) and I would like to record every incoming video stream to from each participant to the browser.
The solutions I am researching are:
Intercept network packets coming to the browsers e.g. using Whireshark and then decode. Following this article: https://webrtchacks.com/video_replay/
Modifying a browser to store recording as a file e.g. by modifying Chromium itself
Any screen-recorders or using solutions like xvfb & ffmpeg is not an options due the resources constrains. Is there any other way that could let me capture packets or encoded video as a file? The solution must be working on Linux.
if the media stream is what you want a method is to override the browser's PeerConnection. Here is an example:
In an extension manifest add the following content script:
content_scripts": [
{
"matches": ["http://*/*", "https://*/*"],
"js": ["payload/inject.js"],
"all_frames": true,
"match_about_blank": true,
"run_at": "document_start"
}
]
inject.js
var inject = '('+function() {
//overide the browser's default RTCPeerConnection.
var origPeerConnection = window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection;
//make sure it is supported
if (origPeerConnection) {
//our own RTCPeerConnection
var newPeerConnection = function(config, constraints) {
console.log('PeerConnection created with config', config);
//proxy the orginal peer connection
var pc = new origPeerConnection(config, constraints);
//store the old addStream
var oldAddStream = pc.addStream;
//addStream is called when a local stream is added.
//arguments[0] is a local media stream
pc.addStream = function() {
console.log("our add stream called!")
//our mediaStream object
console.dir(arguments[0])
return oldAddStream.apply(this, arguments);
}
//ontrack is called when a remote track is added.
//the media stream(s) are located in event.streams
pc.ontrack = function(event) {
console.log("ontrack got a track")
console.dir(event);
}
window.ourPC = pc;
return pc;
};
['RTCPeerConnection', 'webkitRTCPeerConnection', 'mozRTCPeerConnection'].forEach(function(obj) {
// Override objects if they exist in the window object
if (window.hasOwnProperty(obj)) {
window[obj] = newPeerConnection;
// Copy the static methods
Object.keys(origPeerConnection).forEach(function(x){
window[obj][x] = origPeerConnection[x];
})
window[obj].prototype = origPeerConnection.prototype;
}
});
}
}+')();';
var script = document.createElement('script');
script.textContent = inject;
(document.head||document.documentElement).appendChild(script);
script.parentNode.removeChild(script);
I tested this with a voice call in google hangouts and saw that two mediaStreams where added via pc.addStream and one track was added via pc.ontrack. addStream would seem to be local media streams and the event object in ontrack is a RTCTrackEvent which has a streams object. which I assume are what you are looking for.
To access these streams from your extenion's content script you will need to create audio elements and set the "srcObject" property to the media stream: e.g.
pc.ontrack = function(event) {
//check if our element exists
var elm = document.getElementById("remoteStream");
if(elm == null) {
//create an audio element
elm = document.createElement("audio");
elm.id = "remoteStream";
}
//set the srcObject to our stream. not sure if you need to clone it
elm.srcObject = event.streams[0].clone();
//write the elment to the body
document.body.appendChild(elm);
//fire a custom event so our content script knows the stream is available.
// you could pass the id in the "detail" object. for example:
//CustomEvent("remoteStreamAdded", {"detail":{"id":"audio_element_id"}})
//then access if via e.detail.id in your event listener.
var e = CustomEvent("remoteStreamAdded");
window.dispatchEvent(e);
}
Then in your content script you can listen for that event/access the mediastream like so:
window.addEventListener("remoteStreamAdded", function(e) {
elm = document.getElementById("remoteStream");
var stream = elm.captureStream();
})
With the capture stream available to your content script you can do pretty much anything you want with it. For example, MediaRecorder works really well for recording the stream(s) or you could use something like peer.js or maybe binary.js to stream to another source.
I haven't tested this but it should also be possible to override the local streams. For example, in the inject.js you could establish some blank mediastream, override navigator.mediaDevices.getUserMedia and instead of returning the local mediastream return your own mediastream.
This method should work in firefox and maybe others as well assuming you use an extenion/app to load the inject.js script at the start of the document. It being loaded before any of the target's libs is key to making this work.
edited for more detail
edited for even more detail
Capturing packets will only give you the network packets which you would then need to turn into frames and put into a container. A server such as Janus can record videos.
Running headless chrome and using the javascript MediaRecorder API is another option but much more heavy on resources.

web audio API crashing chrome

I'm trying to build something using the processor node here. Almost anything I do in terms of debugging it crashes chrome. Specifically the tab. Whenever I bring up dev tools, and 100% of the time i put a breakpoint in the onaudioprocess node, the tab dies and I have to either find the chrome helper process for that tab or force quit chrome altogether to get started agin. Its basically crippled my development for the time being. Is this a known issue? Do I need to take certain precautions to prevent chrome from crashing? Are the real time aspects are the web audio api simply not debuggable?
Without seeing your code, it's a bit hard to diagnose the problem.
Does running this code snippet crash your browser tab?
let audioCtx = new (window.AudioContext || window.webkitAudioContext)();
function onPlay() {
let scriptProcessor = audioCtx.createScriptProcessor(4096, 2, 2);
scriptProcessor.onaudioprocess = onAudioProcess;
scriptProcessor.connect(audioCtx.destination);
let oscillator = audioCtx.createOscillator();
oscillator.type = "sawtooth";
oscillator.frequency.value = 220;
oscillator.connect(scriptProcessor);
oscillator.start();
}
function onAudioProcess(event) {
let { inputBuffer, outputBuffer } = event;
for (let channel = 0; channel < outputBuffer.numberOfChannels; channel++) {
let inputData = inputBuffer.getChannelData(channel);
let outputData = outputBuffer.getChannelData(channel);
for (let sample = 0; sample < inputBuffer.length; sample++) {
outputData[sample] = inputData[sample];
// Add white noise to oscillator.
outputData[sample] += ((Math.random() * 2) - 1) * 0.2;
// Un-comment the following line to crash the browser tab.
// console.log(sample);
}
}
}
<button type="button" onclick="onPlay()">Play</button>
If it crashes, there's something else in your local dev environment causing you problems, because it runs perfectly for me.
If not, then maybe you are doing a console.log() (or some other heavy operation) in your onaudioprocess event handler? Remember, this event handler processes thousands of audio samples every time it is called, so you need to be careful what you do with it. For example, try un-commenting the console.log() line in the code snippet above – your browser tab will crash.

AMS doesn't receive unpublish command SOMETIMES over rtmpt

This one has had me going for a week at least. I am trying to record a video file to AMS. It works great almost all of the time, except about 1 in 10 or 15 recording sessions, I never receive 'NetStream.Unpublish.Success' on my netstream from AMS when I close the stream. I am connecting to AMS using rtmpt when this happens, it seems to work fine over rtmp. Also, it seems like this only happens in safari on mac, but since its so intermittent I don't really trust that. Here is my basic flow:
// just a way to use promises with netStatusEvents
private function netListener(code:String, netObject:*):Promise {
var deferred:Deferred = new Deferred();
var netStatusHandler:Function = function (event:NetStatusEvent):void {
if (event.info.level == 'error') {
deferred.reject(event);
} else if (event.info.code == code) {
deferred.resolve(netObject);
// we want this to be a one time listener since the connection can swap between record/playback
netObject.removeEventListener(NetStatusEvent.NET_STATUS, netStatusHandler);
}
};
netObject.addEventListener(NetStatusEvent.NET_STATUS, netStatusHandler);
return deferred.promise;
}
// set up for recording
private function initRecord():void {
Settings.recordFile = Settings.uniquePrefix + (new Date()).getTime();
// detach any existing NetStream from the video
_view.video.attachNetStream(null);
// dispose of existing NetStream
if (_videoStream) {
_videoStream.dispose();
_videoStream = null;
}
// disconnect before connecting anew
(_nc.connected ? netListener('NetConnection.Connect.Closed', _nc) : Promise.when(_nc))
.then(function (nc:NetConnection):void {
netListener('NetConnection.Connect.Success', _nc)
.then(function (nc:NetConnection):void {
_view.video.attachCamera(_webcam);
// get new NetStream
_videoStream = getNetStream(_nc);
ExternalInterface.call("CTplayer." + Settings.instanceName + ".onRecordReady", true);
}, function(error:NetStatusEvent):void {
ExternalInterface.call("CTplayer." + Settings.instanceName + ".onError", error.info);
});
_nc.connect(Settings.recordServer);
}); // end ncClose
if (_nc.connected) _nc.close();
}
// stop recording
private function stop():void {
netListener('NetStream.Unpublish.Success', _videoStream)
.then(function (ns:NetStream):void {
ExternalInterface.call("CTplayer." + Settings.instanceName + ".onRecordStop", Settings.recordFile);
});
_videoStream.attachCamera(null);
_videoStream.attachAudio(null);
_videoStream.close();
}
// start recording
private function record():void {
netListener('NetStream.Publish.Start', _videoStream)
.then(function (ns:NetStream):void {
ExternalInterface.call("CTplayer." + Settings.instanceName + ".onRecording");
});
_videoStream.attachCamera(_webcam);
_videoStream.attachAudio(_microphone);
_videoStream.publish(Settings.recordFile, "record"); // fires NetStream.Publish.Success
}
Update
I am now using a new NetConnection per connection attempt and also not forcing port 80 (see my 'answer' below). This has not solved my connection woes, only made the instances more infrequent. Now like every week or so I still have some random failure of ams or flash. Most recently someone made a recording and then flash player was unable to load the video for playback. The ams logs show a connection attempt and then nothing. There should at least be a play event logged for when i load the metadata. This is quite frustrating and impossible to debug.
I would try 2 distinct NetConnection objects, one for record and one for replay. This will remove your complexities around listeners adding/removing and connect/reconnect/disconnect logic and would IMO be cleaner.
NetConnections are cheap, and I've always used one per task at hand. The other advantage is that you can connect both at startup so the replay connection is ready instantly.
I've not seen a Promise used here before, but I'm not qualified to comment if that may cause a problem or not.
I think my issue was connecting over port 80. I originally thought I had to use port 80 with rtmpt, so I set my Settings.recordServer variable to rtmpt://myamsserver.net:80/app. I'm now using a shotgun approach where I try a bunch of port/protocol combos at once and pick the first one to connect. It is almost always picking port 443 over rtmpt, which seems much faster and more stable all around than 80, and I haven't had this issue since. It could also be due to not reusing the same NetConnection object like Stefan suggested, its hard to say.