Empty microphone data from getUserMedia - html5-audio

Using the following code I get all zeroes in the audio stream from my microphone (using Chrome):
navigator.mediaDevices.getUserMedia({audio:true}).then(
function(stream) {
var audioContext = new AudioContext();
var source = audioContext.createMediaStreamSource(stream);
var node = audioContext.createScriptProcessor(8192, 1, 1);
source.connect(node);
node.connect(audioContext.destination);
node.onaudioprocess = function (e) {
console.log("Audio:", e.inputBuffer.getChannelData(0));
};
}).catch(function(error) {console.error(error);})
I created a jsfiddle here: https://jsfiddle.net/g3dck4dr/
What's wrong here?

Umm, something in your hardware config is wrong? The fiddle works fine for me (that is, it shows non-zero values). Do other web audio input tests work, like https://webaudiodemos.appspot.com/input/index.html?
Test to make sure you've selected the right input, and you don't have a hardware mute switch on.

Related

Audio distortion occurs when using AudioWorkletProcessor with a MediaStream source and connecting a bluetooth device while it is already running

In our project, we use AudioContext to wire up input from a microphone to an AudioWorkletProcessor and out to a MediaStream. Ultimately, this is sent to other peers in a WebRTC call.
If someone loads the page, the audio always sounds fine. But if they connect with a hard-wired microphone like a laptop mic or webcam, then connect a bluetooth device (such as airpods or headphones), then the audio becomes distorted & robotic sounding.
If we tear out all the other code and simplify it, we still have the issue.
bypassProcessor.js
// Basic processor that wires input to output without transforming the data
// https://github.com/GoogleChromeLabs/web-audio-samples/blob/main/audio-worklet/basic/hello-audio-worklet/bypass-processor.js
class BypassProcessor extends AudioWorkletProcessor {
process(inputs, outputs) {
const input = inputs[0];
const output = outputs[0];
for (let channel = 0; channel < output.length; ++channel) {
output[channel].set(input[channel]);
}
return true;
}
}
registerProcessor('bypass-processor', BypassProcessor);
main.js
const microphoneStream = await navigator.mediaDevices.getUserMedia({
audio: true, // have also tried { channelCount: 1 } and { channelCount: { exact: 1 } }
video: false
})
const audioCtx = new AudioContext()
const inputNode = audioCtx.createMediaStreamSource(microphoneStream)
await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
inputNode.connect(processorNode).connect(audioCtx.destination)
Interestingly, I have found if you comment out the 2 audio worklet lines and instead create a simple gain node, then it works fine.
// await audioCtx.audioWorklet.addModule('worklet/bypassProcessor.js')
// const processorNode = new AudioWorkletNode(audioCtx, 'bypass-processor')
const gainNode = audioCtx.createGain()
Also if you simply create the AudioWorkletNode, but don't even connect it to the others, this also reproduces the issue.
I've created a small React app here that reproduces the problem: https://github.com/JacobMuchow/audio_distortion_repro/tree/master
I've tried some options such as detecting when this happens using 'ondevicechange' event, closing the old AudioContext & nodes and recreating everything, but this only works some of the time. If I wait for some time and then recreate it again, it works so I'm worried about some type of garbage collection issue with the processor when attempting this, but that might be beside the point.
I suspect this has something to do with sample rates... when the AudioContext is correctly recreated it switches from 48 kHz to 16 kHz and then it sounds find. But sometimes it is recreated with 48 kHz still and it continues to sound robotic.
Threads on the internet concerning this are incredibly sparse and I'm hoping someone has specific experience with this issue or this API and can point out what I need to do differently.
For Chrome, the problem is very likely https://crbug.com/1090441 that was recently fixed. I think Firefox doesn't have this problem but I didn't check.

Chrome produces no audio after reaching 50 audio output streams

During my testing, I have found out that reaching 50 audio output streams (as displayed in chrome://media-internals/ Audio tab) on a single tab causes the audio output to disappear. Does Chrome have a set maximum limit of audio output streams allowed per displayed tab? If so, is there some workaround for that? The Chrome version that I am using is Version 87.0.4280.141.
Whenever we're muting/unmuting the audio(second function below) and adjusting the mic volume(first function below), we create a new audio context. Does too many audio context instances caused the issue?
private setLocalStreamVolume(stream: MediaStream | undefined) {
const context = new AudioContext()
const destination = context.createMediaStreamDestination()
const gainNode = context.createGain()
if (stream) {
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(gainNode)
gainNode.connect(destination)
gainNode.gain.value = this._micVolume
}
}
return destination.stream
}
export function mixStreams(streams: Iterable<(MediaStream | undefined)>) {
const context = new AudioContext()
const mixedOutput = context.createMediaStreamDestination()
for(const stream of streams)
if(stream)
for(const track of stream.getTracks()){
const sourceStream = context.createMediaStreamSource(new MediaStream([track]));
sourceStream.connect(mixedOutput);
}
return mixedOutput.stream.getTracks()[0]
}
Does too many audio context interactions caused the issue?
Too many AudioContext instances certainly will. In fact, on some systems you can only use a single AudioContext.
I'm not sure what your specific use case is, but you probably only need one AudioContext. All your MediaStreamSourceNodes can live in the same context.

Ways to capture incoming WebRTC video streams (client side)

I am currently looking to find a best way to store a incoming webrtc video streams. I am joining the videocall using webrtc (via chrome) and I would like to record every incoming video stream to from each participant to the browser.
The solutions I am researching are:
Intercept network packets coming to the browsers e.g. using Whireshark and then decode. Following this article: https://webrtchacks.com/video_replay/
Modifying a browser to store recording as a file e.g. by modifying Chromium itself
Any screen-recorders or using solutions like xvfb & ffmpeg is not an options due the resources constrains. Is there any other way that could let me capture packets or encoded video as a file? The solution must be working on Linux.
if the media stream is what you want a method is to override the browser's PeerConnection. Here is an example:
In an extension manifest add the following content script:
content_scripts": [
{
"matches": ["http://*/*", "https://*/*"],
"js": ["payload/inject.js"],
"all_frames": true,
"match_about_blank": true,
"run_at": "document_start"
}
]
inject.js
var inject = '('+function() {
//overide the browser's default RTCPeerConnection.
var origPeerConnection = window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection;
//make sure it is supported
if (origPeerConnection) {
//our own RTCPeerConnection
var newPeerConnection = function(config, constraints) {
console.log('PeerConnection created with config', config);
//proxy the orginal peer connection
var pc = new origPeerConnection(config, constraints);
//store the old addStream
var oldAddStream = pc.addStream;
//addStream is called when a local stream is added.
//arguments[0] is a local media stream
pc.addStream = function() {
console.log("our add stream called!")
//our mediaStream object
console.dir(arguments[0])
return oldAddStream.apply(this, arguments);
}
//ontrack is called when a remote track is added.
//the media stream(s) are located in event.streams
pc.ontrack = function(event) {
console.log("ontrack got a track")
console.dir(event);
}
window.ourPC = pc;
return pc;
};
['RTCPeerConnection', 'webkitRTCPeerConnection', 'mozRTCPeerConnection'].forEach(function(obj) {
// Override objects if they exist in the window object
if (window.hasOwnProperty(obj)) {
window[obj] = newPeerConnection;
// Copy the static methods
Object.keys(origPeerConnection).forEach(function(x){
window[obj][x] = origPeerConnection[x];
})
window[obj].prototype = origPeerConnection.prototype;
}
});
}
}+')();';
var script = document.createElement('script');
script.textContent = inject;
(document.head||document.documentElement).appendChild(script);
script.parentNode.removeChild(script);
I tested this with a voice call in google hangouts and saw that two mediaStreams where added via pc.addStream and one track was added via pc.ontrack. addStream would seem to be local media streams and the event object in ontrack is a RTCTrackEvent which has a streams object. which I assume are what you are looking for.
To access these streams from your extenion's content script you will need to create audio elements and set the "srcObject" property to the media stream: e.g.
pc.ontrack = function(event) {
//check if our element exists
var elm = document.getElementById("remoteStream");
if(elm == null) {
//create an audio element
elm = document.createElement("audio");
elm.id = "remoteStream";
}
//set the srcObject to our stream. not sure if you need to clone it
elm.srcObject = event.streams[0].clone();
//write the elment to the body
document.body.appendChild(elm);
//fire a custom event so our content script knows the stream is available.
// you could pass the id in the "detail" object. for example:
//CustomEvent("remoteStreamAdded", {"detail":{"id":"audio_element_id"}})
//then access if via e.detail.id in your event listener.
var e = CustomEvent("remoteStreamAdded");
window.dispatchEvent(e);
}
Then in your content script you can listen for that event/access the mediastream like so:
window.addEventListener("remoteStreamAdded", function(e) {
elm = document.getElementById("remoteStream");
var stream = elm.captureStream();
})
With the capture stream available to your content script you can do pretty much anything you want with it. For example, MediaRecorder works really well for recording the stream(s) or you could use something like peer.js or maybe binary.js to stream to another source.
I haven't tested this but it should also be possible to override the local streams. For example, in the inject.js you could establish some blank mediastream, override navigator.mediaDevices.getUserMedia and instead of returning the local mediastream return your own mediastream.
This method should work in firefox and maybe others as well assuming you use an extenion/app to load the inject.js script at the start of the document. It being loaded before any of the target's libs is key to making this work.
edited for more detail
edited for even more detail
Capturing packets will only give you the network packets which you would then need to turn into frames and put into a container. A server such as Janus can record videos.
Running headless chrome and using the javascript MediaRecorder API is another option but much more heavy on resources.

Detect if another browser tab is using speechRecognition

Is it possible to tell if another Chrome tab is using webkitSpeechRecognition?
If you try to use webkitSpeechRecognition while another tab is using it, it will throw an error "aborted" without any message. I want to be able to know if webkitSpeechRecognition is open in another tab, and if so, throw a better error that could notify the user.
Unless your customer is on the same website(you could check by logging the ip/browserprint in database and requesting by json) you cannot do that.
Cross domain protection is in effect, and that lets you know zilch about what happens in other tabs or frames.
I am using webkitSpeechRecognition for chrome ( does not work on FF) and I faced same issues like multiple Chrome tabs. Until the browser implement a better error message a temporary solutions that work for me:
You need to detect when a tab is focused or not in Chrome using
Javascript.
Make javascript code like this
isChromium = window.chrome;
if(isChromium)
{
if (window.addEventListener)
{
// bind focus event
window.addEventListener("focus", function (event)
{
console.log("Browser tab focus..");
recognition.stop();// to avoid error
recognition.start();
}, false);
window.addEventListener("blur", function (event)
{
console.log("Browser tab blur..");
recognition.stop();
}, false);
}
}
There's a small workaround for it. You can store the timestamp in a variable upon activating SpeechRecognition and when it exits after a few seconds of inactivity, it will be compared to a timestamp since the SpeechRecognition was activated. Since two tabs are using the API simultaneously, it will exit immediately.
For Chrome, you can use the code below and modify it base on your needs. Firefox doesn't support this yet at the moment.
var transcriptionStartTime;
var timeSinceLastStart;
function configureRecognition(){
var webkitSpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition;
if ('webkitSpeechRecognition' in window) {
recognition = new webkitSpeechRecognition();
recognition.continuous = true;
recognition.interimResults = true;
recognition.lang = "en-US";
recognition.onend = function() {
timeSinceLastStart = new Date().getTime() - transcriptionStartTime;
if (timeSinceLastStart < 100) {
alert('Speech recognition failed to start. Please close the tab that is currently using it.');
}
}
}
}
See browser compatibility here: https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition

Audio API: Fail to resume music and also visualize it. Is there bug in html5-audio?

I have a button. Every time it is clicked, a music is played. When it's clicked the second time, the music resumes. I also want to visualize the music.
So i begin with html5 audio (complete code in http://jsfiddle.net/karenpeng/PAW7r/):
$("#1").click(function(){
audio1.src = '1.mp3';
audio1.controls = true;
audio1.autoplay = true;
audio1.loop = true;
source = context.createMediaElementSource(audio1);
source.connect(analyser);
analyser.connect(context.destination);
});
But when it's clicked more than once, it console.log error:
Uncaught Error: INVALID_STATE_ERR: DOM Exception 11
Then i change to use web audio API, and change the source to:
source = context.createBufferSource();
The error is gone.
And then, i need to visualize it.
But ironicly, it only works in html5 audio!
(complete code in http://jsfiddle.net/karenpeng/FvgQF/, it does not work in jsfiddle cuz i dont know how to write processing.js script properly, but it does run on my pc)
var audio = new Audio();
audio.src = '2.mp3';
audio.controls = true;
audio.autoplay = true;
audio.loop=true;
var context = new webkitAudioContext();
var analyser = context.createAnalyser();
var source = context.createMediaElementSource(audio);
source.connect(analyser);
analyser.connect(context.destination);
var freqData = new Uint8Array(analyser.frequencyBinCount);
analyser.getByteFrequencyData(freqData);
//visualization using freqData
when i change the source to :
source = context.createBufferSource();
it does not show anything.
So is there way to visualize it and yet without error and enable it to resume again and again?
Actually, I believe the problem is that you're trying to create a SECOND web audio node for the same media element. (Your code, when clicked, re-sets the SRC, controls, etc., but it's not creating a new Audio().) You should either hang on to the MediaElementAudioSourceNode you created, or create new Audio elements.
E.g.:
var context = new webkitAudioContext();
var analyser = context.createAnalyser();
var source = null;
var audio0 = new Audio();
$("#0").click(function(){
audio0.src = 'http://www.bornemark.se/bb/mp3_demos/PoA_Sorlin_-_Stay_Up.mp3';
audio0.controls = true;
audio0.autoplay = true;
audio0.loop = true;
if (source==null) {
source = context.createMediaElementSource(audio0);
source.connect(analyser);
analyser.connect(context.destination);
}
});​
Hope that helps!
-Chris Wilson
From what I can tell, this is likely because the MediaElementSourceNode may only be able to take in an Audio that isn't already playing. The Audio object is declared outside of the click handler, so you're trying to analyze audio that's in the middle of playing the second time you click.
Note that the API doesn't seem to specify this, so I'm not 100% sure, but this makes intuitive sense.