I'm using MediaCodec with targetSdkVersion 28 (Android 9) to convert the PCM stream from AudioRecorder into AMR for a 3GPP/GSM VoIP application. The onInputBufferAvailable() callback calls AudioRecord.read(bb, bb.limit()) to queue the PCM samples to the encoder through the available ByteBuffer and the onOutbufferAvailable() callback accepts the AMR frame and passes it down to the RTP layer for packetization and transmission. This works well on a variety of Android 7, 8 and 9 devices that we've been testing with.
However, on a Samsung XCoverPro running Android 10, the onOutbufferAvailable() callback isn't being triggered until 26 AMR frames are available, instead of a single frame as has happened previously. Given that each frame represents 20ms of audio, this is causing an audio delay of over half a second. So my question is, what control do I have over a MediaCodec audio encoder to get it to trigger the onOutputBufferAvailable() callback when a particular number of frames (ideally between 1 and 5) are available?
The encoder is created like this...
String mimeType = MediaFormat.MIMETYPE_AUDIO_AMR_NB;
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, mimeType);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, sampleRate);
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
format.setInteger(MediaFormat.KEY_BIT_RATE, bitRate);
MediaCodec encoder = MediaCodec.createEncoderByType(mimeType);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
I've experimented with the MediaFormat.KEY_MAX_INPUT_SIZE parameter but that doesn't appear to have any effect, and setting MediaFormat.KEY_LATENCY didn't help either (any anyway the docs say it only applies to video codecs).
Any suggestions?
Related
I have an AS3 music player built into an app that I'm putting together. It works perfectly with almost every file I've used, but there is one file that it stops early on. The file is roughly 56 seconds long, the player stops at about 44 seconds. I'm using trace to show the length, and for every other song the length is correct. In this case, trace shows roughly 44 seconds instead of 56. Here's the code I use to load the file:
length = 0;
request = new URLRequest(fileAddress);
track = new Sound();
track.load(request);
track.addEventListener(Event.COMPLETE, TrackLoaded);
And here's the TrackLoaded function:
private function TrackLoaded(e:Event):void{
length = track.length;
if (playWhenLoaded == true){
trackChannel = track.play(0);
trackChannel.addEventListener(Event.SOUND_COMPLETE, TrackFinishedPlaying);
playWhenLoaded = false;
}
Works perfectly with every other file. What am I missing?
Are you willing to host this 56 sec MP3 somewhere for download & analysis? Or if you can yourself check the header info via a Hexeditor. I suspect either of two things:
1) Header has incorrect time length embedded and Flash takes that as final duration and stops there. After all why read anymore remaining bytes? they could be just metadata not audio samples Besides what encoder would lie about true duration? So its accepted as final duration even if your ears know that its incorrect.
2) MP3 samplerate /bitrate issue: Consider what sample rate is this one problem MP3? Check the sample rate against of a working MP3. Also are these various found sounds or you made each yourself? I ask to confirm you put the same settings for each file yet this one does not work?
In anycase I think you could solve this particular MP3 by re-encoding it. Maybe save as WAV or AIFF first then take that new uncompressed audio and convert back to a MP3 with samplerate of 44100 khz + Stereo sound + Constant Bitrate (avoid Variable B.Rate like hell if you dont want issues)
Checking and fixing either of the above should get you are correctly parsed MP3. Hope it helps
I am looking to take the audio input from the browser and stream it to multiple listeners. The intended use is for music, so the quality must mp3 standard or thereabouts.
I have attempted two ways, both yielding unsuccessful results:
WebRTC
Streaming audio directly between browsers works fine, but the audio quality seems to be non-customisable though what I have seen. (I have seen that it is using the Opus audio codec, but seems to not expose any controls).
Does anyone have any insight into how to increase the audio quality in WebRTC streams?
Websockets
The issue is the transportation from the browser to the server. The PCM audio data I can acquiring via the method below has proven too large to repeatedly stream to the server via websockets. The stream works perfectly in high speed internet environments, but on slower wifi it is un-usable.
var context = new webkitAudioContext()
navigator.webkitGetUserMedia({audio:true}, gotStream)
function gotStream (stream)
{
var source = context.createMediaStreamSource(stream)
var proc = context.createScriptProcessor(2048, 2, 2)
source.connect(proc)
proc.connect(context.destination)
proc.onaudioprocess = function(event)
{
var audio_data = event.inputBuffer.getChannelData(0)|| new Float32Array(2048)
console.log(audio_data)
// send audio_data to server
}
}
So the main question is, is there any way to compress the PCM data in order to make it easier to stream to the server? Or perhaps there is an easier way to go about this?
There are lots of ways to compress PCM data, sure, but realistically, your best bet is to get WebRTC to work properly. WebRTC is designed to do this - adaptively stream media - although you don't define what you mean by "multiple" listeners (there's a huge difference between 3 listeners and 300,000 simultaneous listeners).
There are several possible ways of resampling and/or compressing your data, none of them native though. I resampled the data to 8Khz Mono (your mileage may vary) with the xaudio.js lib from the speex.js environment. You could also compress the stream using speex, though that is used usually for audio only. In your case, I would probably send the stream to a server, compress it there and stream it to your audience. I really don't believe a simple browser to be good enough to serve data to a huge audience.
WebRTC seems to default to one mono channel around 42 kb/s, it seems to be primarily designed for voice.
You can disable the audio processing features using constraints to get a more consistent input from the browser using:
navigator.mediaDevices.getUserMedia({
audio: {
autoGainControl: false,
channelCount: 2,
echoCancellation: false,
latency: 0,
noiseSuppression: false,
sampleRate: 48000,
sampleSize: 16,
volume: 1.0
}
});
Then you also should set stereo and maxaveragebitrate params on the SDP:
let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);
This should output a string which looks like this:
a=fmtp:111 minptime=10;useinbandfec=1; stereo=1; maxaveragebitrate=510000
This could increase the bitrate up to 520kb/s for stereo, which is 260kps per channel. Actual bitrate depends on the speed of your network and strength of your signal tho.
I'm working on a VoIP application for Windows Phone 8, and I want to cancel the echo produced when using speakerphone. Speex offers an AEC module, which I have tried to integrate into my application, but to no avail. My application works fine, but the echo persists. My code is based off the MS Chatterbox VoIP application, using WASAPI for capture and render. This is the form of the relevant sections (I tried to indicate what already existed and worked, and what was new):
Init:
// I've tried tail lengths between 100-500ms (800-4000 samples # 8KHz)
echoState = speex_echo_state_init (80, 800)
speex_echo_ctl(echoState, SPEEX_ECHO_SET_SAMPLING_RATE, 8000);
Render (runs every 10ms):
Read 10ms (80 samples) data from network (8KHz, 16 bit, mono)
NEW - speex_echo_playback(echoState, networkData)
Upsample data to 48KHz
Render data
Capture (runs every 10ms):
Capture 10ms of data (48KHz, 16 bit, mono)
Downsample to 8KHz
NEW - speex_echo_capture(echoState, downsampledData, echoCancelledData)
Send echoCancelledData to network
After reading the Speex documentation and looking at some posts on this site (not a lot of speex for Wp8 questions, but a few for android), I'm under the impression that this is, or is close to, the proper implementatinon of their API. So why isn't it working?
Thanks in advance
I have a back end sound.php which can return .m4a sound file from web server and I can make a web request with id to the sound.php to return specify .m4a file. i.e. sound.php?id=1234
I am now trying to use org.osmf.media.MediaPlayer and AudioElement and URLResource
var mediaPlayer:MediaPlayer = new MediaPlayer();
var ae:AudioElement = new AudioElement(new URLResource("http://xxx.com/sound.php?id=12"));
mediaPlayer.media = ae;
mediaPlayer.play();
and it throw error of The specified capability is not currently supported .I have tested the link via browser which is return a .m4a file sucessfully .
I dont understand if it is claiming the requesting method or the returned file , would somebody has any idea? Thanks
Try setting the MediaPlayer.autoPlay to true or you should wait until the media is loaded, which is signaled through the mediaPlayerStateChange event, with state READY.
[UPDATE]
As stated in the NetStream - Adobe ActionScript® 3 (AS3 ) API Reference page and also in the Supported codecs | Flash Player page:
Flash Player 9 Update 3 plays files derived from the standard MPEG-4
container format that contain H.264 video and/or HE-AAC audio, such as
F4V, MP4, M4A, MOV, MP4V, 3GP, and 3G2. One thing to note is that
protected MP4 files, such as those downloaded from iTunes or digitally
encrypted by FairPlay, are not supported.
It seems you will have to try the NetStream approach.
I'm trying to serve up a live stream (ie. completely buffered in memory, cannot access the past) and am having trouble with Expression Encoder 4.
Ideally, I'd like to just stream a bare H.264 byte stream to the client consumed by:
<video id="mainVideoWindow">
<source src='http://localhost/path/to/my/stream.mp4' type='video/mp4' />
</video>
I figured I could stream it to the client just like any other byte stream over HTTP. However, I'm having trouble figuring out the appropriate code required to do (first day with Expression Encoder, not sure how to go about getting the raw byte stream) so nor do I know if it would work in the first place.
An alternate was to use IIS Live Streaming server:
var source = job.AddDeviceSource(device, null);
job.ActivateSource(source);
job.ApplyPreset(LivePresets.VC1IISSmoothStreaming720pWidescreen);
var format = new PushBroadcastPublishFormat();
format.PublishingPoint = new Uri("http://localhost/test.isml");
job.PublishFormats.Add(format);
job.StartEncoding();
// Let's listen for a keypress or error message to know when to stop encoding
while (Console.ReadKey(true).Key != ConsoleKey.X) ;
// Stop our encoding
Console.WriteLine("Encoding stopped.");
job.StopEncoding();
However, I'm having trouble getting the client side markup to want to display the video on Chrome and I haven't seen anything to indicate that it'd work on Chrome (though http://learn.iis.net/page.aspx/854/apple-http-live-streaming-with-iis-media-services indicates how it would work with an iOS device).
Anyone have any insights?
You are trying to consume (with your sencond example) a Smooth Streaming feed (HTTP-Adaptive Streaming by Microsoft) through HTML5, which is not supported.
This could work on iOS devices if you enable the Apple HTTP Live Streaming to transmux the fragments into MPEG-2 Transport Stream. This will also generate an Apple HTTP Live Streaming manifest which than can be called though the video tag.
...I saw that you have the IIS link. The Apple HTTP Live Streaming needs to be enabled on the IIS Server (IIS Media Services). This will work for iOS devices. Quicktime will get into play...