Minimizing latency in streaming audio with html 5 - html

I'm trying to listen to a live audio stream on a webpage with a latency of less than 3 seconds. So far with ogg vorbis streams generated using ices & icecast, I've been unable to get latencies less than 7 seconds. All players I've used so far (html5 audio tag in Firefox, Opera, Safari; VLC as well) seem to introduce similar delays. It's unclear at this point how much latency is introduced in ices/icecast vs. the client-side player. I've tweaked ices and icecast settings, to no avail.
Has anyone achieved better latencies than this in a similar ices/icecast setup? I wouldn't expect an ogg vorbis decoder (be it html 5 in a browser, VLC, or whatever) to delay an audio stream for multiple seconds. Am I incorrect? I can't find any info on controlling buffer sizes or the decoding in browsers.
With a different architecture (html 5, firefox, WSGI server serving wav format audio), I was able to achieve latencies around 1-2 seconds. By default, the firefox began playing the wav file 5+ seconds behind, but I could advance playback by setting audio.currentTime ahead, and only be 1-2 seconds back (somewhat fragile). However, I'd much prefer to use icecast, and streaming wavs obviously doesn't scale.
Thanks in advance for any ideas.

The Icecast and Shoutcast servers themselves have internal buffers. I know the shoutcast one can be configured (look in the advanced directives in the docs).

There are some archived discussions threads about Ogg / Vorbis related delay:
Vorbis codec delay
Delay
The answer seems to be that you have to tweek the Ogg container format, and then the remaining delay of Vorbis should not be too high.
However, I also often read that the new Opus codec is better suited for low delay/latency. See e.g. here or here.

Related

Adaptive streaming for AUDIO in HTML5

I have big collection of CC music, which I want to stream.
I want adaptive streaming for users with low internet connection to not wait every 5 sec for buffering.
I read about mpeg-dash, hls, etc and it seems that mpeg-dash supports only mp4 and ts containers, I am not sure if I can do what I want.
I have LC-AAC-320k and HE-AAC-64k files. Quality matters, and I hear the difference between HE-AAC and LC-AAC-320 even on realtek audio.
Is it possible to do adaptive streaming for these formats with support for Chrome, Firefox, Safari? If not, is there any way to detect low bandwidth (often buffering) and switch to HE-AAC?

How does Chrome decide how much video to buffer for HTML5 MP4?

I have an MP4 video that is variable bitrate, so the average bitrate doesn't necessarily stay consistent throughout the entire file. Because my video is a capture of a computer screen, some parts of the video are very low bitrate because nothing is happening, and other parts are a much higher bitrate because there's a lot of activity on the screen.
How does Chrome decide how much video to buffer for progressive download HTTP(S) videos? I'm running into a problem where Chrome tends to buffer too little, so playback stutters.
If there's no way of convincing Chrome to download a certain time of video (and I don't want to just preload the entire thing), can I author the MP4 some special way to solve the problem? I'm using FFmpeg and MP4Box. Maybe it's up to the HTTP server?
If you want more control over the playback of the video, you should definitely check out MediaSourceExtensions. They define a whole model for video playback, defining sourceBuffers where you can put video data, etc.
Beware it is not a simple to use API still, and the information on how to use it is very fragmented.
In your case, if you go the MSE route, you can either keep using h264 (which is probably the codec your mp4 is wrapping) or switch to webm.
In case of going the MP4, h264 route, you'll need to generate a fragmented MP4 (fMP4) and write some JavaScript to control the way you work with the MP4 fragments (segments in MSE parlance).
MP4Box has support for the -dash protocol, which will segment an MP4 in a way that is suitable for consumption via MSE. FFmpeg also has support for generating fragmented MP4.

Determining current rendition for HTML5 HLS streams

I've got an HTML5 <video> element whose source is a .m3u8 (HLS stream)
I have an M3U8 with three different renditions: 640x360, 960x540, and 1280x720
On Desktops I have a Flash Player for playing the video, so the HTML5 fallback is only intended for mobile (iOS and Android) - I am doing all of my testing on an iPad and, once it's working, I will try it out on Android and hope everything works the same.
My goal is to, at any point in time, figure out what rendition the video element is playing. The rendition is subject to change as the user's bandwidth changes.
I tried using the .videoHeight property, but it always returns 480 regardless of the rendition being downloaded - which is particularly odd because 480 isn't even an option.
Does anyone know how I can figure out the rendition being downloaded?
Cleaning up some old questions that never received answers:
Unfortunately this one is just not possible. The HTMl5 video spec and HTML5 video implementations in most browsers are intended to abstract away all of the underlying magic involved in playing videos. You give it a source, it plays. Everything else is completely hidden and you have no access. No access to metadata channels, no access to audio channels, no access to bitrate and resolution information,...
At best I developed a solution to guess which resolution was playing. Every 10 seconds a 1 MB file was loaded over AJAX. I measured the speed at which this downloaded to guess at their current bandwidth. I know that QuickTime will only play a rendition if you have double the required bandwidth. So if the 960x540 rendition requires 1400 kbit/s then it won't play unless you have 2800 kbit/s bandwidth.
It's not very good (and wastes 6 MB of bandwidth per minute) but it's better than nothing.

How did Scirra get HTML5 audio so perfect in Construct 2?

Check out this space shooter demo.
The HTML5 audio is perfect on Chrome 18 and Firefox 10. There is no lag in playing sounds and each sample plays perfectly. The last time I tried to play sounds using HTML5 audio and JavaScript I couldn't get a sound to play more than once.
What sorcery is Scirra doing to make this so perfect?
I'm the developer of Construct 2, so I hope I'm sufficiently qualified to answer your question :)
HTML5 audio is indeed a mess, so I've gone to considerable lengths to try and make it bulletproof in Construct 2. Here's an outline of what I've done:
Use the Web Audio API
HTML5 audio appears designed for streaming music, so a HTML5 Audio object is kind of a heavyweight object. Playing 10 sounds a second with it like Space Blaster does can easily seize up the browser. On the other hand, the Web Audio API is a high-performance audio engine with routing, effects, and lightweight sound playback. It's perfect for games. Audio buffers and audio playback are separated, so you can have one data buffer and efficiently play it many times simultaneously, whereas some browsers are so buggy if you play a HTML5 sound a few times it re-downloads it each time! Since it was actually designed for games and such, you can happily play back tonnes of sound for ages and it will still hum along nicely. It can also use HTML5 audio as a sound source, although I only use HTML5 audio for things the user has designated as music tracks (since that's where you'd prefer to have streaming - typically everything else in the Web Audio API is fully downloaded before playing).
The Web Audio API is supported in Chrome, has also made it in to iOS 6+ (although it's muted until you try to play some audio in a touch event), Firefox are working on support, and it should be coming soon to Chrome for Android. So on these platforms audio will be significantly more reliable.
More info on HTML5Rocks and the proposed spec - you'll have to use the spec as the documentation for now, there's not much else out there.
Other browsers: implement an audio recycling system
The Web Audio API isn't yet supported everywhere, notably IE, which means you still need to crowbar HTML5 audio in to something that might work for games for backwards compatibility. The way to do this is to recycle audio objects.
The player's laser in Space Blaster fires 10 times a second - and that's not including any other sound effects! As I mentioned earlier, Audio is kind of a heavyweight object, so if you're doing new Audio() 10+ times a second, lo and behold, the browser eventually dies and audio starts glitching up. However, you can drastically reduce the number of Audio objects created by recycling them.
Basically, for each sound effect, keep a cache of every Audio object you've created with that sound as a source. Then, when playing a new sound, search the cache for any sound effects which have finished playing (the ended property will be true). If you find one, rewind it back to the beginning (currentTime = 0) and play() again. Otherwise, create a new Audio() object in the cache.
Since the player's laser sound effect is short, instead of creating 600 Audio objects a minute, there will just be 3 or 4 that it keeps cycling round. Some browsers unfortunately will still download it 4 times (Safari did this last I checked!) or have high latency the first time each sound buffer is played, but eventually the browser catches up since the same buffers are always being reused. So basically sound might be a bit weird for a few moments, then it clears up. We also use the HTML5 app cache so next time you play everything loads from disk, so subsequent plays should perform well immediately.
That's basically it. It's still a little dodgy on the first play with HTML5 audio, but every time after that should be fairly solid providing the browser has a sane audio implementation. There are a number of ways to try to clone Audio objects, but I've found that rewinding existing Audios works best.
There's no SoundManager or any Flash/plugin-based fallbacks at all since we make a point of being pure HTML5.
We also support audio APIs provided by PhoneGap and appMobi for mobile, since HTML5 audio on mobile isn't even worth trying. That makes a total of four audio APIs our audio engine wraps, and yes, it does look like a frankenstein mess, but it works.
That's it. I suppose our competitors will read this, but who cares when there's SO rep to be had???!!!1111

Best HTML5 Video Format for Safari on Window (or getting VP8 to play in Safari on Windows)

Here's the deal, through a huge series of events, I am stuck using Safari on Windows for video playback in HTML5.
I can't use any other browser, Chrome is out of the question, I must use Safari and it has to be on Windows for hardware compatibility.
The best format I've found is a h.264 Quicktime file, but I'm still getting some frames dropped and a bit of tearing.
The video is being played in 1920x1080 resolution and I have tried down-sampling to 720p, which causes noticeable quality loss and no noticeable gain in performance.
I'm looking for one of the following two as a solution:
- A plugin for Safari (that's Windows compatible) to use something other than Quicktime for HTML5 video. I've looked and the WebM (VP8) plugin is only for OSX.
- Any video format configuration that will decode faster in Quicktime on Windows. I've even tried ProRes to no avail, it's even slower than h.264.
Update...
Ogg Theora can be played in Quicktime with XiphQT, but I've ran into many issues when trying to playback various Ogg video formats.
With h.264, if you are using x264 (eg: Handbrake) to transcode/encode video, the following can be set in advanced mode:
cabac=0:ref=1:me=umh:bframes=0:weightp=0:8x8dct=0:trellis=0:subq=6:tune=fastdecode
These parameters:
ref=1, set the reference frame limit to 1, using more reference frames requires more processing.
bframes=0, disables b-frames, not sure on this but I believe that forces P-frame which are faster
cabac=0, disables CABAC compression, which would make the output smaller but take more processing
tune=fastdecode, set's the tune preset to optimize the output specifically for decoding
The other options I am not as sure of and have yet to find solid evidence on their impact towards decoding, let alone if they have any impact on decoding. For example, the "me" setting is for subpixel strength in the transcoding process, it has an effect on video quality, but understanding how frames change, it could have an impact (in some videos) on the decoding process. That is something I do not know, but am stating for a better understanding of where I am coming from.
More about these settings can be found here:
http://mewiki.project357.com/wiki/X264_Settings