Streaming MP3 with pre-processing - html

I was wondering if there's a way to get streaming audio data (mp3, flac, arbitrary chunks of data in any encoding) from a server, process it and start playing after first chunk?
Background: All of our data is stored in compressed chunks, with a LAMP environment serving the data. de-compression and re-assembling is done client-side with
xhr-downloads, indexeddb, filehandle and/or chromefs. All currently
available audio/video functionality requires the audio to be
downloaded completely (otherwise decodeAudioData fails) or requires an
URL to the source without giving me a chance to process incoming data
client-side.
I am looking for a solution to squeeze my processing into the browser "inbuild" streaming/caching/decoding functionality (e.g. audio/video tag). I don't want to pre-process anything server-side, I don't want flash/java-applets and I'd like to avoid aligning data client-side (e.g. process mp3)
Question: Would it be possible to dynamically "grow" a storage that a bloburl points to? In other words: Create a filehandle / fileentry, generate a blobURL, feed it into an audio tag and grow the file with more data ?
Any other ideas?
Michaela
Added: Well, after another day of fruitless attempts, I must confirm that there are two problems in dealing with streamed/chunked mp3|ogg data:
1) decodeAudioData ist just too picky about what's fed into it. Even if I pre-align ogg-audio (splitting at "OggS" boundaries) I am unable to get the second chunk decoded.
2) Even IF I would be able to get the chunks decoded, how would I proceed playing them without setting timers, start positions or other head-banging detours? Maybe the webAudioAPI developers should take a look at aurora/mp3 ?
Added: Sorry to be bitching. But my newest experiments with recording audio from the microphone are not very promising either. 400K of WAV for a few seconds of recording? I have taken a few minutes to write about my experiences with webAudioAPI and added a few suggestions - from a coders perspective: http://blog.michaelamerz.com/wordpress/a-coders-perspective-on-the-webaudioapi/

Checkout https://github.com/brion/ogv.js. Brion's project chunk-loads an .ogv video and outputs the raw data back to screen through WebAudio API and Canvas, playing in the original FPS/timing of the file itself.
There is a StreamFile object in the codebase that handles the chunked load, buffering and readout of the data, as well as an example of how it is being assembled for playback through WebAudio.
I actually emailed Brion directly for a little help and he got back to me within an hour. It wan't built for exactly your use case, but the elements are there and I highly recommend Brion who is very knowledgeable on file formats, encoding and playback.

You cannot use <audio> tag for this. However, what you can use
Web Audio API - allows you dynamically construct audio stream in JavaScript
WebRTC - might need the streamed data pre-processing on the server-side, not sure
Buffers are recyclable, so you can discard already played audio.
How you load your data (XHR, WebSockets, chuncked file downloads) really doesn't matter as long as you can stick the raw data to a JavaScript buffer.
Please note that there is no universal audio format the browsers can decode and your mileage with MP3 may vary. AAC (mpeg-4 audio) is more supported and it has best web and mobile coverage. You can also decode AAC using pure JavaScript in Firefox: http://jster.net/library/aac-js - you can also decode MP3 in pure JavaScript: http://audiocogs.org/codecs/mp3/
Note that localStorage supports only 5 MB data per origin without additional dialog boxes, so this severely limits storing audio on the client-side.

Related

How would you convert an image to JSON in Lua?

As you all have probably guessed, I've been trying to make an image parser in a heavily modified and sandboxed version of Lua known as "RBX.Lua" on the kids' gaming platform "ROBLOX".
It is limited and sandboxed heavily to protect from harming the site or engine.
Anyway, is there any way in normal Lua to convert an online image (.png, .jpg, etc) to JSON?
This will probably be closed due to being submissive, and I acknowledge that - I just want to see if there is any way to convert an image into JSON so it returns a JSON table of all the pixel data.
The problem is that you'll have a hard time reconstructing it inside Roblox if you intend to display it. There's no way to give raw image data to GUIs, you'd have to do some trickery and create a frame for every pixel of the image, which isn't very practical.
Otherwise try converting the image data to base64 and then back again. As it'd still be highly compressed you'd have to do the jpg or png decoding in lua. Painful.

individual JS file XMLHttpRequest vs combined gzip download

some stats before i can state the situation,
total JS code = 122 MB
minified = 36 MB
minified and gzip = 4 MB
I would like to get the entire 4 MB down in one shot (with a loading progress indicator on the page), uncompress them, but not parse them yet. We don't want the code expanding in browsers memory when a lot of it might not be required at this point. The parsing should happen when a script tag with the corresponding js file name is encountered.
Intention: faster one shot download of js files, but keeping the behaviour unchanged from the browser perspective.
Do any such solutions exist? Am I even thinking sane?
If yes, I know how to get the gzip, I would like to know how to keep them in the browser cache so that when a script tag is encountered the browser doesn't fire a XMLHttpRequest for it again.
The trick is to leverage HTTP caching directives. For a starter take a look at this. You should only need to fetch your JS code once because you can safely set the cache directive to instruct the browser to hold on to the JS file indefinitely (subject to space). Indefinitely in this context typically means the year 2035.
When you're ready to update all your browser-side caches with a new version of the JS file simply use a cache busting Query String. Any serial number or time and date will do, or a simple version number eg;
<script src="/js/myfile.js?v2.1"></script>
Some minification frameworks handle the cache-busting for you. A good technique for example is those that MD5 the contents and use that as the cache buster query string. That way, whenever your source JS changes the browser will request the new version (because the QS is embedded in your HTML script tag) and then cache for as long as possible again.
XMLHttpRequest will honour the caching primities you set.
In the other part of your question, I believe what you're asking is whether you can download one combined script file and then only refer to parts of it with individual script tags on the page. No - I don't believe you can do that. If you want to refer to individual files you would need to have a HTTP URL and caching directives for each piece of GZIPped content you want to use separately. However, you might find this is as much or maybe even more performant than one big file at first depending on how much parallelisation you can achieve.
A neat trick here is to pre-load a lot of what you need. Google have been doing this on the home page for years. Basically, they pre-load stacks of resources (images certainly, but possibly also JS). So whilst you're thinking about what search query to enter, they are already loading the cache up with stuff you'll want on the subsequent page.
So you could use XMLHttpRequest to fetch your JS files (without parsing them) well before you need them. Then by the time your <script/> tag refers to them they'll already be downloaded and you just need to parse them.
In addition to cirrus's point about using HTTP caching, you could break that still-pretty-large 4mb file down and only load them when that functionality is required.
It's more HTTP requests, but 4MB is a big hit in one go.
Suggest something like require.js to load in the appropriate files when they are needed:
http://requirejs.org/docs/start.html

When encoding to MP3 in ShineRecorder, encoding stops if volume is too high

Okay, basically we have the jRecorder implemented in our website which provides the ability for us to capture audio in WAV format.
Now, after the capture, we use the ShineMP3Encoder to convert that WAV to MP3 (to save on file size). This all works fine.
Numerous people have encountered an issue in that if the recorded audio levels are too high, MP3 encoding will completely stop and the file will become corrupt/short. When performing this with a WAV, it seems the WAV doesn't care how loud the recorded audio is and will happily play it back as is.
I appreciate my question is incredibly niche, but after banging my head against the wall for days, this is my only other option.
For what it's worth, this is the ActionScript that was use to record (it's bog standard ShineMP3 implementation):
//recorder.output is outputted from jRecorder
mp3Encoder = new ShineMP3Encoder(recorder.output);
mp3Encoder.addEventListener(Event.COMPLETE, mp3EncodeComplete);
mp3Encoder.start();
One possibility is that the encoding is running slower than the loop on those tracks, causing an error.
Try making the encoder run slower and see if that fixes the error.
In the start() method of ShineMP3Encoder.as replace
timer = new Timer(1000/30);
with
timer = new Timer(150);
That's line 37 in the current code base.

HTML5: accessing large structured local data

Summary:
Are there good HTML5/javascript options for selectively reading chunks of data (let's say to be eventually converted to JSON) from a large local file?
Problem I am trying to solve:
Some existing program locally and outputs a ton of data. I want to provide a browser-based interactive viewer that will allow folks to browse through these results. I have control over how the data is written out. I can write it all out in one big file, but since it's quite large, I can't just read the whole thing in memory. Hence, I am looking for some kind of indexed or db-like access to this from my webapp.
Thoughts on solutions:
1. Brute-force: HTML5 FileReader API has a nice slice() method for random access. So I could write out some kind of an index in the beginning of the file, use it to look up positions of other stored objects, and read them whenever they're needed. I figured I'd ask if there are already javascript libraries that do something like this (or better) before trying to implement this ugly thing.
2. HTML5 local database. Essentially, I am looking for an analog of HTML5 openDatabase() call that would open (a read-only) connection to a database based on a user-specified local file. From what I understand, there's no way to specify a file with a pre-loaded database. Furthermore, even if there was such a hack, it's not clear whether the local file format would be the same across browsers. I've seen the phonegap solution that populates the browser local database from SQL statements. I can do that too, but the data I am talking about is quite large (5-10GB): it will take a while to load, and such duplication seems rather pointless.
HTML5 does not sound like the appropriate answer for your needs. HTML5's focus is on the client side, and based on your description you're asking a lot out of the browsers, most likely more than they can handle.
I would instead recommend you look at a server-based solution to deliver the desired goal/results to the client view, something like Splunk would be a good product to consider.

Reverse engineering a custom data file

At my place of work we have a legacy document management system that for various reasons is now unsupported by the developers. I have been asked to look into extracting the documents contained in this system to eventually be imported into a new 3rd party system.
From tracing and process monitoring I have determined that the document images (mainly tiff files) are stored in a number of 1.5GB files. These files seem to be read from a specific offset and then written to a tmp file that is then served via a web app to the client, and then deleted.
I guess I am looking for suggestions as to how I can inspect these large files that contain the tiff images, and eventually extract and write them to individual files.
Are the TIFFs compressed in some way? If not, then your job may be pretty easy: stitch the TIFFs together from the 1.5G files.
Can you see the output of a particular 1.5G file (or series of them)? If so, then you should be able to piece together what the bytes should look like for that TIFF if it were uncompressed.
If the bytes don't appear to be there, then try some standard compressions (zip, tar, etc.) to see if you get a match.
I'd open a file, seek to the required offset, and then stream into a tiff object (ideally one that supports streaming from memory or file). Then you've got it. Poke around at some of the other bits, as there's likely metadata about the document that may be useful to the next system.