Get microphone input and read data - html

I have a square reader which has audio output that I'm trying to read via the web browser. I'm using getUserMedia to get the audio stream from the microphone, and the web audio API to get the data. My problem is that all the tutorials I've found mention using the data stream to show visuals for the audio stream. In my implementation, I need to be able to differentiate card-swipe sound with the background noise.
I have a couple basic questions:
What is FFT size? I know it's Fast Fourier Transform, but I don't know how or if it pertains to what I'm doing.
I understand buffer size but how do I know how big my buffer should be?
I've seen multiple things regarding left and right audio. I would assume the square reader would only have one channel for simplicity's sake, would this be correct?
And finally, the most important question, how do I combine it all together to make it so I read the frame that contains the square swipe data? I have this resource but I'm not getting data as 1s and 0s from the frequency or time data values of the stream analyser.
This link has more information about what needs to be done, but I can't do it until I get the data as bytes.

You will probably need to do this as a ScriptProcessor, not using the Analyser. The Analyser will not guarantee you won't drop data in between processing blocks. I'm not familiar with precisely how the Square reader transmits data; I presume it's frequency-shift-keying (FSK) - more at http://www.creativedistraction.com/demos/sensor-data-to-iphone-through-the-headphone-jack-using-arduino/, and interesting teardown at http://andybromberg.com/credit-cards/. An example of using ScriptProcessor at https://github.com/cwilso/volume-meter/.

Related

How to send a very large array to the client browser?

I have a very large array (20 million numbers, output of a sql query) in my MVC application and I need to send it to the client browser (it will be visualized on a map using webGL and the user is supposed to play with the data locally). What is the best approach to send the data? (Please just do not suggest this is a bad idea! I am looking for an answer to this specific question, not alternative suggestions)
This is my current code (called using ajax), but when array size goes above 3 millions I receive outofmemory exception. It seems the serialization (stringbuilder?) fails.
List<double> results = DomainModel.GetPoints();
JsonResult result = Json(results, JsonRequestBehavior.AllowGet);
result.MaxJsonLength = Int32.MaxValue;
return result;
I do not have much experience with web programming/javascript/MVC. I have been researching for the past 24 hours but did not get anywhere, so I need a hint/sample code to continue my research.
NO, NO, NO, you do not send that much information to the browser:
it results in a huge memory usage that will most likely crash the web-browser (and in fact in your case it does)
it takes a large amount of time to retrieve it, not everyone has a good internet connection, and even good connections can fluctuate over time
If you're building a map tool, then I'd recommend splitting the map into tiles and sending only the data corresponding to the portion of the map the user is currently working on. Also for larger zooms you can filter out data, as surely you can't place it all on the map.
Edit. A somewhat another alternative would be to ask your users to use machines with at least 16GB of RAM, or whatever RAM size is needed to deal with your huge data).

"Play" sounds into a .wav

I'm trying to make a program that can convert ORG files into WAV files directly. The ORG format is similar to MIDI, in the sense that it is a list of "instructions" about when and how to play specific instruments, and a program plays these instruments for it to create the song.
However, as I said, I want to generate a WAV directly, instead of just playing the ORG. So, in a sense, I want to "play" the sounds into a WAV. I do know the WAV format and have created some files from raw PCM samples, but this isn't as simple.
The sounds generated by the ORG come from a bunch of files containing WAV samples I have. They're mono, 8-bit samples should be played at 22050Hz. They're all under a second long, and the largest aren't more than 11KB. I would assume that to play them all after each other, I would simply put the samples into the WAV one after the other. It isn't that simple though, as the ORG can have up to 16 different instruments playing at once, and each note of each instrument also has a pan (i.e. a balance, allowing stereo sound). What's more, each ORG has its own tempo (i.e. milliseconds between each point a sound can be played), and some sounds may be longer than this tempo, which means that two sounds on the same instrument can overlap. For instance, a note plays on an instrument, 90 milliseconds later the same note plays on the same instrument, but the first not hasn't finished, hence the first note plays into the second.
I just thought to explain all of that to be sure the situation is clear. In any case, I'd basically like to know how I would go about converting or "playing" an ORG (or if you like, a MIDI (since they're essentially the same)) into a WAV. As I mentioned each note does have a pan/balance, so the WAV would also need to be stereo.
If it matters at all, I'll be doing this in ActionScript 3.0 in FlashDevelop. I don't need any code (as that would be asking someone to do the work for me), but I just want to know how I would go about doing this correctly. An algorithm or two may be handy as well.
First let me say AS3 is not the best language to do these kind of things. Super collider would be a better and easier choice.
But if you want to do it in AS3 here's a general approach. I haven't tested any of it, this is pure theory.
First, put all your sounds into an array, and then find a way of matching the notes from your midi file to a position in the array.
I don't know the format of midi in depth, but I know the smallest value is a tick, and the length of a tick depends on the BPM. Here's the formula to calculate a midi tick: Midi Ticks to Actual PlayBack Seconds !!! ( Midi Music)
Let's say your tick is 2ms in length. So now you have a base value. You can fill a Vector (like an Array but faster) with what happens at every tick. If nothing happens at a particular tick, then insert a null value.
Now the big problem is reading that Vector. It's a problem because the Timer class does not work at small values like 2ms. But what you can do is check the ellapsed time in ms since the app started using getTimer(). You can have some loop that will check the ellapsed time, and whenever you have 2ms more, you read the next index in the Vector. If there are notes on that index, you play the sounds. If not you wait for the next tick.
The problem with this, is that if a loop goes on for more than 15 seconds (I'm not sure of that value) Flash will think the program is not responding and will kill it. So you have to take care of that too, ending the loop and opening a new one before Flash kills your program.
Ok, so now you have sounds playing. You can record the sounds that flash is making (wavs, mp3, mic) with a library called Standing Wave 3.
https://github.com/maxl0rd/standingwave3
This is very theoretical... and I'm quite sure depending on the number of sounds you want to play you can freeze your program... but I hope it will help to get you going.

How can I analyze live data from webcam?

I am going to be working on self-chosen project for my college networking class and I just had a couple questions to help get me started in the right direction.
My project will involve creating a new "physical" link over which data, in the form of text, will be transmitted from one computer to another. This link will involve one computer with a webcam that reads a series of flashing colors (black/white) as binary and converts it to text. Each series of flashes will simulate a packet of data. I will be using OSX an the integrated webcam in a Macbook, the flashing computer will either be windows or osx.
So my questions are: which programming languages or API's would be best for reading live webcam data and analyzing the color of a certain area as well as programming and timing the flashes? Also, would I need to worry about matching the flash rate of the "writing" computer and the frame capture rate of the "reading" computer?
Thank you for any help you might be able to provide.
Regarding the frame capture rate, Shannon sampling theorem says that "perfect reconstruction of a signal is possible when the sampling frequency is greater than twice the maximum frequency of the signal being sampled". In other words if your flashing light switches 10 times per second, you need a camera of more than 20fps to properly capture that. So basically check your camera specs, divide by 2, lower the resulting a little and you have your maximum flashing rate.
Whatever can get the frames will work. If the light conditions in which the camera works are gonna be stable, and the position of the light on images is gonna be static then it is gonna be very very easy with checking the average pixel values of a certain area.
If you need additional image processing you should probably also find out about OpenCV (it has bindings to every programming language).
To answer your question about language choice, I would recommend java. The Java Media Framework is great and easy to use. I have used it for capturing video from webcams in the past. Be warned, however, that everyone you ask will recommend a different language - everyone has their preferences!
What are you using as the flashing device? What kind of distance are you trying to achieve? Something worth thinking about is how are you going to get the receiver to recognise where within the captured image to look for the flashes. Some kind of fiducial marker might be necessary. Longer ranges will make this problem harder to resolve.
If you're thinking about shorter ranges, have you considered using a two-dimensional transmitter? (given that you're using a two-dimensional receiver, it makes sense) and maybe have a transmitter that shows a sequence of QR codes (or similar encodings) on a monitor?
You will have to consider some kind of error-correction encoding, such as a hamming code. While encoding would increase the data footprint, it might give you overall better bandwidth given that you can crank up the speed much higher without having to worry about the odd corrupt bit.
Some 'evaluation' type material might include you discussing the obvious security risks in using such a channel - anyone with line of sight to the transmitter can eavesdrop! You could suggest in your writeup using some kind of encryption, a block cipher in CBC would do, but would require a key-exchange prior to transmission, so you could think about public key encryption.

Manually feeding x264 with my own motion data?

I am trying to encode a stream using x264 (by feeding individual images), but what's unusual is that I already have some motion information for my frames. I know exactly which areas have been modified in each frame, and I know where motion has occurred in the frame.
Is there a way to feed x264 my own motion information? I'd like to give it motion vectors for given areas in the frame, and somehow tell it that certain areas in the frame are guaranteed to not have had any motion in them.
I think this might significantly improve the performance of the encoding (because I'm allowing the codec to completely skip the motion estimation phase), and should also somewhat increase quality in cases where the encoder's motion estimation algos might have missed the motion that actually occurred.
Do I need to modify the encoder in order to do this, or is this supported in the existing API?
Short answer: No you can't feed in your motion estimation data to x264.
Long Answer: IIRC, x264 does it's work by being fed in the raw frame, with no extra data. To accommodate the motion estimation data you have, you'd have to modify the x264 source code to accomplish this.
You may be able to find what you need within common\mvpred.c or encoder\me.c. I'm not sure how many of the x264 developers actually visit Stack overflow (I know one of their lead developers has an account here) but you can try talking to them through their usual channels on their IRC channel or on the doom9 forums.
doom9: http://forum.doom9.org/forumdisplay.php?f=77
doom10:http://doom10.org/index.php?board=5.0 IRC:
irc://irc.freenode.net/x264 and irc://irc.freenode.net/x264dev
Mailing list: http://mailman.videolan.org/listinfo/x264-devel
I wish I could give you more information, but unfortunately I'm not particularly well versed in the code base. The developers are always willing and able to help anyone wishing to work on x264 though.

Syncing two AS3 NetStreams

I'm writing an app that requires an audio stream to be recording while a backing track is played. I have this working, but there is an inconsistent gap in between playback and record starting.
I don't know if I can do anything to make the sync perfect every time, so I've been trying to track what time each stream starts so I can calculate the delay and trim it server-side. This also has proved to be a challenge as no events seem to be sent when a connection starts (as far as I know). I've tried using various properties like the streams' buffer sizes, etc.
I'm thinking now that as my recorded audio is only mono, I may be able to put some kind of 'control signal' on the second stereo track which I could use to determine exactly when a sound starts recording (or stick the whole backing track in that channel so I can sync them that way). This leaves me with the new problem of properly injecting this sound into the NetStream.
If anyone has any idea whether or not any of these ideas will work, how to execute them, or some alternatives, that would be extremely helpful! Been working on this issue for awhile
The only thing that comes to mind is to try and use metadata, flash media streams support metadata and the onMetaData callback. I assume you're using flash media server for the audio coming in and to record the audio going out. If you use the send method while your streaming the audio back to the server, you can put the listening audio track's playhead timestamp in it, so when you get the 2 streams back to the server you can mux them together properly. You can also try encoding the audio that is streamed to the client with metadata and try and use onMetaData to sync them up. I'm not sure how to do this, but a second approach is to try and combine the 2 streams together as the audio goes back so that you don't need to mux them later, or attach it to a blank video stream with 2 audio tracks...
If you're to inject something into the NetStream... As complex as SOUND... I guess here it would be better to go with Socket instead. You'll be directly reading bytes. It's possible there's a compression on the NetStream, so the data sent is not raw sound data - some class for decompressing the codec there would be needed. When you finally get the raw sound data, add the input in there, using Socket.readUnsignedByte() or Socket.readFloat(), and write back the modified data using Socket.writeByte(), or Socket.writeFloat().
This is the alternative with injecting the back into the audio.
For syncing, it is actually quite simple. Even though the data might not be sent instantly, one thing still stays the same - time. So, when user's audio is finished, just mix it without anything else to the back track - the time should stay the same.
IF the user has slow internet DOWNLOAD, so that his backtrack has unwanted breaks - check in the SWF if the data is buffered enough to add the next sound buffer (usually 4096 bytes if I remember correctly). If yes, continue streaming user's audio.
If not, do NOT stream, and start as soon as the data catches back on.
In my experience NetStream is one of the most inaccurate and dirty features of Flash (NetStream:play2 ?!!), which btw is quite ironic seeing how Flash's primary use is probably video playback.
Trying to sync it with anything else in a reliable way is very hard... events and statuses are not very straight forward, and there are multiple issues that can spoil your syncing.
Luckily however, netStream.time will tell you quite accurately the current playhead position, so you can eventually use that to determine starting time, delays, dropped frames, etc... Notice that determining the actual starting time is a bit tricky though. When you start loading a netStream, the time value is zero, but when it shows the first frame and is waiting for the buffer to fill (not playing yet) the time value is something like 0.027 (depends on the video), so you need to very carefully monitor this value to accurately determine events.
An alternative to using NetStream is embedding the video in a SWF file, which should make synchronization much easier (specially if you use frequent keyframes on encoding). But you will lose quality/filesize ratio (If I remember correctly you can only use FLV, not h264).
no events seem to be sent when a connection starts
sure there does.. NetStatusEvent.NET_STATUS fires for a multitude of reasons for NetConnections and Netstreams, you just have to add a listener and process the contents of NET_STATUS.info
the as3 reference docs here and you're looking for NET_STATUS.info