Manually feeding x264 with my own motion data? - h.264

I am trying to encode a stream using x264 (by feeding individual images), but what's unusual is that I already have some motion information for my frames. I know exactly which areas have been modified in each frame, and I know where motion has occurred in the frame.
Is there a way to feed x264 my own motion information? I'd like to give it motion vectors for given areas in the frame, and somehow tell it that certain areas in the frame are guaranteed to not have had any motion in them.
I think this might significantly improve the performance of the encoding (because I'm allowing the codec to completely skip the motion estimation phase), and should also somewhat increase quality in cases where the encoder's motion estimation algos might have missed the motion that actually occurred.
Do I need to modify the encoder in order to do this, or is this supported in the existing API?

Short answer: No you can't feed in your motion estimation data to x264.
Long Answer: IIRC, x264 does it's work by being fed in the raw frame, with no extra data. To accommodate the motion estimation data you have, you'd have to modify the x264 source code to accomplish this.
You may be able to find what you need within common\mvpred.c or encoder\me.c. I'm not sure how many of the x264 developers actually visit Stack overflow (I know one of their lead developers has an account here) but you can try talking to them through their usual channels on their IRC channel or on the doom9 forums.
doom9: http://forum.doom9.org/forumdisplay.php?f=77
doom10:http://doom10.org/index.php?board=5.0 IRC:
irc://irc.freenode.net/x264 and irc://irc.freenode.net/x264dev
Mailing list: http://mailman.videolan.org/listinfo/x264-devel
I wish I could give you more information, but unfortunately I'm not particularly well versed in the code base. The developers are always willing and able to help anyone wishing to work on x264 though.

Related

Get microphone input and read data

I have a square reader which has audio output that I'm trying to read via the web browser. I'm using getUserMedia to get the audio stream from the microphone, and the web audio API to get the data. My problem is that all the tutorials I've found mention using the data stream to show visuals for the audio stream. In my implementation, I need to be able to differentiate card-swipe sound with the background noise.
I have a couple basic questions:
What is FFT size? I know it's Fast Fourier Transform, but I don't know how or if it pertains to what I'm doing.
I understand buffer size but how do I know how big my buffer should be?
I've seen multiple things regarding left and right audio. I would assume the square reader would only have one channel for simplicity's sake, would this be correct?
And finally, the most important question, how do I combine it all together to make it so I read the frame that contains the square swipe data? I have this resource but I'm not getting data as 1s and 0s from the frequency or time data values of the stream analyser.
This link has more information about what needs to be done, but I can't do it until I get the data as bytes.
You will probably need to do this as a ScriptProcessor, not using the Analyser. The Analyser will not guarantee you won't drop data in between processing blocks. I'm not familiar with precisely how the Square reader transmits data; I presume it's frequency-shift-keying (FSK) - more at http://www.creativedistraction.com/demos/sensor-data-to-iphone-through-the-headphone-jack-using-arduino/, and interesting teardown at http://andybromberg.com/credit-cards/. An example of using ScriptProcessor at https://github.com/cwilso/volume-meter/.

Automatically Generated rhythm game Flash Action Script 3

Is it possible to create an automatically generated Rhythm game for Flash Action Script 3 ?
But not just randomly generated, generated from the notes of a song. Or is that something I have to do manually?
How would I go about doing either of these?
I am currently following this tutorial: http://www.flashgametuts.com/tutorials/as3/how-to-make-a-rhythm-game-in-as3-part-4/ so perhaps it can be made to fit around this? (Go to the final part and View Source to see the full thing)
Thanks!
Depending on what you mean by rhythm game, check out the computeSpectrum() function of the SoundMixer class: http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/media/SoundMixer.html#computeSpectrum()
There's an example of it working in the link, but basically what it does is take a snapshot of the current sound wave and puts normalised (-1 to 1) values in a ByteArray. What you do with those values is up to you - e.g. you might use them as a height field to generate terrain for example.
Repeat this every frame, and you get the gist
Welcome to SO!
First off, there is nothing already built in, to my knowledge. There may be something lurking around Google that someone else wrote, but you'd need to dig around for that (though I assume you already did.)
Generated from the notes of a song. Hmm, this will take some serious ingenuity and coding on your part. I'll point you in the right direction, but it is up to you to write the code. No one here will do it for you, but we'll happily help with specific problems in your code.
The crazy (yet potentially more fun) approach MAY BE to derive the data in a similar manner that an audio visualizer does...but I can't guarantee that will work. This would work best with MIDI-generated, single instrument songs. Here is a tutorial on visualizers.
A second approach may be to actually convert MIDI files directly. Again, I can't guarantee it will work, but it would theoretically be possible, seeing how MIDI files store data to begin with. Here's an answer on playing MIDI files, to get you started. Consider looking through their class.
However, the "easiest" approach would be to come up with some sort of system by which you store the note values for a song. You can manually enter the values in an array, or in a data file (such as XML) that you can load.
I put "easiest" in quotes because you'd have to account for a LOT of information - not just note values, but note duration, rhythm, and rests.
Anyway, those are just a few ideas to get you started. Good luck!

How can I analyze live data from webcam?

I am going to be working on self-chosen project for my college networking class and I just had a couple questions to help get me started in the right direction.
My project will involve creating a new "physical" link over which data, in the form of text, will be transmitted from one computer to another. This link will involve one computer with a webcam that reads a series of flashing colors (black/white) as binary and converts it to text. Each series of flashes will simulate a packet of data. I will be using OSX an the integrated webcam in a Macbook, the flashing computer will either be windows or osx.
So my questions are: which programming languages or API's would be best for reading live webcam data and analyzing the color of a certain area as well as programming and timing the flashes? Also, would I need to worry about matching the flash rate of the "writing" computer and the frame capture rate of the "reading" computer?
Thank you for any help you might be able to provide.
Regarding the frame capture rate, Shannon sampling theorem says that "perfect reconstruction of a signal is possible when the sampling frequency is greater than twice the maximum frequency of the signal being sampled". In other words if your flashing light switches 10 times per second, you need a camera of more than 20fps to properly capture that. So basically check your camera specs, divide by 2, lower the resulting a little and you have your maximum flashing rate.
Whatever can get the frames will work. If the light conditions in which the camera works are gonna be stable, and the position of the light on images is gonna be static then it is gonna be very very easy with checking the average pixel values of a certain area.
If you need additional image processing you should probably also find out about OpenCV (it has bindings to every programming language).
To answer your question about language choice, I would recommend java. The Java Media Framework is great and easy to use. I have used it for capturing video from webcams in the past. Be warned, however, that everyone you ask will recommend a different language - everyone has their preferences!
What are you using as the flashing device? What kind of distance are you trying to achieve? Something worth thinking about is how are you going to get the receiver to recognise where within the captured image to look for the flashes. Some kind of fiducial marker might be necessary. Longer ranges will make this problem harder to resolve.
If you're thinking about shorter ranges, have you considered using a two-dimensional transmitter? (given that you're using a two-dimensional receiver, it makes sense) and maybe have a transmitter that shows a sequence of QR codes (or similar encodings) on a monitor?
You will have to consider some kind of error-correction encoding, such as a hamming code. While encoding would increase the data footprint, it might give you overall better bandwidth given that you can crank up the speed much higher without having to worry about the odd corrupt bit.
Some 'evaluation' type material might include you discussing the obvious security risks in using such a channel - anyone with line of sight to the transmitter can eavesdrop! You could suggest in your writeup using some kind of encryption, a block cipher in CBC would do, but would require a key-exchange prior to transmission, so you could think about public key encryption.

Object & Shape Recognition from webcam

I need to create an application to get input from a webcam or camera connected to a computer and detect certain 3d objects.
I could do this from a .3ds file or something else? I'm not quite sure.
I am pretty sure it is possible with flash as3? I have been looking into openCV but i can't find any examples of this kind of thing.
Any help would be great, and if you have any further questions to understand more. please ask.
Thanks
Frank
EDIT: Ow and i need this to be a web based solution. so i was thinking of python, AS3 something along those lines.
To detect a "3D object" through an inherently 2D medium (a bitmap captured by a camera) is a very complex thing, and requires the detection of lit and shaded areas and how they move in respect to an often known light source. What you likely want to do instead (unless you have access to hardware with a depth buffer, e.g. the Kinect) is to analyze the 2D picture for 2D shapes, i.e. the silhouette of the object that you're looking for.
Have a look at ASFEAT and IN2AR, which are made by the same russian wunderkind as ASSURF, but actively developed an not using patented algorithms.
OpenCV (the port of which to Flash/AS3 is called Marilena) might do the trick, but it's not as optimized for Flash, and requires fairly complex descriptor files. I believe the only ones that are readily available are for face detection.
Your best bet is probably ASSURF, it won't do detection of 3D models but it will do 2D shapes.

Solar system computer model

I'm interested in building a 3D model of our solar system for web use (probably with AS3 and papervision) and have been looking into how I would go about encoding the planetary positions. My idea was to download the already calculated positions from NASA as calculating the positions myself seems a but overcomplicated. I'm not sure though whether I should use a helio centric or an earth centric encoding.
I wanted to know if there are any one with any experience in this. Which approach would be better? The NASA JPL website seems to have the positions of all the major bodies in our solar system as earth centric. I can see this becoming a problem later on though when adding Voyager and Mars Lander missions to the model?
Any feedback, comments and links are very welcome.
EDIT: I have a rough model running that uses heliocentric coordinates, but I haven't been able to find the coordinates for all planets in this format.
UPDATE:
I don't have a lot of detail to provide for know because I really don't know what I'm doing (from the space point of view). I wanted to get a handle on 3D programming, and am interested in space. The idea was that I would make a rough solar system simulator with at first all the planets and their orbiters (maybe excluding satellites at first). Perhaps include a news aggregator and some links to news/resources and so on. The general idea would be to allow people to click around and get super excited about going to the moon and Mars (for a starter).
In the long run I hopefully would be able to add in satellites and the moon missions (scroll back in time to the 70's and see the moon missions).
So to answer Arrieta's question the idea was not to calculate eclipses but to build an easy to approach, interactive space exploratorium, and learn some 3D and space related stuff on the way.
Glad you want to build your own simulator, but depending on what you want to do it may be far from an easy task. The simplest approach is as follows:
Download the JPL-DE405 ephemerides and the subroutines for retrieving the planetary positions (wrt Solar System Barycenter).
Request for timespan, compute the positions, and display them to the screen in a visually appealing manner
Done
Now, why would you want to do this? If you want to view the planet's orbits, that's it. You are done. If you want to compute geometric events (like eclipses, or line-of-sight, or ilumination) then you are in a whole different ball game. That's astronautics, and it is not simple.
Please be more specific. The distinction you make of "geocentric" or "heliocentric" coordinates really has no major difficulty involved. If you have all the states in heliocentric frame, you can compute the geocentric frame by simple vector subtraction. That's not the problem! The problems are a thousand more, but you need to be specific so we can provide more guidance.
JPL has provided high quality ephemerides for decades now, and we have a full team of brilliant people working on it. It is one of the most difficult things to get right!
Again, provide more details or check out other sources of information.
Please google "Solar System Simulator" (done here, at JPL) and see if it fulfills your needs.
Cheers.
It may be worth you checking out the ASCOM Platform (we also have a stack exchange site called ASCOM Answers).
The ASCOM Platform has several useful libraries for doing this sort of thing.
USNO NOVAS (Naval Observatory Vector Astrometry)
Kepler orbit engine
The USNO/NOVAS stuff was originally written in C and we've wrapped it up in .NET for ease of use from C# and VB.
As an added bonus (actually it's the raison d’être for ASCOM), the Platform makes it easy for you to control things like telescopes, it's used by Microsoft's World Wide Telescope for exactly that purpose. I tmight be a fun extension to your model to be able to point a telescope at things.
I'd probably start (well, I did a while back) with heliocentric coordinates and get a few of the planets up and running. But sooner or later you'll want to write a heliocentric-to-geocentric coordinate conversion routine, and its inverse. For some bodies, such as artificial satellites the geocentric coordinates will be easier to deal with.
You can use the astro-phys api to get a JSON formatted state vector for all the planets. It calculates them using JPL's de406 so it's pretty accurate and uses the solar system barycenter.
Although, if you know where the sun is relative to the earth and you're in a geocentric model, you can subtract the position of the sun from all of the bodies (including earth) to be heliocentric.