ffmpeg txt from audio levels - csv

Regards community,
I want to use ffmpeg to generate a file (txt, csv) from audio values.
Any idea?
I use this code to generate the audio levels:
ffplay -f lavfi "amovie=input.aac, asplit [a][out1]; [a] showvolume=f=1:b=4:w=800:h=70 [out0]"
Thank you a lot

The command below will generate a CSV format where the first column represents the audio frame time in seconds, the second column the overall RMS dB volume for that frame, the 3rd column RMS volume for the first channel and the last column the RMS volume for the 2nd channel.
ffprobe -f lavfi -i amovie=input.aac,astats=metadata=1:reset=1 -show_entries frame=pkt_pts_time:frame_tags=lavfi.astats.Overall.RMS_level,lavfi.astats.1.RMS_level,lavfi.astats.2.RMS_level -of csv=p=0
Output:
Duration: N/A, start: 0.023220, bitrate: N/A
Stream #0:0: Audio: pcm_f64le, 44100 Hz, stereo, dbl, 5644 kb/s
0.023220,-inf,-inf,-inf
0.046440,-inf,-inf,-inf
0.069660,-inf,-inf,-inf
0.092880,-27.330401,-22.685612,-24.414572
0.116100,-21.141091,-18.986082,-19.931269
0.139320,-20.955719,-18.549085,-19.587788
0.162540,-20.938002,-18.198237,-19.355561
0.185760,-19.852306,-20.032553,-19.941494
0.208980,-20.495281,-21.684953,-21.049508
The reset determines how often the stats are calculated. I've set the value to 1 i.e. calculated for each audio frame in isolation.

Related

usage of start code for H264 video

I have general question about the usage of start code (0x00 0x00 0x00 0x01) for the H264 video. I am not clear about the usage of this start code as there is no reference in the RTP RFCs that are related to H264 video. But I do see lot of reference in the net and particularly in the stackoverflow.
I am confused as I see one client doesn't have this start code and another client is using this start code. So, I am looking for a specific answer where this start code should be used and where I shouldn't.
KMurali
There are two H.264 stream formats and they are sometimes called
Annex B (as found in raw H.264 stream)
AVCC (as found in containers like MP4)
An H.264 stream is made of NALs (a unit of packaging)
(1) Annex B : has 4-byte start code before each NAL unit's bytes [x00][x00][x00][x01].
[start code]--[NAL]--[start code]--[NAL] etc
(2) AVCC : is size prefixed (meaning each NALU begins with byte size of this NALU)
[SIZE (4 bytes)]--[NAL]--[SIZE (4 bytes)]--[NAL] etc
Some notes :
The AVCC (MP4) stream format doesn't contain any NALs of type SPS, PPS or AU delimter. Since that specific information is now placed within MP4 metadata.
The Annex B format you'll find in MPEG-2 TS, RTP and some encoders default output.
The AVCC format you'll find in MP4, FLV, MKV, AVI and such A/V container formats.
Both formats can be converted into each other.
Annex B to MP4 : Remove start codes, insert length of NAL, filter out SPS, PPS and AU delimiter.
MP4 to Annex B : Remove length, insert start code, insert SPS for each I-frame, insert PPS for each frame, insert AU delimiter for each GOP.

Video encoded with ffmpeg not playing in Chrome

I've been trying just about every single permutation of options on ffmpeg to try to get a transcoded video to display on Chrome - OSX 39.0.2171.71 (64-bit) - so far nothing has worked.
The settings I am currently using look like:
/usr/local/Cellar/ffmpeg/2.4.3/bin/ffmpeg -i source.m4v -vcodec libx264 -pix_fmt yuv420p -profile:v baseline -level 3.0 -preset slower -crf 23 -vf scale=640:360 target.mp4
but I've tried various options from various other answers with no success.
The video-js demo video works fine, so it must be possible somehow. Here's a dump of the encoded video:
*** General Parameters ***
- Name: test-1 (2).mp4
- Container: MP4 - QuickTime
- Size: 3.45 MB
- Duration: 32s 299ms
- Bitrate: 856 Kbps
*** Video Track Parameters ***
- Format: H.264/MPEG-4 AVC
- Bitrate: Max.: --- / Average: 721 Kbps / Min.: ---
- Frame rate (fps): Max.: --- / Average: 30.000 / Min.: ---
- Encoding profile: Baseline#L3.0
- Image size: 640*360
- Pixel Aspect Ratio: Undefined
- Display Aspect Ratio: 16:9
- Interlacing: Progressive
*** First Audio Track Parameters ***
- Format: AAC - MPEG-4 audio
- Bitrate: 128 Kbps
- Resolution: Undefined
- Rate: 44.1 KHz
- Channel(s): 2 (stereo)
- Position: Front: L R
Turns out that the metadata is in the wrong place. Adding -movflags +faststart to the ffmpeg parameter list makes it start working.

Create MDAT from I-frame/P-frame fragments

I am creating an MPEG-4 file from H.264 stream. H.264 stream comes in NAL format (EG: 0,0,0,1,67,...,0,0,1,68,...).
Each video frame is transmitted as multiple I-frame/P-frame fragments. For eg: Frame 1 contains approximately 80 I-frame fragments and Frame 2 contains around 10 P-frame fragments.
I understand that MDAT atom of the MPEG-4 file is supposed to contain H.264 streams in NAL format.
I would like to know how these fragments can be converted to a single I-frame before I can put it into MDAT atom of MPEG-4.
I do not want to use any libraries.
Thanks for your help.
You are going to convert H.264 Annex B NAL stream into MP4 file packets. In order to do that you need to:
Split your original file into NAL units ( 00 00 00 01 yy xx xx ... );
Locate frame boundaries: each H.264 frame typically contains a number of slices and optionally one of these: SPS, PPS, SEI. You'll need to parse the 'yy' octet above to determine what kind of NAL unit you are looking at. Now, in order to know the boundary of a frame you will need to parse the first part of each slice called 'SliceHeader' and compare 'frame_number' of consequitive slices.
As soon as you know the frame boundaries you can form MP4 packets. Each packet will contain exactly one frame and and NAL units in this format:
l1 l1 l1 l1 yy xx xx ...
l2 l2 l2 l2 yy xx xx ...
so basically your replace each delimeter '00 00 00 01' with integer holding the length of this NAL unit.
Then in order to obtain correct MP4 header you'll need to use MP4 muxer and populate correct 'AvcC' atom inside of a sample entry of your video track.
This is a rather tedious process but if you want to get into specifics you can study the source code of JCodec ( http://jcodec.org ): org.jcodec.samples.transcode.TranscodeMain , org.jcodec.containers.mp4.MP4Muxer

Creating "holes" in a binary H.264 bitstream

I am trying to simulate data loss in a video by selectively removing H.264 bitstream data. The data is simply a raw H.264 file, which is essentially a binary file. My plan is to delete 2 bytes for every 100 bytes so as to achieve a 2% loss. Eventually, I will be testing the effectiveness of some motion vector error concealment algorithms.
It would be nice to be able to do this in a Unix environment. So far, I have investigated the command xxd for a bit and I am able to save a specific portion of a hex dump from a binary file. For example, to skip the first 50 bytes of a binary bitstream and save the subsequent 100 bytes, I would do the following:
xxd -s 50 -l 100 inputBinaryFile | xxd -r > outputBinaryFile
I'm hoping to incorporate something similar into a bash script that will automatically delete the last 2 bytes per 100 bytes. Furthermore, I would like the script to skip everything before the second occurrence of the sequence 00 00 01 06 05 (first P-frame SEI start code).
I don't know how much easier this could be in a C-based language but my programming skills are quite limited and I would rather deal with just Linux programming for now if possible.
Thanks.

How to encode a stream of RGBA values to video?

More specifically:
I have a sequence of 32 bit unsigned RGBA integers for pixels- e.g. 640 integers per row starting at the left pixel, 480 rows per frame starting at the top row, repeat for n frames. Is there an easy way to feed this to ffmpeg (or some other encoder) without first encoding it to a common image format?
I'm assuming ffmpeg is the best tool for me to use in this case, but I'm open to suggestions (the output video format doesn't matter too much).
I know the documentation would enlighten me if I just knew the right keywords... In case I'm asking the wrong question, here's what I'm trying to do at the highest level:
I have some Actionscript code that draws and animates on the display tree, and I've wrapped it in an AIR application that draws BitmapData frame-by-frame. AIR has proved to be woefully inefficient at directly encoding this output- the best I've managed is a few frames per second, and I need to render at least 15 fps, preferably more like 100 fps, which I get out of ffmpeg when I feed it PNG images (AIR can take 1+ seconds to encode one 640x480 png... appalling). Instead of encoding inside AIR I can send the raw byte data out to an encoder or to disk as fast as it's rendered.
If you're wondering why I'm using Actionscript to render an animation or why it has to be encoded quickly, don't. Suffice it to say, the frames are computed at execution time (not stored as an animation in a .swf file, for example), I have a very large amount of video to create and limited time to do so, and using something other than Actionscript to produce the frames is not an option.
The solution I've come up with is to use x264 instead of ffmpeg.
For testing purposes, I saved frames as files: 00.bin, 01.bin, .. nn.bin, containing 640x480x4 ARGB pixel values. The command I used to verify that the approach is feasible is the following horrible hack:
cat *.bin | \
perl -e 'while (sysread(STDIN,$d,4)){print pack("N",unpack("V",$d));}' | \
x264 --demuxer raw --input-csp bgra --fps 15 --input-res 640x480 --qp 0 \
--muxer flv -o out.flv -
The ugly perl snippet in there is a hack to swap four-byte endian order, since x265 can only take BGRA and my test files contained ARGB.
In a nutshell,
Actionscript renders ARGB values into ByteArray
swap the endian to BGRA
pipe it to x264: raw demuxer, bgra colorspace, specify fps/w/h/quality
??
profit.