H.264 encode with B-frames frame count - h.264

Suppose i encode three YUV frames using simple H.264 and get a framepattern IPIP
But then allowing B-frames in encoding I get IBPBIPBP.....
These are clearly more frames than the simple one so do we play these frames at a higher rate to get original three frames?
In other words how is this related to actual time?

Coders generate B frames (if they have the capability) not to play tricks with the frame rate, but to encode those frames with fewer bits in the channel, or to get higher quality for the same channel bitrate than just IPPPPIPPPP.

Related

Progressive jpeg vs baseline jpegs

I have a web gallery where I display images which vary in file sizes and resolutions uploaded by users. Currently all the images are baseline. So I would like to know whether it would really have any significant impact if I converted them to progressive images. What are the advantages and tradeoffs on using progressive images.
The JPEG standard defines a variety of compression modes. Only three of these are in widespread use:
Baseline Sequential
Extended Sequential
Progressive
The only difference in the first to is in the number of tables allowed. Otherwise, they are encoded and decodes in exactly the same way.
JPEG divides images into Frames that are then divided into Scans. The modes above only permit one frame. The frame is the image. The scans are passes through the image data. A scan may be contain the data for one color component or it may be interleaved and contain data for multiple color components.
A grayscale sequential JPEG stream will have one scan.
A color sequential JPEG stream may have one or three scans.
JPEG takes 8x8 blocks of pixel data and applies the discrete cosine transform to that data. The 64 pixel data become 64 DCT coefficients. The first DCT coefficient is called the "DC" coefficient and the other 63 are called "AC" coefficients.
This is confusing terminology that drawing on the analogy with DC and AC current. The DC coefficient is analogous to the average pixel value of the block.
In sequential JPEG, the 64 coefficients in a block are encoded together (with the DC and AC coefficients encoded differently). In Progressive JPEG, the DC and the AC coefficients scans encode bitfields (of configurable size) within the coefficient. In theory, you could have a separate scan for each bit of each component.
Progressive JPEG is much more complicated to implement and use. If you are creating an encoder for sequential JPEG, you just need to give the caller the option to use interleaved or non-interleaved scans. For progressive JPEG your encoder needs a mechanism to the caller to determine how many scans and what bits should be encoded in each scan.
Progressive encoding can be slower than sequential because you have to make multiple passes over the data.
The speed issue in progressive decoding depends upon how it is done. If you decode the entire image at once, progressive is possibly marginally slower than sequential. If your decoder shows the image fading in as it processes the stream it will be much slower than sequential. Each time you update the display, you have to do the inverse DCT, upsampling, and color transformation.
On the other hand, it is possible to get much better compression using progressive JPEG with well-tuned scans.
There is no difference in quality between progressive and sequential
This book describes the processes:
https://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=asap_bc?ie=UTF8
The only difference is that progressive images are encoded in a way that allows browsers to display a rough preview of the image while it is still being downloaded, which becomes progressively better in quality until finally the download is complete. A baseline image will load from top to bottom, a progressive image will load from low-resolution to high-resolution.
For browsers which do not support progressive images, you won't see anything until the entire image has been loaded. (Nowadays all halfway modern browsers support progressive JPEGs.)
You can see animations of the difference in action, e.g. here: https://www.youtube.com/watch?v=TOc15-2apY0

Enhance text image in x264 encoding

I'm making use of x264 for remote desktop streaming. The goal is to achieve both low bitrate and high video quality within the computation budget. The current parameter set I used almost achieve this goal, but it fails in handling images with many texts (e.g. browsing websites scene). The text in image is blurred and affects the user experience.
I think it's the quantization in x264 that causes this. The quantization after DCT transform will eliminate high frequency sinals which mainly correspond to texts in image.
So, my question is how to improve the text quality in x264 encoding?
My idea: when the bitrate stays at a low level for a period of time,
set crf to be 0 (lossless);
encode current frame as an IDR frame and then send it;
recover the crf.
Also, a flag should be used to prevent resending when bitrate keeps low for a long time. I haven't try this method since I don't know how to mark a frame as an IDR frame manully and then encode it.
There might be the answer to your question: x264: the best low-latency video streaming platform in the world. This might also be related: Psy RDO

H264 Encoder / Decoder Writing from Scratch?

I think about writing a H264 encoder / decoder from scratch to be able to integrate the tech into a composite product. The first implementation would be written in Java.
I am used to implement scientific papers and such so I should at least bring in the basic math understanding.
What would be the best start and what should I focus on. I know that basically H264 is a mix of existing techniques.
What is are the most important things to implement?
Any idea about how much hours of work the first useful version will take.
The main objective is very fast while maintaining good compression.
How many hours? Maybe 20,000. The decoder specification alone is over 750 page document. And the decoder is the easy part.
After consideration :), I now will still use my own solution based on PNG and JPEG while not using motion vectors. I simply wrote a small solution I can compress parts of the image based on those and use filters to degrade the quality by applying sorts of blur or reduce the number of colors or even resolution. Works well enough for now.
If I need better quality I start looking at VP9 in more detail.
The only draw back is no hardware encoding support which might force me / us to look into H264 again.
Currently I can deliver 60+ frames for every days situation and scale down to 15 frames per second for video content along with bad quality but it is good enough to grab a security cam screen and see if something is wrong.

Is there a way to play a sound by specifying the frequency in Hz instead of musical note in SION?

I have been searching for ways to produce sound in as3 and found SION: https://sites.google.com/site/sioncenter/
Seems great, but I have one issue: I need to play tones at specific frequencies. The only options I find to play sounds is by specifying the frequency as musical notes (a, b, c, etc), but I need to play sounds specifying the frequency in Hz (30Hz, 100Hz, etc).
Is there a way to do this in SION?
If not, is there an alternative to SION? I need a sine wave generator.
You don't need SION for that. You can do it with the Sound class.
Here's a tutorial that shows exactly how to do it:
http://www.bit-101.com/blog/?p=2669

Can I start or seek to sub-second intervals with NetStream?

I am working on a Flash Video player and am implementing the ability to start a video at x time within the span of an FLV (served from FMS). I am able to start it x seconds into a stream without any issue using
netStream.play(source, startTime);
but as far as I can tell, it only supports seconds. I am looking to be able to give a start time (or even a seek time if that is supported) in milliseconds, or really anything more precise than whole seconds.
Anyone know of any way to achieve this even by monkey patching the fl classes?
Thanks,
Doug
Well the seek will allow you to seek by a number of frames, but the spec says that it only seeks to the closes I-Frame in the FLV. That will become a problem with any method you use, because the I-Frames are the only ones that actually contain the whole picture frame (here's the gist of that).