detect key-frame in TS with H264 codec - h.264

Is there an easy not horrifyingly complex way to detect key-frame in an H264 video stream wrapped in a Transport Stream?
Also, if extra previous packets needed for the decoding of the key-frame is there a way to find those as well?

There is no super simple way of finding the I frame. You have to read the transport stream packets of the AVC stream. Then you have to assemble the packetized elementry stream packets (PES), strip the PES header and then identify the NAL type 5.
So you will need an transport stream demuxer, find the beginning of PES packets and do minimal H.264 parsing.
For demuxing you could look at this source code: http://tsdemuxer.googlecode.com/svn/trunk/v1.0/tsdemux.cpp

Related

How to specify the framerate with Live555 RTSPServer?

I'm trying to put a raw H264 stream into RTSP with Live555 but I ran into a framerate problem.
I've learned that raw H264 streams do not have timestamps.
I'm using Live555's H264VideoFileServerMediaSubsession.
The problem is that my raw H264 stream is created with 20fps, but when a client (ffplay) plays the RTSP stream (with the h264 encapsulated), it defaults to 25fps.
Shall I set (force) the framerate in Live555's subsession, or should the raw H264 stream already contain it?

Why h264 over rtp doesn't contain NALU Start Codes

I read https://stackoverflow.com/a/24890903/12279500. , But when I looking h264 over rtp I recognize Sps,Pps,Idr .... But didn't see h264 start code before each NALU.
Why is that?
How many h264 formats there are , not include Annex B and AVCC
RTP has its own payload format described
in RFC 6184
As for how many formats there are, assume infinite because nothing is stopping anybody from creating more.
The start codes are used to split each NALU in byte stream because the NALU header doesn't have length info. But in rtp protocol, the NALUs are in payload field of the packet so it doesn't need the start codes. You only need to split each rtp packet.

Container format of this RTSP stream

I would like to know the container format of the following stream:
rtsp://8.15.251.47:1935/rtplive/FairfaxVideo3595
According to ffprobe, the container format is RTSP (format_long_name = RTSP input).
I also looked through the debug messages in VLC but I did not find any information on the stream's container format. What I DID find was that the codec was H264 and that VLC was using live555 to decode the stream. The media files live555 can support according to their website (http://www.live555.com/mediaServer/) makes me think that the above stream is an H264 elementary stream and is not in a container format. Am I correct?
Also, if the stream indeed does not have a container format, is it ok to say the container format is RTP (not RTSP as ffprobe says) because that's the protocol used to send the media data?
Thanks!
RTSP is more of a handshake done with the server, while RTP is the actual stream coming in once the handshake is done and you start streaming. RTSP URLs usually start with RTSP://... and the sequence of requests goes roughly something like
RTSP DESCRIBE, RTSP SETUP, RTSP PLAY, TEARDOWN
The response from the server to DESCRIBE will contain the information you need to know about the encoding of the file (H264, JPEG, etc.) while PLAY will cause the server to start sending the RTP stream. I suggest looking up RTSP SDP (session description protocol) for how to extract this information.
In case of streams, you are most likely correct, since the protocol used for streaming is usually RTP, and it tends to go hand in hand with RTSP (however I'm unsure whether or not we can apply the term container in the context of streaming)

Use Media Foundation H.264 encoder with live555

I want to create an H.264 RTSP stream using the live555 streaming library. For encoding the video frames, I want to use the H.264 encoder MFT. Encoding works using the basic processing model (I do not build a graph, but call the MFT manually). Streaming using a custom FramedSource source also seems to work in the sense that the programme is not crashing and the stream is stable in VLC player. However, the image is crippled - no colour, weird line patterns etc.
I assume that I pass the wrong data from the encoder into the streaming library, but I have not been able to find out what the library is actually expecting. I have read that the Microsoft H.264 encoder outputs more than one NAL in a sample. I further found that live555 requires a single NAL to be returned in doGetNextFrame. Therefore, I try to identify the individual NALs (What does this H264 NAL Header Mean? states that the header can be 3 or 4 bytes - I do not know where to get the information what MF uses, but the memory view of the debugger suggests 4 bytes):
for (DWORD i = 0; i < sampleLen; ++i) {
auto v = *reinterpret_cast<unsigned int *>(sampleData + i);
if (v == ::htonl(1)) {
nals.push_back(sampleData + i);
}
}
This piece of code usually identifies more than one item in one output sample from the MFT. However, if I copy the ranges found by this loop into the fTo output buffer, VLC does not show anything and stops after a few seconds. I also read somewhere that live555 does not want the magic number 0x00000001, so i tried to skip it. The effect on the client side is the same.
Is there any documentation on what live555 expects me to copy into the output buffer?
Does the H.264 encoder in Media Foundation at all produce output samples which I can use for streaming?
Do I need to split the output samples? How much do I need to skip once I have found a magic number (How to write a Live555 FramedSource to allow me to stream H.264 live suggests that I might need to skip more than the magic number, because the accepted answer only passes the payload part of the NAL)?
Is there any way to test whether the samples returned by the H.264 MFT in basic processing mode form a valid H.264 stream?
Here's how I did it MFWebCamRtp.
I was able to stream my webcam feed and view it in VLC. There was no need to dig into NALs or such. Each IMFSample from the Media Foundation H264 encoder contains a single NAL that can be passed straight to live555.

How to decode RTP/MP4A-LATM audio payload

I am working on an implementation of RTSP in J2ME to connect to Wowza. I have the RTSP part working, and the extraction of RTP packets. I am able to decode and display the h264 video stream.
I am having problems understanding how to create an appropriate audio stream to pass to a J2ME Player object.
As part of the RTSP Setup exchange I get the following information from SDP
m=audio 0 RTP/AVP 96
a=rtpmap:96 MP4A-LATM/24000/1
a=fmtp:96 profile-level-id=15;object=2;cpresent=0;config=400026103FC0
a=control:trackID=1
From this I know that I can expect RTP packets, containing MP4A-LATM format audio, and (most importantly) the mux config data is not present in line with the stream. The mux config data is 400026103FC0
I just don't know how to interpret the config string, and how I might configure a J2ME Player.