http/2 dynamic table size update clarification - binary

In the http/2 protocol we see the following statement for dynamic table size update:
SETTINGS_HEADER_TABLE_SIZE (0x1): Allows the sender to inform the
remote endpoint of the maximum size of the header compression
table used to decode header blocks, in octets. The encoder can
select any size equal to or less than this value by using
signaling specific to the header compression format inside a
header block (see [COMPRESSION]). The initial value is 4,096
octets.
The initial size for both encoder and decoder is 4096 bytes according to RFC.
In SETTINGS frame in wireshark, i can see the new table size passed to the ENDPOINT ( google.com in this case )
0000 00 00 12 04 00 00 00 00 00 **00 01 00 01 00** 00 00
0010 04 00 02 00 00 00 05 00 00 40 00
00 01 00 01 00 is a pattern for SETTINGS_HEADER_TABLE_SIZE = 65536
What i can't understand does it actually tells the endpoint that the dynamic table used to decode the headers from this ENDPOINT inside browser is 65536 bytes long, or does it tell the ENDPOINT that ENDPOINT dynamic table size should be 65536 ?
And reversed, i assume that the ENDPOINT must sent SETTINGS_HEADER_TABLE_SIZE to tell the browser its dynamic table used for decoding the headers from ENDPOINT but i don't see that option sent back by the ENDPOINT. Can someone explain this?
Also there is a signal for dynamic table size update, mentioned in RFC, which is sent inside the HEADERS frame.
A dynamic table size update starts with the '001' 3-bit pattern,
followed by the new maximum size, represented as an integer with a
5-bit prefix (see Section 5.1).
The new maximum size MUST be lower than or equal to the limit
determined by the protocol using HPACK. A value that exceeds this
limit MUST be treated as a decoding error. In HTTP/2, this limit is
the last value of the SETTINGS_HEADER_TABLE_SIZE parameter (see
Section 6.5.2 of [HTTP2]) received from the decoder and acknowledged
by the encoder (see Section 6.5.3 of [HTTP2]).
There is this line received from the decoder and acknowledged by the encoder, so does this signal is sent to limit the encoding dynamic table size ? I comletely lost, and it is not obvious from wireshark captures how this is handled correctly
UPDATE
Ok, i looked more on the logs of wireshark from firefox on the site of walmart.com ( since there is a lot of headers involved). Sometimes firefox sends the dynamic table size update signal in the headers frame, with the size smaller then the initial SETTINGS_HEADER_TABLE_SIZE sent by firefox on the beginning of connection. I wrote a firefox dynamic table on a paper and shrink it as if i expected the dynamic table size update would do. Turns out that shrinking it to smaller size produce incorrect headers.. So apparently the dynamic table size update affect only remote endpoint.. ( well i guess it is ). I also looked up on nigthttp and a c# implementation, and there they actually shrink the encoder table size, while sending dynamic table size update signal. I get a feeling that everyone have a complete different implementation for this protocol.. it's a complete nightmare to understand.

As you figured out there are multiple things which indicate the table size:
The maximum table size setting (as indicated in a HTTP/2 SETTINGS frame)
The actual used table size - which is encoded in a HEADERS frame in HPACK format
If we only look at the headers which are flowing from the client (browser) to a server we will see the following things going on:
As long as nobody has an information from the remote side the default values are used, which means the client expects that the server supports a maximum table size of 4kB (SETTINGS_HEADER_TABLE_SIZE) and it also uses this size as the initial table size.
The server can optionally inform the client through the HTTP/2 SETTINGS frame that it only supports smaller header tables. This information is contained in the SETTINGS_HEADER_TABLE_SIZE field, a SETTINGS frame which is sent from the server to the client.
The client can adjust the actually used [dynamic] header table size through the Dynamic Table Size Update in a HEADERS frame. This will always indicate the table size that is actually used on the encoder side - and which therefore also must be set on decoder side to be able to retrieve the same data. The sending side is free to set the actual used table size to anything between 0 and the maximum size that is supported by the remote side (in SETTINGS_HEADER_TABLE_SIZE). A typical strategy for implementations is to to always shrink the used table size when it's currently larger than what the remote supports. And to increase the table size when the remote supports bigger tables and the implementation also still can go bigger. There might be some race conditions where one end already set and used a larger table size than what the remote side actually supports, e.g. because the SETTINGS frame which indicates the lower limit was not received before a client encoded the first pair of headers. In that case the remote side might detect the use of a too big table size and reset the connection. To avoid this situations both sides of the connection should in reality at least support the default table size of 4kB, and ideally only increase the limit dynamically and never shrink it.
Now I mentioned that one pair of max. table size settings and actual table size settings is used for transmitting HEADERS from one end of the connection (client) to the other (server). But in total there is also a second pair of both, for the headers which are sent from the server to the client. For this case the client/browser also indicates in a SETTINGS frame how big the max. header table is that it supports and the server sends the size of the actual header table that is used.

Related

Questions about size of Json data sent by Socket.io/Node.js?

Will tab/space like the below example increases the size of data sent?
return {
id:self.id,
username:self.username,
score:self.score,
level:self.level
};
vs
return {id:self.id,username:self.username
score:self.score,level:self.level};
Is there any size difference between 0/1 and true/false for Json?
Is there a size difference between "11" (string) and 11 (double)?
The Json will be sent 10 times every second with socket.emit of Socket.io.
There should be no difference in the size based on the server side object format. Any size would be type dependent and how your serializer actually converts the object properties.
Is there any size difference between 0/1 and true/false for Json?
There might be a dependence on the server type but number vs Boolean results would be within a string (serialized) as "mybool:true,mynumber:1"
So, if you were to optomize for size "a:true,b:1"` note the NAME is smaller so the serialzed content would be.
Is there a size difference between "11" (string) and 11 (double)? Similar to the second example "mysuperlongnameisgreat:"11",mysuperlongnumbername:11" vs {"a":"11","b":11}" thus the number is smaller by excluding those two quotes.
All that being said considering the total processing (IF speed is an issue) this has to be deserialized into a JavaScript object on the client side so IF you are using the number as a number it will need to parse that at some point to the proper type and thus may be more of an impact than serialized content size.
Note that with short names, you WILL have a negative impact on maintenance as it is much less intuitive to maintain short non-descriptive names than longer ones.
Using your example, it would be "smaller" to do (assuming a strongly typed server side)
return {
d:self.id,
n:self.username,
s:self.score,
v:self.level
};

Reading / Computing Hex received over RS232

I am using Docklight Scripting to put together a VBScript that communicates with a device via RS232. All the commands are sent in Hex.
When I want to read from the device, I send a 32-bit address, a 16-bit read length, and an 8-bit checksum.
When I want to write to the device, I send a 16-bit data length, the data, followed by an 8-bit checksum.
In Hex, the data that is sent to the device is the following:
AA0001110200060013F81800104D
AA 00 01 11 02 0006 0013F818 0010 4D
(spaced for ease of reading)
AA000111020006 is the protocol header, where:
AA is the Protocol Byte
00 is the Source ID
01 is the Dest ID
11 is the Message Type
02 is the Command Byte
0006 is the Length Byte(s)
The remainder of the string is broken down as follows:
0013F818 is the 32-bit address
0010 is the 16 bit read length
4D is the 8-bit checksum
If the string is not correct, or the checksum is invalid the device replies back with an error string. However, I am not getting an error. The device replies back with the following hex string:
AA0100120200100001000000000100000000000001000029
AA 01 00 12 02 0010 00010000000001000000000000010000 29
(spaced for ease of reading)
Again, the first part of the string (AA00011102) is a part of the protocol header, where:
AA is the Protocol Byte
01 is the Source ID
00 is the Dest ID
12 is the Message Type
02 is the Command Byte
The difference between what is sent to the device, and what the device replies back with is that the length bytes is not a "static" part of the protocol header, and will change based of the request. The remainder of the string is broken down as follows:
0010 is the Length Byte(s)
00010000000001000000000000010000 is the data
29 is the 8-bit Check Sum
The goal is to read a timer that is stored in the NVM. The timer is stored in the upper halves of 60 4-byte NVM words.
The instructions specify that I need to read the first two bytes of each word, and then sum the results.
Verbatim, the instructions say:
Read the NVM elapsed timer. The timer is stored in the upper halves of 60 4-byte words.
Read the first two bytes of each word of the timer. Read the 16 bit values of these locations:
13F800H, 13F804H, 13808H, and continue to 13F8ECH.
Sum the results. Multiply the sum by 409.6 seconds, then divide by 3600 to get the results in hours.
My knowledge of bits, and bytes, and all other things is a bit cloudy. The first thing I need to confirm is that I am understanding the read protocol correctly.
I am assuming that when I specify 0010 as the 16 bit read length, that translates to the 16-bit values that the instructions want me to read.
The second thing I need to understand a little better is that when it tells me to read the first two bytes of each word, what exactly constitutes the first two bytes of each word?
I think what confuses me a little more is that the instructions say the timer is stored in the upper half of the 4 byte word (which to me seems like the first half).
I've sat with another colleague of mine for a day trying to figure out how to make this all work, and we haven't had any consistent results with our trials.
I have looked on the internet to find something that would explain this better in the context being used.
Another worry is that the technical data I am using to accomplish this project isn't 100% accurate in their instructions, and they have conflicting information or skipping information throughout their publication (which is probably close to 1000 pages long).
What I would really appreciate is someone who has a much better understanding of hex / binary to review the instructions I've posted, and provide some feedback on my interpretation of the instructions provided, and provide any information.

Zero-padded h264 in mdat

I'd like to do some stuff with h.264 data recorded from Android phone.
My colleague told me there should be 4 bytes right after mdat wich specifies NALU size, then one byte with NALU metadata and then the raw data, and then (after NALU size), another 4 bytes with another NALU size and so on.
But I have a lot of zeros right after mdat:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0e00000000000000000000000000000000000000000000000000000000000000
8100000000000000000000000000000000000000000000000000000000000000
65b84f0f87fa890001022e7fcc9feef3e7fabb8e0007a34000f2bbefffd07c3c
bfffff08fbfefff04355f8c47bdfd05fd57b1c67c4003e89fe1fe839705a699d
c6532fb7ecacbfffe82d3fefc718d15ffffbc141499731e666f1e4c5cce8732f
bf7eb0a8bd49cd02637007d07d938fd767cae34249773bf4418e893969b8eb2c
Before mdat atom are just ftyp mp42, isom mp42 and free atoms. All other atoms (moov, ...) are at the end of the file (that's what Android does, when it writes to socket and not to the file). But If necessary, I've got PPS and SPS from other file with same camera and encoder settings recorded just a seond before this, just to get those PPS and SPS data.
So how exactly can i get NALUs from that?
You can't. The moov atom contains information required to parse the mdat. Without it the mdat has little value. For instance, the first NALU does not need to start at the begining of the mdat, It can start anywhere within the mdat. The byte it starts at is recorded in (I believe) the stco box. If the file has audio, you will find audio and video mixed within mdat with no way to determine what is what without the chunk offsets. In addition, if the video has B frames, there is no way to determine render order without the cts, again only available in the moov. And Technically, the nalu size does not need to be 4 bytes and you cant know that without the moov. I recommend not used mp4. Use a streamable container such as ts or flv. Now if you can make some assumption about the code that is producing the file; Like the chunk offset is always the same, and there is no b frames, you can hard code these values. But is not guaranteed to work after a software update.

x264 rate control modes

Recently I am reading the x264 source codes. Mostly, I concern the RC part. And I am confused about the parameters --bitrate and --vbv-maxrate. When bitrate is set, the CBR mode is used in frame level. If you want to start the MB level RC, the parameters bitrate, vbv-maxrate and vbv-bufsize should be set. But I don't know the relationship between bitrate and vbv-maxrate. What is the criterion of the real encoding result when bitrate and vbv-maxrate are both set?
And what is the recommended value for bitrate? Equals to vbv-maxrate?
Also what is the recommended value for vbv-bufsize? Half of vbv-maxrate?
Please give me some advice.
bitrate address the "target filesize" when you are doing encoding. It is understandably confusing because it applies a "budget" of certain size and then tries to apportion this budget on the frames - that is why the later parts of a movie get a smaller amount of data which results in lower video quality. For example, if you have 10 seconds of complete black images followed by 10 second of natural video - the final encoded file will be very different than if the order was the opposite.
vbv-bufsize is the buffer that has to be completed before a "transmission" would occur say in a streaming scenario. Now, let's tie this to I-frames and P-frames: the vbv-bufsize will limit the size of any of your encoded video frames - most likely the I-frame.

HTML multipart form - maximum length of "boundary" string?

In a multi-part (i.e. Content-Type=multipart/form-data) form, is there an upper limit on the length of the boundary string that an HTTP server should accept?
As far as I can tell, the relevant RFCs say 70 chars:
RFC2616 (HTTP/1.1) section "3.7 Media Types" says that the allowed types in the Content-Type header is defined by RFC1590 (Media Type Registration Procedure).
RFC1590 updates RFC-1521(MIME).
RFC1521 says that a boundary "must be no longer than 70 characters, not counting the two leading hyphens".
The same text also appears in RFC2046 which supposedly obsoletes RFC1521.
So can I be certain all the major HTTP/1.1 browsers out there today adhere to this limit? Are there any browsers (or other HTTP clients/libraries) known to break this limit?
Is there some other spec or common rule-of-thumb I'm missing that says the string will be shorter than 70 chars? In Chrome(ium) I get something like this: ----WebKitFormBoundaryLu4dNSGEhJZUgoe5, which is obviously shorter than 70 chars.
I'm asking this question because my server is running in an extremely memory-constrained environment, so "malloc a buffer large enough to hold the entire header string" is not an ideal answer.
As you note, RFC 2046 updated the MIME spec, but kept the restriction of the maximum boundary string to 70 characters, not counting the two leading hyphens.
I think it's a fair assumption that the spec is followed by all major browsers (and all MIME-using clients, like mail programs) since otherwise passing around multipart data would be very risky indeed.
To be sure, I've experimentally verified it for you using the latest versions of:
curl: ----------------------------5a56a6c893f2 (40)
Chrome 30 (WebKit): ----WebKitFormBoundarym0vCJKBpUYdCIWQG (38)
Safari 6 (WebKit, and same as Chrome): ----WebKitFormBoundaryFHUXvJBZwO2JKkNa (38)
FireFox 24: ---------------------------7096603861379320641089344535 (55)
IE 10: ---------------------------7dd1961640278 (40) - same technique as curl!
Apache HttpClient: -----------------------------1294919323195 (42)
Thus not only does every major browser/client conform, but all would allow you to save 15 allocated bytes per boundary per buffer from the theoretical max. If you could trivially switch on user agent, you could squeeze even more performance out. ;-)