Figure out if mp3 is ID3v2 - binary

I am trying to figure out the length of a remote mp3 file, I have gotten the first 10 bytes 11111111111100111100100011000100000000000000000000000000000000110100100000000000
From Another Stack Overflow Post:
An ID3 version 2.4 tag frame header is 10 bytes in length and consists of the following fields:
ID -- 4 bytes (e.g. TIT2)
Size -- 4 bytes (is sync-safe in versions >= 2.4)
Flags -- 2 bytes
From ID3 docs:
I cant't figure out if the first ten bytes tell me this is an ID3v2 or not... I really wanna know that... if this is not, can you tell me why so, also, could you link a remote mp3 that is? The reason why I am implementing this logic myself if because there is no rust library that does it and I will create the first one, let me know if you wanna contribute btw.

Related

Nand2Tetris-Obtaining Register from RAM chips

I've recently completed Chapter 3 of the associated textbook for this course: The Elements of Computing, Second Edition.
While I was able to implement all of the chips described in this chapter, I am still trying to wrap my head around how exactly the RAM chips work. I think I understand them in theory (e.g. a Ram4K chip stores a set of 8 RAM512 chips, which itself is a set of 8 RAM64 chips).
What I am unsure about is actually using the chips. For example, suppose I try to output a single register from RAM16K using this code, given an address:
CHIP RAM16K {
IN in[16], load, address[14];
OUT out[16];
PARTS:
Mux4Way16(a=firstRam, b=secondRam, c=thirdRam, d=fourthRam, sel=address[12..13], out=out);
And(a=load, b=load, out=shouldLoad);
DMux4Way(in=shouldLoad, sel=address[12..13], a=setRamOne, b=setRamTwo, c=setRamThree, d=setRamFour);
RAM4K(in=in, load=setRamOne, address=address[0..11], out=firstRam);
RAM4K(in=in, load=setRamTwo, address=address[0..11], out=secondRam);
RAM4K(in=in, load=setRamThree, address=address[0..11], out=thirdRam);
RAM4K(in=in, load=setRamFour, address=address[0..11], out=fourthRam);
}
How does the above code get the underlying register? If I understand the description of the chip correctly, it is supposed to return a single register. I can see that it outputs a RAM4K based on a series of address bits -- does it also get the base register itself recursively through the chips at the bottom? Why doesn't this code have an error if it's outputting a RAM4K when we expect a register?
It's been a while since I did the course so please excuse any minor errors below.
Each RAM chip (whatever the size) consists of an array of smaller chips. If you are implementing a 16K chip with 4K subchips, then there will be 4 of them.
So you would use 2 bits of the incoming address to select what sub-chip you need to work with, and the remaining 12 bits are sent on to all the sub-chip. It doesn't matter how you divide up the bits, as long as you have a set of 2 and a set of 12.
Specifically, the 2 select bits are used to route the load signal to just one sub-chip (ie: using a DMux4Way), so loads only affect that one sub-chip, and they are also used to pick which of the sub-chips outputs are used (ie: a Mux4Way16).
When I was doing it, I found that the simplest way to do things was always use the least-significant bits as the select bits. So for example, my RAM64 chip used address[0..2] as the select bits, and passed address[3..5] to the RAM8 sub-chips.
The thing that may be confusing you is that in these kinds of circuits, all of the sub-chips are activated. It's just that you use the select bits to decide which sub-chip's output to pass on to the outputs, and also as a filter to decide which sub-chip might perform a load.
As the saying goes, "It's turtles (or ram chips) all the way down."

Convert EM4x02 ID to Hitag2 Value

I've been working on an RFID project to produce our own RFID cards to work on our existing timeclocks and readers.
I've got most of the work done, and have been able to successfully write a Hitag2 card using the value of page 4 & 5 from another card (so basically copying the card) then changing the config bit which makes it act like an EM4x02 which allows our readers to read it.
What I'm struggling with is trying to relate the hex code on page4/5 to the output you get when scanning as an EM4x..
The values of the hitag page 4/5 are FF800000/003EDF10. This translates to 0000001EBC when read as an EM4x.
Does anybody have an idea on how this translation is done? I've tried using the methods in RFIDIOT but that doesn't seem to work for this.
I've managed to find how this is done after finding a hitag2 datasheet from 1999 (the only one I could find that explains the bits when hitag is in public mode A)
Firstly, convert the number you want on the EM4 card to hex.
Convert that hex into binary.
Split the binary into 4 bit chunks, then work out the even parity for each section and add it to the end of each chunk. (So you'll end up with 5 bits per chunk)
Then, work out the even parity of each column in the data (i.e first character of all chunks, then second etc. But ignoring the parity bit you added) and add these 4 bytes to the binary string.
Then add the correct amount of zeros at the start to ensure the data section has 50 bits.
Once you have the data section sorted, add 9 bits of 1 to the beginning (header) and a final 0 to the very end of the binary.
Your whole binary string should be 64 bits long.
Convert this to hex and split it in half. You can then write these onto pages 4/5 of a Hitag2 card.
You then need to change the configuration bit to 0x02 for the tag to work in public mode a.
Just thought I would send you the diagram of how this works.Em4X tag data

What File Format Has This Magic Header?

I've got a bunch of files that from metadata I can tell are supposed to be PDFs. Some of them are indeed complete PDFs. Some of them appear to be the first part of a PDF file, though they lack the %%EOF and other footer values.
Others appear to be the last part of PDF files (they don't have any of a PDF's headers but they do have the %%EOF stuff). Curiously they start with the following 16-byte magic header:
0x50, 0x4B, 0x57, 0x41, 0x52, 0x45, 0x00, 0x00, 0x00, 0x00, 0x00, 0x57, 0x49, 0x4E, 0x33, 0x32 (PKWARE WIN32).
I'm doing a lot of inference which could possibly be misleading, but it doesn't seem to be a compression scheme (the %%EOF stuff is plaintext) and in the few files I've been allowed to look at deeply there's a correlation between starting with this magic and looking like the final segment of a PDF binary.
Does anyone have any hints as to what file format might be at play here?
Update: I've now observed this PKWARE WIN32 happening on non-PDF files as well. Speculation also suggests that these files are split up in a similar manner.
Update 2: It turns out this PKWARE WIN32 header actually occurs in repeating intervals, the location of which can be predicted by some bytes immediately prior to the header.
I've also received some circumstantial hearsay which suggests that these files are compressed and not split into multiple parts, though in 2 out of the 3 cases where I was told the output file sizes my binaries were only negligibly smaller.
The mystery continues.
Okay, so this ended up being a very strange format. Overall it's a compression scheme, but it's applied inconsistently and lightly wrapped in a way that confounded the issue.
The first 8 bytes of any of these files will start with its own magic, and the next 8 bytes can be read as a long to tell us the final size of the output file.
Then there's a 16 byte "section" (four ints) whose first number is just an incremental counter, whose second int represents the number of bytes until the next "section" break, whose third int is a bit of a mystery to me, and whose fourth int is either 0 or 1. If that int is 0, just read the next (however many) bytes as-is. They're payload.
If it's 1 then you'll get one of these PKWARE headers next. I honestly know how to interpret them the least-well other than they start with the magic in the original question and they're 42 bytes long in total.
If you had a PKWARE header, subtract 42 from the number of bytes to read then treat the remaining bytes as compressed using PKWARE's "implode" algorithm. Meaning you can use zlib's "explode" implementation to decompress them.
Iterate through the file taking all these headers into account and cobbling together compressed and uncompressed parts 'til you run out of bytes and you'll end up with your output file.
I have no idea why only parts of files are compressed nor why they've been broken into blocks like this but it seems to work for the limited sample data I have. Perhaps later on I'll find files that actually have been split up along those boundaries or employ some kind of fancy deduplication but at least now I can explain why it looked like I saw partial PDFs -- the files were only partially compressed.

Tiff versus BigTiff

Please let me know if there is another Stack Exchange community this question would be better suited for.
I am trying to understand the basic differences between Tiff and BigTiff. I have looked on various sites and the only difference that is mentioned is that BigTiff uses 64-bit offsets while Tiff uses 32-bit offsets. That being said, you would need to know which of the two types you are reading. How is this done? According to https://www.leadtools.com/help/leadtools/v19/main/api/tifffmt.html, this is done by reading a file flag. However, the flag they are referring to appears to be unique to their own reader as I cannot find a corresponding data field in the specifications as shown by http://www.fileformat.info/format/tiff/egff.htm. What am I missing? Does BigTiff use a different file header than Tiff?
Everything you need to know is described in the BigTIFF link posted by #cgohlke. This is just to provide an answer to your question:
Yes, it uses a different file header.
Normal TIFF uses the following header:
2 byte byte order mark, "II" for "Intel"/little endian, or "MM" for "Motorola"/big endian.
The (version) number 42* as a 16 bit value, in the endianness given.
Unsigned 32 bit offset to IFD0
BigTIFF uses a slightly different header:
2 byte byte order mark as above
The (version) number 43 as a 16 bit value, in the endianness given.
Byte size of offset as a 16 bit value, always 8 for BigTIFF
2 byte padding, always 0 for BigTIFF
Unsigned 64 bit offset to IFD0
*) The value 42 was chosen for its "deep philosophical significance". Or according to the official specification, "[a]n arbitrary but carefully chosen number"...

Zero-padded h264 in mdat

I'd like to do some stuff with h.264 data recorded from Android phone.
My colleague told me there should be 4 bytes right after mdat wich specifies NALU size, then one byte with NALU metadata and then the raw data, and then (after NALU size), another 4 bytes with another NALU size and so on.
But I have a lot of zeros right after mdat:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0e00000000000000000000000000000000000000000000000000000000000000
8100000000000000000000000000000000000000000000000000000000000000
65b84f0f87fa890001022e7fcc9feef3e7fabb8e0007a34000f2bbefffd07c3c
bfffff08fbfefff04355f8c47bdfd05fd57b1c67c4003e89fe1fe839705a699d
c6532fb7ecacbfffe82d3fefc718d15ffffbc141499731e666f1e4c5cce8732f
bf7eb0a8bd49cd02637007d07d938fd767cae34249773bf4418e893969b8eb2c
Before mdat atom are just ftyp mp42, isom mp42 and free atoms. All other atoms (moov, ...) are at the end of the file (that's what Android does, when it writes to socket and not to the file). But If necessary, I've got PPS and SPS from other file with same camera and encoder settings recorded just a seond before this, just to get those PPS and SPS data.
So how exactly can i get NALUs from that?
You can't. The moov atom contains information required to parse the mdat. Without it the mdat has little value. For instance, the first NALU does not need to start at the begining of the mdat, It can start anywhere within the mdat. The byte it starts at is recorded in (I believe) the stco box. If the file has audio, you will find audio and video mixed within mdat with no way to determine what is what without the chunk offsets. In addition, if the video has B frames, there is no way to determine render order without the cts, again only available in the moov. And Technically, the nalu size does not need to be 4 bytes and you cant know that without the moov. I recommend not used mp4. Use a streamable container such as ts or flv. Now if you can make some assumption about the code that is producing the file; Like the chunk offset is always the same, and there is no b frames, you can hard code these values. But is not guaranteed to work after a software update.