Max string size of Ethereum Contract ABI - ethereum

I am writing code to communicate deployed ethereum contract through JSON-RPC. When I call eth_call, I need to prepare data set for it. Although this link has details but I got a question when I read dynamic sized unicode string part.It says
Finally, we encode the data part of the second dynamic argument, "Hello, world!":
0x000000000000000000000000000000000000000000000000000000000000000d
(number of elements (bytes in this case): 13)
0x48656c6c6f2c20776f726c642100000000000000000000000000000000000000
("Hello, world!" padded to 32 bytes on the right)
If I want to put strings over 32byte, what should I do? Will it be like following?
Sample text: "Lorem ipsum dolor sit amet, consectetur posuere."
Text size: 48 bytes
0x0000000000000000000000000000000000000000000000000000000000000030
0x4c6f72656d20697073756d20646f6c6f722073697420616d65742c20636f6e73
0x6563746574757220706f73756572652e00000000000000000000000000000000
Does anybody know this?

Pad the string to multiple of 32 length.
For your case you've 48 bytes, so pad it to 64 bytes.

Related

How to encode raw bytes?

We have a 2D datamatrix barcode which outputs as 12002052 (CR+LF after the value). When scanning into Chrome the barcode is triggering the downloads menu - which I have read from other posts that this is due to the CR+LF. To troubleshoot, we generated a new 2D datamatrix barcode with an online generator for 12002052 which scans successfully in Chrome (doesn't trigger the downloads menu) but when scanned into notepad++ (showing all characters) it shows the exact same output as the original/bad barcode.
I took an image of both the good and bad barcode and uploaded them to a datamatrix decoding website (zxing) and what is interesting is the last value in the "raw bytes" is different for each barcode
bad 2D
Raw text 12002052
Raw bytes 8e 82 96 b6 81
Barcode format DATA_MATRIX
Parsed Result Type TEXT
Parsed Result 12002052
good 2D
Raw text 12002052
Raw bytes 8e 82 96 b6 0b
Barcode format DATA_MATRIX
Parsed Result Type TEXT
Parsed Result 12002052
my question is what exactly are the "raw bytes" and how could I possible encode them to hopefully reverse engineer this and find what is differentiating the 2 barcodes?
I believe that 'Raw bytes' would refer to a byte array. Byte arrays are exactly what they sound like, an array of bytes which are 8 bits each. So, the raw bytes '8e 82 96 b6 0b' refer to hexidecimal representations of each byte.
That said, from the string you have provided - I do not get a corresponding byte array that matches the raw text input provided. (There are plenty of string to byte converters online) Perhaps some character encoding other than ASCII or UTF8 is used in this case.

When should the ERC-1155 Metadata URI need to be zero-padded to 64 hex characters?

EIP-155 states that the "The string format of the substituted hexadecimal ID MUST be leading zero padded to 64 hex characters length if necessary."
In what situation is a 0-padded hex ID necessary? It is odd they chose to use the keyword MUST here as it seems like the choice of whether to use 64 hex character padding is completely arbitrary.
I understand that there cannot exist more than 2^256 ids (64 hex digits), but wouldn't the choice of metadata URI for an ERC-1155 token be implementation-dependent?
For example, if I wanted to create an ERC-1155 token composed only of 64 NFTs, I'd much prefer defining metadata URLs as follows:
https://{DOMAIN}/1.json
https://{DOMAIN}/2.json
...
https://{DOMAIN}/40.json (64 in hex)
I suspect that ERC-1155 was built with uint256 in mind as the standard for numeric types and that requiring ID to be padded to 64 hex characters means that all 256 bits of information are specified explicitly. Maybe this alleviates potential issues with dirty leading bits?
Padding doesn't appear to be strictly necessary to function - I have seen smart contracts which use unpadded metadata URLs, such as Mining.game
(https://mumbai.polygonscan.com/address/0x1a3d0451f48ebef398dd4c134ae60846274b7ce0#code),
(https://api.mining.game/1.json).
This is on the Polygon testnet, not a mainnet, so keep in mind that code quality may not be stellar. But regardless, it appears to work.

Reading / Computing Hex received over RS232

I am using Docklight Scripting to put together a VBScript that communicates with a device via RS232. All the commands are sent in Hex.
When I want to read from the device, I send a 32-bit address, a 16-bit read length, and an 8-bit checksum.
When I want to write to the device, I send a 16-bit data length, the data, followed by an 8-bit checksum.
In Hex, the data that is sent to the device is the following:
AA0001110200060013F81800104D
AA 00 01 11 02 0006 0013F818 0010 4D
(spaced for ease of reading)
AA000111020006 is the protocol header, where:
AA is the Protocol Byte
00 is the Source ID
01 is the Dest ID
11 is the Message Type
02 is the Command Byte
0006 is the Length Byte(s)
The remainder of the string is broken down as follows:
0013F818 is the 32-bit address
0010 is the 16 bit read length
4D is the 8-bit checksum
If the string is not correct, or the checksum is invalid the device replies back with an error string. However, I am not getting an error. The device replies back with the following hex string:
AA0100120200100001000000000100000000000001000029
AA 01 00 12 02 0010 00010000000001000000000000010000 29
(spaced for ease of reading)
Again, the first part of the string (AA00011102) is a part of the protocol header, where:
AA is the Protocol Byte
01 is the Source ID
00 is the Dest ID
12 is the Message Type
02 is the Command Byte
The difference between what is sent to the device, and what the device replies back with is that the length bytes is not a "static" part of the protocol header, and will change based of the request. The remainder of the string is broken down as follows:
0010 is the Length Byte(s)
00010000000001000000000000010000 is the data
29 is the 8-bit Check Sum
The goal is to read a timer that is stored in the NVM. The timer is stored in the upper halves of 60 4-byte NVM words.
The instructions specify that I need to read the first two bytes of each word, and then sum the results.
Verbatim, the instructions say:
Read the NVM elapsed timer. The timer is stored in the upper halves of 60 4-byte words.
Read the first two bytes of each word of the timer. Read the 16 bit values of these locations:
13F800H, 13F804H, 13808H, and continue to 13F8ECH.
Sum the results. Multiply the sum by 409.6 seconds, then divide by 3600 to get the results in hours.
My knowledge of bits, and bytes, and all other things is a bit cloudy. The first thing I need to confirm is that I am understanding the read protocol correctly.
I am assuming that when I specify 0010 as the 16 bit read length, that translates to the 16-bit values that the instructions want me to read.
The second thing I need to understand a little better is that when it tells me to read the first two bytes of each word, what exactly constitutes the first two bytes of each word?
I think what confuses me a little more is that the instructions say the timer is stored in the upper half of the 4 byte word (which to me seems like the first half).
I've sat with another colleague of mine for a day trying to figure out how to make this all work, and we haven't had any consistent results with our trials.
I have looked on the internet to find something that would explain this better in the context being used.
Another worry is that the technical data I am using to accomplish this project isn't 100% accurate in their instructions, and they have conflicting information or skipping information throughout their publication (which is probably close to 1000 pages long).
What I would really appreciate is someone who has a much better understanding of hex / binary to review the instructions I've posted, and provide some feedback on my interpretation of the instructions provided, and provide any information.

varchar(25000) or text. What's the proper data type for this variable data?

I have this table, which contains a column, which can store strings of different sizes, these can be simple words like "hello" or long texts of up to 25 thousand characters.
I know the byte sizes of the data types and I have read some answers from this same site, but I have not found concrete references that allow me to decide on this particular case.
25000 maximum data is too much for varchar?
Maybe yes, then I should use text. But what if most of the strings do not exceed 20 characters and there are only a few exceptions where the text is 25000 characters long?
What type of data should I use? Varchar (25000) or text?
If you're not building something with has to take care of every little byte of space on your DB-server I guess it doesn't really matter. If you have a large amout of records below 255 bytes, you'll save about one byte for each record if going for varchar.
In cases like this I personally prefer text, mainly because avoiding running into trouble with a too small defined length.
From the MySQL Documentation:
Data Type: VARCHAR(M), VARBINARY(M)
Storage Required: L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require
more than 255 bytes
Data Type: BLOB, TEXT
Storage Required: L + 2 bytes, where L < 2^16

Why is it useful to know how to convert between numeric bases?

We are learning about converting Binary to Decimal (and vice-versa) as well as other base-conversion methods, but I don't understand the necessity of this knowledge.
Are there any real-world uses for converting numbers between different bases?
When dealing with Unicode escape codes— '\u2014' in Javascript is — in HTML
When debugging— many debuggers show all numbers in hex
When writing bitmasks— it's more convenient to specify powers of two in hex (or by writing 1 << 4)
In this article I describe a concrete use case. In short, suppose you have a series of bytes you want to transfer using some transport mechanism, but you cannot simply pass the payload as bytes, because you are not able to send binary content. Let's say you can only use 64 characters for encoding the payload. A solution to this problem is to convert the bytes (8-bit characters) into 6-bit characters. Here the number conversion comes into play. Consider the series of bytes as a big number whose base is 256. Then convert it into a number with base 64 and you are done. Each digit of the new base 64 number now denotes a character of your encoded payload...
If you have a device, such as a hard drive, that can only have a set number of states, you can only count in a number system with that many states.
Because a computer's byte only have on and off, you can only represent 0 and 1. Therefore a base2 system is used.
If you have a device that had 3 states, you could represent 0, 1 and 2, and therefore count in a base 3 system.