I have a function call which only accepts bytes data (dydx _getCallActions)
_getCallAction(bytes memory data)
During contract execution the data is passed to a user defined function named: "callFunction"
When decoding into a single struct, it works, however I want to to decode the data into two separate structs.
function callFunction(bytes calldata _data){
// This works, when passed in encoded data matching Struct1Type
Struct1Type memory data1 = abi.decode(_data, (Struct1Type));
}
function callFunction(bytes calldata _data){
// Doesnt work
Struct1Type memory data1, Struct2Type memory data2 = abi.decode(_data, (Struct1Type,Struct2Type));
}
I could decode the data into a single struct and then selectively cast it into the two desired structs, but this seems gas inefficient
You can split the array by the total byte length of the first struct rounded up to a multiplier of 32 - the slot length - and then decode each chunk separately.
In the example below, the length of Struct1Type is just 8 bytes, but the memory and storage slots take up the whole 32byte word. That's why we're splitting at the 32nd index.
Code:
pragma solidity ^0.8;
contract MyContract {
struct Struct1Type {
uint8 number;
}
struct Struct2Type {
uint16 number;
}
function callFunction(bytes calldata _data) external pure returns (Struct1Type memory, Struct2Type memory) {
// `:32` returns a chunk "from the beginning to the 32nd index"
Struct1Type memory data1 = abi.decode(_data[:32], (Struct1Type));
// `32:` returns a chunk "from the 32nd index to the end"
Struct2Type memory data2 = abi.decode(_data[32:], (Struct2Type));
return (data1, data2);
}
}
Input:
# two values: `1` and `2`
0x00000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000002
Output:
0: tuple(uint8): 1
1: tuple(uint16): 2
Related
I'm looking into the source code for Okio in order to understand efficient byte transferring better, and as a toy example made a little ForwardingSource which inverts individual bytes as they come along. For example, it transforms (unsigned) 0b1011 to (unsigned) 0b0100.
class ByteInvertingSource(source: Source) : ForwardingSource(source) {
// temporarily stores incoming bytes
private val sourceBuffer: Buffer = Buffer()
override fun read(sink: Buffer, byteCount: Long): Long {
// read incoming bytes
val count = delegate.read(sourceBuffer, byteCount)
// write inverted bytes to sink
sink.write(
sourceBuffer.readByteArray().apply {
println("Converting: ${joinToString(",") { it.toString(2) }}")
forEachIndexed { index, byte -> this[index] = byte.inv() }
println("Converted : ${joinToString(",") { it.toString(2) }}")
}
)
return count
}
}
Is this optimal code?
Specifically:
Do I really need the sourceBuffer field, or could I use another trick to transform the bytes directly?
Is it more efficient to read the individual bytes from sourceBuffer and write the individual bytes into sink? (I can't find a write(Byte) method, so maybe that is a clue that it's not.)
It looks pretty close the this testing sample from OkHttp.
https://github.com/square/okhttp/blob/f8fd4d08decf697013008b05ad7d2be10a648358/okhttp-testing-support/src/main/kotlin/okhttp3/UppercaseResponseInterceptor.kt
override fun read(
sink: Buffer,
byteCount: Long
): Long {
val buffer = Buffer()
val read = delegate.read(buffer, byteCount)
if (read != -1L) {
sink.write(buffer.readByteString().toAsciiUppercase())
}
return read
}
It is definitely not more efficient to read individual bytes. I don't think you can improve your invert loop, as it's an operation on a single byte. But generally you don't want to be doing loops in your code, so definitely do the bulk reads.
I am loading in a lot of CSV files into a struct using Golang.
The struct is
type csvData struct {
Index []time.Time
Columns map[string][]float64
}
I have a parser that uses:
csv.NewReader(file).ReadAll()
Then I iterate over the rows, and convert the values into their types: time.Time or float64.
The problem is that on disk these files consume 5GB space.
Once I load them into memory they consume 12GB!
I used ioutil.ReadFile(path) and found that this was, as expected, almost exactly the on-disk size.
Here is the code for my parser, with errors omitted for readability, if you could help me troubleshoot:
var inMemoryRepo = make([]csvData, 0)
func LoadCSVIntoMemory(path string) {
parsedData := csvData{make([]time.Time, 0), make(map[string][]float64)}
file, _ := os.Open(path)
reader := csv.NewReader(file)
columnNames := reader.Read()
columnData := reader.ReadAll()
for _, row := range columnData {
parsedData.Index = append(parsedData.Index, parseTime(row[0])) //parseTime is a simple wrapper for time.Parse
for i := range row[1:] { //parse non-index numeric columns
parsedData.Columns[columnNames[i]] = append(parsedData.Columns[columnsNames[i]], parseFloat(columnData[i])) //parseFloat is wrapper for strconv.ParseFloat
}
}
inMemoryRepo = append(inMemoryRepo, parsedData)
}
I tried troubleshooting by setting columnData and reader to nil at end of function call, but no change.
There is nothing surprising in this. On your disk there are just the characters (bytes) of your CSV text. When you load them into memory, you create data structures from your text.
For example, a float64 value requires 64 bits in memory, that is: 8 bytes. If you have an input text "1", that is 1 single byte. Yet, if you create a float64 value equal to 1, that will still consume 8 byes.
Further, strings are stored having a string header (reflect.StringHeader) which is 2 integer values (16 bytes on 64-bit architectures), and this header points to the actual string data. See String memory usage in Golang for details.
Also slices are similar data structures: reflect.SliceHeader. The header consists of 3 integer values, which again is 24 bytes on 64-bit architectures even if there are no elements in the slice.
Structs on top of this may have padding (fields must be aligned to certain values), which again adds overhead. For details, see Spec: Size and alignment guarantees.
Go maps are hashmaps, which again has quite some overhead, for details see why slice values can sometimes go stale but never map values?, for memory usage see How much memory do golang maps reserve?
Reading an entire file into memory rarely is a good idea.
What if your csv is 100GiB?
If your transformation does not involve several records, maybe you could apply the following algorithm:
open csv_reader (source file)
open csv_writer (destination file)
for row in csv_reader
transform row
write row into csv_writer
close csv_reader and csv_write
So I have a project with lots of incoming data about 15 sources in total, of course there are inconsistencies in how each label there data available in their rest api's. I need to Change some of their field names to be consistent with the others, but I am at a loss on how to do this when the data sources are json object arrays. A working example of what I am trying to do is found here playground and below
however I seem to lack the knowledge as to how to make this work when the data is not a single json object , but instead and array of objects that I am unmarshaling.
Another approach is using Maps like in this example but the result is the same, works great as is for single objects, but I can not seem to get it to work with json object arrays. Iteration through arrays is not a possibility as I am collecting about 8,000 records every few minutes.
package main
import (
"encoding/json"
"os"
)
type omit bool
type Value interface{}
type CacheItem struct {
Key string `json:"key"`
MaxAge int `json:"cacheAge"`
Value Value `json:"cacheValue"`
}
func NewCacheItem() (*CacheItem, error) {
i := &CacheItem{}
return i, json.Unmarshal([]byte(`{
"key": "foo",
"cacheAge": 1234,
"cacheValue": {
"nested": true
}
}`), i)
}
func main() {
item, _ := NewCacheItem()
json.NewEncoder(os.Stdout).Encode(struct {
*CacheItem
// Omit bad keys
OmitMaxAge omit `json:"cacheAge,omitempty"`
OmitValue omit `json:"cacheValue,omitempty"`
// Add nice keys
MaxAge int `json:"max_age"`
Value *Value `json:"value"`
}{
CacheItem: item,
// Set the int by value:
MaxAge: item.MaxAge,
// Set the nested struct by reference, avoid making a copy:
Value: &item.Value,
})
}
It appears your desired output is JSON. You can accomplish the conversion by unmarshaling into a slice of structs, and then iterating through each of those to convert them to the second struct type (your anonymous struct above), append them into a slice and then marshal the slice back to JSON:
package main
import (
"fmt"
"encoding/json"
)
type omit bool
type Value interface{}
type CacheItem struct {
Key string `json:"key"`
MaxAge int `json:"cacheAge"`
Value Value `json:"cacheValue"`
}
type OutGoing struct {
// Omit bad keys
OmitMaxAge omit `json:"cacheAge,omitempty"`
OmitValue omit `json:"cacheValue,omitempty"`
// Add nice keys
Key string `json:"key"`
MaxAge int `json:"max_age"`
Value *Value `json:"value"`
}
func main() {
objects := make([]CacheItem, 0)
sample := []byte(`[
{
"key": "foo",
"cacheAge": 1234,
"cacheValue": {
"nested": true
}},
{
"key": "baz",
"cacheAge": 123,
"cacheValue": {
"nested": true
}}]`)
json.Unmarshal(sample, &objects)
out := make([]OutGoing, 0, len(objects))
for _, o := range objects {
out = append(out, OutGoing{Key:o.Key, MaxAge:o.MaxAge, Value:&o.Value})
}
s, _ := json.Marshal(out)
fmt.Println(string(s))
}
This outputs
[{"key":"foo","max_age":1234,"value":{"nested":true}},{"key":"baz","max_age":123,"value":{"nested":true}}]
You could probably skip this iteration and conversion code if you wrote custom MarshalJSON and UnmarshalJSON methods for your CacheItem type, instead of relying on struct field tags. Then you could pass the same slice to both Unmarshal and Marshal.
To me there's no obvious performance mistake with these approaches -- contrast with building a string in a loop using the + operator -- and when that's the case it's often best to just get the software to work and then test for performance rather than ruling out a solution based on fears of performance issues without actually testing.
If there's a performance problem with the above approaches, and you really want to avoid marshal and unmarshal completely, you could look into byte replacement in the JSON data (e.g. regexp). I'm not recommending this approach, but if your changes are very simple and the inputs are very consistent it could work, and it would give another approach you could performance test, and then you could compare performance test results.
I'm having some trouble parsing a JSON file from an API to Go, this is the JSON I want to parse:
{"method":"stats.provider.ex",
"result":{
"addr":"17a212wdrvEXWuipCV5gcfxdALfMdhMoqh",
"current":[{
"algo":3, // algorithm number (3 = X11)
"name":"X11", // algorithm name
"suffix":"MH", // speed suffix (kH, MH, GH, TH,...)
"profitability":"0.00045845", // current profitability in BTC/suffix/Day
"data":[{ // speed object can contain following fields:
// a (accepted), rt (rejected target), rs (rejected stale),
// rd (rejected duplicate) and ro (rejected other)
// if fields are not present, speed is 0
"a":"23.09", // accepted speed (in MH/s for X11)
"rs":"0.54", // rejected speed - stale
},
"0.0001234" // balance (unpaid)
]},
... // other algorithms here
],
"past":[{
"algo":3,
"data":[
[4863234, // timestamp; multiply with 300 to get UNIX timestamp
{"a":"28.6"}, // speed object
"0" // balance (unpaid)
],[4863235,{"a":"27.4"},"0.00000345"],
... // next entries with inc. timestamps
]},
... // other algorithms here
],
"payments":[{
"amount":"0.00431400",
"fee":"0.00023000",
"TXID":"txidhere",
"time":1453538732, // UNIX timestamp
"type":0 // payment type (0 for standard NiceHash payment)
},
... // other payments here
]
}
}
You can find more info about the API in this link: https://www.nicehash.com/doc-api
The problem I'm experiencing is in the data attribute:
"data":[{ // speed object can contain following fields:
// a (accepted), rt (rejected target), rs (rejected stale),
// rd (rejected duplicate) and ro (rejected other)
// if fields are not present, speed is 0
"a":"23.09", // accepted speed (in MH/s for X11)
"rs":"0.54", // rejected speed - stale
},
"0.0001234" // balance (unpaid)
]},
Because of the balance (unpaid) line, since it doesn't have a name I don't know how to do the struct in go.
It seems that this "data" object can be described by the following struct types (assuming its shape doesn't vary from your examples):
type Data struct {
Timestamp *int64
Speed *Speed
Balance *float64
}
type Speed struct {
Accepted *float64 `json:"a,string,omitempty"`
RejectedTarget *float64 `json:"rt,string,omitempty"`
RejectedStale *float64 `json:"rs,string,omitempty"`
RejectedDuplicate *float64 `json:"rd,string,omitempty"`
RejectedOther *float64 `json:"ro,string,omitempty"`
}
The "Speed" struct has JSON tags since that object is well-suited for the default JSON un/marshaler.
The "Data" struct, however, should implement a custom json.UnmarshalJSON so that it can handle the odd choice of a JSON array with varying types to serialize its fields. Note that my sample implementation below uses the json.RawMessage type to simplify things a bit by allowing the JSON unmarshaler to ensure proper JSON array syntax and store the bytes of each element separately so we can unmarshal them according to their respective types and shapes:
// Parse valid JSON arrays as "Data" by assuming one of the following shapes:
// 1: [int64, Speed, string(float64)]
// 2: [Speed, string(float64)]
func (d *Data) UnmarshalJSON(bs []byte) error {
// Ensure that the bytes contains a valid JSON array.
msgs := []json.RawMessage{}
err := json.Unmarshal(bs, &msgs)
if err != nil {
return err
}
// Parse the initial message as "Timestamp" int64, if necessary.
idx := 0
if len(msgs) == 3 {
ts, err := strconv.ParseInt(string(msgs[idx]), 10, 64)
if err != nil {
return err
}
d.Timestamp = &ts
idx++
}
// Parse the mandatory "Speed" struct per usual.
d.Speed = &Speed{}
err = json.Unmarshal(msgs[idx], &d.Speed)
idx++
if err != nil {
return err
}
// Parse the mandatory "Balance" item after trimming quotes.
balance, err := strconv.ParseFloat(string(msgs[idx][1:len(msgs[idx])-1]), 64)
if err != nil {
return err
}
d.Balance = &balance
return nil
}
As such, you can parse valid, properly shaped JSON arrays as "Data" objects like so:
jsonstr := `[
[4863234, {"a":"28.6"}, "0" ],
[{"a":"23.09","rs":"0.54"},"0.0001234"]
]`
datas := []Data{}
err := json.Unmarshal([]byte(jsonstr), &datas)
if err != nil {
panic(err)
}
// datas[0] = Data{Timestamp:4863234,Speed{Accepted:28.6},Balance:0}
// datas[1] = Data{Speed{Accepted:23.09,RejectedStale:0.54},Balance:0.0001234}
Of course, you would also need to implement json.MarshalJSON if you want to serialize "Data" objects into JSON.
The data field in your JSON object has an array […] as its value, and
in your example that array has two elements: an object and a string apparently containing a floating-point number.
As you can see, this is an array of geterogenous types,
hence in Go, you have two options:
Create a custom type for the elements of that array, and have an
that type implement the encoding/json.Unmarshaler interface.
Then, in that method, you can go creative about interpreting what
kind of data you're about to unmarshal, and act accordingly.
Basically, you'd peek into the input data using Decoder.Token and then
unmarshal the whole input byte slice into a value of an appropriate type
Have the value for that data field to be unmarshaled into a
slice of type []interface{} and then inspect the individual elements
by a type switch
or a series of "comma ok" type asserts.
In this case, an object will be unmarshaled into a map of type
map[string]interface{}, and that string will be unmarshaled
to a value of type string.
Basically these two approaches can be classified as "detect type as you go"
vs "unmarshal everything into data structures of the most generic types
and deal with the real typing afterwards".
Here's also a third approach.
First, it may well turn out that the types of objects in the array
which is the value of that data field are implicit from their positions
in the array. You may act accordingly by unmarshaling the value of data
into an object of your custom type implementing json.Unmarshaler, which
knows which is the real type of each data element it processes.
Second, from that
{
// speed object can contain following fields:
// a (accepted), rt (rejected target), rs (rejected stale),
// rd (rejected duplicate) and ro (rejected other)
// if fields are not present, speed is 0
"a":"23.09", // accepted speed (in MH/s for X11)
"rs":"0.54", // rejected speed - stale
}
I'd say that this "object" really can have different combinations of fields,
so to me, this looks like a candidate to be unmarshaled into
into map[string]string or map[string]float,
and not into some struct-typed object.
I have an object. I encode the object to json using json.Encoder.
How can I measure the size of the json string in either bits?
io.Writer and json.Encoder does not expose nor maintain number of written bytes.
One way would be to first marshal the value using json.Marshal() into a []byte whose length we can get with the builtin len() function. The bit count you seek for is the length multiplied by 8 (1 byte is 8 bits). After that you have to manually write the byte slice to your output. For small types, this is not a problem, but it may be undesirable for large structs / values. Also there is unnecessary work marshalling it, getting its length and writing the slice manually.
A much better and more elegant way is to extend the functionality of any writers to manage the written bytes, using embedding:
type CounterWr struct {
io.Writer
Count int
}
func (cw *CounterWr) Write(p []byte) (n int, err error) {
n, err = cw.Writer.Write(p)
cw.Count += n
return
}
This CounterWr type automatically manages the number of written bytes in its Count field which you can check / examine at any time.
Now you create a value of our CounterWr passing the io.Writer that you currently use, and then pass this CounterWr value to json.NewEncoder(), and you can access number of written bytes from CounterWr.Count directly.
Example usage:
type Something struct {
S string
I int
}
buf := &bytes.Buffer{}
// Any writer, not just a buffer!
var out io.Writer = buf
cw := &CounterWr{Writer: out}
s := Something{"hello", 4}
if err := json.NewEncoder(cw).Encode(s); err != nil {
panic(err)
}
fmt.Printf("Count: %d bytes, %d bits\n", cw.Count, cw.Count*8)
fmt.Printf("Verif: %d bytes, %d bits\n", buf.Len(), buf.Len()*8)
For verification purposes we're also printing the length of the bytes.Buffer we used as our output (CounterWr.Count and Buffer.Len() should match).
Output:
Count: 20 bytes, 160 bits
Verif: 20 bytes, 160 bits
Try it on the Go Playground.
Notes:
If you encode other values too, cw.Count will be the number of total bytes of course (and not just that of the last value). If you want to get the size of the last encoded value only, store cw.Count before calling Encoder.Encode(), and calculate the difference to the count you get after encoding it. Or simply set cw.Count to 0 before encoding (yes, you can also change that field):
cw.Count = 0