Custom JSON Unmarshalling for string-encoded number - json

I have a struct which contains various currency values, in cents (1/100 USD):
type CurrencyValues struct {
v1 int `json:"v1,string"`
v2 int `json:"v2,string"`
}
I'd like to create a custom json Unmarshaller for currency values with thousand separators. These values are encoded as strings, with one or more thousand separators (,), and possibly a decimal point (.).
For this JSON {"v1": "10", "v2": "1,503.21"}, I'd like to JSON Unmarshal a CurrencyValues{v1: 1000, v2: 150321}.
Following a similar answer here: Golang: How to unmarshall both 0 and false as bool from JSON, I went ahead and created a custom type for my currency fields, which include a custom Unmarshalling function:
type ConvertibleCentValue int
func (cents *ConvertibleCentValue) UnmarshalJSON(data []byte) error {
asString := string(data)
// Remove thousands separators
asString = strings.Replace(asString, ",", "", -1)
// Parse to float, then convert dollars to cents
if floatVal, err := strconv.ParseFloat(asString, 32); err == nil {
*cents = ConvertibleCentValue(int(floatVal * 100.0))
return nil
} else {
return err
}
}
However, when writing unit tests:
func Test_ConvertibleCentValue_Unmarshal(t *testing.T) {
var c ConvertibleCentValue
assert.Nil(t, json.Unmarshal([]byte("1,500"), &c))
assert.Equal(t, 150000, int(c))
}
I encounter this error:
Error: Expected nil, but got: &json.SyntaxError{msg:"invalid character ',' after top-level value", Offset:2}
What am I missing here?

You're trying to unmarshal the string 1,500 which is invalid in JSON. I think what you means is to unmarshal the JSON string "1,500":
assert.Nil(t, json.Unmarshal([]byte(`"1,500"`), &c))
Note the backticks. Here is a simplified example:
b := []byte(`1,500`)
var s string
err := json.Unmarshal(b, &s)
fmt.Println(s, err) // Prints error.
b = []byte(`"1,500"`)
err = json.Unmarshal(b, &s)
fmt.Println(s, err) // Works fine.
Playground: http://play.golang.org/p/uwayOSgmTv.

Related

Unmarshalling `time.Time` from JSON fails when escaping '+' as `\u002b` in files but works in plain strings: cannot parse "\\u002b00:00\"" as "Z07:00"

I'm unmarshalling into a struct that has a time.Time field named Foo:
type AStructWithTime struct {
Foo time.Time `json:"foo"`
}
My expectation is, that after unmarshalling I get something like this:
var expectedStruct = AStructWithTime{
Foo: time.Date(2022, 9, 26, 21, 0, 0, 0, time.UTC),
}
Working Example 1: Plain JSON Objects into Structs
This works fine when working with plain json strings:
func Test_Unmarshalling_DateTime_From_String(t *testing.T) {
jsonStrings := []string{
"{\"foo\": \"2022-09-26T21:00:00Z\"}", // trailing Z = UTC offset
"{\"foo\": \"2022-09-26T21:00:00+00:00\"}", // explicit zero offset
"{\"foo\": \"2022-09-26T21:00:00\u002b00:00\"}", // \u002b is an escaped '+'
}
for _, jsonString := range jsonStrings {
var deserializedStruct AStructWithTime
err := json.Unmarshal([]byte(jsonString), &deserializedStruct)
if err != nil {
t.Fatalf("Could not unmarshal '%s': %v", jsonString, err) // doesn't happen
}
if deserializedStruct.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatal("Unmarshalling is erroneous") // doesn't happen
}
// works; no errors
}
}
Working Example 2: JSON Array into Slice
It also works, if I unmarshal the same objects from a json array into a slice:
func Test_Unmarshalling_DateTime_From_Array(t *testing.T) {
// these are just the same objects as above, just all in one array instead of as single objects/dicts
jsonArrayString := "[{\"foo\": \"2022-09-26T21:00:00Z\"},{\"foo\": \"2022-09-26T21:00:00+00:00\"},{\"foo\": \"2022-09-26T21:00:00\u002b00:00\"}]"
var slice []AStructWithTime // and now I need to unmarshal into a slice
unmarshalErr := json.Unmarshal([]byte(jsonArrayString), &slice)
if unmarshalErr != nil {
t.Fatalf("Could not unmarshal array: %v", unmarshalErr)
}
for index, instance := range slice {
if instance.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatalf("Unmarshalling failed for index %v: Expected %v but got %v", index, expectedStruct.Foo, instance.Foo)
}
}
// works; no errors
}
Not Working Example
Now I do the same unmarshalling with a JSON read from a file "test.json". Its content is the array from the working example above:
[
{
"foo": "2022-09-26T21:00:00Z"
},
{
"foo": "2022-09-26T21:00:00+00:00"
},
{
"foo": "2022-09-26T21:00:00\u002b00:00"
}
]
The code is:
func Test_Unmarshalling_DateTime_From_File(t *testing.T) {
fileName := "test.json"
fileContent, readErr := os.ReadFile(filepath.FromSlash(fileName))
if readErr != nil {
t.Fatalf("Could not read file %s: %v", fileName, readErr)
}
if fileContent == nil {
t.Fatalf("File %s must not be empty", fileName)
}
var slice []AStructWithTime
unmarshalErr := json.Unmarshal(fileContent, &slice)
if unmarshalErr != nil {
// ERROR HAPPENS HERE
// Could not unmarshal file content test.json: parsing time "\"2022-09-26T21:00:00\\u002b00:00\"" as "\"2006-01-02T15:04:05Z07:00\"": cannot parse "\\u002b00:00\"" as "Z07:00"
t.Fatalf("Could not unmarshal file content %s: %v", fileName, unmarshalErr)
}
for index, instance := range slice {
if instance.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatalf("Unmarshalling failed for index %v in file %s. Expected %v but got %v", index, fileName, expectedStruct.Foo, instance.Foo)
}
}
}
It fails because of the escaped '+'.
parsing time ""2022-09-26T21:00:00\u002b00:00"" as ""2006-01-02T15:04:05Z07:00"": cannot parse "\u002b00:00"" as "Z07:00"
Question: Why does unmarshalling the time.Time field fail when it's being read from a file but works when the same json is read from an identical string?
I believe that this is a bug in encoding/json.
Both the JSON grammar at https://www.json.org and the IETF definition of JSON at RFC 8259, Section 7: Strings provide that a JSON string may contain Unicode escape sequences:
7. Strings
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with quotation
marks. All Unicode characters may be placed within the quotation marks,
except for the characters that MUST be escaped: quotation mark, reverse
solidus, and the control characters (U+0000 through U+001F).
Any character may be escaped. If the character is in the Basic
Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a
six-character sequence: a reverse solidus, followed by the lowercase letter
u, followed by four hexadecimal digits that encode the character's code
point. The hexadecimal letters A through F can be uppercase or lowercase.
So, for example, a string containing only a single reverse solidus
character may be represented as "\u005C".
. . .
To escape an extended character that is not in the Basic Multilingual
Plane, the character is represented as a 12-character sequence, encoding
the UTF-16 surrogate pair. So, for example, a string containing only the
G-clef character (U+1D11E) may be represented as "\uD834\uDD1E".
string = quotation-mark *char quotation-mark
char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX
escape = %x5C ; \
quotation-mark = %x22 ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
The JSON document from the original post
{
"foo": "2022-09-26T21:00:00\u002b00:00"
}
Parses and deserializes perfectly fine in Node.js using JSON.parse().
Here's an example demonstrating the bug:
package main
import (
"encoding/json"
"fmt"
"time"
)
var document []byte = []byte(`
{
"value": "2022-09-26T21:00:00\u002b00:00"
}
`)
func main() {
deserializeJsonAsTime()
deserializeJsonAsString()
}
func deserializeJsonAsTime() {
fmt.Println("")
fmt.Println("Deserializing JSON as time.Time ...")
type Widget struct {
Value time.Time `json: "value"`
}
expected := Widget{
Value: time.Date(2022, 9, 26, 21, 0, 0, 0, time.UTC),
}
actual := Widget{}
err := json.Unmarshal(document, &actual)
switch {
case err != nil:
fmt.Println("Error deserializing JSON as time.Time")
fmt.Println(err)
case actual.Value != expected.Value:
fmt.Printf("Unmarshalling failed: expected %v but got %v\n", expected.Value, actual.Value)
default:
fmt.Println("Sucess")
}
}
func deserializeJsonAsString() {
fmt.Println("")
fmt.Println("Deserializing JSON as string ...")
type Widget struct {
Value string `json: "value"`
}
expected := Widget{
Value: "2022-09-26T21:00:00+00:00",
}
actual := Widget{}
err := json.Unmarshal(document, &actual)
switch {
case err != nil:
fmt.Println("Error deserializing JSON as string")
fmt.Println(err)
case actual.Value != expected.Value:
fmt.Printf("Unmarshalling failed: expected %v but got %v\n", expected.Value, actual.Value)
default:
fmt.Println("Sucess")
}
}
When run — see https://goplay.tools/snippet/fHQQVJ8GfPp — we get:
Deserializing JSON as time.Time ...
Error deserializing JSON as time.Time
parsing time "\"2022-09-26T21:00:00\\u002b00:00\"" as "\"2006-01-02T15:04:05Z07:00\"": cannot parse "\\u002b00:00\"" as "Z07:00"
Deserializing JSON as string ...
Sucess
Since deserializing a JSON string containing Unicode escape sequences as a string yields the correct/expected result — the escape sequence being turned into the expected rune/byte sequence — the problem seemingly lies in the code that handles the deserialization to time.Time (It does not appear to deserialize to a string and then parse the string value as a time.Time.
As Brits point out this is one issue time: UnmarshalJSON does not respect escaped unicode characters. We could solve those two errors when json.Unmarshal to the string {"value": "2022-09-26T21:00:00\u002b00:00"} in this way.
JSON fails when escaping '+' as '\u002b'
Solution: Converting escaped unicode to utf8 through strconv.Unquote
cannot parse "\\u002b00:00\"" as "Z07:00"
Solution: parse time with this format "2006-01-02T15:04:05-07:00"
stdNumColonTZ // "-07:00" from src/time/format.go
If you want to parse TimeZone from it, time.ParseInLocation could be used.
In order to make it compatible with json.Unmarshal, we could define one new type utf8Time
type utf8Time struct {
time.Time
}
func (t *utf8Time) UnmarshalJSON(data []byte) error {
str, err := strconv.Unquote(string(data))
if err != nil {
return err
}
tmpT, err := time.Parse("2006-01-02T15:04:05-07:00", str)
if err != nil {
return err
}
*t = utf8Time{tmpT}
return nil
}
func (t utf8Time) String() string {
return t.Format("2006-01-02 15:04:05.999999999 -0700 MST")
}
Then to do the json.Unmarshal
type MyDoc struct {
Value utf8Time `json:"value"`
}
var document = []byte(`{"value": "2022-09-26T21:00:00\u002b00:00"}`)
func main() {
var mydoc MyDoc
err := json.Unmarshal(document, &mydoc)
if err != nil {
fmt.Println(err)
}
fmt.Println(mydoc.Value)
}
Output
2022-09-26 21:00:00 +0000 +0000

Custom unmarshaling a struct into a map of slices

I thought I understood unmarshalling by now, but I guess not. I'm having a little bit of trouble unmarshalling a map in go. Here is the code that I have so far
type OHLC_RESS struct {
Pair map[string][]Candles
Last int64 `json:"last"`
}
type Candles struct {
Time uint64
Open string
High string
Low string
Close string
VWAP string
Volume string
Count int
}
func (c *Candles) UnmarshalJSON(d []byte) error {
tmp := []interface{}{&c.Time, &c.Open, &c.High, &c.Low, &c.Close, &c.VWAP, &c.Volume, &c.Count}
length := len(tmp)
err := json.Unmarshal(d, &tmp)
if err != nil {
return err
}
g := len(tmp)
if g != length {
return fmt.Errorf("Lengths don't match: %d != %d", g, length)
}
return nil
}
func main() {
response := []byte(`{"XXBTZUSD":[[1616662740,"52591.9","52599.9","52591.8","52599.9","52599.1","0.11091626",5],[1616662740,"52591.9","52599.9","52591.8","52599.9","52599.1","0.11091626",5]],"last":15}`)
var resp OHLC_RESS
err := json.Unmarshal(response, &resp)
fmt.Println("resp: ", resp)
}
after running the code, the last field will unmarshal fine, but for whatever reason, the map is left without any value. Any help?
The expedient solution, for the specific example JSON, would be to NOT use a map at all but instead change the structure of OHLC_RESS so that it matches the structure of the JSON, i.e.
type OHLC_RESS struct {
Pair []Candles `json:"XXBTZUSD"`
Last int64 `json:"last"`
}
https://go.dev/play/p/Z9PhJt3wX33
However it's safe to assume, I think, that the reason you've opted to use a map is because the JSON object's key(s) that hold the "pairs" can vary and so hardcoding them into the field's tag is out of the question.
To understand why your code doesn't produce the desired result, you have to realize two things. First, the order of a struct's fields has no bearing on how the keys of a JSON object will be decoded. Second, the name Pair holds no special meaning for the unmarshaler. Therefore, by default, the unmarshaler has no way of knowing that your wish is to decode the "XXBTZUSD": [ ... ] element into the Pair map.
So, to get your desired result, you can have the OHLC_RESS implement the json.Unmarshaler interface and do the following:
func (r *OHLC_RESS) UnmarshalJSON(d []byte) error {
// first, decode just the object's keys and leave
// the values as raw, non-decoded JSON
var obj map[string]json.RawMessage
if err := json.Unmarshal(d, &obj); err != nil {
return err
}
// next, look up the "last" element's raw, non-decoded value
// and, if it is present, then decode it into the Last field
if last, ok := obj["last"]; ok {
if err := json.Unmarshal(last, &r.Last); err != nil {
return err
}
// remove the element so it's not in
// the way when decoding the rest below
delete(obj, "last")
}
// finally, decode the rest of the element values
// in the object and store them in the Pair field
r.Pair = make(map[string][]Candles, len(obj))
for key, val := range obj {
cc := []Candles{}
if err := json.Unmarshal(val, &cc); err != nil {
return err
}
r.Pair[key] = cc
}
return nil
}
https://go.dev/play/p/Lj8a8Gx9fWH

Golang: JSON: How do I unmarshal array of strings into []int64

Golang encoding/json package lets you use ,string struct tag in order to marshal/unmarshal string values (like "309230") into int64 field. Example:
Int64String int64 `json:",string"`
However, this doesn't work for slices, ie. []int64:
Int64Slice []int64 `json:",string"` // Doesn't work.
Is there any way to marshal/unmarshal JSON string arrays into []int64 field?
Quote from https://golang.org/pkg/encoding/json:
The "string" option signals that a field is stored as JSON inside a JSON-encoded string. It applies only to fields of string, floating point, integer, or boolean types. This extra level of encoding is sometimes used when communicating with JavaScript programs:
For anyone interested, I found a solution using a custom type having MarshalJSON() and UnmarshalJSON() methods defined.
type Int64StringSlice []int64
func (slice Int64StringSlice) MarshalJSON() ([]byte, error) {
values := make([]string, len(slice))
for i, value := range []int64(slice) {
values[i] = fmt.Sprintf(`"%v"`, value)
}
return []byte(fmt.Sprintf("[%v]", strings.Join(values, ","))), nil
}
func (slice *Int64StringSlice) UnmarshalJSON(b []byte) error {
// Try array of strings first.
var values []string
err := json.Unmarshal(b, &values)
if err != nil {
// Fall back to array of integers:
var values []int64
if err := json.Unmarshal(b, &values); err != nil {
return err
}
*slice = values
return nil
}
*slice = make([]int64, len(values))
for i, value := range values {
value, err := strconv.ParseInt(value, 10, 64)
if err != nil {
return err
}
(*slice)[i] = value
}
return nil
}
The above solution marshals []int64 into JSON string array. Unmarshaling works from both JSON string and integer arrays, ie.:
{"bars": ["1729382256910270462", "309286902808622", "23"]}
{"bars": [1729382256910270462, 309286902808622, 23]}
See example at https://play.golang.org/p/BOqUBGR3DXm
As you quoted from json.Marshal(), the ,string option only applies to specific types, namely:
The "string" option signals that a field is stored as JSON inside a JSON-encoded string. It applies only to fields of string, floating point, integer, or boolean types.
You want it to work with a slice, but that is not supported by the json package.
If you still want this functionality, you have to write your custom marshaling / unmarshaling logic.
What you presented works, but it is unnecessarily complex. This is because you created your custom logic on slices, but you only want this functionality on individual elements of the slices (arrays). You don't want to change how an array / slice (as a sequence of elements) is rendered or parsed.
So a much simpler solution is to only create a custom "number" type producing this behavior, and elements of slices of this custom type will behave the same.
Our custom number type and the marshaling / unmarshaling logic:
type Int64Str int64
func (i Int64Str) MarshalJSON() ([]byte, error) {
return json.Marshal(strconv.FormatInt(int64(i), 10))
}
func (i *Int64Str) UnmarshalJSON(b []byte) error {
// Try string first
var s string
if err := json.Unmarshal(b, &s); err == nil {
value, err := strconv.ParseInt(s, 10, 64)
if err != nil {
return err
}
*i = Int64Str(value)
return nil
}
// Fallback to number
return json.Unmarshal(b, (*int64)(i))
}
And that's all!
The type using it:
type Foo struct {
Bars []Int64Str `json:"bars"`
}
Testing it the same way as you did yields the same result. Try it on the Go Playground.

Decode JSON value which can be either string or number

When I make an HTTP call to a REST API I may get the JSON value count back as a Number or String. I'ld like to marshal it to be an integer in either case. How can I deal with this in Go?.
Use the "string" field tag option to specify that strings should be converted to numbers. The documentation for the option is:
The "string" option signals that a field is stored as JSON inside a JSON-encoded string. It applies only to fields of string, floating point, integer, or boolean types. This extra level of encoding is sometimes used when communicating with JavaScript programs:
Here's an example use:
type S struct {
Count int `json:"count,string"`
}
playground example
If the JSON value can be number or string, then unmarshal to interface{} and convert to int after unmarshaling:
Count interface{} `json:"count,string"`
Use this function to convert the interface{} value to an int:
func getInt(v interface{}) (int, error) {
switch v := v.(type) {
case float64:
return int(v), nil
case string:
c, err := strconv.Atoi(v)
if err != nil {
return 0, err
}
return c, nil
default:
return 0, fmt.Errorf("conversion to int from %T not supported", v)
}
}
// Format of your expected request
type request struct {
ACTIVE string `json:"active"`
CATEGORY string `json:"category"`
}
// struct to read JSON input
var myReq request
// Decode the received JSON request to struct
decoder := json.NewDecoder(r.Body)
err := decoder.Decode(&myReq)
if err != nil {
log.Println( err)
// Handler for invalid JSON received or if you want to decode the request using another struct with int.
return
}
defer r.Body.Close()
// Convert string to int
numActive, err = strconv.Atoi(myReq.ACTIVE)
if err != nil {
log.Println(err)
// Handler for invalid int received
return
}
// Convert string to int
numCategory, err = strconv.Atoi(myReq.CATEGORY)
if err != nil {
log.Println(err)
// Handler for invalid int received
return
}
I had the same problem with a list of values where the values were string or struct. The solution I'm using is to create a helper struct with fields of expected types and parse value into the correct field.
type Flag struct {
ID string `json:"id"`
Type string `json:"type"`
}
type FlagOrString struct {
Flag *Flag
String *string
}
func (f *FlagOrString) UnmarshalJSON(b []byte) error {
start := []byte("\"")
for idx := range start {
if b[idx] != start[idx] {
return json.Unmarshal(b, &f.Flag)
}
}
return json.Unmarshal(b, &f.String)
}
var MainStruct struct {
Vals []FlagOrString
}
Custom Unmarshaller simplifies a code. Personally I prefer this over interface{} as it explicitly states what a developer expects.

golang format data to JSON format in one pass

My original data format:
id=1, name=peter, age=12
I converted it to JSON string:
{"id" : "1", "name" : "peter", "age" : "12"}
I use the following golang statement to do the conversion:
Regex, err = regexp.Compile(`([^,\s]*)=([^,\s]*)`)
JSON := fmt.Sprintf("{%s}", Regex.ReplaceAllString(inp, `"$1" : "$2"`))
inp is the variable that holds the original data format.
However, now I get a new format:
id=1 name=peter age=12
and I also want to convert to JSON string using similar method that I used above, i.e., use regex to do a one pass formatting.
{"id"="1", "name"="peter", "age"="12"}
How can I achieve that?
UPDATE: One additional requirement. if the input format is
id=1, name=peter, age="12"
I need to get rid of the "" to be or escape \" so I can process in the next step. The double quote can appear at the beginning and the end of any value field.
There are two parts to the question: The easy part is serialising to JSON, Go has standard library methods for doing that. I would use that library rather than trying to encode the JSON myself.
The slightly trickier part of your question is parsing the input into a struct or map that can be easily serialised out, and making it flexible enough to accept different input formats.
I would do it with a general interface for converting text to a struct or map, and then implementing the interface for parsing each new input type.
Sample code: (You can run it here)
package main
import (
"encoding/json"
"errors"
"fmt"
"strings"
)
// parseFn describes the function for converting input into a map.
// This could be a struct or something else if the format is well known.
// In real code this would return map[string]interface{}, but for this
// demo I'm just using string
type parseFn func(string) (map[string]string, error)
// parseFormat1 is for fields separated by commas
func parseFormat1(in string) (map[string]string, error) {
data := map[string]string{}
fields := strings.Split(in, ",")
for _, field := range fields {
pair := strings.Split(field, "=")
if len(pair) != 2 {
return nil, errors.New("invalid input")
}
data[strings.Trim(pair[0], ` "`)] = strings.Trim(pair[1], ` "`)
}
return data, nil
}
// parseFormat2 is for lines with no commas
func parseFormat2(in string) (map[string]string, error) {
data := map[string]string{}
fields := strings.Split(in, " ")
for _, field := range fields {
pair := strings.Split(field, "=")
if len(pair) != 2 {
return nil, errors.New("invalid input")
}
data[strings.Trim(pair[0], ` "`)] = strings.Trim(pair[1], ` "`)
}
return data, nil
}
// nullFormat is what we fall back on when we just don't know
func nullFormat(in string) (map[string]string, error) { return nil, errors.New("invalid format") }
// classify just tries to guess the parser to use for the input
func classify(in string) parseFn {
switch {
case strings.Count(in, ", ") > 1:
return parseFormat1
case strings.Count(in, " ") > 1:
return parseFormat2
default:
return nullFormat
}
}
func main() {
testCases := []string{
`id=1, name=peter, age=12`,
`id=1, name=peter, age="12"`,
`id=1 name=peter age=12`,
`id=1;name=peter;age="12"`,
}
for ix, tc := range testCases {
pfn := classify(tc)
d, err := pfn(tc)
if err != nil {
fmt.Printf("\nerror parsing on line %d: %v\n", ix, err)
continue
}
b, err := json.Marshal(d)
if err != nil {
fmt.Printf("\nerror marshaling on line %d: %v\n", ix, err)
continue
}
fmt.Printf("\nSuccess on line %d:\n INPUT: %s\nOUTPUT: %s\n", ix, tc, string(b))
}
}