can't read quoted field with gocsv - csv

I have a csv response that comes from an endpoint that I don't control and I'm failing to parse its response because it has quotes. It looks something like this:
name,id,quantity,"status (active, expired)"
John,14,4,active
Bob,12,7,expired
to parse this response I have created the following struct:
type UserInfo struct {
Name string `csv:"name"`
ID string `csv:"id"`
Quantity string `csv:"quantity"`
Status string `csv:"status (active, expired)"`
}
I have tried using
Status string `csv:""status (active, expired)""`
Status string `csv:'"status (active, expired)"'`
but none seem to be helpful, I just can't access the field Status when I use gocsv.Unmarshal.
var actualResult []UserInfo
err = gocsv.Unmarshal(in, &actualResult)
for _, elem := range actualResult {
fmt.Println(elem.Status)
}
And I get nothing as as response.
https://go.dev/play/p/lje1zNO9w6E here's an example

You don't need third party package like gocsv (unless you have specific usecase) when it can be done easily with Go's builtin encoding/csv.
You just have to ignore first line/record which is csv header in your endpoint's response.
csvReader := csv.NewReader(strings.NewReader(csvString))
records, err := csvReader.ReadAll()
if err != nil {
panic(err)
}
var users []UserInfo
// Iterate over all records excluding first one i.e., header
for _, record := range records[1:] {
users = append(users, UserInfo{Name: record[0], ID: record[1], Quantity: record[2], Status: record[3]})
}
fmt.Printf("%v", users)
// Output: [{ John 14 4 active } { Bob 12 7 expired}]
Here is working example on Go Playground based on your use case and sample string.

I simply don't think gocarina/gocsv can parse a header with a quoted comma. I don't see it spelled out anywhere in the documentation that it cannot, but I did some digging and there are clear examples of commas being used in the "CSV annotations", and it looks like the author only conceived of commas in the annotations being used for the purposes of the package/API, and not as part of the column name.
If we look at sample_structs_test.go from the package, we can see commas being used in some of the following ways:
in metadata directives, like "omitempty":
type Sample struct {
Foo string `csv:"foo"`
Bar int `csv:"BAR"`
Baz string `csv:"Baz"`
...
Omit *string `csv:"Omit,omitempty"`
}
for declaring that a field in the struct can be populated from multiple, different headers:
type MultiTagSample struct {
Foo string `csv:"Baz,foo"`
Bar int `csv:"BAR"`
}
You can see this in action, here.
FWIW, the official encoding/json package has the same limitation, and they note it (emphasis added):
The encoding of each struct field can be customized by the format string stored under the "json" key in the struct field's tag. The format string gives the name of the field, possibly followed by a comma-separated list of options. The name may be empty in order to specify options without overriding the default field name.
and
The key name will be used if it's a non-empty string consisting of only Unicode letters, digits, and ASCII punctuation except quotation marks, backslash, and comma.
So, you may not be able to get what you expect/want: sorry, this may just be a limitation of having the ability to annotate your structs. If you want, you could file a bug with gocarina/gocsv.
In the meantime, you can just modify the header as it's coming in. This is example is pretty hacky, but it works: it just replaces "status (active, expired)" with "status (active expired)" and uses the comma-less version to annotate the struct.
endpointReader := strings.NewReader(sCSV)
// Fix header
var bTmp bytes.Buffer
fixer := bufio.NewReader(endpointReader)
header, _ := fixer.ReadString('\n')
header = strings.Replace(header, "\"status (active, expired)\"", "status (active expired)", -1)
bTmp.Write([]byte(header))
// Read rest of CSV
bTmp.ReadFrom(fixer)
// Turn back into a reader
reader := bytes.NewReader(bTmp.Bytes())
var actualResult []UserInfo
...
I can run that and now get:
active
expired

Related

JSON Decoder cannot decode an object into an object

type MiddleMan struct {
User CompletedByUser `json:"user"`
}
type CompletedByUser struct {
DisplayName string `json:"displayName"`
Id string `json:"id"`
}
Using the following types, I run the code
shorterJsonString := `{
"user":{
"displayName":null,
"id":"a03dfee5-a754-4eb9"
}
}`
if !json.Valid([]byte(shorterJsonString)) {
log.Println("Not valid")
}
var middleMan models.MiddleMan
newReader := strings.NewReader(shorterJsonString)
json.NewDecoder(newReader).Decode(&middleMan)
log.Println(middleMan)
Unfortunately, the decoder is seemingly broken for nested objects. Rather than spitting out actual objects, the print prints out
{{ a03dfee5-a754-4eb9 }}
It seems to flatten the whole thing into the id field. What is going on here?
What did you expect to happen / get printed?
The log package (which uses the fmt package) prints structs enclosed in braces, listing field values separated by spaces.
Your MiddleMan has a single field, so it'll look like this:
{field}
Where field is another struct of type CompletedByUser, which has 2 fields, so it'll look like this:
{{field1 field2}}
Where field is of string type, being the empty string, so you'll see the value of field2 prefixed with a space:
{{ a03dfee5-a754-4eb9}}
If you print it adding field names:
log.Printf("%+v", middleMan)
You'll see an output like:
{User:{DisplayName: Id:a03dfee5-a754-4eb9}}
Using another (Go-syntax) format:
log.Printf("%#v", middleMan)
Output:
main.MiddleMan{User:main.CompletedByUser{DisplayName:"", Id:"a03dfee5-a754-4eb9"}}
Try this on the Go Playground.

How to deal with Struct having different json key than json response

I have a struct VideoInfo that has a key in it called embedCode. The API I am querying returns the embed code as embed_code. During unmarshalling the response how do I ensure embed_code goes into embedCode?
Also is there an easy way to take a large json string and automatically turn it into a struct, or can one only use a map?
With respect to remapping the field names use the corresponding annotation in the structure declaration:
type VideoInfo struct {
EmbedCode string `json:"embed_code"`
}
The marshaller/un-marshaller will only process public field, so you need to capitalise the field name.
With respect to converting the whole structure, yes it is easy. Declare an instance to un-marshal into and pass a reference to the json.Unmarshal method (from a test):
data, _ := json.Marshal(request)
var resp response.VideoInfo
if err := json.Unmarshal(data, &resp); err != nil {
t.Errorf("unexpected error, %v", err)
}
At first, struct's field must be start from capital letter to be public. So you need something like that:
type VideoInfo struct {
EmbedCode string `json:"embed_code"`
}
And look at documentation for more info.

How can I override json tags in a Go struct?

I'd like to marshal part of this struct:
type ValueSet struct {
Id string `json:"id" bson:"_id"`
Url string `bson:"url,omitempty" json:"url,omitempty"`
Identifier *Identifier `bson:"identifier,omitempty" json:"identifier,omitempty"`
Version string `bson:"version,omitempty" json:"version,omitempty"`
Name string `bson:"name,omitempty" json:"name,omitempty"`
Status string `bson:"status,omitempty" json:"status,omitempty"`
Experimental *bool `bson:"experimental,omitempty" json:"experimental,omitempty"`
Publisher string `bson:"publisher,omitempty" json:"publisher,omitempty"`
Contact []ValueSetContactComponent `bson:"contact,omitempty" json:"contact,omitempty"`
Date *FHIRDateTime `bson:"date,omitempty" json:"date,omitempty"`
LockedDate *FHIRDateTime `bson:"lockedDate,omitempty" json:"lockedDate,omitempty"`
Description string `bson:"description,omitempty" json:"description,omitempty"`
UseContext []CodeableConcept `bson:"useContext,omitempty" json:"useContext,omitempty"`
Immutable *bool `bson:"immutable,omitempty" json:"immutable,omitempty"`
Requirements string `bson:"requirements,omitempty" json:"requirements,omitempty"`
Copyright string `bson:"copyright,omitempty" json:"copyright,omitempty"`
Extensible *bool `bson:"extensible,omitempty" json:"extensible,omitempty"`
CodeSystem *ValueSetCodeSystemComponent `bson:"codeSystem,omitempty" json:"codeSystem,omitempty"`
Compose *ValueSetComposeComponent `bson:"compose,omitempty" json:"compose,omitempty"`
Expansion *ValueSetExpansionComponent `bson:"expansion,omitempty" json:"expansion,omitempty"`
}
which is part of a Go implementation of HL7 FHIR, including only the metadata fields, and omitting the three content three fields (codeSystem, compose and expansion). I can't (and shouldn't) change the JSON tags in the original source code, since other code strongly depends on it working the way it's written. How can I tell json.Marshal to override the existing JSON tags on these struct elements?
You can't change it, but you don't have to.
Easiest solution is to create your own struct, define your own json tags (how you want them to appear in the output), copy the fields, and marshal a value of your own struct.
E.g. let's say you want to marshal the Id and Urlfields, then:
type MyValueSet struct {
Id string `json:"MyId"`
Url string `json:"MyUrl"`
}
var vs ValueSet = ... // Comes from somewhere
mvs := MyValueSet {
Id: vs.Id,
Url: vs.Url,
}
data, err := json.Marshal(&mvs)
// Check err

Write struct to csv file

What is an idiomatic golang way to dump the struct into a csv file provided? I am inside a func where my struct is passed as interface{}:
func decode_and_csv(my_response *http.Response, my_struct interface{})
Why interface{}? - reading data from JSON and there could be a few different structs returned, so trying to write a generic enough function.
an example of my types:
type Location []struct {
Name string `json: "Name"`
Region string `json: "Region"`
Type string `json: "Type"`
}
It would be a lot easier if you used a concrete type. You'll probably want to use the encoding/csv package, here is a relevant example; https://golang.org/pkg/encoding/csv/#example_Writer
As you can see, the Write method is expecting a []string so in order to generate this, you'll have to either 1) provide a helper method or 2) reflect my_struct. Personally, I prefer the first method but it depends on your needs. If you want to go route two you can get all the fields on the struct an use them as the column headers, then iterate the fields getting the value for each, use append in that loop to add them to a []string and then pass it to Write out side of the loop.
For the first option, I would define a ToSlice or something on each type and then I would make an interface call it CsvAble that requires the ToSlice method. Change the type in your method my_struct CsvAble instead of using the empty interface and then you can just call ToSlice on my_struct and pass the return value into Write. You could have that return the column headers as well (meaning you would get back a [][]string and need to iterate the outer dimension passing each []string into Write) or you could require another method to satisfy the interface like GetHeaders that returns a []string which is the column headers. If that were the case your code would look something like;
w := csv.NewWriter(os.Stdout)
headers := my_struct.GetHeaders()
values := my_struct.ToSlice()
if err := w.Write(headers); err != nil {
//write failed do something
}
if err := w.Write(values); err != nil {
//write failed do something
}
If that doesn't make sense let me know and I can follow up with a code sample for either of the two approaches.

Go json.Unmarshal key with \u0000 \x00

Here is the Go playground link.
Basically there are some special characters ('\u0000') in my JSON string key:
var j = []byte(`{"Page":1,"Fruits":["5","6"],"\u0000*\u0000_errorMessages":{"x":"123"},"*_successMessages":{"ok":"hi"}}`)
I want to Unmarshal it into a struct:
type Response1 struct {
Page int
Fruits []string
Msg interface{} `json:"*_errorMessages"`
Msg1 interface{} `json:"\\u0000*\\u0000_errorMessages"`
Msg2 interface{} `json:"\u0000*\u0000_errorMessages"`
Msg3 interface{} `json:"\0*\0_errorMessages"`
Msg4 interface{} `json:"\\0*\\0_errorMessages"`
Msg5 interface{} `json:"\x00*\x00_errorMessages"`
Msg6 interface{} `json:"\\x00*\\x00_errorMessages"`
SMsg interface{} `json:"*_successMessages"`
}
I tried a lot but it's not working.
This link might help golang.org/src/encoding/json/encode_test.go.
Short answer: With the current json implementation it is not possible using only struct tags.
Note: It's an implementation restriction, not a specification restriction. (It's the restriction of the json package implementation, not the restriction of the struct tags specification.)
Some background: you specified your tags with a raw string literal:
The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes...
So no unescaping or unquoting happens in the content of the raw string literal by the compiler.
The convention for struct tag values quoted from reflect.StructTag:
By convention, tag strings are a concatenation of optionally space-separated key:"value" pairs. Each key is a non-empty string consisting of non-control characters other than space (U+0020 ' '), quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using U+0022 '"' characters and Go string literal syntax.
What this means is that by convention tag values are a list of (key:"value") pairs separated by spaces. There are quite a few restrictions for keys, but values may be anything, and values (should) use "Go string literal syntax", this means that these values will be unquoted at runtime from code (by a call to strconv.Unquote(), called from StructTag.Get(), in source file reflect/type.go, currently line #809).
So no need for double quoting. See your simplified example:
type Response1 struct {
Page int
Fruits []string
Msg interface{} `json:"\u0000_abc"`
}
Now the following code:
t := reflect.TypeOf(Response1{})
fmt.Printf("%#v\n", t.Field(2).Tag)
fmt.Printf("%#v\n", t.Field(2).Tag.Get("json"))
Prints:
"json:\"\\u0000_abc\""
"\x00_abc"
As you can see, the value part for the json key is "\x00_abc" so it properly contains the zero character.
But how will the json package use this?
The json package uses the value returned by StructTag.Get() (from the reflect package), exactly what we did. You can see it in the json/encode.go source file, typeFields() function, currently line #1032. So far so good.
Then it calls the unexported json.parseTag() function, in json/tags.go source file, currently line #17. This cuts the part after the comma (which becomes the "tag options").
And finally json.isValidTag() function is called with the previous value, in source file json/encode.go, currently line #731. This function checks the runes of the passed string, and (besides a set of pre-defined allowed characters "!#$%&()*+-./:<=>?#[]^_{|}~ ") rejects everything that is not a unicode letter or digit (as defined by unicode.IsLetter() and unicode.IsDigit()):
if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
return false
}
'\u0000' is not part of the pre-defined allowed characters, and as you can guess now, it is neither a letter nor a digit:
// Following code prints "INVALID":
c := '\u0000'
if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
fmt.Println("INVALID")
}
And since isValidTag() returns false, the name (which is the value for the json key, without the "tag options" part) will be discarded (name = "") and not used. So no match will be found for the struct field containing a unicode zero.
For an alternative solution use a map, or a custom json.Unmarshaler or use json.RawMessage.
But I would highly discourage using such ugly json keys. I understand likely you are just trying to parse such json response and it may be out of your reach, but you should fight against using these keys as they will just cause more problems later on (e.g. if stored in db, by inspecting records it will be very hard to spot that there are '\u0000' characters in them as they may be displayed as nothing).
You cannot do in such way due to: http://golang.org/ref/spec#Struct_types
But You can unmarshal to map[string]interface{} then check field names of that object through regexp.
I don't think this is possible with struct tags. The best thing you can do is unmarshal it into map[string]interface{} and then get the values manually:
var b = []byte(`{"\u0000abc":42}`)
var m map[string]interface{}
err := json.Unmarshal(b, &m)
if err != nil {
panic(err)
}
fmt.Println(m, m["\x00abc"])
Playground: http://play.golang.org/p/RtS7Nst0d7.