I try to parse an XML data to a JSON file, but when I begin writing marshaled data to a JSON, it just rewrites data in the JSON file and, as a result, I have the file with last XML element. How to write the whole data into the JSON file?
Snippet of code that parses XML and marshal data to JSON
decoder := xml.NewDecoder(file)
resultData := map[string]map[string]string{}
for {
t, _ := decoder.Token()
if t == nil {
break
}
switch et := t.(type) {
case xml.StartElement:
if et.Name.Local == "profile" {
var object XMLProfile
decoder.DecodeElement(&object, &et)
resultData = map[string]map[string]string{
object.ProfileName: {},
}
for _, val := range object.Fields {
resultData[object.ProfileName][val.Name] = val.Value
}
}
}
}
if out, err := json.MarshalIndent(resultData, "", "\t"); err != nil {
panic(err)
} else {
_ = ioutil.WriteFile("test.json", out, 0644)
}
Expect JSON:
{
"Profile 1": {
"role": "user"
},
"Profile 2": {
"role": "user"
},
"Profile 3": {
"role": "admin"
}
}
Actual JSON:
{
"Profile 3": {
"role": "admin"
}
}
Seems like you are recreating the resultData after each iteration in the nodes named "profile". As that happens, only the last one will reach the code where you write the JSON.
Try this:
decoder := xml.NewDecoder(file)
resultData := map[string]map[string]string{}
for {
t, _ := decoder.Token()
if t == nil {
break
}
switch et := t.(type) {
case xml.StartElement:
if et.Name.Local == "profile" {
var object XMLProfile
decoder.DecodeElement(&object, &et)
resultData[object.ProfileName] = map[string]string{}
for _, val := range object.Fields {
resultData[object.ProfileName][val.Name] = val.Value
}
}
}
}
if out, err := json.MarshalIndent(resultData, "", "\t"); err != nil {
panic(err)
} else {
_ = ioutil.WriteFile("test.json", out, 0644)
}
I would also check if no duplicate ProfileName happens to appear in the XML, as it would override the previous entry.
Related
I have a code like this
scanner := bufio.NewScanner(reader)
scanner.Split(splitJSON)
for scanner.Scan() {
bb := scanner.Bytes()
}
I would like to get from Scanner only valid JSON objects one at a time. In some case in Scanner may be bytes that represent struct like this
{
"some_object": "name",
"some_fileds": {}
}
{
"some_object":
}
I need only the first part of this
{
"some_object": "name",
"some_fileds": {}
}
For the other, I should wait for the end of JSON object.
I have a function like this, but it's horrible and doesn't work.
func splitJSON(
bb []byte, atEOF bool,
) (advance int, token []byte, err error) {
print(string(bb))
if len(bb) < 10 {
return 0, nil, nil
}
var nested, from, to int
var end bool
for i, b := range bb {
if string(b) == "{" {
if end {
to = i
break
}
if nested == 0 {
from = i
}
nested++
}
if string(b) == "}" {
nested--
if nested == 0 {
to = i
end = true
}
}
}
if atEOF {
return len(bb), bb, nil
}
return len(bb[from:to]), bb[from:to], nil
}
UPD
It was decided by this splitFunc
func splitJSON(data []byte, atEOF bool) (advance int, token []byte, err error) {
if atEOF && len(data) == 0 {
return 0, nil, nil
}
reader := bytes.NewReader(data)
dec := json.NewDecoder(reader)
var raw json.RawMessage
if err := dec.Decode(&raw); err != nil {
return 0, nil, nil
}
return len(raw) + 1, raw, nil
}
Use json.Decoder for this. Each Decoder.Decode() call will decode the next JSON-encoded value from the input, JSON objects in your case.
If you don't want to decode the JSON objects just need the JSON data (byte slice), use a json.RawMessage to unmarshal into.
For example:
func main() {
reader := strings.NewReader(src)
dec := json.NewDecoder(reader)
for {
var raw json.RawMessage
if err := dec.Decode(&raw); err != nil {
if err == io.EOF {
break
}
fmt.Printf("Error:", err)
return
}
fmt.Println("Next:", string(raw))
}
}
const src = `{
"some_object": "name",
"some_fileds": {}
}
{
"some_object": "foo"
}`
This will output (try it on the Go Playground):
Next: {
"some_object": "name",
"some_fileds": {}
}
Next: {
"some_object": "foo"
}
I have a Dataset Struct, which looks like this -
type Dataset struct {
Publications []GeneralDetails `bson:"publications,omitempty" json:"publications,omitempty"`
URI string `bson:"uri,omitempty" json:"uri,omitempty"`
}
And a GeneralDetails struct, which looks like this -
type GeneralDetails struct {
Description string `bson:"description,omitempty" json:"description,omitempty"`
HRef string `bson:"href,omitempty" json:"href,omitempty"`
Title string `bson:"title,omitempty" json:"title,omitempty"`
}
I have a ValidateDastaset funciton which trims whitespace and returns a parsed URL -
func ValidateDataset(ctx context.Context, dataset *Dataset) error {
var generalDetails = &GeneralDetails{}
var invalidFields []string
if dataset.URI != "" {
dataset.URI = strings.TrimSpace(dataset.URI)
_, err := url.Parse(dataset.URI)
if err != nil {
invalidFields = append(invalidFields, "URI")
log.Event(ctx, "error parsing URI", log.ERROR, log.Error(err))
}
}
if dataset.Publications != nil {
generalDetails.HRef = strings.TrimSpace(generalDetails.HRef)
_, err := url.Parse(generalDetails.HRef)
if err != nil {
invalidFields = append(invalidFields, "href")
log.Event(ctx, "error parsing URI", log.ERROR, log.Error(err))
}
}
if invalidFields != nil {
return fmt.Errorf("invalid fields: %v", invalidFields)
}
return nil
}
However, the whitespace is only being trimmed on dataset.URI, but not on generalDetails.HRef and I really don't understand why. Is anyone able to help please?
Edit: here's what's in my test package, which is how I know the whitespace isn't being trimmed -
var testPublications = GeneralDetails{
Description: "some publication description",
HRef: "http://localhost:22000//datasets/publications",
Title: "some publication title",
}
func createDataset() Dataset {
return Dataset{
ID: "123",
URI: "http://localhost:22000/datasets/123",
Publications: []GeneralDetails{
{Description: "some publication description"},
{HRef: "http://localhost:22000//datasets/publications"},
{Title: "some publication title"},
},
}
}
}
func createGeneralDetails() GeneralDetails {
return testPublications
}
And the actual test itself, which is failing -
Convey("Successful validation (true) returned", t, func() {
Convey("when generalDetails.Href contains whitespace it should not return an error ", func() {
dataset := createDataset()
dataset.ID = "123"
generalDetails := createGeneralDetails()
generalDetails.HRef = " http://localhost:22000//datasets/publications "
validationErr := ValidateDataset(testContext, &dataset)
So(validationErr, ShouldBeNil)
So(generalDetails.HRef, ShouldEqual, "http://localhost:22000//datasets/publications")
})
})
In the function ValidateDataset, you created a
var generalDetails = &GeneralDetails{}
But you never assign to this variable, or iterate the Publications field of the passed in *Dataset
if dataset.Publications != nil {
generalDetails.HRef = strings.TrimSpace(generalDetails.HRef)
_, err := url.Parse(generalDetails.HRef)
if err != nil {
invalidFields = append(invalidFields, "href")
log.Event(ctx, "error parsing URI", log.ERROR, log.Error(err))
}
}
You're just editing the nil generalDetails var, and forgetting completely about dataset.Publications.
You'd need to do something like
for i := range dataset.Publications {
generalDetails = &dataset.Publications[i]
// validate
}
Here's an example of what I need:
Default JSON:
{
"name": "John",
"greetings": {
"first": "hi",
"second": "hello"
}
}
merged with the changes:
{
"name": "Jane",
"greetings": {
"first": "hey"
}
}
should become:
{
"name": "Jane",
"greetings": {
"first": "hey",
"second": "hello"
}
}
Here's what I've tried:
package main
import (
"encoding/json"
"fmt"
"reflect"
)
func MergeJSON(defaultJSON, changedJSON string) string {
var defaultJSONDecoded map[string]interface{}
defaultJSONUnmarshalErr := json.Unmarshal([]byte(defaultJSON), &defaultJSONDecoded)
if defaultJSONUnmarshalErr != nil {
panic("Error unmarshalling first JSON")
}
var changedJSONDecoded map[string]interface{}
changedJSONUnmarshalErr := json.Unmarshal([]byte(changedJSON), &changedJSONDecoded)
if changedJSONUnmarshalErr != nil {
panic("Error unmarshalling second JSON")
}
for key, _ := range defaultJSONDecoded {
checkKeyBeforeMerging(key, defaultJSONDecoded[key], changedJSONDecoded[key], changedJSONDecoded)
}
mergedJSON, mergedJSONErr := json.Marshal(changedJSONDecoded)
if mergedJSONErr != nil {
panic("Error marshalling merging JSON")
}
return string(mergedJSON)
}
func checkKeyBeforeMerging(key string, defaultMap interface{}, changedMap interface{}, finalMap map[string]interface{}) {
if !reflect.DeepEqual(defaultMap, changedMap) {
switch defaultMap.(type) {
case map[string]interface{}:
//Check that the changed map value doesn't contain this map at all and is nil
if changedMap == nil {
finalMap[key] = defaultMap
} else if _, ok := changedMap.(map[string]interface{}); ok { //Check that the changed map value is also a map[string]interface
defaultMapRef := defaultMap.(map[string]interface{})
changedMapRef := changedMap.(map[string]interface{})
for newKey, _ := range defaultMapRef {
checkKeyBeforeMerging(newKey, defaultMapRef[newKey], changedMapRef[newKey], finalMap)
}
}
default:
//Check if the value was set, otherwise set it
if changedMap == nil {
finalMap[key] = defaultMap
}
}
}
}
func main() {
defaultJSON := `{"name":"John","greetings":{"first":"hi","second":"hello"}}`
changedJSON := `{"name":"Jane","greetings":{"first":"hey"}}`
mergedJSON := MergeJSON(defaultJSON, changedJSON)
fmt.Println(mergedJSON)
}
The code above returns the following:
{
"greetings": {
"first": "hey"
},
"name": "Jane",
"second": "hello"
}
So basically any changes should be applied to the default and return the full JSON. I also need this to work recursively.
How can I fix this? I can see where I went wrong, I'm just not sure how to make it work recursively.
Thanks
Your issue in the posted code is with your recursive call:
checkKeyBeforeMerging(newKey, defaultMapRef[newKey], changedMapRef[newKey], finalMap)
The reference to finalMap should actually be the nested part of the merged map. Meaning replace finalMap with something like finalMap[key].(map[string]interface{}).
I recently come across the same need, here is my solution:
// override common json by input json
func Override(input, common interface{}) {
switch inputData := input.(type) {
case []interface{}:
switch commonData := common.(type) {
case []interface{}:
for idx, v := range inputData {
Override(v, commonData[idx])
}
}
case map[string]interface{}:
switch commonData := common.(type) {
case map[string]interface{}:
for k, v := range commonData {
switch reflect.TypeOf(v).Kind() {
case reflect.Slice, reflect.Map:
Override(inputData[k], v)
default:
// do simply replacement for primitive type
_, ok := inputData[k]
if !ok {
inputData[k] = v
}
}
}
}
}
return
}
unmarshal json before call:
var commmon interface{}
if err = json.Unmarshal("common data", &commmon); err != nil {
logger.Error(err)
return
}
var input interface{}
if err = json.Unmarshal("input data", &input); err != nil {
logger.Error(err)
return
}
Override(input, commmon)
I have this code:
type Response struct {
ID string `json:"id"`
Tags Tags `json:"tags,omitempty"`
}
type Tags struct {
Geo []string `json:"geo,omitempty"`
Keyword []string `json:"keyword,omitempty"`
Storm []string `json:"storm,omitempty"`
}
func (t *Tags) UnmarshalJSON(b []byte) (err error) {
str := string(b)
if str == "" {
t = &Tags{}
return nil
}
err = json.Unmarshal(b, t)
if err != nil {
return err
}
return nil
}
Now, my JSON response looks like this:
[{
"id": "/cms/v4/assets/en_US",
"doc": [{
"id": "af02b41d-c2c5-48ec-9dbc-ceed693bdbac",
"tags": {
"geo": [
"DMA:US.740:US"
]
}
},
{
"id": "6a90d9ed-7978-4c18-8e36-c01cf4260492",
"tags": ""
},
{
"id": "32cfd045-98ac-408c-b464-c74e02466339",
"tags": {
"storm": [
"HARVEY - AL092017"
],
"keyword": [
"hurrcane",
"wunderground"
]
}
}
]
}]
Preferably, I'd change the JSON response to be done correctly, but I cannot. Unmarshaling continues to error out (goroutine stack exceeds 1000000000-byte limit). Preferably, I'd rather do this using easyjson or ffjson but doubt it is possible. Suggestions?
Your UnmarshalJSON function calls itself recursively, which will cause the stack to explode in size.
func (t *Tags) UnmarshalJSON(b []byte) (err error) {
str := string(b)
if str == "" {
t = &Tags{}
return nil
}
err = json.Unmarshal(b, t) <--- here it calls itself again
if err != nil {
return err
}
return nil
}
If you have a reason to call json.Unmarshal from within a UnmarshalJSON function, it must be on a different type. A common way to do this is to use a local alias:
type tagsAlias Tags
var ta = &tagsAlias
err = json.Unmarshal(b, ta)
if err != nil {
return err
}
*t = Tags(ta)
Also note that t = &Tags{} does nothing in your function; it assigns a new value to t, but that value is lost as soon as the function exits. If you really want to assign to t, you need *t; but you also don't need that at all, unless you're trying to unsset a previously set instance of *Tags.
I've this piece of code.
package main
import (
"github.com/gin-gonic/gin"
_ "github.com/go-sql-driver/mysql"
)
func divisionsHandler(c *gin.Context) {
divisions := getDivisionRows()
json := make(map[int]string)
for divisions.Next() {
var d Division
err := divisions.Scan(&d.id, &d.name)
json[d.id] = d.name
if err != nil {
panic(err.Error())
}
}
c.JSON(200, json)
}
The result is
{
1: "games",
2: "technology",
3: "tekk",
4: "home entertainment",
5: "toys & stationery"
}
I am trying to convert that json in something like
{
[{
"id": 1,
"name": "games"
},
...
]
}
but how?
So you want a json array instead of a json object?
Instead of loading a map[int]string, why not simply make a []Division?
list := []Division{}
for divisions.Next() {
var d Division
err := divisions.Scan(&d.id, &d.name)
list = append(list, d)
if err != nil {
panic(err.Error())
}
}
You'll need to change the field names to ID and Name so that the json package can serialize them, but you should end up with somthing more like:
[
{"ID":1,"Name":"Games},
...
]