How to validate JSON and show positions of any errors? - json

I want to parse and validate (custom) JSON configuration files within Go. I would like to be able to parse the file into a struct and validate that:
no unexpected keys are present in the JSON file (in particular to detect typos)
certain keys are present and have non-empty values
In case the validation fails (or in case of a syntax error), I want to print an error message to the user that explains as detailed as possible where in the file the error happened (e.g. by stating the line number if possible).
The JSON parser built into Go seems to just silently ignore unexpected keys. I also tried using jsonpb (Protobuf) to deserialize the JSON, which returns an error in case of an unexpected key, but does not report the position.
To check for non-empty values, I could use an existing validation library, but I haven't seen any that reports detailed error messages. Alternatively, I could write custom code that validates the data returned by the built-in JSON parser, but it would be nice if there was a generic way.
Is there a simple way to get the desired behaviour?

Have you looked at JSON schema?
JSON Schema describes your JSON data format.
I believe it is in Draft stage, but a lot of languages have validation libraries. Here's a Go implementation:
https://github.com/xeipuuv/gojsonschema

You can also use the encoding/json JSON Decoder and force errors when unexpected keys are found. It won't tell you the line number, but it's a start and you don't require any external package.
package main
import (
"bytes"
"encoding/json"
"fmt"
)
type MyType struct {
ExpectedKey string `json:"expected_key"`
}
func main() {
jsonBytes := []byte(`{"expected_key":"a", "unexpected_key":"b"}`)
var typePlaceholder MyType
// Create JSON decoder
dec := json.NewDecoder(bytes.NewReader(jsonBytes))
// Force errors when unexpected keys are present
dec.DisallowUnknownFields()
if err := dec.Decode(&typePlaceholder); err != nil {
fmt.Println(err.Error())
}
}
You can see that working in playground here

Related

Encoding and decoding structs of

I'm trying to encode and decode structs, I've searched around quite a bit and a lot of the questions regarding this topic is usually people who want to encode primitives, or simple structs. What I want is to encode a struct that could look like this:
Name string
Id int
file *os.File
keys *ecdsa.PrivateKey
}
The name and the ID is no problem, and I can encode them using either gob or json marshalling. However when I want to encode a file for example using gob, I'd usegob.Register(os.File{}) I get an error that file has no exported fields, due to the fields in the file struct being lower case. I would use a function like this
buf := bytes.Buffer{}
enc := gob.NewEncoder(&buf)
gob.Register(big.Int{})
...
err := enc.Encode(&p)
if err != nil {
log.Fatal(err)
}
fmt.Println("uncompressed size (bytes): ", len(buf.Bytes()))
return buf.Bytes()
}
I'm not sure if it's correct to register within the encode function, however it seems odd that I have to register all structs that is being referenced to for the one specific struct i want to encode. For example with a file, I would have to register a ton of interfaces, it doesn't seem to be the correct way to do it. Is there a simple way to encode and decode structs that have a bit more complexity.
If I use json marshalling to do this it will always return nil if I use a pointer to another struct. Is there a way to get all the information I want?
Thanks!
Imagine your struct ponts to a file in /foo/bar/baz.txt and you serialize your struct. The you send it to another computer (perhaps in a different operational system) and re-create the struct. What do you expect?
What if you serialize, delete the file (or update the content) and re-create the struct in the same computer?
One solution is store the content of the file.
Another solution is to store the path to the file and, when you deserialize the struct you can try to reopen the file. You can add a security layer by storing the hash of the content, size and other metadata to check if the file is the same.
The answer will guide you to the best implementation

Serialization/Encoding in json.NewEncoder and json.NewDecoder

I am trying to learn Backend development by building a vary basic REST API using gorilla mux library in Go (following this tutorial)
Here's the code that I have built so far:
package main
import (
"encoding/json"
"net/http"
"github.com/gorilla/mux"
)
// Post represents single post by user
type Post struct {
Title string `json:"title"`
Body string `json:"body"`
Author User `json:"author"`
}
// User is struct that represnets a user
type User struct {
FullName string `json:"fullName"`
Username string `json:"username"`
Email string `json:"email"`
}
var posts []Post = []Post{}
func main() {
router := mux.NewRouter()
router.HandleFunc("/posts", addItem).Methods("POST")
http.ListenAndServe(":5000", router)
}
func addItem(w http.ResponseWriter, req *http.Request) {
var newPost Post
json.NewDecoder(req.Body).Decode(&newPost)
posts = append(posts, newPost)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(posts)
}
However, I'm really confused about what exactly is happening in json.NewDecoder and json.NewEncoder part.
As far as I understand, ultimately data transfer over internet in a REST API will happen in form of bytes/binary format (encoded in UTF-8 i guess?). So json.NewEncoder is converting Go data strcuture to JSON string and json.NewDecoder is doing the opposite (correct me if i'm wrong).
So who is responsible here for converting this JSON string to UTF-8
encoding for data transfer? Is that also part of what json.NewDecoder and json.NewEncoder
do?
Also, if these 2 functions are only serializing/de-serializing
to/from JSON, why the name encoder and decoder (isn't encoding
always related to binary data conversion?). Honestly i'm pretty confused with the terms encoding, serialization, marshaling and the difference between them
Can someone just explain how exactly is data transfer happening here at each conversion level (json, binary, in-memory data structure)?
First, We have to understand that the Encoding process doesn't actually mean that it translates types and returns a JSON representation of a type. The process that gives you the JSON representation is called the Marshaling process and could be done by calling the json.Marshal function.
On the other hand, the Encoding process means that we want to get the JSON encoding of any type and to write(encode) it on a stream that implements io.Writer interface. As we can see the func NewEncoder(w io.Writer) *Encoder receives an io.Writer interface as a parameter and returns a *json.Encoder object. When the method encoder.Encode() is being called, it does the Marshaling process and then writes the result to the io.Writer that we have passed when creating a new Encoder object. You could see the implementation of json.Encoder.Encode() here.
So, if you asked who does do the encoding process to the http stream, the answer is the http.ResponseWriter. ResponseWriter implements the io.Writer interface and when the Encode() method is being called, the encoder will Marshal the object to a JSON encoding representation and then call the func Write([]byte) (int, error) which is a contract method of the io.Writer interface and it will do the writing process to the http stream.
In summary, I could say that Marshal and Unmarshal mean that we want to get the JSON representation of any type and vice-versa. While Encode means that we want to do the Marshaling process and then write(encode) the result to any stream object. And Decode means that we want to get(decode) a json object from any stream and then do the Unmarshaling process.
The json.Encoder produced by the call to json.NewEncoder directly produces its output in UTF-8. No conversion is necessary. (In fact, Go does not have a representation for textual data that is distinct from UTF-8 encoded sequences of bytes — even a string is just an immutable array of bytes under the hood.)
Go uses the term encode for serialisation and decode for deserialisation, whether the serialised form is binary or textual. Do not think too much about the terminology — consider encode and seralise as synonyms.

Is there any way to extract JSON Schema from a given JSON in Go? [duplicate]

This question already exists:
How can I extract JSON Schema from a given JSON in Go? [closed]
Closed 3 years ago.
Is there a way that to convert a JSON to its schema in Go? I need to compare 2 JSON templates or schemas and cannot find any package or function to do the same - can someone please help me with this?
You can have a look at gjson library. It has functions to parse and get unmarshalled JSON. You can use gjson functionality to compare json results.
I think you will need to unmarshal them recursively (if they contain nested json) into something like map[string]interface{}, and then loop through and compare the keys. There are some libraries mentioned on this question https://stackoverflow.com/a/42153666 which could be used to unmarshal them safely.
For example, You can use Exists from the gabs library while iterating through the keys in the unmarhsalled map to see if the same keys exist in the other map.
// From gabs library
// Exists checks whether a field exists within the hierarchy.
func (g *Container) Exists(hierarchy ...string) bool {
return g.Search(hierarchy...) != nil
}
Edit: without libraries here: https://play.golang.org/p/jmfFsLT0G1n based on the test case of this code golf exercise: https://codegolf.stackexchange.com/questions/195476/extract-all-keys-from-an-object-json
The json package provided in Go’s standard library provides us with all the functionality we need. For any JSON string, the standard way to parse it is:
import "encoding/json" //...
// ... myJsonString := `{"some":"json"}`
// `&myStoredVariable` is the address of the variable we want to store our // parsed data in
json.Unmarshal([]byte(myJsonString), &myStoredVariable)
//...

Python like json handling using Golang

Using Python I can do following:
r = requests.get(url_base + url)
jsonObj = json.loads(r.content.decode('raw_unicode_escape'))
print(jsonObj["PartDetails"]["ManufacturerPartNumber"]
Is there any way to perform same thing using Golang?
Currently I need following:
json.Unmarshal(body, &part_number_json)
fmt.Println("\r\nPartDetails: ", part_number_json.(map[string]interface{})["PartDetails"].(map[string]interface{})["ManufacturerPartNumber"])
That is to say I need to use casting for each field of JSON what tires and makes the code unreadable.
I tried this using reflection but it is not comphortable too.
EDIT:
currently use following function:
func jso(json interface{}, fields ...string) interface{} {
res := json
for _, v := range fields {
res = res.(map[string]interface{})[v]
}
return res
and call it like that:
fmt.Println("PartDetails: ", jso( part_number_json, "PartDetails", "ManufacturerPartNumber") )
There are third-party packages like gjson that can help you do that.
That said, note that Go is Go, and Python is Python. Go is statically typed, for better and worse. It takes more code to write simple JSON manipulation, but that code should be easier to maintain later since it's more strictly typed and the compiler helps you check against error. Types also serve as documentation - simply nesting dicts and arrays is completely arbitrary.
I have found the following resource very helpful in creating a struct from json. Unmarshaling should only match the fields you have defined in the struct, so take what you need, and leave the rest if you like.
https://mholt.github.io/json-to-go/

Using to to convert large XML files to JSON to store in MongoDB

For a project of mine I have to deal with XML files over 2GB. I would like to store the data mongoDB. I have decided to give it a try using the Go language. But I have a bit of trouble figuring out the best way to do this in Go.
I've seen a lot of examples with a fixed XML structure, but the data structure I get is dynamic, so using some kind of predefined struct isn't going to work for me.
Now I stumbled upon this package: https://github.com/basgys/goxml2json which looks very promising, but there are a few things I don't get:
The example given in the readme is using a XML string, but I don't see anything in the code that accepts a file.
Given the example, I have 2GB xml files, I cannot simply load the whole XML file in memory. This would flud my server.
I think it is good to say, I just have to convert the XML data just once to its JSON form so I can store it in mongoDB.
Does some of you have some ideas on how to parse XML files efficiently to JSON using Go?
Go provides a builtin XML stream parser at encoding/xml.Decoder.
A typical usage pattern is to read tokens until you find something of interest and then unmarshal the token into an XML tagged struct, then handle that data accordingly. This way you're only loading into memory what is required for a single XML token or to unmarshal an interesting bit of data.
For example (Go Playground):
d := xml.NewDecoder(xmlStream)
for {
// Decode the next token from the stream...
token, err := d.Token()
if err == io.EOF {
break
}
check(err)
// Switch behavior based on the token type.
switch el := token.(type) {
case xml.StartElement:
// Handle "person" start elements by unmarshaling from XML...
if el.Name.Local == "person" {
var p Person
err := d.DecodeElement(&p, &el)
check(err)
// ...then marshal to JSON...
jsonbytes, err := json.Marshal(p)
check(err)
// ...then take other action (e.g. insert into database).
fmt.Printf("OK: %s\n", string(jsonbytes))
// OK: {"Id":"123","Name":"Alice","Age":30}
}
}
}