Kafka - Json (best practices) - json

I need to push output of RESTAPI call into KAFKA. Restapi returns json output which has supporting information along with data output into json.RawMessage
type Response struct {
RequestID string `json:"requestId"`
Success bool `json:"success"`
NextPageToken string `json:"nextPageToken,omitempty"`
MoreResult bool `json:"moreResult,omitempty"`
Errors []struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"errors,omitempty"`
**Result json.RawMessage `json:"result,omitempty"`**
Warnings []struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"warning,omitempty"`
}
json.RawMessage has data for 200 records.
Question:
1. As a producer, should I put the whole raw message into kafka topic as one message? Or unmarshal(parse) the json raw message and put each message records as a message( In this case there will be 200 records)
2. if I unmarshal(parse) the data will not be in json format anymore.
I'm not providing any code here... my code can be in GO, python
End consumer for the topic is Spark or custom program which read the data from topic and push the data to another system.
Please let me know what's the best design/ approach?
Thanks

There's no other answer than a great big "It Depends" :)
It Depends on what you're doing with the data ("push to another system" is just a step on the way to doing something with the data), and it depends on the semantic and business meaning of the data.
If each of your 200 messages means something on its own, independent from other messages, then unbundling and putting as individual messages on Kafka makes sense.

Related

How to marshal & unmarshal an x509.Certificate to/from JSON?

I have a struct which looks like this, where PrivateKey & PublicKey are own types mapping to []byte:
type Secret struct {
Cert *x509.Certificate
ValidFor string
Private PrivateKey
Public PublicKey
}
This struct is embedded as a field in another structure, which gets marshalled into a JSON. Marshalling the outer structure seems to work fine and the JSON looks okay, however unmarshalling seems to cause the following error:
json: cannot unmarshal number json: cannot unmarshal number 54368953042[...number shortened...] into Go struct field Certificate.Secrets.Cert.PublicKey of type float64
Obviously, it seems that the unmarshaller is tripping on the big.int which contains the public key. Now, I found a solution online which tells me to first unmarshal to a map[string]interface{}, however given that my structs are a bit more nested, this seems like an exhaustive solution.
Now I'm wondering, is there any easier way to marshal & unmarshal a x509.Certificate with big.Ints in it? Or is the best way indeed to manually store, replace & restore problematic fields?

Serialization/Encoding in json.NewEncoder and json.NewDecoder

I am trying to learn Backend development by building a vary basic REST API using gorilla mux library in Go (following this tutorial)
Here's the code that I have built so far:
package main
import (
"encoding/json"
"net/http"
"github.com/gorilla/mux"
)
// Post represents single post by user
type Post struct {
Title string `json:"title"`
Body string `json:"body"`
Author User `json:"author"`
}
// User is struct that represnets a user
type User struct {
FullName string `json:"fullName"`
Username string `json:"username"`
Email string `json:"email"`
}
var posts []Post = []Post{}
func main() {
router := mux.NewRouter()
router.HandleFunc("/posts", addItem).Methods("POST")
http.ListenAndServe(":5000", router)
}
func addItem(w http.ResponseWriter, req *http.Request) {
var newPost Post
json.NewDecoder(req.Body).Decode(&newPost)
posts = append(posts, newPost)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(posts)
}
However, I'm really confused about what exactly is happening in json.NewDecoder and json.NewEncoder part.
As far as I understand, ultimately data transfer over internet in a REST API will happen in form of bytes/binary format (encoded in UTF-8 i guess?). So json.NewEncoder is converting Go data strcuture to JSON string and json.NewDecoder is doing the opposite (correct me if i'm wrong).
So who is responsible here for converting this JSON string to UTF-8
encoding for data transfer? Is that also part of what json.NewDecoder and json.NewEncoder
do?
Also, if these 2 functions are only serializing/de-serializing
to/from JSON, why the name encoder and decoder (isn't encoding
always related to binary data conversion?). Honestly i'm pretty confused with the terms encoding, serialization, marshaling and the difference between them
Can someone just explain how exactly is data transfer happening here at each conversion level (json, binary, in-memory data structure)?
First, We have to understand that the Encoding process doesn't actually mean that it translates types and returns a JSON representation of a type. The process that gives you the JSON representation is called the Marshaling process and could be done by calling the json.Marshal function.
On the other hand, the Encoding process means that we want to get the JSON encoding of any type and to write(encode) it on a stream that implements io.Writer interface. As we can see the func NewEncoder(w io.Writer) *Encoder receives an io.Writer interface as a parameter and returns a *json.Encoder object. When the method encoder.Encode() is being called, it does the Marshaling process and then writes the result to the io.Writer that we have passed when creating a new Encoder object. You could see the implementation of json.Encoder.Encode() here.
So, if you asked who does do the encoding process to the http stream, the answer is the http.ResponseWriter. ResponseWriter implements the io.Writer interface and when the Encode() method is being called, the encoder will Marshal the object to a JSON encoding representation and then call the func Write([]byte) (int, error) which is a contract method of the io.Writer interface and it will do the writing process to the http stream.
In summary, I could say that Marshal and Unmarshal mean that we want to get the JSON representation of any type and vice-versa. While Encode means that we want to do the Marshaling process and then write(encode) the result to any stream object. And Decode means that we want to get(decode) a json object from any stream and then do the Unmarshaling process.
The json.Encoder produced by the call to json.NewEncoder directly produces its output in UTF-8. No conversion is necessary. (In fact, Go does not have a representation for textual data that is distinct from UTF-8 encoded sequences of bytes — even a string is just an immutable array of bytes under the hood.)
Go uses the term encode for serialisation and decode for deserialisation, whether the serialised form is binary or textual. Do not think too much about the terminology — consider encode and seralise as synonyms.

How to make a swift Codable struct that will decode and encode a MongoDB _id ObjectId() in JSON

I'm a very new developer(this is my first dev job) building a Swift/iOS application for creating/processing orders, and the server is sending me a ObjectID() object in JSON with every product I look up. After some research, it seems like this is a MongoDB object.
The gentleman coding my API routes wants me to grab that object for every product the server sends me, so I can send it back to the server with any orders that include that product. He says that will make it much easier for him to access the product when processing new orders.
So far, I've had no trouble decoding the JSON the server is sending me, because it's been in formats like String, Int, Float, etc., or just another JSON object that needs a new Codable struct full of more of the same.
When it comes to creating a Codable struct for an ObjectID, I don't know what keys/value types (properties/value types, whatever terminology you want to use) to tell it to expect. Or if this is even the correct way to go about it.
This is what I have right now:
import Foundation
struct ProductData: Codable {
let _id : ObjectId
let productId : String
let description : String
let ...
}
The ObjectId type appearing above is a custom Codable struct that I haven't built yet, because I'm not sure how. I imagine it should look something like this:
import Foundation
struct ObjectId : Codable {
let someVariableName : SomeCodableType
let ...
}
I don't know what the variable name or the type would be. I understand that it has a timestamp and some other information inside of it, and I've read about it being represented as a string, but I get the feeling if I try something like let _id:String in my product Codable struct, it won't decode/encode the way I'm imagining.
I'm wondering how to build a "type" that will properly catch/decode the _id object that is being thrown at me. If there's a way to simply hold that data without decoding it, and just send it back when I need to later, that would also suit my purposes.
EDIT:
After some experimentation, I found this raw JSON for the _id object:
"_id":{"$id":"58071f9d3f791f4f4f8b45ff"}
Dollar signs are not allowed in Swift variable/property names, so I'm unsure how to proceed in a way that satisfies both the incoming AND outgoing - I could make a custom key by manually initializing the JSON so it will properly work with Swift, but I'm unsure if there's a way to reverse that when encoding the same object back into JSON to send back to the server.
Catching the JSON as a Dictionary of [String:String] seemed to do the trick.
import Foundation
struct ProductData: Codable {
let _id : [String:String]
let productId : String
let description : String
let ...
}
However, the server is struggling to convert this back into an OrderId() object - I'm guessing that the translation from "$id":OrderId("someid") to "$id":"someid" is not what should happen.

Excessive use of map[string]interface{} in go development?

The majority of my development experience has been from dynamically typed languages like PHP and Javascript. I've been practicing with Golang for about a month now by re-creating some of my old PHP/Javascript REST APIs in Golang. I feel like I'm not doing things the Golang way most of the time. Or more generally, I'm not use to working with strongly typed languages. I feel like I'm making excessive use of map[string]interface{} and slices of them to box up data as it comes in from http requests or when it gets shipped out as json http output. So what I'd like to know is if what I'm about to describe goes against the philosophy of golang development? Or if I'm breaking the principles of developing with strongly typed languages?
Right now, about 90% of the program flow for REST Apis I've rewritten with Golang can be described by these 5 steps.
STEP 1 - Receive Data
I receive http form data from http.Request.ParseForm() as formvals := map[string][]string. Sometimes I will store serialized JSON objects that need to be unmarshaled like jsonUserInfo := json.Unmarshal(formvals["user_information"][0]) /* gives some complex json object */.
STEP 2 - Validate Data
I do validation on formvals to make sure all the data values are what I expect before using it in SQL queries. I treat everyting as a string, then use Regex to determine if the string format and business logic is valid (eg. IsEmail, IsNumeric, IsFloat, IsCASLCompliant, IsEligibleForVoting,IsLibraryCardExpired etc...). I've written my own Regex and custom functions for these types of validations
STEP 3 - Bind Data to SQL Queries
I use golang's database/sql.DB to take my formvals and bind them to my Query and Exec functions like this Query("SELECT * FROM tblUser WHERE user_id = ?, user_birthday > ? ",formvals["user_id"][0], jsonUserInfo["birthday"]). I never care about the data types I'm supplying as arguments to be bound, so they're all probably strings. I trust the validation in the step immediately above has determined they are acceptable for SQL use.
STEP 4 - Bind SQL results to []map[string]interface{}{}
I Scan() the results of my queries into a sqlResult := []map[string]interface{}{} because I don't care if the value types are null, strings, float, ints or whatever. So the schema of an sqlResult might look like:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
}
I wrote my own eager load function so that I can bind more information like so EagerLoad("tblAddress", "JOIN ON tblAddress.user_id",&sqlResult) which then populates sqlResult with more information of the type []map[string]interface{}{} such that it looks like this:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
[1] {
"type":"work"
"address1":"5 Kennedy Avenue"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
}
STEP 5 - JSON Marshal and send HTTP Response
then I do a http.ResponseWriter.Write(json.Marshal(sqlResult)) and output data for my REST API
Recently, I've been revisiting articles with code samples that use structs in places I would have used map[string]interface{}. For example, I wanted to refactor Step 2 with a more standard approach that other golang developers would use. So I found this https://godoc.org/gopkg.in/go-playground/validator.v9, except all it's examples are with structs . I also noticed that most blogs that talk about database/sql scan their SQL results into typed variables or structs with typed properties, as opposed to my Step 4 which just puts everything into map[string]interface{}
Hence, i started writing this question. I feel the map[string]interface{} is so useful because majority of the time,I don't really care what the data is and it gives me to the freedom in Step 4 to construct any data schema on the fly before I dump it as JSON http response. I do all this with as little code verbosity as possible. But this means my code is not as ready to leverage Go's validation tools, and it doesn't seem to comply with the golang community's way of doing things.
So my question is, what do other golang developers do with regards to Step 2 and Step 4? Especially in Step 4...do Golang developers really encourage specifying the schema of the data through structs and strongly typed properties? Do they also specify structs with strongly typed properties along with every eager loading call they make? Doesn't that seem like so much more code verbosity?
It really depends on the requirements just like you have said you don't require to process the json it comes from the request or from the sql results. Then you can easily unmarshal into interface{}. And marshal the json coming from sql results.
For Step 2
Golang has library which works on validation of structs used to unmarshal json with tags for the fields inside.
https://github.com/go-playground/validator
type Test struct {
Field `validate:"max=10,min=1"`
}
// max will be checked then min
you can also go to godoc for validation library. It is very good implementation of validation for json values using struct tags.
For STEP 4
Most of the times, We use structs if we know the format and data of our JSON. Because it provides us more control over the data types and other functionality. For example if you wants to empty a JSON feild if you don't require it in your JSON. You should use struct with _ json tag.
Now you have said that you don't care if the result coming from sql is empty or not. But if you do it again comes to using struct. You can scan the result into struct with sql.NullTypes. With that also you can provide json tag for omitempty if you wants to omit the json object when marshaling the data when sending a response.
Struct values encode as JSON objects. Each exported struct field
becomes a member of the object, using the field name as the object
key, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format
string stored under the "json" key in the struct field's tag. The
format string gives the name of the field, possibly followed by a
comma-separated list of options. The name may be empty in order to
specify options without overriding the default field name.
The "omitempty" option specifies that the field should be omitted from
the encoding if the field has an empty value, defined as false, 0, a
nil pointer, a nil interface value, and any empty array, slice, map,
or string.
As a special case, if the field tag is "-", the field is always
omitted. Note that a field with name "-" can still be generated using
the tag "-,".
Example of json tags
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty,
// as defined above.
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
As you can analyze from above information given in Golang spec for json marshal. Struct provide so much control over json. That's why Golang developer most probably use structs.
Now on using map[string]interface{} you should use it when you don't the structure of your json coming from the server or the types of fields. Most Golang developers stick to structs wherever they can.

golang appengine outgoing json

So I run golang appengine with go endpoints package ...
I use structs to marshal and un marshal my json incoming requests and out going responses ..
type BusinessWorker struct {
Wid string `json:"wid" datastore:"Worker_id" endpoints:"req,desc=Worker id. string value"`
Phone string `json:"phone" datastore:"Phone" endpoints:"req,desc=Worker phone number. string value"`
}
So as you can see after I validate the data this obj is saved or loaded to/from the datastore ..
My question is ..
there are many cases I dont want to respond with all my data that is saved in the datastore .. is there some sort of attribute that I can give to the param that i dont wanna include in my response only in my incoming requests ?
It seems so elementary .. and I cant find it .. ?
Maybe you would like to try one or a combination of the following approaches:
Tag of "-" so that the field is ignored. e.g. json:"-"
omitempty can be included in your 'json:' and will cause the field not to be included in the resulting json. So you could set the fields you want to hide to nil, prior to serializing to json. e.g. json:"myName,omitempty"
copier - there are some projects like: jinzhu's copier that would allow you to copy your entity to a simplified structure, or you could roll your own. (a combination of JSON un-marshalling and marshalling can produce similar results).
For more details about the JSON package see Golang Json marshal docs