Dealing with NULLs when migrating from MySQL to Mongo in Go

Dealing with NULLs when migrating from MySQL to Mongo in Go - mysql

Setup
I started a project using MySQL and as such, my project has some helper types that assist with dealing with nulls, both when unmarshalling incoming data on the API, inputting data into the DB, and then the inverse of that, pulling data out of the Database and responding with said data to the API.
For the purposes of this question, we'll deal with a struct i have that represents a Character.
type Character struct {
MongoID primitive.ObjectID `bson:"_id" json:"-"`
ID uint64 `bson:"id" json:"id"`
Name string `bson:"name" json:"name"`
CorporationID uint `bson:"corporation_id" json:"corporation_id"`
AllianceID null.Uint `bson:"alliance_id" json:"alliance_id,omitempty"`
FactionID null.Uint `bson:"faction_id" json:"faction_id,omitempty"`
SecurityStatus float64 `bson:"security_status" json:"security_status"`
NotModifiedCount uint `bson:"not_modified_count" json:"not_modified_count"`
UpdatePriority uint `bson:"update_priority" json:"update_priority"`
Etag null.String `bson:"etag" json:"etag"`
CachedUntil time.Time `bson:"cached_until" json:"cached_until"`
CreatedAt time.Time `bson:"created_at" json:"created_at"`
UpdatedAt time.Time `bson:"updated_at" json:"updated_at"`
}
I want to specifically concentrate on the AllianceID property of type null.Uint which is represented with the following struct:
// Uint is an nullable uint.
type Uint struct {
Uint uint
Valid bool
}
In an API setup using JSON and MySQL (i.e. My setup, but this is not exclusive), this structure allows me to easily deal with values that are "nullable" without having to deal with Pointers. I've always heard that it is best to avoid Pointers with the exception of pointers to structures (Slices, Slices of Structs, Map of Structs, etc). If you have a primitive type (int, bool, float, etc), try to avoid using a pointer to that primitive type.
This type has functions like MarshalJSON, UnmarshalJSON, Scan, and Value with logic inside those functions that leverage the Vaild property to determine what type of value to return. This works really really well with this setup.
Question
After some research, I've come to realize that Mongo would suit me better than a relational database, but due to the fluidity of a Mongo Document (Schemaless), I'm having a hard time understanding how to handle scenarios where a field maybe missing, or a property that i have in MySQL that would normally be null and I can easily unmarshal ontop this struct and use the helper functions logically, is handled. Also, when I setup a connection to Mongo and pull a couple of rows from MySQL and created Documents in Mongo from these rows, the BSON layer is marshalling the entire type for Alliance ID and sticking it in the DB.
Example:
"alliance_id" : {
"uint" : NumberLong(99007760),
"valid" : true
},
Where as in MySQL, the Value function implementing Valuer interface would be called and return 99007760 and that is the value in the DB.
Another scenario would be if valid was false. In MySQL this would mean a null value and when the Value function is called, it would return nil and the mysql driver would populate the field with NULL
So my question is how do I do this? Do I need to start from scratch and rebuild my models and redo some of the logic in my application that leverages the Valid property and use *Pointers or can I do what I am attempting to do using these helper types.
I do want to say that I have tried implementing the Marshaller, and Unmarshaller interfaces on the bson package and the alliane_id in the document is still set to the json encoded version of this type as I outlined above. I wanted to point this out to rule out any suggestions of implemeting those interfaces. If what I am attempting to achieve is counter intuitive to Mongo, please link some guides that can help me achieve what im attempting to do.
Thank you to all who can assist with this.

The easiest way to deal with optional fields like this is to use a pointer:
type Character struct {
ID *uint64 `bson:"id,omitempty" json:"id"`
Name string `bson:"name" json:"name"`
...
}
Above, the ID field will be written if it s non-nil. When unmarshaling, it will be set to a non-nil value if database record has a value for it. If you omit the omitempty flag, marshaling this struct will write null to the database.
For strings, you may use omitempty to omit the field completely if it is empty. If you want to store empty strings, omit omitempty.

Related

A better solution to validate JSON unmarshal to nested structs

There appears to be few options to validate the source JSON used when unmarshalling to a struct. By validate I mean 3 main things:
a required field exists in the JSON
the field is the correct type (e.g. don't force a string into an integer)
the field contains a valid value (value range / enum)
For nested structs, I simply mean where an attribute in one struct has the type of another struct:
type Example struct {
Attr1 int `json:"attr1"`
Attr2 ExampleToo `json:"attr2"`
}
type ExampleToo struct {
Attr3 int `json:"attr3"`
}
And this JSON would be valid:
{"attr1": 5, "attr2": {"attr3": 0}}
To keep this simple, I'll focus simply on integers. The concept of "zero values" is the first issue. I could create an UnmarshalJSON method, which is detected by JSON packages, including the standard encoding/json package. The problem with this approach is that is that is does not support nested structs. If ExampleToo has an UnmarshalJSON method, the ExampleToo.UnmarshalJSON() method is never called if unmarshalling to an Example object. It would be possible to write a method Example.UnmarshalJSON() that recursively handled validation, but that seems extremely complex, especially if ExampleToo is reused in many places.
So there appears to be some packages like the go-playground/validator where validation can be specified both as functions and tags. However, this works on the struct created, and not the JSON itself. So if a field is tagged as validation:"required" on an integer, and the integer value is 0, this will return an error because 0 is both a valid value and the "zero value" for integers.
An example of the latter here: https://go.dev/play/p/zqSUksPzUiq
I could also use pointers for everything, checking for nil as missing values. The main problem with that is that it requires dereferencing on each use and is a pretty uncommon practice for things like integers and strings.
One thing that I have also considered is a "sister struct" that uses pointers to do validation for required fields. The process would basically be to write a validation method for each struct, then validate that sister struct. If it works, then deserialize the main struct (without pointers). I haven't started on this, just a concept I've thought about, but I'm hoping there are better validation options.
So... is there a better way to do JSON/YAML input validation on nested structs? I'm happy to mix methods where say UnmarshalJSON is used for doing some work like verifying fields exist, but I'd like to pass that back to the library to let it continue to call UnmarshalJSON on subsequent nested structs. I'd also rather defer to the JSON library for casting values into the struct, etc.

How to unmarshall time string into time.Time in golang?

I am reading data from multiple tables using JOIN, CONCAT, GROUP_CONCAT, JSON_OBJECT. The data is read into the below mentioned model using gorm.
type OrgUserDisPublisherData struct {
Disciplines datatypes.JSON `json:"disciplines" example:"[]"`
User datatypes.JSON `json:"user"`
}
This process is successfully completed. But then when I try to unmarshal the OrgUserDisPublisherData.Disciplines into another struct which has time.Time datatypes. I am getting the following error parsing time "\"2022-11-03 07:08:09.000000\"" as "\"2006-01-02T15:04:05Z07:00\"": cannot parse " 07:08:09.000000\"" as "T"
Final model used for unmarshalling
type Discipline struct {
Name string `json:"name"`
Code string `json:"code"`
IsPrimary uint `json:"isPrimary"`
IsAligned uint `json:"isAligned"`
IsTrainingFaculty uint `json:"isTrainingFaculty"`
AlignmentDate time.Time `json:"alignmentDate"`
UnalignmentDate time.Time `json:"UnalignmentDate"`
ExpiryDate time.Time `json:"expiryDate"`
ExternalId string `json:"externalId"`
Status string `json:"status"`
CreatedAt time.Time `json:"createdAt"`
UpdatedAt time.Time `json:"updatedAt"`
}
At the same time, while inserting data into the tables the same model was used and it does not throw any error related to time.
How can I handle time while unmarshalling, irrespective of the data that is present against the time property?

The problem here is that the default JSON marshalling behaviour for date/time types in GoLang structs is to use ISO8601 formatted date/time strings.
This is identified by the format string in the error message with the T separator between date and time and time-zone suffix. The values in your Discipline JSON string don't conform to this format, lacking both the T separator and any time-zone. Hence the error.
If you can influence the formatting of the JSON string produced by gorm, (not something I'm familiar with so cannot say whether you can or how to do so), then the simplest solution would be to ensure that your JSON string time fields are formatted as ISO8601/RFC3339 strings.
If you have no control over that, then you have two options:
Implement some pre-processing of the JSON via an intermediate map[string]any and reformat the appropriate fields. If the gorm formatting is at least consistent then this could be as easy as splitting the string on the space, remove the final 3dps from the time, append an appropriate timezone (or just Z if the times are UTC) then re-assemble with a T separator.
Use a custom time type with a json.Marshaller implementation that works correctly with the gorm formatted values (you still need to know what time zone applies to the persisted values and apply that correctly when marshalling).
Both of these are vulnerable to a change in the gorm formatting of date/time variables and mis-use (failing to pre-process in the case of Option #1 and mistakenly using time.Time rather than the custom type in the case of Option #2).
For this reason, modifying the formatted output from gorm would be the preferred approach for me, if it is possible.

The quickest solution was to format the data/value while reading it from DB, as suggested by #Deltics.
While querying the data from the DB, using the DATE_FORMAT() I am formatting the data into the format required by go/json
DATE_FORMAT(actual_data, '%Y-%m-%dT%TZ')

Excessive use of map[string]interface{} in go development?

The majority of my development experience has been from dynamically typed languages like PHP and Javascript. I've been practicing with Golang for about a month now by re-creating some of my old PHP/Javascript REST APIs in Golang. I feel like I'm not doing things the Golang way most of the time. Or more generally, I'm not use to working with strongly typed languages. I feel like I'm making excessive use of map[string]interface{} and slices of them to box up data as it comes in from http requests or when it gets shipped out as json http output. So what I'd like to know is if what I'm about to describe goes against the philosophy of golang development? Or if I'm breaking the principles of developing with strongly typed languages?
Right now, about 90% of the program flow for REST Apis I've rewritten with Golang can be described by these 5 steps.
STEP 1 - Receive Data
I receive http form data from http.Request.ParseForm() as formvals := map[string][]string. Sometimes I will store serialized JSON objects that need to be unmarshaled like jsonUserInfo := json.Unmarshal(formvals["user_information"][0]) /* gives some complex json object */.
STEP 2 - Validate Data
I do validation on formvals to make sure all the data values are what I expect before using it in SQL queries. I treat everyting as a string, then use Regex to determine if the string format and business logic is valid (eg. IsEmail, IsNumeric, IsFloat, IsCASLCompliant, IsEligibleForVoting,IsLibraryCardExpired etc...). I've written my own Regex and custom functions for these types of validations
STEP 3 - Bind Data to SQL Queries
I use golang's database/sql.DB to take my formvals and bind them to my Query and Exec functions like this Query("SELECT * FROM tblUser WHERE user_id = ?, user_birthday > ? ",formvals["user_id"][0], jsonUserInfo["birthday"]). I never care about the data types I'm supplying as arguments to be bound, so they're all probably strings. I trust the validation in the step immediately above has determined they are acceptable for SQL use.
STEP 4 - Bind SQL results to []map[string]interface{}{}
I Scan() the results of my queries into a sqlResult := []map[string]interface{}{} because I don't care if the value types are null, strings, float, ints or whatever. So the schema of an sqlResult might look like:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
}
I wrote my own eager load function so that I can bind more information like so EagerLoad("tblAddress", "JOIN ON tblAddress.user_id",&sqlResult) which then populates sqlResult with more information of the type []map[string]interface{}{} such that it looks like this:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
[1] {
"type":"work"
"address1":"5 Kennedy Avenue"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
}
STEP 5 - JSON Marshal and send HTTP Response
then I do a http.ResponseWriter.Write(json.Marshal(sqlResult)) and output data for my REST API
Recently, I've been revisiting articles with code samples that use structs in places I would have used map[string]interface{}. For example, I wanted to refactor Step 2 with a more standard approach that other golang developers would use. So I found this https://godoc.org/gopkg.in/go-playground/validator.v9, except all it's examples are with structs . I also noticed that most blogs that talk about database/sql scan their SQL results into typed variables or structs with typed properties, as opposed to my Step 4 which just puts everything into map[string]interface{}
Hence, i started writing this question. I feel the map[string]interface{} is so useful because majority of the time,I don't really care what the data is and it gives me to the freedom in Step 4 to construct any data schema on the fly before I dump it as JSON http response. I do all this with as little code verbosity as possible. But this means my code is not as ready to leverage Go's validation tools, and it doesn't seem to comply with the golang community's way of doing things.
So my question is, what do other golang developers do with regards to Step 2 and Step 4? Especially in Step 4...do Golang developers really encourage specifying the schema of the data through structs and strongly typed properties? Do they also specify structs with strongly typed properties along with every eager loading call they make? Doesn't that seem like so much more code verbosity?

It really depends on the requirements just like you have said you don't require to process the json it comes from the request or from the sql results. Then you can easily unmarshal into interface{}. And marshal the json coming from sql results.
For Step 2
Golang has library which works on validation of structs used to unmarshal json with tags for the fields inside.
https://github.com/go-playground/validator
type Test struct {
Field `validate:"max=10,min=1"`
}
// max will be checked then min
you can also go to godoc for validation library. It is very good implementation of validation for json values using struct tags.
For STEP 4
Most of the times, We use structs if we know the format and data of our JSON. Because it provides us more control over the data types and other functionality. For example if you wants to empty a JSON feild if you don't require it in your JSON. You should use struct with _ json tag.
Now you have said that you don't care if the result coming from sql is empty or not. But if you do it again comes to using struct. You can scan the result into struct with sql.NullTypes. With that also you can provide json tag for omitempty if you wants to omit the json object when marshaling the data when sending a response.
Struct values encode as JSON objects. Each exported struct field
becomes a member of the object, using the field name as the object
key, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format
string stored under the "json" key in the struct field's tag. The
format string gives the name of the field, possibly followed by a
comma-separated list of options. The name may be empty in order to
specify options without overriding the default field name.
The "omitempty" option specifies that the field should be omitted from
the encoding if the field has an empty value, defined as false, 0, a
nil pointer, a nil interface value, and any empty array, slice, map,
or string.
As a special case, if the field tag is "-", the field is always
omitted. Note that a field with name "-" can still be generated using
the tag "-,".
Example of json tags
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty,
// as defined above.
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
As you can analyze from above information given in Golang spec for json marshal. Struct provide so much control over json. That's why Golang developer most probably use structs.
Now on using map[string]interface{} you should use it when you don't the structure of your json coming from the server or the types of fields. Most Golang developers stick to structs wherever they can.

TypeScript types serialisation/deserialization in localstorage

I have a Typescript app. I use the localstorage for development purpose to store my objects and I have the problem at the deserialization.
I have an object meeting of type MeetingModel:
export interface MeetingModel {
date: moment.Moment; // from the library momentjs
}
I store this object in the localStorage using JSON.stringify(meeting).
I suppose that stringify call moment.toJson(), that returns an iso string, hence the value stored is: {"date":"2016-12-26T15:03:54.586Z"}.
When I retrieve this object, I do:
const stored = window.localStorage.getItem("meeting");
const meeting: MeetingModel = JSON.parse(stored);
The problem is: meeting.date contains a string instead of a moment !
So, first I'm wondering why TypeScript let this happen ? Why can I assign a string value instead of a Moment and the compiler agree ?
Second, how can I restore my objects from plain JSON objects (aka strings) into Typescript types ?
I can create a factory of course, but when my object database will grow up it will be a pain in the *** to do all this work.
Maybe there is a solution for better storing in the local storage in the first place?
Thank you

1) TypeScript is optionally typed. That means there are ways around the strictness of the type system. The any type allows you to do dynamic typing. This can come in very handy if you know what you are doing, but of course you can also shoot yourself in the foot.
This code will compile:
var x: string = <any> 1;
What is happening here is that the number 1 is casted to any, which to TypeScript means it will just assume you as a developer know what it is and how you to use it. Since the any type is then assigned to a string TypeScript is absolutely fine with it, even though you are likely to get errors during run-time, just like when you make a mistake when coding JavaScript.
Of course this is by design. TypeScript types only exist during compile time. What kind of string you put in JSON.parse is unknowable to TypeScript, because the input string only exists during run-time and can be anything. Hence the any type. TypeScript does offer so-called type guards. Type guards are bits of code that are understood during compile-time as well as run-time, but that is beyond the scope of your question (Google it if you're interested).
2) Serializing and deserializing data is usually not as simple as calling JSON.stringify and JSON.parse. Most type information is lost to JSON and typically the way you want to store objects (in memory) during run-time is very different from the way you want to store them for transfer or storage (in memory, on disk, or any other medium). For instance, during run-time you might need lookup tables, user/session state, private fields, library specific properties, while in storage you might want version numbers, timestamps, metadata, different types of normalization, etc. You can JSON.stringify anything you want in JavaScript land, but that does necessarily mean it is a good idea. You might want to design how you actually store data. For example, an iso string looks pretty, but takes a lot of bytes. If you have just a few that does not matter, but when you are transferring millions a second you might want to consider another format.
My advise to you would be to define interfaces for the objects you want to save and like moment create a .toJson method on your model object, which will return the DTO (Data Transfer Object) that you can simply serialize with JSON.stringify. Then on the way back you cast the any output of JSON.parse to your DTO and then convert it back to your model with a factory function or constructor of your creation. That might seem like a lot of boilerplate, but in my experience it is totally worth it, because now you are in control of what gets stored and that gives you a lot of flexility to change your model without getting deserialization problems.
Good luck!

You could use the reviver feature of JSON.parse to convert the string back to a moment:
JSON.parse(input, (key, value) => {
if (key == "date") {
return parseStringAsMoment(value);
} else {
return value;
});
Check browser support for reviver, though, as it's not the same as basic JSON.parse

Typescript String Based Enums

So I've read all the posts on String Based Enums in Typescript, but I couldn't find a solution that meets my requirements. Those would be:
Enums that provide code completion
Enums that can be iterated over
Not having to specify an element twice
String based
The possibilities I've seen so far for enums in typescript are:
enum MyEnum {bla, blub}: This fails at being string based, so I can't simply read from JSONs which are string based...
type MyEnum = 'bla' | 'blub': Not iterable and no code completion
Do it yourself class MyEnum { static get bla():string{return "bla"} ; static get blub():string{return "blub"}}: Specifies elements twice
So here come the questions:
There's no way to satisfy those requirements simultaneously? If no, will it be possible in the future?
Why didn't they make enums string based?
Did someone experience similar problems and how did you solve them?

I think that implementing Enum in a C-like style with numbers is fine, because an Enum (similar to a Symbol) is usually used to declare a value that is uniquely identifiable on development time. How the machine represents that value on run time doesn't really concern the developer.
But what we developer sometimes want (because we're all lazy and still want to have all the benefits!), is to use the Enum as an API or with an API that does not share that Enum with us, even though the API is essentially an Enum because the valid value of a property only is foo and bar.
I guess this is the reason why some languages have string based Enums :)
How TypeScript handles Enums
If you look at the transpiled JavaScript you can see that TypeScript just uses a plain JavaScript Object to implement an Enum. For example:
enum Color {
Red,
Green,
Blue
}
will be transpiled to:
{
0: "Red",
1: "Green",
2: "Blue",
Blue: 2,
Green: 1,
Red: 0
}
This means you can access the string value like Color[Color.Red]. You will still have code completion and you do not have to specify the values twice. But you can not just do Object.keys(Color) to iterate over the Enum, because the values exist "twice" on the object.

Why didn't they make enums string based
To be clear Enums are both number and string based in that direct access is number and reverse map is string (more on this).
Meeting your requirement
You key reason for ruling out raw enums is
so I can't simply read from JSONs which are string based...
You will experience the same thing e.g. when reading Dates cause JSON has no date data type. You would new Date("someServerDateTime") to convert these.
You would use the same strategy to go from server side enum (string) to TS enum (number). Easy done thanks to the reverse lookup MyEnum["someServerString"]
Hydration
This process of converting server side data to client side active data is sometimes called Hydration. My favorite lib for this at the moment is https://github.com/pleerock/class-transformer
I personally handle this stuff myself at the server access level i.e. hand write an API that makes the XHR + does the serialization.
At my last job we automated this with code generation that did even more than that (supported common validation patterns between server and client code).

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008