Problems with bad characters in UTF8 JSON strings with Delphi XE - json

We have a Delphi XE application that exchanges JSON data with a cloud application that is written in Node.
Generally everything works fine - but every once in a while we get bad (unknown) characters in some of our strings - and we are having problems tracking it down. These characters are rendered as diamonds and have a character code of 65533
We perform a rest POST call to get our data from the cloud and we get it as a JSON object that includes some meta-data and an array of record. We use DBXJson's TJsonObject for the json parsing in the form of
jsv := TJSONObject.parseJsonValue(s)
where s is the data we got from our Post call.
From this we use a TJsonArray to get the record, traverse it and use the JSonValue.ToString to retrieve the string value.
The data is stored in DBIsam using VarChar fields.
Any ideas how to detect and prevent "bad" characters that appear or where else this could happen?

Related

What is the "correct" way to format json from SQL Server?

I have a stored proc that returns json. The proc works fine and when I call it in SSMS it returns a json object that I can click and see as valid json. When I call it through Visual Studio it does the same, and clicking into it and pressing return formats it nicely so it's readable (and not all on a single line). If I take the string and pass it back into the sql function IsJson, it also returns true, so I'm 100% confident that the json coming out of the proc is valid.
However, my frontend developer is unable to parse it, and it seems to boil down to something trying to convert it somewhere.
The json (as output from the proc) is:
{"APIResult":[{"ID":200,"Status_Message":"Success","Developer_Message":"Successful login for D56D10AC-3A74-42FC-8BB5-F783A6C0C556 33E4F907-F1E5-4F1B-8CA5-5521291E683A (AppID: (null)).","User_Message":"Successful login","User":[{"ID":"D56D10AC-3A74-42FC-8BB5-F783A6C0C556"}]}]}
Using Postman to hit the live API (or localhost) I get back the data I expect (as above) however, it wraps the entire thing in double quotes and escapes all the double quotes around each element, so I get:
"{\"APIResult\":[{\"ID\":200,\"Status_Message\":\"Success\",\"Developer_Message\":\"Successful login for D56D10AC-3A74-42FC-8BB5-F783A6C0C556 33E4F907-F1E5-4F1B-8CA5-5521291E683A (AppID: (null)).\",\"User_Message\":\"Successful login\",\"User\":[{\"ID\":\"D56D10AC-3A74-42FC-8BB5-F783A6C0C556\"}]}]}"
When I try to parse this in sql (using isjson) it returns false. However, parsing this same code though jsonlint.com interprets it as valid. Trying it through the parser my developer is using http://json.parser.online.fr throws it out as errors too.
At a push, I can tell my developer to strip the first and last characters and replace every \" with " but this seems a bit faffy.
Are there different interpretations of json? I've read that this is often caused by the calling app expecting a string that it needs to json-ify and because it's getting a json object it's json-ifying the json, but even if I force sql server to output the string as a string and not json, it still seems the same. SQL is definitely pushing it out correctly but whatever my developer is calling the api through returns the same format as postman, and doesn't like it. How can I ensure that what the calling code gets is what SQL is giving and not some weird interpretation?
#Tim Try to deserialize the output of stored procedure. Use JSonConvert.Derialize in C#

Excessive use of map[string]interface{} in go development?

The majority of my development experience has been from dynamically typed languages like PHP and Javascript. I've been practicing with Golang for about a month now by re-creating some of my old PHP/Javascript REST APIs in Golang. I feel like I'm not doing things the Golang way most of the time. Or more generally, I'm not use to working with strongly typed languages. I feel like I'm making excessive use of map[string]interface{} and slices of them to box up data as it comes in from http requests or when it gets shipped out as json http output. So what I'd like to know is if what I'm about to describe goes against the philosophy of golang development? Or if I'm breaking the principles of developing with strongly typed languages?
Right now, about 90% of the program flow for REST Apis I've rewritten with Golang can be described by these 5 steps.
STEP 1 - Receive Data
I receive http form data from http.Request.ParseForm() as formvals := map[string][]string. Sometimes I will store serialized JSON objects that need to be unmarshaled like jsonUserInfo := json.Unmarshal(formvals["user_information"][0]) /* gives some complex json object */.
STEP 2 - Validate Data
I do validation on formvals to make sure all the data values are what I expect before using it in SQL queries. I treat everyting as a string, then use Regex to determine if the string format and business logic is valid (eg. IsEmail, IsNumeric, IsFloat, IsCASLCompliant, IsEligibleForVoting,IsLibraryCardExpired etc...). I've written my own Regex and custom functions for these types of validations
STEP 3 - Bind Data to SQL Queries
I use golang's database/sql.DB to take my formvals and bind them to my Query and Exec functions like this Query("SELECT * FROM tblUser WHERE user_id = ?, user_birthday > ? ",formvals["user_id"][0], jsonUserInfo["birthday"]). I never care about the data types I'm supplying as arguments to be bound, so they're all probably strings. I trust the validation in the step immediately above has determined they are acceptable for SQL use.
STEP 4 - Bind SQL results to []map[string]interface{}{}
I Scan() the results of my queries into a sqlResult := []map[string]interface{}{} because I don't care if the value types are null, strings, float, ints or whatever. So the schema of an sqlResult might look like:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
}
I wrote my own eager load function so that I can bind more information like so EagerLoad("tblAddress", "JOIN ON tblAddress.user_id",&sqlResult) which then populates sqlResult with more information of the type []map[string]interface{}{} such that it looks like this:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
[1] {
"type":"work"
"address1":"5 Kennedy Avenue"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
}
STEP 5 - JSON Marshal and send HTTP Response
then I do a http.ResponseWriter.Write(json.Marshal(sqlResult)) and output data for my REST API
Recently, I've been revisiting articles with code samples that use structs in places I would have used map[string]interface{}. For example, I wanted to refactor Step 2 with a more standard approach that other golang developers would use. So I found this https://godoc.org/gopkg.in/go-playground/validator.v9, except all it's examples are with structs . I also noticed that most blogs that talk about database/sql scan their SQL results into typed variables or structs with typed properties, as opposed to my Step 4 which just puts everything into map[string]interface{}
Hence, i started writing this question. I feel the map[string]interface{} is so useful because majority of the time,I don't really care what the data is and it gives me to the freedom in Step 4 to construct any data schema on the fly before I dump it as JSON http response. I do all this with as little code verbosity as possible. But this means my code is not as ready to leverage Go's validation tools, and it doesn't seem to comply with the golang community's way of doing things.
So my question is, what do other golang developers do with regards to Step 2 and Step 4? Especially in Step 4...do Golang developers really encourage specifying the schema of the data through structs and strongly typed properties? Do they also specify structs with strongly typed properties along with every eager loading call they make? Doesn't that seem like so much more code verbosity?
It really depends on the requirements just like you have said you don't require to process the json it comes from the request or from the sql results. Then you can easily unmarshal into interface{}. And marshal the json coming from sql results.
For Step 2
Golang has library which works on validation of structs used to unmarshal json with tags for the fields inside.
https://github.com/go-playground/validator
type Test struct {
Field `validate:"max=10,min=1"`
}
// max will be checked then min
you can also go to godoc for validation library. It is very good implementation of validation for json values using struct tags.
For STEP 4
Most of the times, We use structs if we know the format and data of our JSON. Because it provides us more control over the data types and other functionality. For example if you wants to empty a JSON feild if you don't require it in your JSON. You should use struct with _ json tag.
Now you have said that you don't care if the result coming from sql is empty or not. But if you do it again comes to using struct. You can scan the result into struct with sql.NullTypes. With that also you can provide json tag for omitempty if you wants to omit the json object when marshaling the data when sending a response.
Struct values encode as JSON objects. Each exported struct field
becomes a member of the object, using the field name as the object
key, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format
string stored under the "json" key in the struct field's tag. The
format string gives the name of the field, possibly followed by a
comma-separated list of options. The name may be empty in order to
specify options without overriding the default field name.
The "omitempty" option specifies that the field should be omitted from
the encoding if the field has an empty value, defined as false, 0, a
nil pointer, a nil interface value, and any empty array, slice, map,
or string.
As a special case, if the field tag is "-", the field is always
omitted. Note that a field with name "-" can still be generated using
the tag "-,".
Example of json tags
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty,
// as defined above.
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
As you can analyze from above information given in Golang spec for json marshal. Struct provide so much control over json. That's why Golang developer most probably use structs.
Now on using map[string]interface{} you should use it when you don't the structure of your json coming from the server or the types of fields. Most Golang developers stick to structs wherever they can.

http rest: Disadvantages of using json in query string?

I want to use query string as json, for example: /api/uri?{"key":"value"}, instead of /api/uri?key=value. Advantages, from my point of view, are:
json keep types of parameters, such are booleans, ints, floats and strings. Standard query string treats all parameters as strings.
json has less symbols for deep nested structures, For example, ?{"a":{"b":{"c":[1,2,3]}}} vs ?a[b][c][]=1&a[b][c][]=2&a[b][c][]=3
Easier to build on client side
What disadvantages could be in that case of json usage?
It looks good if it's
/api/uri?{"key":"value"}
as stated in your example, but since it's part of the URL then it gets encoded to:
/api/uri?%3F%7B%22key%22%3A%22value%22%7D
or something similar which makes /api/uri?key=value simpler than the encoded one; in this case both for debugging outbound calls (ie you want to check the actual request via wireshark etc). Also notice that it takes up more characters when encoded to valid url (see browser limitation).
For the case of 'lesser symbols for nested structures', It might be more appropriate to create a new resource for your sub resource where you will handle the filtering through your query parameters;
api/a?{"b":1,}
to
api/b?bfield1=1
api/a?aBfield1=1 // or something similar
Lastly for the 'easier to build in client side', i think it depends on what you use to create your client, usually query params are represented as maps so it is still simple.
Also if you need a collection for a single param then:
/uri/resource?param1=value1,value2,value3

Does PL/JSON software support UTF-8 characters

I'm using PL/JSON to parse data from MongoDB to Oracle DB. The Packages work fine with latin characters. However, whenver there is json_value with chinese characters, the resulted value in oracle is totally corrupted (question marks, symbole ... etc).
As an example, I use the following line to parse:
Remarks := json_ext.get_string(json(l_list.get(i)),'Remarks');
I'm aware the type json and get_string function are using varchar and not nvarchar. More importantly, Oracle DB instance supports chinese (I can see when I insert directly into table, no corruption occurs. Only when I parse chinese json file it gets corrupted.)
So my question is, Does PL/JSON software support chinesecharacters? Do I need to update the package on my own to accommodate chinese characters? Where exactly needs to be fixed?
In adidtion, I ran the following:
select value from nls_database_parameters where parameter='NLS_CHARACTERSET';
and it returns : AL32UTF8.
does not mean my DB is configured for UTF-8? or there are other things need to be checked for this pupose?
Update
Inspired by #jsumners' comment, I tried to track the corruption source and I found it from the beginning. That is, the snippet below is run before using PL/SJON to receive the htp response:
BEGIN
LOOP
UTL_HTTP.read_text(l_http_response, buf);
DBMS_OUTPUT.PUT_LINE(buf);
l_response_text := l_response_text || buf;
END LOOP;
EXCEPTION
WHEN UTL_HTTP.end_of_body THEN
NULL;
END;
The line DBMS_OUTPUT.PUT_LINE(buf); returns corrupted fields of json. I'm not sure if this is related to the way I send and receive htp requests. Noteworthy, buf is nvarchar2.

JSON feed in UTF-8 without byte order marker

I have an WCF application written in C# that deliver my data in JSON or XML, depending on what the user asks for in the query string. Here is a quick snippet of my code that delivers the data:
Encoding utf8 = new System.Text.UTF8Encoding(false);
return WebOperationContext.Current.CreateTextResponse(data, "application/json", utf8);
When I deliver the data using above method, the special characters are all messed up. So Chávez looks like Chávez. On the other hand, if I create the utf8 variable above with the BOM or use the enum (Encoding.UTF8), the special characters are working fine. But then, some of my consumers are complaining that their code is throwing exception when consuming my API. This of course is happening because of the BOM in the feed. Is there a way for me to correctly display the special characters without the BOM in the feed?
It looks like the output is correct, but whatever you are using to display it expects ANSI encoded text. Chávez is what you get when you encode Chávez in UTF-8 and interpret the result as if it was Latin 1.