Use a period in a field name in a Matlab struct - json

I'm using webwrite to post to an api. One of the field names in the json object I'm trying to setup for posting is odata.metadata. I'm making a struct that looks like this for the json object:
json = struct('odata.metadata', metadata, 'odata.type', type, 'Name', name,);
But I get an error
Error using struct
Invalid field name "odata.metadata"
Here's the json object I'm trying to use in Matlab. All strings for simplicity:
{
"odata.metadata": "https://website.com#Element",
"odata.type": "Blah.Blah.This.That",
"Name": "My Object"
}
Is there a way to submit this json object or is it a lost cause?

Field names are not allowed to have dots in them. The reason why is because this will be confused with accessing another nested structure within the structure itself.
For example, doing json.odata.metadata would be interpreted as json being a struct with a member whose field name is odata where odata has another member whose field name is metadata. This would not be interpreted as a member with the combined field name as odata.metadata. You're going to have to rename the field to something else or change the convention of your field name slightly.
Usually, the convention is to replace dots with underscores. An automated way to take care of this if you're not willing to manually rename the field names yourself is to use a function called matlab.lang.makeValidName that takes in a string and converts it into a valid field name. This function was introduced in R2014a. For older versions, it's called genvarname.
For example:
>> matlab.lang.makeValidName('odata.metadata')
ans =
odata_metadata
As such, either replace all dots with _ to ensure no ambiguities or use matlab.lang.makeValidName or genvarname to take care of this for you.

I would suggest using a a containers.Map instead of a struct to store your data, and then creating your JSON string by iterating over the Map filednames and appending them along with the data to your JSON.
Here's a quick demonstration of what I mean:
%// Prepare the Map and the Data:
metadata = 'https://website.com#Element';
type = 'Blah.Blah.This.That';
name = 'My Object';
example_map = containers.Map({'odata.metadata','odata.type','Name'},...
{metadata,type,name});
%// Convert to JSON:
JSONstr = '{'; %// Initialization
map_keys = keys(example_map);
map_vals = values(example_map);
for ind1 = 1:example_map.Count
JSONstr = [JSONstr '"' map_keys{ind1} '":"' map_vals{ind1} '",'];
end
JSONstr =[JSONstr(1:end-1) '}']; %// Finalization (get rid of the last ',' and close)
Which results in a valid JSON string.
Obviously if your values aren't strings you'll need to convert them using num2str etc.
Another alternative you might want to consider is the JSONlab FEX submission. I saw that its savejson.m is able to accept cell arrays - which can hold any string you like.
Other alternatives may include any of the numerous Java or python JSON libraries which you can call from MATLAB.

I probably shouldn't add this as an answer - but you can have '.' in a struct fieldname...
Before I go further - I do not advocate this and it will almost certainly cause bugs and a lot of trouble down the road... #rayryeng method is a better approach
If your struct is created by a mex function which creates a field that contains a "." -> then you will get what your after.
To create your own test see the Mathworks example and modify accordingly.
(I wont put the full code here to discourage the practice).
If you update the char example and compile to test_mex you get:
>> obj = test_mex
obj =
Doublestuff: [1x100 double]
odata.metadata: 'This is my char'
Note: You can only access your custom field in Matlab using dynamic fieldnames:
obj.('odata.metadata')
You need to use a mex capability to update it...

Related

Processing JSON from a .txt file and converting to a DataFrame in Julia

Cross posting from Julia Discourse in case anyone here has any leads.
I’m just looking for some insight into why the below code is returning a dataframe containing just the first line of my json file. If you’d like to try working with the file I’m working with, you can download the aminer_papers_0.zip from the Microsoft Open Academic Graph site, I’m using the first file in that group of files.
using JSON3, DataFrames, CSV
file_name = "path/aminer_papers_0.txt"
json_string = read(file_name, String)
js = JSON3.read(json_string)
df = DataFrame([js])
The resulting DataFrame has just one line, but the column titles are correct, as is the first line. To me the mystery is why the rest isn’t getting processed. I think I can rule out that read() is only reading the first JSON object, because I can index into the resulting object and see many JSON objects:
enter image description here
My first guess was maybe the newline \n was causing escape issues, and tried to use chomp to get rid of them, but couldn’t get it to work.
Anyway - any help would be greatly appreciated!
I think the problem is that the file is in JSON Lines format, and the JSON3 library only returns the first valid JSON value that it finds at the start of a string unless told otherwise.
tl;dr
Call JSON3.read with the keyword argument jsonlines=true.
Why?
By default, JSON3 interprets a string passed to its read function as a single "JSON text", defined by RFC 8259 section 1.3.2:
A JSON text is a serialized value....
(My emphasis on the use of the indefinite singular article "a.") A "JSON value" is defined in section 1.3.3:
A JSON value MUST be an object, array, number, or string, or one of the following three literal names: false, null, true.
A string with multiple JSON values in it is technically multiple "JSON texts." It is up to the parser to determine what part of the string argument you give it is a JSON text, and the authors of JSON3 chose as the default behavior to parse from the start of the string to the end of the first valid JSON value.
In order to get JSON3 to read the string as multiple JSON values, you have to give it the keyword option jsonlines=true, which is documented as:
jsonlines: A Bool indicating that the json_str contains newline delimited JSON strings, which will be read into a JSON3.Array of the JSON values. See jsonlines for reference. [default false]
Example
Take for example this simple string:
two_values = "3.14\n2.72"
Each one of these lines is a valid JSON serialization of a number. However, when passed to JSON3.read, only the first is parsed:
using JSON3
#assert JSON3.read(two_values) == 3.14
Using jsonlines=true, both values are parsed and returned as a JSON3.Array struct:
#assert JSON3.read(two_values, jsonlines=true) == [3.14, 2.72]
Other Packages
The JSON.jl library, which people might use by default given the name, does not implement parsing of JSON Lines strings at all, leaving it up to the caller to properly split the string as needed:
using JSON
JSON.parse(two_values)
# ERROR: Expected end of input
# Line: 1
# Around: ...3.14 2.72...
# ^
A simple way to implement reading multiple values is to use eachline:
#assert [JSON.parse(line) for line in eachline(IOBuffer(two_values))] == [3.14, 2.72]

F# How to Write/Read to a CSV File

I am working on an assignment using F# where I have to add in a specific student and his information to a large Students.txt file
The Student.txt file contains their lastname, firstname, middle intial, phone number, email, and their gpa
A snippet of the Students.txt file
If I am trying to add in this information and then read from the file:
type Phone =
type Email =
type StudentInfo =
{ firstName : string;
middleInitial : char option;
lastName : string;
phone : Phone;
email : Email option;
gpa : float }
let addPhone input =
let addEmail input =
let readStudentsFromCSV filename =
let students = readStudentsFromCSV "Students.txt"
I need insight on how to write these functions.
Note: This is only a snippet of my code.
There are a number of options. The main question is whether you want to write your own CSV parsing, or whether you want to use an existing library.
If this is an assignment, it might require you to write your own parser (doing that would probably be a bad idea in the real world, because real world CSV files can be very messy, but it might be fine if your input is very regular). If you were to use a library, the F# Data library is what most people in the F# community would use.
F# Data comes with a type provider called CsvProvider which infers the type of rows in a CSV file for you, so you do not have to write explicit type definitions. (Or you can still do that, but then load data into your structures using a simple transformation.)
F# Data also has CsvFile type, which just does the parsing, but then returns data as a sequence of rows (which are themselves arrays of string values). This might be nice if you just need something to take care of splitting the lines, but want to do the rest of the work.
If you wanted to write CSV parsing on your own, you can use File.ReadAllLines to read individual rows and then row.Split(',') to turn each row into an array of strings using , as the separator. This could work on your file - but it will break if there is any escaping in your file (for example Foo, "a, b", Bar is just three columns!

Excessive use of map[string]interface{} in go development?

The majority of my development experience has been from dynamically typed languages like PHP and Javascript. I've been practicing with Golang for about a month now by re-creating some of my old PHP/Javascript REST APIs in Golang. I feel like I'm not doing things the Golang way most of the time. Or more generally, I'm not use to working with strongly typed languages. I feel like I'm making excessive use of map[string]interface{} and slices of them to box up data as it comes in from http requests or when it gets shipped out as json http output. So what I'd like to know is if what I'm about to describe goes against the philosophy of golang development? Or if I'm breaking the principles of developing with strongly typed languages?
Right now, about 90% of the program flow for REST Apis I've rewritten with Golang can be described by these 5 steps.
STEP 1 - Receive Data
I receive http form data from http.Request.ParseForm() as formvals := map[string][]string. Sometimes I will store serialized JSON objects that need to be unmarshaled like jsonUserInfo := json.Unmarshal(formvals["user_information"][0]) /* gives some complex json object */.
STEP 2 - Validate Data
I do validation on formvals to make sure all the data values are what I expect before using it in SQL queries. I treat everyting as a string, then use Regex to determine if the string format and business logic is valid (eg. IsEmail, IsNumeric, IsFloat, IsCASLCompliant, IsEligibleForVoting,IsLibraryCardExpired etc...). I've written my own Regex and custom functions for these types of validations
STEP 3 - Bind Data to SQL Queries
I use golang's database/sql.DB to take my formvals and bind them to my Query and Exec functions like this Query("SELECT * FROM tblUser WHERE user_id = ?, user_birthday > ? ",formvals["user_id"][0], jsonUserInfo["birthday"]). I never care about the data types I'm supplying as arguments to be bound, so they're all probably strings. I trust the validation in the step immediately above has determined they are acceptable for SQL use.
STEP 4 - Bind SQL results to []map[string]interface{}{}
I Scan() the results of my queries into a sqlResult := []map[string]interface{}{} because I don't care if the value types are null, strings, float, ints or whatever. So the schema of an sqlResult might look like:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
}
I wrote my own eager load function so that I can bind more information like so EagerLoad("tblAddress", "JOIN ON tblAddress.user_id",&sqlResult) which then populates sqlResult with more information of the type []map[string]interface{}{} such that it looks like this:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
[1] {
"type":"work"
"address1":"5 Kennedy Avenue"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
}
STEP 5 - JSON Marshal and send HTTP Response
then I do a http.ResponseWriter.Write(json.Marshal(sqlResult)) and output data for my REST API
Recently, I've been revisiting articles with code samples that use structs in places I would have used map[string]interface{}. For example, I wanted to refactor Step 2 with a more standard approach that other golang developers would use. So I found this https://godoc.org/gopkg.in/go-playground/validator.v9, except all it's examples are with structs . I also noticed that most blogs that talk about database/sql scan their SQL results into typed variables or structs with typed properties, as opposed to my Step 4 which just puts everything into map[string]interface{}
Hence, i started writing this question. I feel the map[string]interface{} is so useful because majority of the time,I don't really care what the data is and it gives me to the freedom in Step 4 to construct any data schema on the fly before I dump it as JSON http response. I do all this with as little code verbosity as possible. But this means my code is not as ready to leverage Go's validation tools, and it doesn't seem to comply with the golang community's way of doing things.
So my question is, what do other golang developers do with regards to Step 2 and Step 4? Especially in Step 4...do Golang developers really encourage specifying the schema of the data through structs and strongly typed properties? Do they also specify structs with strongly typed properties along with every eager loading call they make? Doesn't that seem like so much more code verbosity?
It really depends on the requirements just like you have said you don't require to process the json it comes from the request or from the sql results. Then you can easily unmarshal into interface{}. And marshal the json coming from sql results.
For Step 2
Golang has library which works on validation of structs used to unmarshal json with tags for the fields inside.
https://github.com/go-playground/validator
type Test struct {
Field `validate:"max=10,min=1"`
}
// max will be checked then min
you can also go to godoc for validation library. It is very good implementation of validation for json values using struct tags.
For STEP 4
Most of the times, We use structs if we know the format and data of our JSON. Because it provides us more control over the data types and other functionality. For example if you wants to empty a JSON feild if you don't require it in your JSON. You should use struct with _ json tag.
Now you have said that you don't care if the result coming from sql is empty or not. But if you do it again comes to using struct. You can scan the result into struct with sql.NullTypes. With that also you can provide json tag for omitempty if you wants to omit the json object when marshaling the data when sending a response.
Struct values encode as JSON objects. Each exported struct field
becomes a member of the object, using the field name as the object
key, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format
string stored under the "json" key in the struct field's tag. The
format string gives the name of the field, possibly followed by a
comma-separated list of options. The name may be empty in order to
specify options without overriding the default field name.
The "omitempty" option specifies that the field should be omitted from
the encoding if the field has an empty value, defined as false, 0, a
nil pointer, a nil interface value, and any empty array, slice, map,
or string.
As a special case, if the field tag is "-", the field is always
omitted. Note that a field with name "-" can still be generated using
the tag "-,".
Example of json tags
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty,
// as defined above.
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
As you can analyze from above information given in Golang spec for json marshal. Struct provide so much control over json. That's why Golang developer most probably use structs.
Now on using map[string]interface{} you should use it when you don't the structure of your json coming from the server or the types of fields. Most Golang developers stick to structs wherever they can.

Go json.Unmarshal field case

I'm new to Go. I was trying to fetch and marshal json data to a struct. My sample data looks like this:
var reducedFieldData = []byte(`[
{"model":"Traverse","vin":"1gnkrhkd6ej111234"}
,{"model":"TL","vin":"19uua66265a041234"}
]`)
If I define the struct for receiving the data like this:
type Vehicle struct {
Model string
Vin string
}
The call to Unmarshal works as expected. However, if I use lower case for the fields ("model" and "vin") which actually matches cases for the field names in the data it will return empty strings for the values.
Is this expected behavior? Can the convention be turned off?
Fields need to be exported (declared with an uppercase first letter) or the reflection library cannot edit them. Since the JSON (un)marshaller uses reflection, it cannot read or write unexported fields.
So yes, it is expected, and no, you cannot change it. Sorry.
You can add tags to a field to change the name the marshaller uses:
Model string `json:"model"`
See the documentation for more info on the field tags "encoding/json" supports.

Parsing large JSON file with Scala and JSON4S

I'm working with Scala in IntelliJ IDEA 15 and trying to parse a large twitter record json file and count the total number of hashtags. I am very new to Scala and the idea of functional programming. Each line in the json file is a json object (representing a tweet). Each line in the file starts like so:
{"in_reply_to_status_id":null,"text":"To my followers sorry..
{"in_reply_to_status_id":null,"text":"#victory","in_reply_to_screen_name"..
{"in_reply_to_status_id":null,"text":"I'm so full I can't move"..
I am most interested in a property called "entities" which contains a property called "hastags" with a list of hashtags. Here is an example:
"entities":{"hashtags":[{"text":"thewayiseeit","indices":[0,13]}],"user_mentions":[],"urls":[]},
I've browsed the various scala frameworks for parsing json and have decided to use json4s. I have the following code in my Scala script.
import org.json4s.native.JsonMethods._
var json: String = ""
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line
val data = parse(json)
My logic here is that I am trying to read each line from twitter38.json into a string and then parse the entire string with parse(). The parse function is throwing an error claiming:
"Type mismatch, expected: Nothing, found:String."
I have seen examples that use parse() on strings that hold json objects such as
val jsontest =
"""{
|"name" : "bob",
|"age" : "50",
|"gender" : "male"
|}
""".stripMargin
val data = parse(jsontest)
but I have received the same error. I am coming from an object oriented programming background, is there something fundamentally wrong with the way I am approaching this problem?
You have most likely incorrectly imported dependencies to your Intellij project or modules into your file. Make sure you have the following lines imported:
import org.json4s.native.JsonMethods._
Even if you correctly import this module, parse(String: json) will not work for you, because you have incorrectly formed a json. Your json String will look like this:
"""{"in_reply_...":"someValue1"}{"in_reply_...":"someValues2"}"""
but should look as follows to be a valid json that can be parsed:
"""{{"in_reply_...":"someValue1"},{"in_reply_...":"someValues2"}}"""
i.e. you need starting and ending brackets for the json, and a comma between each line of tweets. Please read the json4s documenation for more information.
Although being almost 6 years old, I think this question deserves another try.
JSON format has a few misunderstandings in people's minds, especially how they are stored and how they are read back.
JSON documents, are stored as either a single object having all the other fields, or an array of multiple object possibly in same format. this second part is important because arrays in almost every programming language are defined by angle brackets and values separated by commas (note here I used a person object as my single value):
[
{"name":"John","surname":"Doe"},
{"name":"Jane","surname":"Doe"}
]
also note that everything except brackets, numbers and booleans are enclosed in quotes when written into file.
however, there is another use that is not official but preferred to transfer datasets easily where every object, or document as in nosql/mongo language, are stored in a new line like this:
{"name":"John","surname":"Doe"}
{"name":"Jane","surname":"Doe"}
so for the question, OP has a document written in this second form, but tries an algorithm written to read the first form. following code has few simple changes to achieve this, and the user must read the file knowing that:
var json: String = "["
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line + ","
json=json.splitAt(json.length()-1)._1
json+= "]"
val data = parse(json)
PS: although #sbrannon, has the correct idea, the example he/she gave has mistakenly curly braces instead of angle brackets to surround the data.
EDIT: I have added json=json.splitAt(json.length()-1)._1 because the code above ends with a trailing comma which will cause parse error per the JSON format definition.