Using JSON in Haskell to Serialize a Record - json

I've been working on a very small program to grab details about Half Life 2 servers (using the protocol-srcds library). The workflow is pretty straightforward; it takes a list of servers from a file, queries each of them, and writes the output out to another file (which is read by a PHP script for display, as I'm tied to vBulletin). Would be nice if it was done in SQL or something, but seeing as I'm still just learning, that's a step too far for now!
Anyway, my question relates to serialization, namely, serializing to JSON. For now, I've written a scrappy helper function jsonify, such that:
jsonify (Just (SRCDS.GameServerInfo serverVersion
serverName
serverMap
serverMod
serverModDesc
serverAppId
serverPlayers
serverMaxPlayers
serverBots
serverType
serverOS
serverPassword
serverSecure
serverGameVersioning)) =
toJSObject [ ("serverName", serverName)
, ("serverMap", serverMap)
, ("serverPlayers", show serverPlayers)
, ("serverMaxPlayers", show serverMaxPlayers) ]
(I'm using the Text.JSON package). This is obviously not ideal. At this stage, however, I don't understand using instances to define serializers for records, and my attempts to do so met a wall of frustration in the type system.
Could someone please walk me through the "correct" way of doing this? How would I go about defining an instance that serializes the record? What functions should I use in the instance (showJSON?).
Thanks in advance for any help.

You might want to consider using Data.Aeson instead which might be regarded as the successor to Text.JSON.
With aeson you define separate instances for serialize/deserializing (with Text.JSON you have to define both directions even if you need only one, otherwise the compile will annoy you -- unless you silence the warning somehow), and it provides a few operators making defining instances a bit more compact, e.g. the example from #hammar's answer can be written a little bit less noisy as shown below with the aeson API:
instance ToJSON SRCDS.GameServerInfo where
toJSON (SRCDS.GameServerInfo {..}) = object
[ "serverName" .= serverName
, "serverMap" .= serverMap
, "serverPlayers" .= serverPlayers
, "serverMaxPlayers" .= serverMaxPlayers
]

One simple thing you can do is use record wildcards to cut down on the pattern code.
As for your type system problems, it's hard to give help without seeing error messages and what you've tried so far, however I suspect one thing that might be confusing is that the result of toJSObject will have to be wrapped in a JSObject data constructor as the return type of showJSON is supposed to be a JSValue. Similarly, the values of your object should also be of type JSValue. The easiest way to do this is to use their JSON instance and call showJSON to convert the values.
instance JSON SRCDS.GameServerInfo where
showJSON (SRCDS.GameServerInfo {..}) =
JSObject $ toJSObject [ ("serverName", showJSON serverName)
, ("serverMap", showJSON serverMap)
, ("serverPlayers", showJSON serverPlayers)
, ("serverMaxPlayers", showJSON serverMaxPlayers) ]

Related

Deserialize JSON without knowing full structure

I'm redoing the backend of a very basic framework that connects to a completely customizable frontend. It was originally in PHP but for the refactor have been plodding away in F#. Although it seems like PHP might be the more suited language. But people keep telling me you can do everything in F# and I like the syntax and need to learn and this seemingly simple project has me stumped when it comes to JSON. This is a further fleshed out version of my question yesterday, but it got alot more complex than I thought.
Here goes.
The frontend is basically a collection of HTML files, which are simply loaded in PHP and preg_replace() is used to replace things like [var: varName] or [var: array|key] or the troublesome one: [lang: hello]. That needs to be replaced by a variable defined in a translation dictionary, which is stored as JSON which is also editable by a non-programmer.
I can't change the frontend or the JSON files, and both are designed to be edited by non-programmers so it is very likely that there will be errors, calls to language variables that don't exist etc.
So we might have 2 json files, english.json and french.json
english.json contains:
{
"hello":"Hello",
"bye":"Goodbye"
}
french.json:
{
"hello": "Bonjour",
"duck": "Canard"
//Plus users can add whatever else they want here and expect to be able to use it in a template
}
There is a template that contains
<b>[lang: hello]</b>
<span>Favourite Animal: [lang:duck]</span>
In this case, if the language is set to "english" and english.json is being loaded, that should read:
<b>Hello</b>
<span>Favourite Animal: </span>
Or in French:
<b>Bonjour</b>
<span>Favourite Animal: Canard</span>
We can assume that the json format key: value is always string:string but ideally I'd like to handle string: 'T as well but that might be beyond the scope of this question.
So I need to convert a JSON file (called by dynamic name, which gave F# Data a bit of an issue I couldn't solve last night as it only allowed a static filename as a sample, and since these two files have potential to be different from sample and provided, the type provider doesn't work) to a dictionary or some other collection.
Now inside the template parsing function I need to replace [lang: hello] with something like
let key = "duck"
(*Magic function to convert JSON to usable collection*)
let languageString = convertedJSONCollection.[key] (*And obviously check if containsKey first*)
Which means I need to call the key dynamically, and I couldn't figure out how to do that with the type that FSharp.Data provided.
I have played around with some Thoth as well to some promising results that ended up going nowhere. I avoided JSON.NET because I thought it was paid, but just realised I am mistaken there so might be an avenue to explore
For comparison, the PHP function looks something like this:
function loadLanguage($lang='english){
$json = file_get_contents("$lang.json");
return json_decode($json, true);
}
$key = 'duck';
$langVars = loadLanguage();
$duck = $langVars[$key] || "";
Is there a clean way to do this in F#/.NET? JSON seems really painful to work with in comparison to PHP/Javascript and I'm starting to lose my mind. Am I going to have to write my own parser (which means probably going back to PHP)?
Cheers to all you F# geniuses who know the answer :p
open Thoth.Json.Net
let deserialiseDictionary (s: string) =
s
|> Decode.unsafeFromString (Decode.keyValuePairs Decode.string)
|> Map.ofList
let printDictionary json =
json
|> deserialiseDictionary
|> fun m -> printfn "%s" m.["hello"] // Hello
For the question about 'T the question becomes, what can 'T be? For json it very limited, it can be a number of things, string, json-object, number, bool or json array. What should happen if it is bool or a number?

Scala function to Json

Can I map Scala functions to JSON; or perhaps via a different way than JSON?
I know I can map data types, which is fine. But I'd like to create a function, map it to JSON send it via a REST method to another server, then add that function to a list of functions in another application and apply it.
For instance:
def apply(f: Int => String, v: Int) = f(v)
I want to make a list of functions that can be applied within an application, over different physical locations. Now I want to add and remove functions to the list. By means of REST calls.
Let's assume I understand security problems...
ps.. If you downvote, you might as well have the decency to explain why
If I understand correctly, you want to be able to send Scala code to be executed on different physical machines. I can think of a few different ways of achieving that
Using tools for distributed computing e.g. Spark. You can set up Spark clusters on different machines and then select to which cluster you want to submit Spark jobs. There are a lot of other tools for distributed computing that might also be worth looking into.
Pass scala code as a string and compile it either within your server side code (here's an example) or by invoking scalac as an external process.
Send the functions as byte code and execute the byte code on the remote machine.
If it fits with what you want to do, I'd recommend option #1.
Make sure that you can really trust the code that you want to execute, to not expose yourself to malicious code.
The answer is you can't do this, and even if you could you shouldn't!
You should never, never, never write a REST API that allows the client to execute arbitrary code in your application.
What you can do is create a number of named operations that can be executed. The client can then pass the name of the operation which the server can look up in a Map[String, <function>] and execute the result.
As mentioned in my comment, here is an example of how to turn a case class into JSON. Things to note: don't question the implicit val format line (it's magic); each case class requires a companion object in order to work; if you have Optional fields in your case class and define them as None when turning it into JSON, those fields will be ignored (if you define them as Some(whatever), they will look like any other field). If you don't know much about Scala Play, ignore the extra stuff for now - this is just inside the default Controller you're given when you make a new Project in IntelliJ.
package controllers
import javax.inject._
import play.api.libs.json.{Json, OFormat}
import play.api.mvc._
import scala.concurrent.Future
#Singleton
class HomeController #Inject()(cc: ControllerComponents) extends AbstractController(cc) {
case class Attributes(heightInCM: Int, weightInKG: Int, eyeColour: String)
object Attributes {
implicit val format: OFormat[Attributes] = Json.format[Attributes]
}
case class Person(name: String, age: Int, attributes: Attributes)
object Person {
implicit val format: OFormat[Person] = Json.format[Person]
}
def index: Action[AnyContent] = Action.async {
val newPerson = Person("James", 24, Attributes(192, 83, "green"))
Future.successful(Ok(Json.toJson(newPerson)))
}
}
When you run this app with sbt run and hit localhost:9000 through a browser, the output you see on-screen is below (formatted for better reading). This is also an example of how you might send JSON as a response to a GET request. It isn't the cleanest example but it works.
{
"name":"James",
"age":24,
"attributes":
{
"heightInCM":187,
"weightInKG":83,
"eyeColour":"green"
}
}
Once more though, I would never recommend passing actual functions between services. If you insist though, maybe store them as a String in a Case Class and turn it into JSON like this. Even if you are okay with passing functions around, it might even be a good exercise to practice security by validating the functions you receive to make sure they're not malicious.
I also have no idea how you'll convert them back into a function either - maybe write the String you receive to a *.scala file and try to run them with a Bash script? Idk. I'll let you figure that one out.

How does Ruby JSON.parse differ to OJ.load in terms of allocating memory/Object IDs

This is my first question and I have tried my best to find an answer - I have looked everywhere for an answer but haven't managed to find anything concrete to answer this in both the oj docs and ruby json docs and here.
Oj is a gem that serves to improve serialization/deserialization speeds and can be found at: https://github.com/ohler55/oj
I noticed this difference when I tried to dump and parse a hash with a NaN contained in it, twice, and compared the two, i.e.
# Create json Dump
dump = JSON.dump ({x: Float::NAN})
# Create first JSON load
json_load = JSON.parse(dump, allow_nan: true)
# Create second JSON load
json_load_2 = JSON.parse(dump, allow_nan: true)
# Create first OJ load
oj_load = Oj.load(dump, :mode => :compat)
# Create second OJload
oj_load_2 = Oj.load(dump, :mode => :compat)
json_load == json_load_2 # Returns true
oj_load == oj_load_2 # Returns false
I always thought NaN could not be compared to NaN so this confused me for a while until I realised that json_load and json_load_2 have the same object ID and oj_load and oj_load_2 do not.
Can anyone point me in the direction of where this memory allocation/object ID allocation occurs or how I can control that behaviour with OJ?
Thanks and sorry if this answer is floating somewhere on the internet where I could not find it.
Additional info:
I am running Ruby 1.9.3.
Here's the output from my tests re object IDs:
puts Float::NAN.object_id; puts JSON.parse(%q({"x":NaN}), allow_nan: true)["x"].object_id; puts JSON.parse(%q({"x":NaN}), allow_nan: true)["x"].object_id
70129392082680
70129387898880
70129387898880
puts Float::NAN.object_id; puts Oj.load(%q({"x":NaN}), allow_nan: true)["x"].object_id; puts Oj.load(%q({"x":NaN}), allow_nan: true)["x"].object_id
70255410134280
70255410063100
70255410062620
Perhaps I am doing something wrong?
I believe that is a deep implementation detail. Oj does this:
if (ni->nan) {
rnum = rb_float_new(0.0/0.0);
}
I can't find a Ruby equivalent for that, Float.new doesn't appear to exist, but it does create a new Float object every time (from an actual C's NaN it constructs on-site), hence different object_ids.
Whereas Ruby's JSON module uses (also in C) its own JSON::NaN Float object everywhere:
CNaN = rb_const_get(mJSON, rb_intern("NaN"));
That explains why you get different NaNs' object_ids with Oj and same with Ruby's JSON.
No matter what object_ids the resulting hashes have, the problem is with NaNs. If they have the same object_ids, the enclosing hashes are considered equal. If not, they are not.
According to the docs, Hash#== uses Object#== for values that only outputs true if and only if the argument is the same object (same object_id). This contradicts NaN's property of not being equal to itself.
Spectacular. Inheritance gone haywire.
One could, probably, modify Oj's C code (and even make a pull request with it) to use a constant like Ruby's JSON module does. It's a subtle change, but it's in the spirit of being compat, I guess.

Dealing with JSON in Scala?

In Scala 2.11, having the below code:
import play.api.libs.json._
...
val data = // read json from file (3)
val JSON: JsValue = Json.parse(data mkString "\n") (4)
val items = JSON \ "items"
for (i <- 0 until 100) yield items(i)
if I unite the last two lines for (i <- 0 until 100) yield (JSON \ "items")(i), will the term JSON \ "items" be evaluated for each i or only once?
is it worth to parallelise the list construction with this
for-expression (I don't care about the order in which items will
appear in the list), where items is an array of JSON objects?
what is the best way to handle exceptions from parsing the JSON in the lines (3 - 4) and validate it?
If you use the expression JSON \ "items" 100 times instead of 1, there'll be 100 times the work to find those nodes - there's no majick memoization or anything like that going on. Your cost is O(n) relative to the number of times you execute it - not O(1). But in any case, for this application the difference is inconsequential - assuming there's no outer loop you're not showing us.
This is way too small beans for parallelization to make any sense - in fact, the overhead might slow things down. If your real case was yield expensiveComputationBasedOn(items(i)), then maybe.
For lines 3-4, yes, use a Try here if you need to handle it here, else farther up (in the method that called the method that called this). In general, catch exceptions at the highest level where you can still provide sufficient information about what went wrong in a log message, where you can do any failure recovery, and where you'll be able to debug. This saves work and makes sure that you catch everything - even what you don't think of. If that's in your "main", fine. An Option will not catch exceptions. Caution: if this is for a class, your teacher may be looking for local error handling, regardless.

Is it okay to rely on automatic pass-by-reference to mutate objects?

I'm working in Python here (which is actually pass-by-name, I think), but the idea is language-agnostic as long as method parameters behave similarly:
If I have a function like this:
def changefoo(source, destination):
destination["foo"] = source
return destination
and call it like so,
some_dict = {"foo": "bar"}
some_var = "a"
new_dict = changefoo(some_var, some_dict)
new_dict will be a modified version of some_dict, but some_dict will also be modified.
Assuming the mutable structure like the dict in my example will almost always be similarly small, and performance is not an issue (in application, I'm taking abstract objects and changing into SOAP requests for different services, where the SOAP request will take an order of magnitude longer than reformatting the data for each service), is this okay?
The destination in these functions (there are several, it's not just a utility function like in my example) will always be mutable, but I like to be explicit: the return value of a function represents the outcome of a deterministic computation on the parameters you passed in. I don't like using out parameters but there's not really a way around this in Python when passing mutable structures to a function. A couple options I've mulled over:
Copying the parameters that will be mutated, to preserve the original
I'd have to copy the parameters in every function where I mutate them, which seems cumbersome and like I'm just duplicating a lot. Plus I don't think I'll ever actually need the original, it just seems messy to return a reference to the mutated object I already had.
Just use it as an in/out parameter
I don't like this, it's not immediately obvious what the function is doing, and I think it's ugly.
Create a decorator which will automatically copy the parameters
Seems like overkill
So is what I'm doing okay? I feel like I'm hiding something, and a future programmer might think the original object is preserved based on the way I'm calling the functions (grabbing its result rather than relying on the fact that it mutates the original). But I also feel like any of the alternatives will be messy. Is there a more preferred way? Note that it's not really an option to add a mutator-style method to the class representing the abstract data due to the way the software works (I would have to add a method to translate that data structure into the corresponding SOAP structure for every service we send that data off too--currently the translation logic is in a separate package for each service)
If you have a lot of functions like this, I think your best bet is to write a little class that wraps the dict and modifies it in-place:
class DictMunger(object):
def __init__(self, original_dict):
self.original_dict = original_dict
def changefoo(source)
self.original_dict['foo'] = source
some_dict = {"foo": "bar"}
some_var = "a"
munger = DictMunger(some_dict)
munger.changefoo(some_var)
# ...
new_dict = munger.original_dict
Objects modifying themselves is generally expected and reads well.