Translating JSON values using io.circe - json

I have a function in scala that translates a value and produces a string.
strOut = translate(strIn)
Suppose the following JSON object:
{
"id": "c730433b-082c-4984-3d56-855c243265f0",
"standard": "stda",
"timestamp": "tsx000",
"stdparms" : {
"stdparam1": "a",
"stdparam2": "b"
}
}
and the following mapping provided by the translation function:
"stda" -> "stdb"
"tsx000" -> "tsy000"
"a" -> "f"
"b" -> "g"
What is the best way to translate the whole JSON object using the translate function? My goal is to obtain the following result:
{
"id": "c730433b-082c-4984-3d56-855c243265f0",
"standard": "stdb",
"timestamp": "tsy000",
"stdparms" : {
"stdparam1": "f",
"stdparam2": "g"
}
}
I must use the io.circe library due to project related matters.

If you know beforehand which fields you want to translate, or what translations apply to that field, you can use Cursors to traverse the JSON tree. Or if the fields themselves are fixed (you always know what fields to expect) Optics may require less code.
When you get to the right leaf, you apply the translation.
However, when you don't know what could apply when/where it might be easier to find/replace using string methods.
Note that the JSON you provided as an example is not valid JSON by the way.

Related

Validating against dynamic data - JSON Schema - Ajv

I'm trying to create a JSON Schema for something very dynamic. Say I have two pieces of data, and I want one (the source) to determine the validity of the other (the target). Both can change over time, but both will always be an array of objects with known properties. For example:
source.json
[
{ "id": 23, "active": true },
{ "id": 9, "active": false },
{ "id": 6, "active": true }
]
target.json
[
{ "identifier": 6 }
]
The schema I'm trying to create is this: For each active object in the source array, there should be an equivalent object in the target array. A little more formally, given an object in the source array where "active" equals true and "id" equals x, there should be an object in the target array where "identifier" equals x.
In the example above, the target would be invalid because it's missing an object like { "identifier": 23 }.
However, I want to statically define this schema (or something capable of generating it) in a JSON file ahead of time, and this feels pretty tough since the source array can change. I'm using Ajv, and I'm aware that it supports the $data reference, but I'm not sure that's enough to help me here. The other option I could see is creating some kind of schema-generator definition? In concept, it too would be a JSON object I define ahead of time, but at runtime it would be used to safely generate arbitrary schemas based on runtime data such as the source array. However, if a mechanism like this doesn't already exist, trying to implement it myself sounds like a great way to give myself a code-injection vulnerability.
Thanks for your time!

AWS Glue Crawler - DynamoDB Export - Get attribute names in schema instead of struct

I've defined a default crawler on the data directory of an export from dynamodb. I'm trying to get it to give me a structured table instead of a table with a single column of type struct. What do I have to do make get the actual column names in there? I've tried adding custom classifiers and different path expressions but nothing seems to work, and I feel like I'm missing something really obvious.
I'm using the crawler builder inside of glue, which doesn't seem to offer much customization.
Here's the schema from the table generated by the default crawler:
And here's one of the items that I've exported from dynamo:
{
"Item": {
"the_url": {
"S": "/2021/07/06/****redacted****.html"
},
"as_of_when": {
"S": "2021-09-01"
},
"user_hashes": {
"SS": [
"****redacted*****"
]
},
"user_id_hashes": {
"SS": [
"u3MeXDcpQm0ACYuUv6TMrg=="
]
},
"accumulated_count": {
"N": "1"
},
"today_count": {
"N": "1"
}
}
}
The way Athena interprets JSON data means that your data has only a single column, Item. Athena doesn't have any mechanism to map arbitrary parts of a JSON object to columns, it can only map top-level attributes to columns.
If you want other parts of the objects as columns you will either have to create a new table with transformed data, or create a view with the attributes as columns, e.g.
CREATE OR REPLACE VIEW attributes_as_top_level_columns AS
SELECT
item.the_url.S AS the_url,
CAST(item.as_of_when.S AS DATE) AS as_of_when,
item.user_hashes.SS AS user_hashes,
item.user_id_hashes.SS AS user_id_hashes,
item.accumulated_count.N AS accumulated_count,
item.today_count.N AS today_count
FROM items
In the example above I've also flattened the data type keys (S, SS, N) and I converted the date string to a date.

Kotlinx.Serializer - Create a quick JSON to send

I've been playing with Kotlinx.serialisation. I've been trying to find a quick way to use Kotlinx.serialisation to create a plain simple JSON (mostly to send it away), with minimum code clutter.
For a simple string such as:
{"Album": "Foxtrot", "Year": 1972}
I've been doing is something like:
val str:String = Json.stringify(mapOf(
"Album" to JsonPrimitive("Foxtrot"),
"Year" to JsonPrimitive(1972)))
Which is far from being nice. My elements are mostly primitive, so I wish I had something like:
val str:String = Json.stringify(mapOf(
"Album" to "Sergeant Pepper",
"Year" to 1967))
Furthermore, I'd be glad to have a solution with a nested JSON. Something like:
Json.stringify(JsonObject("Movies", JsonArray(
JsonObject("Name" to "Johnny English 3", "Rate" to 8),
JsonObject("Name" to "Grease", "Rate" to 1))))
That would produce:
{
"Movies": [
{
"Name":"Johnny English 3",
"Rate":8
},
{
"Name":"Grease",
"Rate":1
}
]
}
(not necessarily prettified, even better not)
Is there anything like that?
Note: It's important to use a serialiser, and not a direct string such as
"""{"Name":$name, "Val": $year}"""
because it's unsafe to concat strings. Any illegal char might disintegrate the JSON! I don't want to deal with escaping illegal chars :-(
Thanks
Does this set of extension methods give you what you want?
#ImplicitReflectionSerializer
fun Map<*, *>.toJson() = Json.stringify(toJsonObject())
#ImplicitReflectionSerializer
fun Map<*, *>.toJsonObject(): JsonObject = JsonObject(map {
it.key.toString() to it.value.toJsonElement()
}.toMap())
#ImplicitReflectionSerializer
fun Any?.toJsonElement(): JsonElement = when (this) {
null -> JsonNull
is Number -> JsonPrimitive(this)
is String -> JsonPrimitive(this)
is Boolean -> JsonPrimitive(this)
is Map<*, *> -> this.toJsonObject()
is Iterable<*> -> JsonArray(this.map { it.toJsonElement() })
is Array<*> -> JsonArray(this.map { it.toJsonElement() })
else -> JsonPrimitive(this.toString()) // Or throw some "unsupported" exception?
}
This allows you to pass in a Map with various types of keys/values in it, and get back a JSON representation of it. In the map, each value can be a primitive (string, number or boolean), null, another map (representing a child node in the JSON), or an array or collection of any of the above.
You can call it as follows:
val json = mapOf(
"Album" to "Sergeant Pepper",
"Year" to 1967,
"TestNullValue" to null,
"Musicians" to mapOf(
"John" to arrayOf("Guitar", "Vocals"),
"Paul" to arrayOf("Bass", "Guitar", "Vocals"),
"George" to arrayOf("Guitar", "Sitar", "Vocals"),
"Ringo" to arrayOf("Drums")
)
).toJson()
This returns the following JSON, not prettified, as you wanted:
{"Album":"Sergeant Pepper","Year":1967,"TestNullValue":null,"Musicians":{"John":["Guitar","Vocals"],"Paul":["Bass","Guitar","Vocals"],"George":["Guitar","Sitar","Vocals"],"Ringo":["Drums"]}}
You probably also want to add handling for some other types, e.g. dates.
But can I just check that you want to manually build up JSON in code this way rather than creating data classes for all your JSON structures and serializing them that way? I think that is generally the more standard way of handling this kind of stuff. Though maybe your use case does not allow that.
It's also worth noting that the code has to use the ImplicitReflectionSerializer annotation, as it's using reflection to figure out which serializer to use for each bit. This is still experimental functionality which might change in future.

How can I find the row & column in a JSON String identified by a JsonPath?

Suppose I have a JSON string like this example:
[
{
"name": "John"
},
{
"name": "Jane"
}
]
Using JSONPath I want to select the second name like this:
[1].name
However, I am not interested in the name's value but rather in the row & column of the entire second name key-value pair as they appear in the String. In this example, the expected result when written as a JSON would be:
{
"row": 5,
"column": 4
}
Alternatively, the index within the String would be fine as well. Note that I don't require the result to be a JSON. Note also that this should work with any formatting of the JSON.
I have thought about constructing a regular expression but this might be fragile if the key's name "name" also appears as text in values.
I am using Jayway JSONPath in my Java backend but a solution with a different JSONPath library, or one in JavaScript using an Node.js compatible library would also be fine.

Reusing type definitions with JSONProvider?

I'm using the JSONProvider from FSharp-Data to automatically create types for a webservice that I'm consuming using sample responses from the service.
However I'm a bit confused when it comes to types that are reused in the service, like for example there is one api method that return a single item of type X while another returns a list of X and so on. Do I really have to generate multiple definitions for this, and won't that mean that I will have duplicate types for the same thing?
So, I guess what I'm really asking, is there a way to create composite types from types generated from JSON samples?
If you call JsonProvider separately with separate samples, then you will get duplicate types for the same things in the sample. Sadly, there is not much that the F# Data library can do about this.
One option that you have would be to pass multiple samples to the JsonProvider at the same time (using the SampleIsList parameters). In that case, it tries to find one type for all the samples you provide - but it will also share types with the same structure among all the samples.
I assume you do not want to get one type for all your samples - in that case, you can wrap the individual samples with additional JSON object like this (here, the real samples are the records nested under "one" and "two"):
type J = JsonProvider<"""
[ { "one": { "person": {"name": "Tomas"} } },
{ "two": { "num": 42, "other": {"name": "Tomas"} } } ]""", SampleIsList=true>
Now, you can run the Parse method and wrap the samples in a new JSON object using "one" or "two", depending on which sample you are processing:
let j1 = """{ "person": {"name": "Tomas"} }"""
let o1 = J.Parse("""{"one":""" + j1 + "}").One.Value
let j2 = """{ "num": 42, "other": {"name": "Tomas"} }"""
let o2 = J.Parse("""{"two":""" + j2 + "}").Two.Value
The "one" and "two" records are completely arbitrary (I just added them to have two separate names). We wrap the JSON before parsing it and then we access it using the One or Two property. However, it means that o1.Person and o2.Other are now of the same type:
o1.Person = o2.Other
This returns false because we do not implement equality on JSON values in F# Data, but it type checks - so the types are the same.
This is fairly complicated, so I would probably look for other ways of doing what you need - but it is one way to get shared types among multiple JSON samples.