I'm creating JsonObject from Map<String, String> with Gson:
val params = HashMap<String, String>()
params["confirmation"] = "send"
JsonParser().parse(Gson().toJson(params)) as JsonObject
It works fine when all entries are Strings (hence the Map<String, String>). However, I find myself unable to use this method to create a mixed-value Json, such as it the following example:
{
"integer": 1,
"string": "text",
"boolean": false
}
Is there a way of accomplishing such results without creating models and POJOs? I found some workarounds, but I'm looking forward to see an elegant solution, maybe a Map with generic (or even wildcard) types..
So, basically I found an answer by trial-and-error method. It was all about extending the parametrized type to Any (Object in Java). Here is how it works:
val params = HashMap<String, Any>()
params["integer"] = 123
params["string"] = "text"
params["boolean"] = false
JsonParser().parse(Gson().toJson(params)) as JsonObject
This code results in a mixed typed JsonObject, as follows:
{
"integer": 123,
"string": "text",
"boolean": false
}
Related
TL;DR
For a json string containing ...,field=,..., Gson keeps throwing JsonSyntaxException. What can I do?
The Case
I have to communicate with a 3rd api, Which tends to provide data like this:
{
"fieldA": "stringData",
"fieldB": "",
"fieldC": ""
}
However, In my app project, it turns out to read like this:
val jsonString = "{fieldA=stringData,fieldB=,fieldC=}"
The Problem
I tried using the standard method to deserialize it:
val jsonString = "{fieldA=stringData,fieldB=,fieldC=}"
val parseJson = Gson().fromJson(jsonString, JsonObject::class.java)
assertEquals(3, parseJson.size())
But it results in a Exception:
com.google.gson.JsonSyntaxException: com.google.gson.stream.MalformedJsonException: Unexpected value at line 1 column 28 path $.fieldB
The Solutions That Don't Work
I have tried so many solutions, none of them works. Including:
Setup a custom data class and set value to nullable
data class DataExample(
val fieldA: String?,
val fieldB: String?,
val fieldC: String?,
)
val parseToObject = Gson().fromJson(jsonString, DataExample::class.java)
Using JsonElement instead:
data class DataExample(
val fieldA: JsonElement,
val fieldB: JsonElement,
val fieldC: JsonElement,
)
val parseToObject = Gson().fromJson(jsonString, DataExample::class.java)
Applying a Deserializer:
class EmptyToNullDeserializer<T>: JsonDeserializer<T> {
override fun deserialize(
json: JsonElement, typeOfT: Type, context: JsonDeserializationContext
): T? {
if (json.isJsonPrimitive) {
json.asJsonPrimitive.also {
if (it.isString && it.asString.isEmpty()) return null
}
}
return context.deserialize(json, typeOfT)
}
}
data class DataExample(
#JsonAdapter(EmptyToNullDeserializer::class)
val fieldA: String?,
#JsonAdapter(EmptyToNullDeserializer::class)
val fieldB: String?,
#JsonAdapter(EmptyToNullDeserializer::class)
val fieldC: String?,
)
val parseToObject = Gson().fromJson(jsonString, DataExample::class.java)
or using it in GsonBuilder:
val gson = GsonBuilder()
.registerTypeAdapter(DataExample::class.java, EmptyToNullDeserializer<String>())
.create()
val parseToObject = gson.fromJson(jsonString, DataExample::class.java)
What else can I do?
It is not a valid JSON. You need to parse it by yourself. Probably this string is made by using Map::toString() method.
Here is the code to parse it into Map<String, String>
val jsonString = "{fieldA=stringData,fieldB=,fieldC=}"
val userFieldsMap = jsonString.removeSurrounding("{", "}").split(",") // split by ","
.mapNotNull { fieldString ->
val keyVal = fieldString.split("=")
// check if array contains exactly 2 items
if (keyVal.size == 2) {
keyVal[0].trim() to keyVal[1].trim() // return#mapNotNull
} else {
null // return#mapNotNull
}
}
.toMap()
It turns out that, like #frc129 and many others said, it is not an valid JSON.
The truth is however, Gson handles more situation than JSON should be, like the data below:
val jsonString = "{fieldA=stringData,fieldB=s2,fieldC=s3}"
val parseJson = Gson().fromJson(jsonString, JsonObject::class.java)
// This will NOT throw exception, even the jsonString here is not actually a JSON string.
assertEquals(3, parseJson.size())
assertEquals("stringData", parseJson["fieldA"].asString)
assertEquals("s2", parseJson["fieldB"].asString)
assertEquals("s3", parseJson["fieldC"].asString)
Further investigation indicates that -- the string mentioned here and in the question -- is more like a Map to string.
I got a bit misunderstanding with GSON dealing with Map. That should be treat as a extra handy support, but not a legal procedure. In short, it is not supposed to be transformed, and data format should be fixed. I'll go work with server and base transformation then.
Just leave a note here. If someone in the future want some quick fix to string, you may take a look at #frc129 answer; however, the ideal solution to this is to fix the data provider to provide "the correct JSON format":
val jsonString = "{\"fieldA\":\"stringData\",\"fieldB\":\"\",\"fieldC\":\"\"}"
val parseJson = Gson().fromJson(jsonString, JsonObject::class.java)
assertEquals(3, parseJson.size())
assertEquals("stringData", parseJson["fieldA"].asString)
assertEquals("", parseJson["fieldB"].asString)
assertEquals("", parseJson["fieldC"].asString)
The problem I am trying to solve is perfectly described by the following text got from this link:
For a concrete example of when this could be useful, consider an API that supports partial updates of objects. Using this API, a JSON object would be used to communicate a patch for some long-lived object. Any included property specifies that the corresponding value of the object should be updated, while the values for any omitted properties should remain unchanged. If any of the object’s properties are nullable, then a value of null being sent for a property is fundamentally different than a property that is missing, so these cases must be distinguished.
That post presents a solution but using the kotlinx.serialization library, however, I must use gson library for now.
So I am trying to implement my own solution as I didn't find anything that could suit my use case (please let me know if there is).
data class MyObject(
val fieldOne: OptionalProperty<String> = OptionalProperty.NotPresent,
val fieldTwo: OptionalProperty<String?> = OptionalProperty.NotPresent,
val fieldThree: OptionalProperty<Int> = OptionalProperty.NotPresent
)
fun main() {
val gson = GsonBuilder()
.registerTypeHierarchyAdapter(OptionalProperty::class.java, OptionalPropertyDeserializer())
.create()
val json1 = """{
"fieldOne": "some string",
"fieldTwo": "another string",
"fieldThree": 18
}
"""
println("json1 result object: ${gson.fromJson(json1, MyObject::class.java)}")
val json2 = """{
"fieldOne": "some string",
"fieldThree": 18
}
"""
println("json2 result object: ${gson.fromJson(json2, MyObject::class.java)}")
val json3 = """{
"fieldOne": "some string",
"fieldTwo": null,
"fieldThree": 18
}
"""
println("json3 result object: ${gson.fromJson(json3, MyObject::class.java)}")
}
sealed class OptionalProperty<out T> {
object NotPresent : OptionalProperty<Nothing>()
data class Present<T>(val value: T) : OptionalProperty<T>()
}
class OptionalPropertyDeserializer : JsonDeserializer<OptionalProperty<*>> {
private val gson: Gson = Gson()
override fun deserialize(
json: JsonElement?,
typeOfT: Type?,
context: JsonDeserializationContext?
): OptionalProperty<*> {
println("Inside OptionalPropertyDeserializer.deserialize json:$json")
return when {
// Is it a JsonObject? Bingo!
json?.isJsonObject == true ||
json?.isJsonPrimitive == true-> {
// Let's try to extract the type in order
// to deserialize this object
val parameterizedType = typeOfT as ParameterizedType
// Returns an Present with the value deserialized
return OptionalProperty.Present(
context?.deserialize<Any>(
json,
parameterizedType.actualTypeArguments[0]
)!!
)
}
// Wow, is it an array of objects?
json?.isJsonArray == true -> {
// First, let's try to get the array type
val parameterizedType = typeOfT as ParameterizedType
// check if the array contains a generic type too,
// for example, List<Result<T, E>>
if (parameterizedType.actualTypeArguments[0] is WildcardType) {
// In case of yes, let's try to get the type from the
// wildcard type (*)
val internalListType = (parameterizedType.actualTypeArguments[0] as WildcardType).upperBounds[0] as ParameterizedType
// Deserialize the array with the base type Any
// It will give us an array full of linkedTreeMaps (the json)
val arr = context?.deserialize<Any>(json, parameterizedType.actualTypeArguments[0]) as ArrayList<*>
// Iterate the array and
// this time, try to deserialize each member with the discovered
// wildcard type and create new array with these values
val result = arr.map { linkedTreeMap ->
val jsonElement = gson.toJsonTree(linkedTreeMap as LinkedTreeMap<*, *>).asJsonObject
return#map context.deserialize<Any>(jsonElement, internalListType.actualTypeArguments[0])
}
// Return the result inside the Ok state
return OptionalProperty.Present(result)
} else {
// Fortunately it is a simple list, like Array<String>
// Just get the type as with a JsonObject and return an Ok
return OptionalProperty.Present(
context?.deserialize<Any>(
json,
parameterizedType.actualTypeArguments[0]
)!!
)
}
}
// It is not a JsonObject or JsonArray
// Let's returns the default state NotPresent.
else -> OptionalProperty.NotPresent
}
}
}
I got most of the code for the custom deserializer from here.
This is the output when I run the main function:
Inside OptionalPropertyDeserializer.deserialize json:"some string"
Inside OptionalPropertyDeserializer.deserialize json:"another string"
Inside OptionalPropertyDeserializer.deserialize json:18
json1 result object: MyObject(fieldOne=Present(value=some string), fieldTwo=Present(value=another string), fieldThree=Present(value=18))
Inside OptionalPropertyDeserializer.deserialize json:"some string"
Inside OptionalPropertyDeserializer.deserialize json:18
json2 result object: MyObject(fieldOne=Present(value=some string), fieldTwo=my.package.OptionalProperty$NotPresent#573fd745, fieldThree=Present(value=18))
Inside OptionalPropertyDeserializer.deserialize json:"some string"
Inside OptionalPropertyDeserializer.deserialize json:18
json3 result object: MyObject(fieldOne=Present(value=some string), fieldTwo=null, fieldThree=Present(value=18))
I am testing the different options for the fieldTwo and it is almost fully working, with the exception of the 3rd json, where I would expect that fieldTwo should be fieldTwo=Present(value=null) instead of fieldTwo=null.
And I see that in this situation, the custom deserializer is not even called for fieldTwo.
Can anyone spot what I am missing here? Any tip would be very appreciated!
I ended giving up of gson and move to moshi.
I implemented this behavior based on the solution presented in this comment.
I have a simple json, but the containing field has dynamic object. For instance, json can look like
{
"fixedField1": "value1",
"dynamicField1": {
"f1": "abc",
"f2": 123
}
}
or
{
"fixedField1": "value2",
"dynamicField1": {
"g1": "abc",
"g2": { "h1": "valueh1"}
}
}
I am trying to serialize this object, but not sure how to map the dynamic field
#Serializable
data class Response(
#SerialName("fixedField1")
val fixedField: String,
#SerialName("dynamicField1")
val dynamicField: Map<String, Any> // ???? what should be the type?
)
Above code fails with following error
Backend Internal error: Exception during code generation Cause:
Back-end (JVM) Internal error: Serializer for element of type Any has
not been found.
I ran into a similar problem when I had to serialize arbitrary Map<String, Any?>
The only way I managed to do this so far was to use the JsonObject/JsonElement API and combining it with the #ImplicitReflectionSerializer
The major downside is the use of reflection which will only work properly in JVM and is not a good solution for kotlin-multiplatform.
#ImplicitReflectionSerializer
fun Map<*, *>.toJsonObject(): JsonObject = JsonObject(map {
it.key.toString() to it.value.toJsonElement()
}.toMap())
#ImplicitReflectionSerializer
fun Any?.toJsonElement(): JsonElement = when (this) {
null -> JsonNull
is Number -> JsonPrimitive(this)
is String -> JsonPrimitive(this)
is Boolean -> JsonPrimitive(this)
is Map<*, *> -> this.toJsonObject()
is Iterable<*> -> JsonArray(this.map { it.toJsonElement() })
is Array<*> -> JsonArray(this.map { it.toJsonElement() })
else -> {
//supporting classes that declare serializers
val jsonParser = Json(JsonConfiguration.Stable)
val serializer = jsonParser.context.getContextualOrDefault(this)
jsonParser.toJson(serializer, this)
}
}
Then, to serialize you would use:
val response = mapOf(
"fixedField1" to "value1",
"dynamicField1" to mapOf (
"f1" to "abc",
"f2" to 123
)
)
val serialized = Json.stringify(JsonObjectSerializer, response.toJsonObject())
Note
This reflection based serialization is only necessary if you are constrained to use Map<String, Any?>
If you are free to use your own DSL to build the responses, then you can use the json DSL directly, which is very similar to mapOf
val response1 = json {
"fixedField1" to "value1",
"dynamicField1" to json (
"f1" to "abc",
"f2" to 123
)
}
val serialized1 = Json.stringify(JsonObjectSerializer, response1)
val response 2 = json {
"fixedField1" to "value2",
"dynamicField1" to json {
"g1" to "abc",
"g2" to json { "h1" to "valueh1"}
}
}
val serialized2 = Json.stringify(JsonObjectSerializer, response2)
If, however you are constrained to define a data type, and do serialization as well as deserialization you probably can't use the json DSL so you'll have to define a #Serializer using the above methods.
An example of such a serializer, under Apache 2 license, is here: ArbitraryMapSerializer.kt
Then you can use it on classes that have arbitrary Maps. In your example it would be:
#Serializable
data class Response(
#SerialName("fixedField1")
val fixedField: String,
#SerialName("dynamicField1")
#Serializable(with = ArbitraryMapSerializer::class)
val dynamicField: Map<String, Any>
)
I have a use case where an API will get a generic collection of Key|Value pairs in json. There are no defined attributes in the input json. I need to map it to a generic object and process the data..
JSON input:
"[{ "PostalCode": "345", "Region": "MA", "Enabled": "True" },
{"PostalCode": "989", "Country": "US", "Enabled": "True" }
]";
I am using GSON to deserialize this to java object. On mapping this to a generic object like:
Object obj = new GsonBuilder().create()
.fromJson(jsonInput, Object.class);
i get a an object of Array list of HashMaps (com.google.gson.internal.LinkedTreeMap).
From here how do i get individual key and values like key = PostalCode & value = 345?
Thanks in advance for your help!
As you have already got an array list of com.google.gson.internal.LinkedTreeMap, Now you need to iterate this list to get each key value pair :
Object obj = new GsonBuilder().create().fromJson(jsonInput, Object.class);
List<LinkedTreeMap<Object, Object>> jsonMapList = (List<LinkedTreeMap<Object, Object>>) obj;
for (LinkedTreeMap<Object, Object> jsonMap : jsonMapList) {
Set<Entry<Object, Object>> entrySet = jsonMap.entrySet();
for (Entry<Object, Object> entry : entrySet) {
System.out.println(entry.getKey() + " " + entry.getValue());
}
}
You can do this way. I have tested it.
import this
import com.google.gson.reflect.TypeToken;
import java.lang.reflect.Type;
Type type = new TypeToken<List<LinkedTreeMap<Object, Object>>>(){}.getType();
List<Map<Object, Object>> treeMap = newGsonBuilder().create().fromJson(jsonInput,type);
I don't know if this question is a repetition but somehow all the answers I came across don't seem to work for me (maybe I'm doing something wrong).
I have a class defined thus:
case class myRec(
time: String,
client_title: String,
made_on_behalf: Double,
country: String,
email_address: String,
phone: String)
and a sample Json file that contains records or objects in the form
[{...}{...}{...}...]
i.e
[{"time": "2015-05-01 02:25:47",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Brussel",
"email_address": "15e29034#gmail.com"},
{"time": "2015-05-01 04:15:03",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Bundesliga",
"email_address": "aae665d95c5d630#aol.com"},
{"time": "2015-05-01 06:29:18",
"client_title": "Mr.",
"made_on_behalf": 0,
"country": "Japan",
"email_address": "fef412c714ff#yahoo.com"}...]
my build.sbt has libraryDependencies += "com.owlike" % "genson-scala_2.11" % "1.3" for scalaVersion := "2.11.7",
I have a scala function defined thus
//PS: Other imports already made
import com.owlike.genson.defaultGenson_
//PS: Spark context already defined
def prepData(infile:String):RDD[myRec] = {
val input = sc.textFile(infile)
//Read Json Data into my Record Case class
input.mapPartitions( records =>
records.map( record => fromJson[myRec](record))
)}
And I'm calling the function
prepData("file://path/to/abc.json")
Is there any way of doing this or is there any other Json library I can use to convert to RDD
I also tried this too and both don't seem to work
Using ScalaObjectMapper
PS: I don't want to go through spark SQL to process the json file
Thanks!
Jyd, not using Spark SQL for JSON is an interesting choice, but its very much doable. There is an example of how to do this is in the Learning Spark book's examples (disclaimer I am one of the co-authors so a little biased). The examples are on github https://github.com/databricks/learning-spark, but here is the relevant code snippet:
case class Person(name: String, lovesPandas: Boolean) // Note: must be a top level class
object BasicParseJsonWithJackson {
def main(args: Array[String]) {
if (args.length < 3) {
println("Usage: [sparkmaster] [inputfile] [outputfile]")
exit(1)
}
val master = args(0)
val inputFile = args(1)
val outputFile = args(2)
val sc = new SparkContext(master, "BasicParseJsonWithJackson", System.getenv("SPARK_HOME"))
val input = sc.textFile(inputFile)
// Parse it into a specific case class. We use mapPartitions beacuse:
// (a) ObjectMapper is not serializable so we either create a singleton object encapsulating ObjectMapper
// on the driver and have to send data back to the driver to go through the singleton object.
// Alternatively we can let each node create its own ObjectMapper but that's expensive in a map
// (b) To solve for creating an ObjectMapper on each node without being too expensive we create one per
// partition with mapPartitions. Solves serialization and object creation performance hit.
val result = input.mapPartitions(records => {
// mapper object created on each executor node
val mapper = new ObjectMapper with ScalaObjectMapper
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
mapper.registerModule(DefaultScalaModule)
// We use flatMap to handle errors
// by returning an empty list (None) if we encounter an issue and a
// list with one element if everything is ok (Some(_)).
records.flatMap(record => {
try {
Some(mapper.readValue(record, classOf[Person]))
} catch {
case e: Exception => None
}
})
}, true)
result.filter(_.lovesPandas).mapPartitions(records => {
val mapper = new ObjectMapper with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
records.map(mapper.writeValueAsString(_))
})
.saveAsTextFile(outputFile)
}
}
Note this uses Jackson (specifically "com.fasterxml.jackson.core" % "jackson-databind" % "2.3.3" & "com.fasterxml.jackson.module" % "jackson-module-scala_2.10" % "2.3.3" dependencies).
I just noticed that your question had some sample input and as #zero323 pointed out line by line parsing isn't going to work. Instead you would do:
val input = sc.wholeTextFiles(inputFile).map(_._2)
// Parse it into a specific case class. We use mapPartitions beacuse:
// (a) ObjectMapper is not serializable so we either create a singleton object encapsulating ObjectMapper
// on the driver and have to send data back to the driver to go through the singleton object.
// Alternatively we can let each node create its own ObjectMapper but that's expensive in a map
// (b) To solve for creating an ObjectMapper on each node without being too expensive we create one per
// partition with mapPartitions. Solves serialization and object creation performance hit.
val result = input.mapPartitions(records => {
// mapper object created on each executor node
val mapper = new ObjectMapper with ScalaObjectMapper
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
mapper.registerModule(DefaultScalaModule)
// We use flatMap to handle errors
// by returning an empty list (None) if we encounter an issue and a
// list with one element if everything is ok (List(_)).
records.flatMap(record => {
try {
mapper.readValue(record, classOf[List[Person]])
} catch {
case e: Exception => None
}
})
})
Just for fun you can try to split individual documents using specific delimiter. While it won't work on complex nested documents it should handle example input without using wholeTextFiles:
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat
import org.apache.hadoop.io.{LongWritable, Text}
import org.apache.hadoop.conf.Configuration
import net.liftweb.json.{parse, JObject, JField, JString, JInt}
case class MyRec(
time: String,
client_title: String,
made_on_behalf: Double,
country: String,
email_address: String)
#transient val conf = new Configuration
conf.set("textinputformat.record.delimiter", "},\n{")
def clean(s: String) = {
val p = "(?s)\\[?\\{?(.*?)\\}?\\]?".r
s match {
case p(x) => Some(s"{$x}")
case _ => None
}
}
def toRec(os: Option[String]) = {
os match {
case Some(s) =>
for {
JObject(o) <- parse(s);
JField("time", JString(time)) <- o;
JField("client_title", JString(client_title)) <- o;
JField("made_on_behalf", JInt(made_on_behalf)) <- o
JField("country", JString(country)) <- o;
JField("email_address", JString(email)) <- o
} yield MyRec(time, client_title, made_on_behalf.toDouble, country, email)
case _ => Nil
}
}
val records = sc.newAPIHadoopFile("some.json",
classOf[TextInputFormat], classOf[LongWritable], classOf[Text], conf)
.map{case (_, txt) => clean(txt.toString)}
.flatMap(toRec)