how to remove empty json object by Scala - json

some times we have JSON looks like this:
{a:{}, b:{c:{}, d:123}}
you want to remove the empty structures and make it into {b:{d:123}}

here is how you can do it simply in Scala by using Jackson:
val json = """ {"a":{}, "b": {"c": {}, "d": 123}} """
val mapper = new ObjectMapper()
mapper
.setSerializationInclusion(JsonInclude.Include.NON_EMPTY)
.setSerializationInclusion(JsonInclude.Include.NON_NULL)
.setSerializationInclusion(JsonInclude.Include.NON_ABSENT)
val node = mapper.readTree(json)
removeEmptyFields(node)
val cleanJson = mapper.writeValueAsString(node) // {"b": {"d": 123}}
private def removeEmptyFields(obj: Object): Boolean = {
if (obj.isInstanceOf[ArrayNode]) {
val array = obj.asInstanceOf[ArrayNode]
val iter = array.elements()
var i = 0
while (iter.hasNext) {
if(!removeEmptyFields(iter.next())) array.remove(i)
i += 1
}
true
} else if (obj.isInstanceOf[ObjectNode]) {
val json = obj.asInstanceOf[ObjectNode]
val names = json.fieldNames().asScala.toList
if (names == null || names.isEmpty) return false
var removeRoot = true
names.foreach (
name => {
if (!removeEmptyFields(json.get(name))) {
json.remove(name)
} else removeRoot = false
}
)
!removeRoot
} else true
}

Pure, stack-safe implementation with circe:
import cats.Eval
import cats.implicits._
import io.circe.JsonObject
import io.circe.literal._
import io.circe.syntax._
object Main extends App {
def removeEmpty(jo: JsonObject): JsonObject = {
//`Eval` is trampolined so this is stack-safe
def loop(o: JsonObject): Eval[JsonObject] =
o.toList.foldLeftM(o) { case (acc, (k, v)) =>
v.asObject match {
case Some(oo) if oo.isEmpty => acc.remove(k).pure[Eval]
case Some(oo) => Eval.defer(loop(oo)).map(_o => acc.add(k, _o.asJson))
case _ => acc.pure[Eval]
}
}
loop(jo).value
}
//this is a json literal
// if it's from a dynamic string please parse it with `io.circe.parser.parse` first
val json = json"""{"a":{}, "b": {"c": {}, "d": 123}}"""
val res = json.asObject.map(removeEmpty(_).asJson)
println(res)
}

Related

JSON string to dataframe "change schema" schema contain Ambiguous column - Spark Scala

How to convert this JSON to Dataframe in scala
json_string = """{
"module": {
"col1": "a",
"col2": {
"5": 1,
"3": 4,
"4": {
"numeric reasoning": 2,
"verbal": 4
},
"7": {
"landline": 2,
"landLine": 4
}
}
}
}"""
Function I use -
val jsonRDD = spark.parallelize(json_string::Nil)
val jsonDF = sqlContext.read.json(jsonRDD)
val df = flattenRecursive(jsonDF)
df.show()
def flattenRecursive(df: DataFrame): DataFrame = {
val fields = df.schema.fields
val fieldNames = fields.map(x => x.name)
val length = fields.length
for(i <- 0 to fields.length-1){
val field = fields(i)
val fieldtype = field.dataType
val fieldName = field.name
fieldtype match {
case arrayType: ArrayType =>
println("flatten array")
val newfieldNames = fieldNames.filter(_!=fieldName) ++ Array("explode_outer(".concat(fieldName).concat(") as ").concat(fieldName))
val explodedDf = df.selectExpr(newfieldNames:_*)
return flattenRecursive(explodedDf)
case structType: StructType =>
println("flatten struct")
val newfieldNames = fieldNames.filter(_!= fieldName) ++ structType.fieldNames.map(childname => fieldName.concat(".").concat(childname) .concat(" as ").concat(fieldName).concat("_").concat(childname))
val explodedf = df.selectExpr(newfieldNames:_*)
return flattenRecursive(explodedf)
case _ =>
println("other type")
}
}
df
}
Error I face is -
Ambiguous reference to fields StructField(landLine,LongType,true), StructField(landline,LongType,true);
Required output - if we can edit 1 landline column to landline_1 before explode
**Note - please provide the generic code because
I don't know on which level I face this ambiguity and also
I don't know the schema while running the code**

HowTo skip deserialization for a field in json4s

Here is my json:
{
"stringField" : "whatever",
"nestedObject": { "someProperty": "someValue"}
}
I want to map it to
case class MyClass(stringField: String, nestedObject:String)
nestedObject should not be deserialized, I want json4s to leave it as string.
resulting instance shouldBe:
val instance = MyClass(stringField="whatever", nestedObject= """ { "someProperty": "someValue"} """)
Don't understand how to do it in json4s.
You can define a custom serializer:
case object MyClassSerializer extends CustomSerializer[MyClass](f => ( {
case jsonObj =>
implicit val format = org.json4s.DefaultFormats
val stringField = (jsonObj \ "stringField").extract[String]
val nestedObject = compact(render(jsonObj \ "nestedObject"))
MyClass(stringField, nestedObject)
}, {
case myClass: MyClass =>
("stringField" -> myClass.stringField) ~
("nestedObject" -> myClass.nestedObject)
}
))
Then add it to the default formatter:
implicit val format = org.json4s.DefaultFormats + MyClassSerializer
println(parse(jsonString).extract[MyClass])
will output:
MyClass(whatever,{"someProperty":"someValue"})
Code run at Scastie

How to edit existing JSON object with sprayJSON

I am using akka with spray json support for which I need to edit value in the recieved json.
import akka.http.scaladsl.server.Directives
import akka.http.scaladsl.marshallers.sprayjson.SprayJsonSupport
import spray.json._
final case class Item(name: String, id: Long)
final case class Order(items: List[Item],orderTag:String)
trait JsonSupport extends SprayJsonSupport with DefaultJsonProtocol {
implicit val itemFormat = jsonFormat2(Item)
implicit val orderFormat = jsonFormat2(Order)
}
In my use case I recieve the json with orderTag value as null, all I need to do is edit the orderTag value with and then use it as entity value.Is it possible to write/edit jsonObject and How to do that ?
class MyJsonService extends Directives with JsonSupport {
// format: OFF
val route =
get {
pathSingleSlash {
complete(Item("thing", 42)) // will render as JSON
}
} ~
post {
entity(as[Order]) { order => // will unmarshal JSON to Order
val itemsCount = order.items.size
val itemNames = order.items.map(_.name).mkString(", ")
complete(s"Ordered $itemsCount items: $itemNames")
}
}
}
You can just edit the json AST like ..
val json = """{"orderTag":null}"""
val jsVal = json.parseJson
val updatedJs = if (jsObj.fields.get("orderTag") == Some(JsNull)) {
JsObject(jsObj.fields + ("orderTag" -> JsString("new tag")))
} else {
jsObj
}
updatedJs.compactPrint
res26: String = """
{"orderTag":"new tag"}
"""

Transform all keys from `underscore` to `camel case` of json objects in circe

Origin
{
"first_name" : "foo",
"last_name" : "bar",
"parent" : {
"first_name" : "baz",
"last_name" : "bazz",
}
}
Expected
{
"firstName" : "foo",
"lastName" : "bar",
"parent" : {
"firstName" : "baz",
"lastName" : "bazz",
}
}
How can I transform all keys of json objects ?
Here's how I'd write this. It's not as concise as I'd like, but it's not terrible:
import cats.free.Trampoline
import cats.std.list._
import cats.syntax.traverse._
import io.circe.{ Json, JsonObject }
/**
* Helper method that transforms a single layer.
*/
def transformObjectKeys(obj: JsonObject, f: String => String): JsonObject =
JsonObject.fromIterable(
obj.toList.map {
case (k, v) => f(k) -> v
}
)
def transformKeys(json: Json, f: String => String): Trampoline[Json] =
json.arrayOrObject(
Trampoline.done(json),
_.traverse(j => Trampoline.suspend(transformKeys(j, f))).map(Json.fromValues),
transformObjectKeys(_, f).traverse(obj => Trampoline.suspend(transformKeys(obj, f))).map(Json.fromJsonObject)
)
And then:
import io.circe.literal._
val doc = json"""
{
"first_name" : "foo",
"last_name" : "bar",
"parent" : {
"first_name" : "baz",
"last_name" : "bazz"
}
}
"""
def sc2cc(in: String) = "_([a-z\\d])".r.replaceAllIn(in, _.group(1).toUpperCase)
And finally:
scala> import cats.std.function._
import cats.std.function._
scala> transformKeys(doc, sc2cc).run
res0: io.circe.Json =
{
"firstName" : "foo",
"lastName" : "bar",
"parent" : {
"firstName" : "baz",
"lastName" : "bazz"
}
}
We probably should have some way of recursively applying a Json => F[Json] transformation like this more conveniently.
Depending on your full use-case, with the latest Circe you might prefer just leveraging the existing decoder/encoder for converting between camel/snake according to these references:
https://dzone.com/articles/5-useful-circe-feature-you-may-have-overlooked
https://github.com/circe/circe/issues/663
For instance, in my particular use-case this makes sense because I'm doing other operations that benefit from the type-safety of first deserializing into case classes. So if you're willing to decode the JSON into a case class, and then encode it back into JSON, all you would need is for your (de)serializing code to extend a trait that configures this, like:
import io.circe.derivation._
import io.circe.{Decoder, Encoder, ObjectEncoder, derivation}
import io.circe.generic.auto._
import io.circe.parser.decode
import io.circe.syntax._
trait JsonSnakeParsing {
implicit val myCustomDecoder: Decoder[MyCaseClass] = deriveDecoder[MyCaseClass](io.circe.derivation.renaming.snakeCase)
// only needed if you want to serialize back to snake case json:
// implicit val myCustomEncoder: ObjectEncoder[MyCaseClass] = deriveEncoder[MyCaseClass](io.circe.derivation.renaming.snakeCase)
}
For example, I then extend that when I actually parse or output the JSON:
trait Parsing extends JsonSnakeParsing {
val result: MyCaseClass = decode[MyCaseClass](scala.io.Source.fromResource("my.json").mkString) match {
case Left(jsonError) => throw new Exception(jsonError)
case Right(source) => source
}
val theJson = result.asJson
}
For this example, your case class might look like:
case class MyCaseClass(firstName: String, lastName: String, parent: MyCaseClass)
Here's my full list of circe dependencies for this example:
val circeVersion = "0.10.0-M1"
"io.circe" %% "circe-generic" % circeVersion,
"io.circe" %% "circe-parser" % circeVersion,
"io.circe" %% "circe-generic-extras" % circeVersion,
"io.circe" %% "circe-derivation" % "0.9.0-M5",
def transformKeys(json: Json, f: String => String): TailRec[Json] = {
if(json.isObject) {
val obj = json.asObject.get
val fields = obj.toList.foldLeft(done(List.empty[(String, Json)])) { (r, kv) =>
val (k, v) = kv
for {
fs <- r
fv <- tailcall(transformKeys(v, f))
} yield fs :+ (f(k) -> fv)
}
fields.map(fs => Json.obj(fs: _*))
} else if(json.isArray) {
val arr = json.asArray.get
val vsRec = arr.foldLeft(done(List.empty[Json])) { (vs, v) =>
for {
s <- vs
e <- tailcall(transformKeys(v, f))
} yield s :+ e
}
vsRec.map(vs => Json.arr(vs: _*))
} else {
done(json)
}
}
Currently I do transform like this, but is rather complicated, hope there is a simple way.
I took #Travis answer and modernized it a bit, I took his code and I had several error and warnings, so the updated version for Scala 2.12 with Cats 1.0.0-MF:
import io.circe.literal._
import cats.free.Trampoline, cats.instances.list._, cats.instances.function._, cats.syntax.traverse._, cats.instances.option._
def transformKeys(json: Json, f: String => String): Trampoline[Json] = {
def transformObjectKeys(obj: JsonObject, f: String => String): JsonObject =
JsonObject.fromIterable(
obj.toList.map {
case (k, v) => f(k) -> v
}
)
json.arrayOrObject(
Trampoline.done(json),
_.toList.traverse(j => Trampoline.defer(transformKeys(j, f))).map(Json.fromValues(_)),
transformObjectKeys(_, f).traverse(obj => Trampoline.defer(transformKeys(obj, f))).map(Json.fromJsonObject)
)
}
def sc2cc(in: String) = "_([a-z\\d])".r.replaceAllIn(in, _.group(1).toUpperCase)
def camelizeKeys(json: io.circe.Json) = transformKeys(json, sc2cc).run

How to parse json with arbitrary schema, update/create one field and write it back as json(Scala)

how do one update/create field in JSON object with arbitrary schema and write it back as JSON in Scala?
I tried with spray-json with something like that:
import spray.json._
import DefaultJsonProtocol._
val jsonAst = """{"anyfield":"1234", "sought_optional_field":5.0}""".parse
val newValue = jsonAst.asJsObject.fields.getOrElse("sought_optional_field", 1)
val newMap = jsonAst.asJsObject.fields + ("sought_optional_field" -> newValue)
JSONObject(newMap).toJson
but it gives weird result: "{"anyfield"[ : "1234", "sought_optional_field" : ]1}
You were almost there :
import spray.json._
import DefaultJsonProtocol._
def changeField(json: String) = {
val jsonAst = JsonParser(json)
val map = jsonAst.asJsObject.fields
val sought = map.getOrElse("sought_optional_field", 1.toJson)
map.updated("sought_optional_field", sought).toJson
}
val jsonA = """{"anyfield":"1234", "sought_optional_field":5.0}"""
val jsonB = """{"anyfield":"1234"}"""
changeField(jsonA)
// spray.json.JsValue = {"anyfield":"1234","sought_optional_field":5.0}
changeField(jsonB)
// spray.json.JsValue = {"anyfield":"1234","sought_optional_field":1}
Using Argonaut:
import argonaut._, Argonaut._
def changeField2(json: String) =
json.parseOption.map( parsed =>
parsed.withObject(o =>
o + ("sought_optional_field", o("sought_optional_field").getOrElse(jNumber(1)))
)
)
changeField2(jsonA).map(_.nospaces)
// Option[String] = Some({"anyfield":"1234","sought_optional_field":5})
changeField2(jsonB).map(_.nospaces)
// Option[String] = Some({"anyfield":"1234","sought_optional_field":1})