How to remove non-ascii characters from logs? - json

I want to remove non ascii characters from Logs(json Strings) and parse them. but i see text like this before my json string starts, how can remove these kinds of string and parse my JSON String
SEQ^F!org.apache.hadoop.io.LongWritable^Yorg.apache.hadoop.io.Text^#^#^#^#^#^#ìþNmbÃ<92>w^G6ùó¯Ãl^#^#^X^E^#^#^#^H^#^#^^¯/âë<8e>^Wú{

You can drop everything till { if that's safe so that you get the json string. I assume your log format is "garbage{json}"
example,
scala> val log = """SEQ^F!org.apache.hadoop.io.LongWritable^Yorg.apache.hadoop.io.Text^#^#^#^#^#^#ìþNmbÃ<92>w^G6ùó¯Ãl^#^#^X^E^#^#^#^H^#^#^^¯/âë<8e>^Wú{"key1": "value1", "key2": ["1", "2"]}"""
log: String = SEQ^F!org.apache.hadoop.io.LongWritable^Yorg.apache.hadoop.io.Text^#^#^#^#^#^#ìþNmbÃ<92>w^G6ùó¯Ãl^#^#^X^E^#^#^#^H^#^#^^¯/âë<8e>^Wú{"key1": "value1", "key2": ["1", "2"]}
scala> val extractJson = log.dropWhile(char => char != '{')
extractJson: String = {"key1": "value1", "key2": ["1", "2"]}
Then use any json api, Im using circe in following example,
scala> import io.circe.parser._
import io.circe.parser._
scala> parse(extractJson)
res3: Either[io.circe.ParsingFailure,io.circe.Json] =
Right({
"key1" : "value1",
"key2" : [
"1",
"2"
]
})
If you want to extract any particular element json,
scala> res3.map(j => (j \\ "key1").headOption)
res4: scala.util.Either[io.circe.ParsingFailure,Option[io.circe.Json]] = Right(Some("value1"))

Related

How to convert a Gatling jsonFeeder with nested structures to json request body?

I have a json feeder reading from a file a json array like this:
[{ "a": {"b": 1} }, { "a": {"b": 2} }]
When I use it for a request body it's sent like ArraySeq(HashMap(.. instead of the actual json. How can I convert the Parsed Json back to Json keeping the feeder approach?
val jsonFileFeeder = jsonFile("requests.json").circular
val scn = scenario("cost estimation")
.feed(jsonFileFeeder)
.exec(
http("request_1")
.post("/")
.body(StringBody(
"""{
"a": "${a}"
}"""
)).asJson
)
Use jsonStringify(), see documentation.
val jsonFileFeeder = jsonFile("requests.json").circular
val scn = scenario("cost estimation")
.feed(jsonFileFeeder)
.exec(
http("request_1")
.post("/")
.body(StringBody(
"""{
"a": "${a.jsonStringify()}"
}"""
)).asJson
)

Scala, Circe, Json - how to remove parent node from json?

I have a json structure like this:
"data" : {
"fields": {
"field1": "value1",
"field2": "value2"
}
}
Now I would like to remove fields node and keep data in data:
"data" : {
"field1": "value1",
"field2": "value2"
}
I tried to do it like this:
val result = data.hcursor.downField("fields").as[JsonObject].toOption.head.toString
but I got a strange result, instead of just json in string format
I also tried:
val result = data.hcursor.downField("fields").top.head.toString
but it was the same as:
val result = data.toString
and it includes fields.
How I should change my code to remove fields root and keep data under data property?
Here is a full working solution that traverses the JSON, extracts the fields, removes them and then merges them under data:
import io.circe.Json
import io.circe.parser._
val s =
"""
|{
|"data": {
| "fields": {
| "field1": "value1",
| "field2": "value2"
| }
|}
|}
|""".stripMargin
val modifiedJson =
for {
json <- parse(s)
fields <- json.hcursor
.downField("data")
.downField("fields")
.as[Json]
modifiedRoot <- json.hcursor
.downField("data")
.downField("fields")
.delete
.root
.as[Json]
res <-
modifiedRoot.hcursor
.downField("data")
.withFocus(_.deepMerge(fields))
.root
.as[Json]
} yield res
Yields:
Right({
"data" : {
"field1" : "value1",
"field2" : "value2"
}
})

How do I ignore decoding failures in a JSON array?

Suppose I want to decode some values from a JSON array into a case class with circe. The following works just fine:
scala> import io.circe.generic.auto._, io.circe.jawn.decode
import io.circe.generic.auto._
import io.circe.jawn.decode
scala> case class Foo(name: String)
defined class Foo
scala> val goodDoc = """[{ "name": "abc" }, { "name": "xyz" }]"""
goodDoc: String = [{ "name": "abc" }, { "name": "xyz" }]
scala> decode[List[Foo]](goodDoc)
res0: Either[io.circe.Error,List[Foo]] = Right(List(Foo(abc), Foo(xyz)))
It's sometimes the case that the JSON array I'm decoding contains other, non-Foo-shaped stuff, though, which results in a decoding error:
scala> val badDoc =
| """[{ "name": "abc" }, { "id": 1 }, true, "garbage", { "name": "xyz" }]"""
badDoc: String = [{ "name": "abc" }, { "id": 1 }, true, "garbage", { "name": "xyz" }]
scala> decode[List[Foo]](badDoc)
res1: Either[io.circe.Error,List[Foo]] = Left(DecodingFailure(Attempt to decode value on failed cursor, List(DownField(name), MoveRight, DownArray)))
How can I write a decoder that ignores anything in the array that can't be decoded into my case class?
The most straightforward way to solve this problem is to use a decoder that first tries to decode each value as a Foo, and then falls back to the identity decoder if the Foo decoder fails. The new either method in circe 0.9 makes the generic version of this practically a one-liner:
import io.circe.{ Decoder, Json }
def decodeListTolerantly[A: Decoder]: Decoder[List[A]] =
Decoder.decodeList(Decoder[A].either(Decoder[Json])).map(
_.flatMap(_.left.toOption)
)
It works like this:
scala> val myTolerantFooDecoder = decodeListTolerantly[Foo]
myTolerantFooDecoder: io.circe.Decoder[List[Foo]] = io.circe.Decoder$$anon$21#2b48626b
scala> decode(badDoc)(myTolerantFooDecoder)
res2: Either[io.circe.Error,List[Foo]] = Right(List(Foo(abc), Foo(xyz)))
To break down the steps:
Decoder.decodeList says "define a list decoder that tries to use the given decoder to decode each JSON array value".
Decoder[A].either(Decoder[Json] says "first try to decode the value as an A, and if that fails decode it as a Json value (which will always succeed), and return the result (if any) as a Either[A, Json]".
.map(_.flatMap(_.left.toOption)) says "take the resulting list of Either[A, Json] values and remove all the Rights".
…which does what we want in a fairly concise, compositional way. At some point we might want to bundle this up into a utility method in circe itself, but for now writing out this explicit version isn't too bad.

How do I update a nested key of a JsObject in scala

In my code I make a get request to a server to get some json, and then I want to update one of the values before I send it back. I know that if the key was on the top level I could just update the key by writing
val newConfig = originalConfig ++ Json.obj("key" -> newValue)
however I cannot figure out a nice way to update it if the key I want to change is a couple of layers in.
ie. My json looks like this, and I want to just update key5
{
"key1": "value",
"key2": {
"key3": "value",
"key4": {
"key5": "value",
"key6": "value"
}
}
}
Is there a way to do this without updating it layer by layer?
ie.
val key4 = originalKey4 ++ Json.obj("key5" -> newValue)
val key2 = originalKey2 ++ Json.obj("key4" -> key4)
val newJson = originalJson ++ Json.obj("key2" -> key2)
The actual key that I want to update is 7 layers in, so this is rather tedious.
Take look at json transformers
import play.api.libs.json._
val str = """{
| "key1": "value",
| "key2": {
| "key3": "value",
| "key4": {
| "key5": "value",
| "key6": "value"
| }
| }
|}""".stripMargin
val json = Json.parse(str)
val transformer = (__ \ 'key2 \ 'key4 \ 'key5).json.update(
__.read[JsString].map(_ => Json.toJson("updated value"))
)
val result = json.transform(transformer).asOpt.get
Json.prettyPrint(result)
res0: String = {
"key1" : "value",
"key2" : {
"key3" : "value",
"key4" : {
"key5" : "updated value",
"key6" : "value"
}
}
}

extracting keys from json string using json4s

can someone tell me how to extract keys from json using json4s.
My use case:
json stored as string in scala variable:
{
"key1" : "val1",
"key2" : ["12", "32"],
"key3" : {"keyN" : "valN"}
}
I'd like to transform this into a following Map[String, String]:
(key1 -> "val1", key2 -> "[\"12\",\"32\"]", key3 -> "{\"keyN\":\"valN\"}"
is there a simple way to achieve this with json4s?
Thanks in advance
val result: Map[String, String] = parse( """ {
| "key1" : "val1",
| "key2" : ["12", "32"],
| "key3" : {"keyN" : "valN"}
| }""".stripMargin).mapField(k => {
val v: String = k._2 match {
case s: JString => k._2.extract[String]
case _ => write(k._2)
}
(k._1, JString(v))
}).extract[Map[String, String]]
println(result)
You can use mapField map the JValue toString
if the value's type is String just extract as String
if the value's type is others, use the json4s to parse it to as JSON string
finally extract the JValue as Map[String, String].
implicit val formats = DefaultFormats
val a = parse(""" { "numbers" : [1, 2, 3, 4] } """)
println(a.extract[Map[String, Any]].keySet)