Groovy JsonBuilder strange behavior when toString() - json

I need to create a json to use as body in an http.request. I'm able to build dynamically up the json, but I noticed a strange behavior when calling builder.toString() twice. The resulting json was totally different. I'm likely to think this is something related to a kind of buffer or so. I've been reading the documentation but I can't find a good answer. Here is a code to test.
import groovy.json.JsonBuilder
def builder = new JsonBuilder()
def map = [
catA: ["catA-element1", "catA-element2"],
catB:[],
catC:["catC-element1"]
]
def a = map.inject([:]) { res, k, v ->
def b = v.inject([:]) {resp, i ->
resp[i] = k
resp
}
res += b
}
println a
def root = builder.query {
bool {
must a.collect{ k, v ->
builder.match {
"$v" k
}
}
}
should([
builder.simple_query_string {
query "text"
}
])
}
println builder.toString()
println builder.toString()
This will print the following lines. Pay attention to the last two lines
[catA-element1:catA, catA-element2:catA, catC-element1:catC]
{"query":{"bool":{"must":[{"match":{"catA":"catA-element1"}},{"match":{"catA":"catA-element2"}},{"match":{"catC":"catC-element1"}}]},"should":[{"simple_query_string":{"query":"text"}}]}}
{"match":{"catC":"catC-element1"}}
In my code I can easily send the first toString() result to a variable and use it when needed. But, why does it change when invoking more than one time?

I think this is happening because you are using builder inside the closure bool. If we make print builder.content before printing the result (buider.toString() is calling JsonOutput.toJson(builder.content)) we get:
[query:[bool:ConsoleScript54$_run_closure3$_closure6#294b5a70, should:[[simple_query_string:[query:text]]]]]
Adding println builder.content to the bool closure we can see that the builder.content is modified when the closure is evaluated:
def root = builder.query {
bool {
must a.collect{ k, v ->
builder.match {
"$v" k
println builder.content
}
}
}
should([
builder.simple_query_string {
query "text"
}
])
}
println JsonOutput.toJson(builder.content)
println builder.content
The above yields:
[query:[bool:ConsoleScript55$_run_closure3$_closure6#39b6156d, should:[[simple_query_string:[query:text]]]]]
[match:[catA:catA-element1]]
[match:[catA:catA-element2]]
{"query":{"bool":{"must":[{"match":{"catA":"catA-element1"}},{"match":{"catA":"catA-element2"}},{"match":{"catC":"catC-element1"}}]},"should":[{"simple_query_string":{"query":"text"}}]}}
[match:[catC:catC-element1]]
You can easily avoid that with a different builder for the closure inside:
def builder2 = new JsonBuilder()
def root = builder.query {
bool {
must a.collect{ k, v ->
builder2.match {
"$v" k
}
}
}
should([
builder.simple_query_string {
query "text"
}
])
}
Or even better:
def root = builder.query {
bool {
must a.collect({ k, v -> ["$v": k] }).collect({[match: it]})
}
should([
simple_query_string {
query "text"
}
])
}

Related

How to use output from one generator in another generator in Kotest?

Using an example from Clojure's test.check let generator, generate a non-empty list of strings, give that list to another generator to pick a string from, then create a map that contains the string list and the selected string. In Clojure, it looks as follows:
(gen/let [list-of-strings (gen/not-empty (gen/list gen/string))
a-string (gen/element list-of-strings)] ;; use the generated list-of-strings above
{:all-strings list-of-strings
:selected a-string})
Taking io.kotest.property.arbitrary.bind for inspiration, I've tried implementing it as follows, but it doesn't work (Kotlin compiler spitted out "Type inference failed"):
fun <A, B, T: Any> let(genA: Gen<A>, genB: (A) -> Gen<B>, bindFn: (A, B) -> T): Arb<T> {
return arb { rs ->
val iterA = genA.generate(rs).iterator()
generateSequence {
val a = iterA.next()
val iterB = genB(a.value).generate(rs).iterator()
val b = iterB.next()
bindFn(a.value, b.value)
}
}
}
Turns out dropping bindFn parameter solves the problem, but the solution looks a little ugly as it needs to return a Pair:
fun <A, B> let(genA: Gen<A>, genBFn: (A) -> Gen<B>): Arb<Pair<A, B>> {
return arb { rs ->
val iterA = genA.generate(rs).iterator()
generateSequence {
val a = iterA.next().value
// could combine the following to one line, but split for clarity
val genB = genBFn(a)
val iterB = genB.generate(rs).iterator()
Pair(a, iterB.next().value)
}
}
}
Then with the above, using it looks as follows:
class StringTest : StringSpec({
"element is in list" {
val letGen = let(
Arb.list(Arb.string(), range=1..100), // genA
{ xs -> Arb.element(xs) } // genBFn
)
forAll(letGen) { (xs, x) ->
x in xs
}
}
})
Inspire from above solution and wrote a shorter one
fun <A, B> Gen<A>.then(genFn: (A) -> Gen<B>): Arb<Pair<A, B>> =
arbitrary { rs ->
val first = this.generate(rs).first().value
val second = genFn(first).generate(rs).first().value
Pair(first, second)
}
class StringTest : StringSpec({
"element is in list" {
val dependArb =
Arb.list(Arb.string(), range=1..100).then { Arb.element(it) } // genBFn
forAll(dependArb) { (xs, x) ->
x in xs
}
}
})

Recursively removing whitespace from JSON field names in Groovy

I have a Groovy process that is receiving troublesome JSON that has attribute/field names containing whitespaces:
{
"leg bone" : false,
"connected to the" : {
"arm bones " : [
{
" fizz" : "buzz",
"well hello" : "there"
}
]
}
}
Above, fields such as "leg bone" and "well hello" are causing issues during processing (even though they are technically legal JSON fields). So I want to scan each field (recursively, or in nested fashion) in the incoming JSON and string replace any whitespace with an underscore ("_"). Hence, the above JSON would be converted into:
{
"leg_bone" : false,
"connected_to__the" : {
"arm_bones_" : [
{
"_fizz" : "buzz",
"well_hello" : "there"
}
]
}
}
Typically I use a JsonSlurper for parsing JSON strings into maps, but I can't seem to figure out how to get the recursion correct. Here's my best attempt so far:
// In reality 'incomingJson' isn't hardcoded as a string literal, but this helps make my actual use case
// an SSCCE.
class JsonMapExperiments {
static void main(String[] args) {
String incomingJson = """
{
"leg bone" : false,
"connected to the" : {
"arm bones " : [
{
" fizz" : "buzz",
"well hello" : "there"
}
]
}
}
"""
String fixedJson = fixWhitespaces(new JsonSlurper().parseText(incomingJson))
println fixedJson
}
static String fixWhitespaces(def jsonMap) {
def fixedMap = [:]
String regex = ""
jsonMap.each { key, value ->
String fixedKey = key.replaceAll('\\s+', '_')
String fixedValue
if(value in Map) {
fixedValue = fixWhitespaces(value)
} else {
fixedValue = value
}
fixedMap[fixedKey] = fixedValue
}
new JsonBuilder(fixedMap).toString()
}
}
When this runs, the final output is:
{"connected_to_the":"{\"arm_bones_\":\"[{ fizz=buzz, well hello=there}]\"}","leg_bone":"false"}
Which is kinda/sorta close, but not exactly what I need. Any ideas?
Given your input and this script:
def fixWhitespacesInTree(def tree) {
switch (tree) {
case Map:
return tree.collectEntries { k, v ->
[(k.replaceAll('\\s+', '_')):fixWhitespacesInTree(v)]
}
case Collection:
return tree.collect { e -> fixWhitespacesInTree(e) }
default :
return tree
}
}
def fixWhitespacesInJson(def jsonString) {
def tree = new JsonSlurper().parseText(jsonString)
def fixedTree = fixWhitespacesInTree(tree)
new JsonBuilder(fixedTree).toString()
}
println fixWhitespacesInJson(json)
I got the following results:
{"connected_to_the":{"arm_bones_":[{"_fizz":"buzz","well_hello":"there"}]},"leg_bone":false}
I would, however, suggest that you change the regular expression \\s+ to just \\s. In the former case. if you have two JSON properties at the same level, one called " fizz" and the other called " fizz" then the translated keys will both be "_fizz" and one will overwrite the other in the final result. In the latter case, the translated keys will be "_fizz" and "__fizz" respectively, and the original content will be preserved.

Comma separated list with Enumerator

I've just started working with Scala in my new project (Scala 2.10.3, Play2 2.2.1, Reactivemongo 0.10.0), and encountered a pretty standard use case, which is - stream all the users in MongoDB to the external client. After navigating Enumerator, Enumeratee API I have not found a solid solution for that, and so I solved this in following way:
val users = collection.find(Json.obj()).cursor[User].enumerate(Integer.MAX_VALUE, false)
var first:Boolean = true
val indexedUsers = (users.map(u => {
if(first) {
first = false;
Json.stringify(Json.toJson(u))
} else {
"," + Json.stringify(Json.toJson(u))
}
}))
Which, from my point of view, is a little bit tricky - mainly because I needed to add Json Start Array, Json End Array and comma separators in element list, and I was not able to provide it as a pure Json stream, so I converted it to String steam.
What is a standard solution for that, using reactivemongo in play?
I wrote a helper function which does what you want to achieve:
def intersperse[E](e: E, enum: Enumerator[E]): Enumerator[E] = new Enumerator[E] {
val element = Input.El(e)
override def apply[A](i1: Iteratee[E, A]): Future[Iteratee[E, A]] = {
var iter = i1
val loop: Iteratee[E, Unit] = {
lazy val contStep = Cont(step)
def step(in: Input[E]): Iteratee[E, Unit] = in match {
case Input.Empty ⇒ contStep
case Input.EOF ⇒ Done((), Input.Empty)
case e # Input.El(_) ⇒
iter = Iteratee.flatten(iter.feed(element).flatMap(_.feed(e)))
contStep
}
lazy val contFirst = Cont(firstStep)
def firstStep(in: Input[E]): Iteratee[E, Unit] = in match {
case Input.EOF ⇒ Done((), Input.Empty)
case Input.Empty ⇒
iter = Iteratee.flatten(iter.feed(in))
contFirst
case Input.El(x) ⇒
iter = Iteratee.flatten(iter.feed(in))
contStep
}
contFirst
}
enum(loop).map { _ ⇒ iter }
}
}
Usage:
val prefix = Enumerator("[")
val suffix = Enumerator("]")
val asStrings = Enumeratee.map[User] { u => Json.stringify(Json.toJson(u)) }
val result = prefix >>> intersperse(",", users &> asStrings) >>> suffix
Ok.chunked(result)

scala function with repeated parameters

I see that it possible to uses the following syntax for method that take parameters of a repeated type:
def capitalizeAll( args: String*) = {
args.map { args => args.capitalize }
}
However I was wondering how an function can be used instead of "args => args.capitalize"
for example (does not work):
def func(s: String): String = { s.capitalize }
def capitalizeAll2( args: String*) = {
args.map { func( args ) }
}
how can I make this work?
Cheers
There is no magic:
def func(s: String): String = { s.capitalize }
def capitalizeAll2( args: String*) = {
args.map { arg => func( arg ) }
}
Here I gave arg name to currently processed string (out of all args strings). Your first example works only because of shadowing (all strings are args and current string given the same name, which just shadows original).
Almost no magic...
def capitalizeAll3( args: String*) = {
args.map(func)
}
The latest example uses syntax sugar to apply function with only one parameter to args.

restart iterator on exceptions in Scala

I have an iterator (actually a Source.getLines) that's reading an infinite stream of data from a URL. Occasionally the iterator throws a java.io.IOException when there is a connection problem. In such situations, I need to re-connect and re-start the iterator. I want this to be seamless so that the iterator just looks like a normal iterator to the consumer, but underneath is restarting itself as necessary.
For example, I'd like to see the following behavior:
scala> val iter = restartingIterator(() => new Iterator[Int]{
var i = -1
def hasNext = {
if (this.i < 3) {
true
} else {
throw new IOException
}
}
def next = {
this.i += 1
i
}
})
res0: ...
scala> iter.take(6).toList
res1: List[Int] = List(0, 1, 2, 3, 0, 1)
I have a partial solution to this problem, but it will fail on some corner cases (e.g. an IOException on the first item after a restart) and it's pretty ugly:
def restartingIterator[T](getIter: () => Iterator[T]) = new Iterator[T] {
var iter = getIter()
def hasNext = {
try {
iter.hasNext
} catch {
case e: IOException => {
this.iter = getIter()
iter.hasNext
}
}
}
def next = {
try {
iter.next
} catch {
case e: IOException => {
this.iter = getIter()
iter.next
}
}
}
}
I keep feeling like there's a better solution to this, maybe some combination of Iterator.continually and util.control.Exception or something like that, but I couldn't figure one out. Any ideas?
This is fairly close to your version and using scala.util.control.Exception:
def restartingIterator[T](getIter: () => Iterator[T]) = new Iterator[T] {
import util.control.Exception.allCatch
private[this] var i = getIter()
private[this] def replace() = i = getIter()
def hasNext: Boolean = allCatch.opt(i.hasNext).getOrElse{replace(); hasNext}
def next(): T = allCatch.opt(i.next).getOrElse{replace(); next}
}
For some reason this is not tail recursive but it that can be fixed by using a slightly more verbose version:
def restartingIterator2[T](getIter: () => Iterator[T]) = new Iterator[T] {
import util.control.Exception.allCatch
private[this] var i = getIter()
private[this] def replace() = i = getIter()
#annotation.tailrec def hasNext: Boolean = {
val v = allCatch.opt(i.hasNext)
if (v.isDefined) v.get else {replace(); hasNext}
}
#annotation.tailrec def next(): T = {
val v = allCatch.opt(i.next)
if (v.isDefined) v.get else {replace(); next}
}
}
Edit: There is a solution with util.control.Exception and Iterator.continually:
def restartingIterator[T](getIter: () => Iterator[T]) = {
import util.control.Exception.allCatch
var iter = getIter()
def f: T = allCatch.opt(iter.next).getOrElse{iter = getIter(); f}
Iterator.continually { f }
}
There is a better solution, the Iteratee:
http://apocalisp.wordpress.com/2010/10/17/scalaz-tutorial-enumeration-based-io-with-iteratees/
Here is for example an enumerator that restarts on encountering an exception.
def enumReader[A](r: => BufferedReader, it: IterV[String, A]): IO[IterV[String, A]] = {
val tmpReader = r
def loop: IterV[String, A] => IO[IterV[String, A]] = {
case i#Done(_, _) => IO { i }
case Cont(k) => for {
s <- IO { try { val x = tmpReader.readLine; IO(x) }
catch { case e => enumReader(r, it) }}.join
a <- if (s == null) k(EOF) else loop(k(El(s)))
} yield a
}
loop(it)
}
The inner loop advances the Iteratee, but the outer function still holds on to the original. Since Iteratee is a persistent data structure, to restart you just have to call the function again.
I'm passing the Reader by name here so that r is essentially a function that gives you a fresh (restarted) reader. In practise you will want to bracket this more effectively (close the existing reader on exception).
Here's an answer that doesn't work, but feels like it should:
def restartingIterator[T](getIter: () => Iterator[T]): Iterator[T] = {
new Traversable[T] {
def foreach[U](f: T => U): Unit = {
try {
for (item <- getIter()) {
f(item)
}
} catch {
case e: IOException => this.foreach(f)
}
}
}.toIterator
}
I think this very clearly describes the control flow, which is great.
This code will throw a StackOverflowError in Scala 2.8.0 because of a bug in Traversable.toStream, but even after the fix for that bug, this code still won't work for my use case because toIterator calls toStream, which means that it will store all items in memory.
I'd love to be able to define an Iterator by just writing a foreach method, but there doesn't seem to be any easy way to do that.