Scala: merging two JSON files using AmazonS3Client getObject Futures - json

I'm trying to merge two JSON files from an S3 bucket. First file pulls fine, but not the second file.
val eventLogJsonFuture = Future(new AmazonS3Client(credentials))
.map(_.getObject(logBucket, logDirectory + "/" + id + "/event_log.json"))
.map(_.getObjectContent)
.map(Source.fromInputStream(_))
.map(_.mkString)
.map(Json.parse) map { archiveEvents =>
Json.toJson(Json.obj("success" -> true, "data" -> archiveEvents))
} recover {
case NonFatal(error) =>
Json.obj("success" -> false, "errorCode" -> "archive_does_not_exist", "message" -> error.getMessage)
}
val infoJsonFuture = Future(new AmazonS3Client(credentials))
.map(_.getObject(logBucket, logDirectory + "/" + id + "/info.json"))
.map(_.getObjectContent)
.map(Source.fromInputStream(_))
.map(_.mkString)
.map(Json.parse) map { archiveInfo =>
Json.toJson(Json.obj("success" -> true, "data" -> archiveInfo))
} recover {
case NonFatal(error) =>
Json.obj("success" -> false, "errorCode" -> "archive_does_not_exist", "message" -> error.getMessage)
}
val combinedJson = for {
eventLogJson <- eventLogJsonFuture
infoJson <- infoJsonFuture
}
yield {
Json.obj("info" -> infoJson, "events" -> eventLogJson)
}
This is what the result JSON looks like ...
Is there another (better?) way of writing this?

Should you wait 3 parts of JSON from different source ?
I can recommend solution with case class DTO
Simple example:
val firstJson = Future {
//case class json1(...)
}
val secondJson = Future {
//case class json2(...)
...
}
val finalFson = for {
f <- first
s <- second
} yield (f, s)
finalJson onComplete {
case Success(jsons) => {
//merge json here
jsons._1 + jsons._2 ...
}

Related

Saving JsObject into DynamoDB

We want to save our data models into DynamoDB. We use scanamo with alpakka for nonblocking I/O.
For numerous reasons, we don't want the keys and data to be auto generated to dynamo format. We already have Play-Json formatters for all our case classes and want the data to be saved in Dynamo from JsObjects.
For saving the data as JsObject, each repository has the following
import com.gu.scanamo.Table
val table = Table[JsObject](name)
Always end up receiving this error:
could not find implicit value for evidence parameter of type com.gu.scanamo.DynamoFormat[play.api.libs.json.JsObject]
I can't find a way to make it accept JsObject or create a formatter that will fit.
Will much appreciate any help.
Sidenote: I've looked at PlayDynamo-Repo but they actually create the whole request from scratch and we'd like to use scanamo's API.
I ended up using the following code which works just as expected. I cannot share the sub-functions but it should give the general idea.
implicit val dynamoFormat: DynamoFormat[JsValue] = new DynamoFormat[JsValue] {
override def read(av: AttributeValue): Either[DynamoReadError, JsValue] = {
Option(av.getS).map {
fromStringAttributeValue
} orElse Option(av.getN).map { n =>
Right(JsNumber(BigDecimal.apply(n)))
} orElse Option(av.getBOOL).map { b =>
Right(JsBoolean(b))
} orElse Option(av.isNULL).map { _ =>
Right(JsNull)
} orElse Option(av.getSS).map { ss =>
Right(JsArray(ss.asScala.map(JsString.apply)))
} orElse Option(av.getNS).map { ns =>
Right(JsArray(ns.asScala.map(n => JsNumber(BigDecimal(n)))))
} orElse Option(av.getL).map { l =>
traverse(l.asScala.toList)(read).right.map(JsArray.apply)
} orElse Option(av.getM).map { m =>
traverse(m.asScala) {
case (k, v) => read(v).right.map(j => k -> j)
}.right.map(values => JsObject(values.toMap))
} getOrElse {
Left(YOUR_ERROR_HERE)
}
}
override def write(t: JsValue): AttributeValue = {
val res = new AttributeValue()
t match {
case JsNumber(n) => res.setN(n.toString())
case JsBoolean(b) => res.setBOOL(b)
case JsString(s) => res.setS(stringToAttributeValueString(s))
case a: JsArray => res.setL(a.value.map(write).asJava)
case o: JsObject => res.setM(o.value.mapValues(write).asJava)
case JsNull => res.setNULL(true)
}
res
}
}

Handle exception for Scala Async and Future variable, error accessing variable name outside try block

#scala.throws[scala.Exception]
def processQuery(searchQuery : scala.Predef.String) : scala.concurrent.Future[io.circe.Json] = { /* compiled code */ }
How do I declare the searchResult variable at line 3 so that it can be initailized inside the try block and can be processed if it's successful after and outside the try block. Or, is there any other way to handle the exception? The file containing processQuery function is not editable to me, it's read-only.
def index = Action.async{ implicit request =>
val query = request.body.asText.get
var searchResult : scala.concurrent.Future[io.circe.Json] = Future[io.circe.Json] //line 3
var jsonVal = ""
try {
searchResult = search.processQuery(query)
} catch {
case e :Throwable => jsonVal = e.getMessage
}
searchResult onSuccess ({
case result => jsonVal = result.toString()
})
searchResult.map{ result =>
Ok(Json.parse(jsonVal))
}
}
if declared in the way it has been declared it's showing compilation error
Would using the recover method help you? I also suggest to avoid var and use a more functional approach if possible. In my world (and play Json library), I would hope to get to something like:
def index = Action.async { implicit request =>
processQuery(request.body.asText.get).map { json =>
Ok(Json.obj("success" -> true, "result" -> json))
}.recover {
case e: Throwable => Ok(Json.obj("success" -> false, "message" -> e.getMessage))
}
}
I guess it may be necessary to put the code in another whole try catch:
try {
processQuery....
...
} catch {
...
}
I have here a way to validate on the incoming JSON and fold on the result of the validation:
def returnToNormalPowerPlant(id: Int) = Action.async(parse.tolerantJson) { request =>
request.body.validate[ReturnToNormalCommand].fold(
errors => {
Future.successful{
BadRequest(
Json.obj("status" -> "error", "message" -> JsError.toJson(errors))
)
}
},
returnToNormalCommand => {
actorFor(id) flatMap {
case None =>
Future.successful {
NotFound(s"HTTP 404 :: PowerPlant with ID $id not found")
}
case Some(actorRef) =>
sendCommand(actorRef, id, returnToNormalCommand)
}
}
)
}

Scala-Slick: only part of sequence of actions executed

Have a piece of code that adds/updates a Product and also associates one or more tags to it. Tags are actually added to a TagGroup and that is associated with the Product.
Issue I am facing is that only "part" of addOrUpdateProductWithTags() executes. Product is updated or created but Tags are not added. If I comment the last query (see comment) then everything works. Have turned on "" to confirm this.
lazy val pRetId = prods returning prods.map(_.id)
def addTags(keywords: Seq[String]) = {
for {
k <- keywords
} yield {
tags.filter(_.keyword === k).take(1).result.headOption.flatMap {
case Some(tag) => {
Logger.debug("Using existing tag: " + k)
DBIO.successful(tag.id)
}
case None => {
Logger.debug("Adding new tag: " + k)
tags.returning(tags.map(_.id)) += Tag(k, Some("DUMMY"))
}
}
}
}
def addOrUpdateProductWithTags(prod: Product, tagSet: Seq[String]): Future[Option[Long]] = {
// handle add or update product
val prodObject = prod.id match {
case 0L => pRetId += prod
case _ => prods.withFilter(_.id === prod.id).update(prod)
}
val action = for {
pid <- prodObject
tids <- DBIO.sequence(addTags(tagSet))
} yield (tids, pid)
val finalAction = action.flatMap {
case (tids, pid) => {
val prodId = if (prod.id > 0L) prod.id else pid.asInstanceOf[Number].longValue
val delAction = tagGroups.filter(_.prodId === prodId).delete
val tgAction = for {
tid <- tids
} yield {
tagGroups += TagGroup("Ignored-XX", prodId, tid)
}
delAction.flatMap { x => DBIO.sequence(tgAction) }
// IF LINE BELOW IS COMMENTED THEN TagGroup is created else even delete above doesn't happen
prods.filter(_.id === prodId).map(_.id).result.headOption
}
}
db.run(finalAction.transactionally)
}
This is the snippet in the controller where this method is called from. My suspicion is that caller doesn't wait long enough but not sure...
val prod = Prod(...)
val tagSet = generateTags(prod.tags)
val add = prodsService.addOrUpdateProductWithTags(prod, tagSet)
add.map { value =>
Redirect(controllers.www.routes.Dashboard.dashboard)
}.recover {
case th =>
InternalServerError("bad things happen in life: " + th)
}
Any clue what's wrong with the query ?
Stack: Scala 2.11.7, play version 2.5.4, play-slick 2.0.0 (slick 3.1)
Finally figured out a solution:
In place of the following 2 lines:
delAction.flatMap { x => DBIO.sequence(tgAction) }
prods.filter(_.id === prodId).map(_.id).result.headOption
I combined the actions with andThen operators as follows:
delAction >> DBIO.sequence(tgAction) >> prods.filter(_.id === prodId).map(_.id).result.headOption
Now the entire sequence gets executed. I still don't know what's wrong with the original solution but this works.

Play scala - confusing about the result type of Action.async

I'm little bit confusing about the expected result of Action.async. Here the use case : from the frontend, I receive a JSON to validate (a Foo), I send this data calling an another web service and I extract and validate the received JSON (Bar case class) which I want to validate too. The problem is when I return a result, I have the following error :
type mismatch;
found : Object
required: scala.concurrent.Future[play.api.mvc.Result]
Here my code :
case class Foo(id : String)
case class Bar(id : String)
def create() = {
Action.async(parse.json) { request =>
val sessionTokenOpt : Option[String] = request.headers.get("sessionToken")
val sessionToken : String = "Bearer " + (sessionTokenOpt match {
case None => throw new NoSessionTokenFound
case Some(session) => session
})
val user = ""
val structureId : Option[String] = request.headers.get("structureId")
if (sessionToken.isEmpty) {
Future.successful(BadRequest("no token"))
} else {
val url = config.getString("createURL").getOrElse("")
request.body.validate[Foo].map {
f =>
Logger.debug("sessionToken = " + sessionToken)
Logger.debug(f.toString)
val data = Json.toJson(f)
val holder = WS.url(url)
val complexHolder =
holder.withHeaders(("Content-type","application/json"),("Authorization",(sessionToken)))
Logger.debug("url = " + url)
Logger.debug(complexHolder.headers.toString)
Logger.debug((Json.prettyPrint(data)))
val futureResponse = complexHolder.put(data)
futureResponse.map { response =>
if(response.status == 200) {
response.json.validate[Bar].map {
b =>
Future.successful(Ok(Json.toJson(b)))
}.recoverTotal { e : JsError =>
Future.successful(BadRequest("The JSON in the body is not valid."))
}
} else {
Logger.debug("status from apex " + response.status)
Future.successful(BadRequest("alo"))
}
}
Await.result(futureResponse,5.seconds)
}.recoverTotal { e : JsError =>
Future.successful(BadRequest("The JSON in the body is not valid."))
}
}
}
}
What is wrong in my function ?
Firstly, this is doing nothing:
futureResponse.map { response =>
if(response.status == 200) {
response.json.validate[Bar].map {
b =>
Future.successful(Ok(Json.toJson(b)))
}.recoverTotal { e : JsError =>
Future.successful(BadRequest("The JSON in the body is not valid."))
}
} else {
Logger.debug("status from apex " + response.status)
Future.successful(BadRequest("alo"))
}
}
Because you're not capturing or assigning the result of it to anything. It's equivalent to doing this:
val foo = "foo"
foo + " bar"
println(foo)
The foo + " bar" statement there is pointless, it achieves nothing.
Now to debug type inference problems, what you need to do is assign results to things, and annotate with the types you're expecting. So, assign the result of the map to something first:
val newFuture = futureResponse.map {
...
}
Now, what is the type of newFuture? The answer is actually Future[Future[Result]], because you're using map, and then returning a future from inside that. If you want to return a future inside your map function, then you have to use flatMap instead, this flattens the Future[Future[Result]] to Future[Result]. But actually in your case, you don't need that you can use map, and just get rid of all those Future.successful calls, because you're not actually doing anything in that map function that needs to return a future.
And then get rid of that await as others have said - using await means blocking, which negates the point of using futures in the first place.
Anyway, this should compile:
def create() = {
Action.async(parse.json) { request =>
val sessionTokenOpt : Option[String] = request.headers.get("sessionToken")
val sessionToken : String = "Bearer " + (sessionTokenOpt match {
case None => throw new NoSessionTokenFound
case Some(session) => session
})
val user = ""
val structureId : Option[String] = request.headers.get("structureId")
if (sessionToken.isEmpty) {
Future.successful(BadRequest("no token"))
} else {
val url = config.getString("createURL").getOrElse("")
request.body.validate[Foo].map {
f =>
Logger.debug("sessionToken = " + sessionToken)
Logger.debug(f.toString)
val data = Json.toJson(f)
val holder = WS.url(url)
val complexHolder =
holder.withHeaders(("Content-type","application/json"),("Authorization",(sessionToken)))
Logger.debug("url = " + url)
Logger.debug(complexHolder.headers.toString)
Logger.debug((Json.prettyPrint(data)))
val futureResponse = complexHolder.put(data)
futureResponse.map { response =>
if(response.status == 200) {
response.json.validate[Bar].map {
b =>
Ok(Json.toJson(b))
}.recoverTotal { e : JsError =>
BadRequest("The JSON in the body is not valid.")
}
} else {
Logger.debug("status from apex " + response.status)
BadRequest("alo")
}
}
}.recoverTotal { e : JsError =>
Future.successful(BadRequest("The JSON in the body is not valid."))
}
}
}
}
Do not Await.result(futureResponse, 5 seconds). Just return the futureResponse as is. The Action.async can deal with it (in fact, it wants to deal with it, it requires you to return a Future).
Note that in your various other codepaths (else, recoverTotal) you are already doing that.
If you use Action.async you don't need to await for result. So try to return future as is, without Await.result

How to yield a JSON object from a for loop in scala?

for (character <- content) {
if (character == '\n') {
val current_line = line.mkString
line.clear()
current_line match {
case docStartRegex(_*) => {
startDoc = true
endText = false
endDoc = false
}
case docnoRegex(group) => {
docID = group.trim
}
case docTextStartRegex(_*) => {
startText = true
}
case docTextEndRegex(_*) => {
endText = true
startText = false
}
case docEndRegex(_*) => {
endDoc = true
startDoc = false
es_json = Json.obj(
"_index" -> "ES_SPARK_AP",
"_type" -> "document",
"_id" -> docID,
"_source" -> Json.obj(
"text" -> textChunk.mkString(" ")
)
)
// yield es_json
textChunk.clear()
}
case _ => {
if (startDoc && !endDoc && startText) {
textChunk += current_line.trim
}
}
}
} else {
line += character
}
}
The above for-loop parses through a text file and creates a JSON object of each chunk parsed in a loop. This is JSON will be sent to for further processing to Elasticsearch. In python, we can yield the JSON and use generator easily like:
def func():
for i in range(num):
... some computations ...
yield {
JSON ## JSON is yielded
}
for json in func(): ## we parse through the generator here.
process(json)
I cannot understand how I can use yield in similar fashion using scala?
If you want lazy returns, scala does this using Iterator types. Specifically if you want to handle line by line values, I'd split it into lines first with .lines
val content: String = ???
val results: Iterator[Json] =
for {
lines <- content.lines
line <- lines
} yield {
line match {
case docEndRegex(_*) => ...
}
}
You can also use a function directly
def toJson(line: String): Json =
line match {
case "hi" => Json.obj("line" -> "hi")
case "bye" => Json.obj("what" -> "a jerk")
}
val results: Iterator[Json] =
for {
lines <- content.lines
line <- lines
} yield toJson(line)
This is equivalent to doing
content.lines.map(line => toJson(line))
Or somewhat equivalently in python
lines = (line.strip() for line in content.split("\n"))
jsons = (toJson(line) for line in lines)