Scala Convert a string into a map - json

What is the fastest way to convert this
{"a":"ab","b":"cd","c":"cd","d":"de","e":"ef","f":"fg"}
into mutable map in scala ? I read this input string from ~500MB file. That is the reason I'm concerned about speed.

If your JSON is as simple as in your example, i.e. a sequence of key/value pairs, where each value is a string. You can do in plain Scala :
myString.substring(1, myString.length - 1)
.split(",")
.map(_.split(":"))
.map { case Array(k, v) => (k.substring(1, k.length-1), v.substring(1, v.length-1))}
.toMap

That looks like a JSON file, as Andrey says. You should consider this answer. It gives some example Scala code. Also, this answer gives some different JSON libraries and their relative merits.

The fastest way to read tree data structures in XML or JSON is by applying streaming API: Jackson Streaming API To Read And Write JSON.
Streaming would split your input into tokens like 'beginning of an object' or 'beginning of an array' and you would need to build a parser for these token, which in some cases is not a trivial task.

Keeping it simple. If reading a json string from file and converting to scala map
import spray.json._
import DefaultJsonProtocol._
val jsonStr = Source.fromFile(jsonFilePath).mkString
val jsonDoc=jsonStr.parseJson
val map_doc=jsonDoc.convertTo[Map[String, JsValue]]
// Get a Map key value
val key_value=map_doc.get("key").get.convertTo[String]
// If nested json, re-map it.
val key_map=map_doc.get("nested_key").get.convertTo[Map[String, JsValue]]
println("Nested Value " + key_map.get("key").get)

Related

GCP Proto Datastore encode JsonProperty in base64

I store a blob of Json in the datastore using JsonProperty.
I don't know the structure of the json data.
I am using endpoints proto datastore in order to retrieve my data.
The probleme is the json property is encoded in base64 and I want a plain json object.
For the example, the json data will be:
{
first: 1,
second: 2
}
My code looks something like:
import endpoints
from google.appengine.ext import ndb
from protorpc import remote
from endpoints_proto_datastore.ndb import EndpointsModel
class Model(EndpointsModel):
data = ndb.JsonProperty()
#endpoints.api(name='myapi', version='v1', description='My Sample API')
class DataEndpoint(remote.Service):
#Model.method(path='mymodel2', http_method='POST',
name='mymodel.insert')
def MyModelInsert(self, my_model):
my_model.data = {"first": 1, "second": 2}
my_model.put()
return my_model
#Model.method(path='mymodel/{entityKey}',
http_method='GET',
name='mymodel.get')
def getMyModel(self, model):
print(model.data)
return model
API = endpoints.api_server([DataEndpoint])
When I call the api for getting a model, I get:
POST /_ah/api/myapi/v1/mymodel2
{
"data": "eyJzZWNvbmQiOiAyLCAiZmlyc3QiOiAxfQ=="
}
where eyJzZWNvbmQiOiAyLCAiZmlyc3QiOiAxfQ== is the base64 encoded of {"second": 2, "first": 1}
And the print statement give me: {u'second': 2, u'first': 1}
So, in the method, I can explore the json blob data as a python dict.
But, in the api call, the data is encoded in base64.
I expeted the api call to give me:
{
'data': {
'second': 2,
'first': 1
}
}
How can I get this result?
After the discussion in the comments of your question, let me share with you a sample code that you can use in order to store a JSON object in Datastore (it will be stored as a string), and later retrieve it in such a way that:
It will show as plain JSON after the API call.
You will be able to parse it again to a Python dict using eval.
I hope I understood correctly your issue, and this helps you with it.
import endpoints
from google.appengine.ext import ndb
from protorpc import remote
from endpoints_proto_datastore.ndb import EndpointsModel
class Sample(EndpointsModel):
column1 = ndb.StringProperty()
column2 = ndb.IntegerProperty()
column3 = ndb.StringProperty()
#endpoints.api(name='myapi', version='v1', description='My Sample API')
class MyApi(remote.Service):
# URL: .../_ah/api/myapi/v1/mymodel - POSTS A NEW ENTITY
#Sample.method(path='mymodel', http_method='GET', name='Sample.insert')
def MyModelInsert(self, my_model):
dict={'first':1, 'second':2}
dict_str=str(dict)
my_model.column1="Year"
my_model.column2=2018
my_model.column3=dict_str
my_model.put()
return my_model
# URL: .../_ah/api/myapi/v1/mymodel/{ID} - RETRIEVES AN ENTITY BY ITS ID
#Sample.method(request_fields=('id',), path='mymodel/{id}', http_method='GET', name='Sample.get')
def MyModelGet(self, my_model):
if not my_model.from_datastore:
raise endpoints.NotFoundException('MyModel not found.')
dict=eval(my_model.column3)
print("This is the Python dict recovered from a string: {}".format(dict))
return my_model
application = endpoints.api_server([MyApi], restricted=False)
I have tested this code using the development server, but it should work the same in production using App Engine with Endpoints and Datastore.
After querying the first endpoint, it will create a new Entity which you will be able to find in Datastore, and which contains a property column3 with your JSON data in string format:
Then, if you use the ID of that entity to retrieve it, in your browser it will show the string without any strange encoding, just plain JSON:
And in the console, you will be able to see that this string can be converted to a Python dict (or also a JSON, using the json module if you prefer):
I hope I have not missed any point of what you want to achieve, but I think all the most important points are covered with this code: a property being a JSON object, store it in Datastore, retrieve it in a readable format, and being able to use it again as JSON/dict.
Update:
I think you should have a look at the list of available Property Types yourself, in order to find which one fits your requirements better. However, as an additional note, I have done a quick test working with a StructuredProperty (a property inside another property), by adding these modifications to the code:
#Define the nested model (your JSON object)
class Structured(EndpointsModel):
first = ndb.IntegerProperty()
second = ndb.IntegerProperty()
#Here I added a new property for simplicity; remember, StackOverflow does not write code for you :)
class Sample(EndpointsModel):
column1 = ndb.StringProperty()
column2 = ndb.IntegerProperty()
column3 = ndb.StringProperty()
column4 = ndb.StructuredProperty(Structured)
#Modify this endpoint definition to add a new property
#Sample.method(request_fields=('id',), path='mymodel/{id}', http_method='GET', name='Sample.get')
def MyModelGet(self, my_model):
if not my_model.from_datastore:
raise endpoints.NotFoundException('MyModel not found.')
#Add the new nested property here
dict=eval(my_model.column3)
my_model.column4=dict
print(json.dumps(my_model.column3))
print("This is the Python dict recovered from a string: {}".format(dict))
return my_model
With these changes, the response of the call to the endpoint looks like:
Now column4 is a JSON object itself (although it is not printed in-line, I do not think that should be a problem.
I hope this helps too. If this is not the exact behavior you want, maybe should play around with the Property Types available, but I do not think there is one type to which you can print a Python dict (or JSON object) without previously converting it to a String.

Not able to convert Scala object to JSON String

I am using Play Framework and I am trying to convert a Scala object to a JSON string.
Here is my code where I get my object:
val profile: Future[List[Profile]] = profiledao.getprofile(profileId);
The object is now in the profile value.
Now I want to convert that profile object which is a Future[List[Profile]] to JSON data and then convert that data into a JSON string then write into a file.
Here is the code that I wrote so far:
val jsondata = Json.toJson(profile)
Jackson.toJsonString(jsondata)
This is how I am trying to convert into JSON data but it is giving me the following output:
{"empty":false,"traversableAgain":true}
I am using the Jackson library to do the conversion.
Can someone help me with this ?
Why bother with Jackson? If you're using Play, you have play-json available to you, which uses Jackson under the hood FWIW:
First, you need an implicit Reads to let play-json know how to serialize Profile. If Profile is a case class, you can do this:
import play.api.libs.json._
implicit val profileFormat = Json.format[Profile]
If not, define your own Reads like this.
Then since getprofile (which should follow convention and be getProfile) returns Future[List[Profile]], you can do this to get a JsValue:
val profilesJson = profiledao.getprofile(profileId).map(toJson)
(profiledao should also be profileDao.)
In the end, you can wrap this in a Result like Ok and return that from your controller.

Parsing the JSON representation of database rows in Scala.js

I am trying out scala.js, and would like to use simple extracted row data in json format from my Postgres database.
Here is an contrived example of the type of json I would like to parse into some strongly typed scala collection, features to note are that there are n rows, various column types including an array just to cover likely real life scenarios, don't worry about the SQL which creates an inline table to extract the JSON from, I've included it for completeness, its the parsing of the JSON in scala.js that is causing me problems
with inline_t as (
select * from (values('One',1,True,ARRAY[1],1.0),
('Six',6,False,ARRAY[1,2,3,6],2.4494),
('Eight',8,False,ARRAY[1,2,4,8],2.8284)) as v (as_str,as_int,is_odd,factors,sroot))
select json_agg(t) from inline_t t;
[{"as_str":"One","as_int":1,"is_odd":true,"factors":[1],"sroot":1.0},
{"as_str":"Six","as_int":6,"is_odd":false,"factors":[1,2,3,6],"sroot":2.4494},
{"as_str":"Eight","as_int":8,"is_odd":false,"factors":[1,2,4,8],"sroot":2.8284}]
I think this should be fairly easy using something like upickle or prickle as hinted at here: How to parse a json string to a case class in scaja.js and vice versa but I haven't been able to find a code example, and I'm not up to speed enough with Scala or Scala.js to work it out myself. I'd be very grateful if someone could post some working code to show how to achieve the above
EDIT
This is the sort of thing I've tried, but I'm not getting very far
val jsparsed = scala.scalajs.js.JSON.parse(jsonStr3)
val jsp1 = jsparsed.selectDynamic("1")
val items = jsp1.map{ (item: js.Dynamic) =>
(item.as_str, item.as_int, item.is_odd, item.factors, item.sroot)
.asInstanceOf[js.Array[(String, Int, Boolean, Array[Int], Double)]].toSeq
}
println(items._1)
So you are in a situation where you actually want to manipulate JSON values. Since you're not serializing/deserializing Scala values from end-to-end, serialization libraries like uPickle or Prickle will not be very helpful to you.
You could have a look at a cross-platform JSON manipulation library, such as circe. That would give you the advantage that you wouldn't have to "deal with JavaScript data structures" at all. Instead, the library would parse your JSON and expose it as a Scala data structure. This is probably the best option if you want your code to also cross-compile.
If you're only writing Scala.js code, and you want a more lightweight version (no dependency), I recommend declaring types for your JSON "schema", then use those types to perform the conversion in a safer way:
import scala.scalajs.js
import scala.scalajs.js.annotation._
// type of {"as_str":"Six","as_int":6,"is_odd":false,"factors":[1,2,3,6],"sroot":2.4494}
#ScalaJSDefined
trait Row extends js.Object {
val as_str: String
val as_int: Int
val is_odd: Boolean
val factors: js.Array[Int]
val sroot: Double
}
type Rows = js.Array[Row]
val rows = js.JSON.parse(jsonStr3).asInstanceOf[Rows]
val items = (for (row <- rows) yield {
import row._
(as_str, as_int, is_odd, factors.toArray, sroot)
}).toSeq

How can I convert a json string to a scala map?

I have a nested json whose structure is not defined. It can be different each time I run since I am reading from a remote file. I need to convert this json into a map of type Map[String, Any]. I tried to look into json4s and jackson parsers but they don't seem to solve this issue I have.
Does anyone know how I can achieve this?
Example string:
{"body":{
"method":"string",
"events":"string",
"clients":"string",
"parameter":"string",
"channel":"string",
"metadata":{
"meta1":"string",
"meta2":"string",
"meta3":"string"
}
},
"timestamp":"string"}
The level of nesting can be arbitrary and not predefined.
To help with the use case:
I have a Map[String,Any] which I need to store in a file as backup. So I convert it to a json string and store it in a file. Now everytime I get new data, I need to get the json from the file, convert it to a map again and perform some computation. I cannot store the map in memory since I would lose that if my job fails.
I need a solution that would convert the json string back to the original map I had before i converted it.
I tried the following method with json4s 3.2.11 and it works:
import org.json4s._
import org.json4s.jackson.JsonMethods._
//...
def jsonStrToMap(jsonStr: String): Map[String, Any] = {
implicit val formats = org.json4s.DefaultFormats
parse(jsonStr).extract[Map[String, Any]]
}
Maybe you didn't define the implicit val of type Formats? Note also that you don't need to have an implicit val within every and each method as long as it's findable in the scope.
You can use the following code to parse a JSON string into a Map[String, Any]
import org.json4s._
import org.json4s.jackson.JsonMethods._
val jsonMap = parse(jsonString).values.asInstanceOf[Map[String, Any]]
However, this is not typesafe and hence should be used with caution when extracting values from the map.

Convert scala list to Json object

I want to convert a scala list of strings, List[String], to an Json object.
For each string in my list I want to add it to my Json object.
So that it would look like something like this:
{
"names":[
{
"Bob",
"Andrea",
"Mike",
"Lisa"
}
]
}
How do I create an json object looking like this, from my list of strings?
To directly answer your question, a very simplistic and hacky way to do it:
val start = """"{"names":[{"""
val end = """}]}"""
val json = mylist.mkString(start, ",", end)
However, what you almost certainly want to do is pick one of the many JSON libraries out there: play-json gets some good comments, as does lift-json. At the worst, you could just grab a simple JSON library for Java and use that.
Since I'm familiar with lift-json, I'll show you how to do it with that library.
import net.liftweb.json.JsonDSL._
import net.liftweb.json.JsonAST._
import net.liftweb.json.Printer._
import net.liftweb.json.JObject
val json: JObject = "names" -> List("Bob", "Andrea", "Mike", "Lisa")
println(json)
println(pretty(render(json)))
The names -> List(...) expression is implicitly converted by the JsonDSL, since I specified that I wanted it to result in a JObject, so now json is the in-memory model of the json data you wanted.
pretty comes from the Printer object, and render comes from the JsonAST object. Combined, they create a String representation of your data, which looks like
{
"names":["Bob","Andrea","Mike","Lisa"]
}
Be sure to check out the lift documentation, where you'll likely find answers to any further questions about lift's json support.