How to Parse Json on Scala - json

Here, I attached my code (pure Scala)
package stock
import scala.io.Source._
import java.util.Scanner
import java.io.File
import scala.util.parsing.json._
object StockDetails {
def main(args : Array[String]){
val filePath = new java.io.File(".").getCanonicalPath
val source = scala.io.Source.fromFile(filePath +"/stock/file.txt")
// file.txt --> contains
// line 1 --> GOOG - 50, MS - 10
// line 2 --> SGI - 100, GOOG - 50, MS - 10
// line 3 --> GOOG - 100, AMZN - 90, MS - 80
for(line <- source.getLines()) {
val txt = line.split(",")
val s1 = txt.map{ss =>
val s2 = ss.split(" - ")(0).trim()
val holder = fromURL("http://finance.google.com/finance/info?client=ig&q="+s2).mkString
println("holder===========",holder)
val s3 = holder.split("//")(1)
println("s3================",s3)
val s4 = JSON.parseFull(s3).get
println("s4==========================",s4)
}
}
}
}
O\P:
(holder==========================,
// [
{
"id": "660479"
,"t" : "MS"
,"e" : "NYSE"
,"l" : "43.06"
,"l_fix" : "43.06"
,"l_cur" : "43.06"
,"s": "0"
,"ltt":"4:02PM EST"
,"lt" : "Dec 23, 4:02PM EST"
,"lt_dts" : "2016-12-23T16:02:12Z"
,"c" : "+0.27"
,"c_fix" : "0.27"
,"cp" : "0.63"
,"cp_fix" : "0.63"
,"ccol" : "chg"
,"pcls_fix" : "42.79"
}
])
(s3================,
[{
"id": "660479"
,"t" : "MS"
,"e" : "NYSE"
,"l" : "43.06"
,"l_fix" : "43.06"
,"l_cur" : "43.06"
,"s": "0"
,"ltt":"4:02PM EST"
,"lt" : "Dec 23, 4:02PM EST"
,"lt_dts" : "2016-12-23T16:02:12Z"
,"c" : "+0.27"
,"c_fix" : "0.27"
,"cp" : "0.63"
,"cp_fix" : "0.63"
,"ccol" : "chg"
,"pcls_fix" : "42.79"
}
])
(s4==========================,
List(Map(
e -> NYSE,
s -> 0,
cp_fix -> 0.63,
l_cur -> 43.06,
ccol -> chg,
t -> MS,
pcls_fix -> 42.79,
id -> 660479,
l -> 43.06,
l_fix -> 43.06,
c_fix -> 0.27,
c -> +0.27,
cp -> 0.63,
lt -> Dec 23, 4:02PM EST,
lt_dts -> 2016-12-23T16:02:12Z,
ltt -> 4:02PM EST)))
Here, I want the " l " value but i can't able to get it, When I used map/ foreach it's returned as
$ scalac Stock.scala
Stock.scala:30: error: value map is not a member of Any
val s5 = s4.map{ ex => ex }
^
one error found
And tried this link but i can't able to get it, Here what i do?

Parsing using JSON.parseFull returns option of any.
scala> JSON.parseFull("""{"key": "value"}""")
res2: Option[Any] = Some(Map(key -> value))
So, use pattern matching to extract the types you want.
or type cast (not recommended)

Well... the thing is that you have S4 as Any.
// since we know that s4 is supposed to be a List
// we can type-cast it to List[Any]
val s4List = s4.asInstanceOf[List[Any]]
// now we also know that s4 was actually a list of Map's
// and those maps look to be String -> String
// so we can type-cast all the things inside list to Map[String, String]
val s4MapList = s4List.map(m => m.asInstanceOf[Map[String, String]])
// and we also know that s4 is a List of Map's and has length 1
val map = s4MapList(0)
// lets say you wanted id
val idOption = map.get("id")
// but we know that map has a key "id"
// so we can just get that
val id = map("id")
Note :: This is not the recommended way of doing things in Scala, but can help you understand what are you supposed to do on a basic level. Keep learning... and you will come to understand better ways to deal with things.

Related

JSON Decoder in Elm 0.18

In Elm 0.18, I would like to build a JSON decoder for the following examples:
case 1:
{"metadata": {"signatures":[{"metadata": {"code": "1234"}},
{"metadata": {"code": "5678"}}]}}
-> { code = Just "1234" }
case 2:
{"metadata": {"signatures":[]}}
-> { code = Nothing }
case 3:
{"metadata": {"signatures":[{"metadata": null}]}}
-> { code = Nothing }
This is what I got working, but it fails for case 3.
type alias Code = { code : Maybe String }
let
js = """{"metadata": {"signatures":[{"metadata": {"code": "1234"}},
{"metadata": {"code": "5678"}}]}}"""
dec1 =
Decode.at [ "metadata", "code" ] Decode.string
dec0 =
Decode.list dec1
|> Decode.andThen
(\v ->
if List.isEmpty v then
Decode.succeed Nothing
else
Decode.succeed <| List.head v
)
dec =
decode Code
|> optionalAt [ "metadata", "signatures" ] dec0 Nothing
expected =
Ok { code = Just "1234" }
in
Decode.decodeString dec js
|> Expect.equal expected
A workaround would be to import all the data to the model and then obtain the info from the model, but I prefer to avoid adding unnecessary data into my model. How can I improve this?
A more simplified approach could use Json.Decode.index to force the decoding at index zero as a string if it exists, which will fail otherwise, so you can use Json.Decode.maybe to return Nothing on failure.
dec0 =
Decode.maybe (Decode.index 0 dec1)

Decoding polymorphic JSON objects into elm with andThen

My JSON looks similar to this:
{ "items" :
[ { "type" : 0, "order": 10, "content": { "a" : 10, "b" : "description", ... } }
, { "type" : 1, "order": 11, "content": { "a" : 11, "b" : "same key, but different use", ... } }
, { "type" : 2, "order": 12, "content": { "c": "totally different fields", ... } }
...
]
}
and I want to use the type value to decide what union type to create while decoding. So, I defined alias types and decoders for all the above in elm :
import Json.Decode exposing (..)
import Json.Decode.Pipeline exposing (..)
type alias Type0Content = { a : Int, b : String }
type alias Type1Content = { a : Int, b2 : String }
type alias Type2Content = { c : String }
type Content = Type0 Type0Content | Type1 Type1Content | Type2 Type2Content
type alias Item = { order : Int, type : Int, content: Content }
decode0 = succeed Type0Content
|> requiredAt ["content", "a"] int
|> requiredAt ["content", "b"] string
decode1 = succeed Type1Content
|> requiredAt ["content", "a"] int
|> requiredAt ["content", "b"] string
decode2 = succeed Type2Content
|> requiredAt ["content", "c"] string
decodeContentByType hint =
case hint of
0 -> Type0 decode0
1 -> Type1 decode1
2 -> Type2 decode2
_ -> fail "unknown type"
decodeItem = succeed Item
|> required "order" int
|> required "type" int `andThen` decodeContentByType
Can't get the last two functions to interact as needed.
I've read through page 33 of json-survival-kit by Brian Thicks, but that didn't bring me on track either.
Any advice and lecture appreciated!
It looks like the book was written targeting Elm 0.17 or below. In Elm 0.18, the backtick syntax was removed. You will also need to use a different field name for type since it is a reserved word, so I'll rename it type_.
Some annotations might help narrow down bugs. Let's annotate decodeContentByType, because right now, the branches aren't returning the same type. The three successful values should be mapping the decoder onto the expected Content constructor:
decodeContentByType : Int -> Decoder Content
decodeContentByType hint =
case hint of
0 -> map Type0 decode0
1 -> map Type1 decode1
2 -> map Type2 decode2
_ -> fail "unknown type"
Now, to address the decodeItem function. We need three fields to satisfy the Item constructor. The second field is the type, which can be obtained via required "type" int, but the third field relies on the "type" value to deduce the correct constructor. We can use andThen (with pipeline syntax as of Elm 0.18) after fetching the Decoder Int value using Elm's field decoder:
decodeItem : Decoder Item
decodeItem = succeed Item
|> required "order" int
|> required "type" int
|> custom (field "type" int |> andThen decodeContentByType)

Extract specific info from json-list

In my system from json-call http://192.168.1.6:8080/json.htm?type=devices&rid=89 I get the output below.
{
"ActTime" : 1501360852,
"ServerTime" : "2017-07-29 22:40:52",
"Sunrise" : "05:50",
"Sunset" : "21:28",
"result" : [
{
"AddjMulti" : 1.0,
"AddjMulti2" : 1.0,
"AddjValue" : 0.0,
"AddjValue2" : 0.0,
"BatteryLevel" : 255,
"CustomImage" : 0,
"Data" : "73 Lux",
"Description" : "",
"Favorite" : 1,
"HardwareID" : 4,
"HardwareName" : "Dummies",
"HardwareType" : "Dummy (Does nothing, use for virtual switches only)",
"HardwareTypeVal" : 15,
"HaveTimeout" : true,
"ID" : "82089",
"LastUpdate" : "2017-07-29 21:16:22",
"Name" : "ESP8266C_Licht1",
"Notifications" : "false",
"PlanID" : "0",
"PlanIDs" : [ 0 ],
"Protected" : false,
"ShowNotifications" : true,
"SignalLevel" : "-",
"SubType" : "Lux",
"Timers" : "false",
"Type" : "Lux",
"TypeImg" : "lux",
"Unit" : 1,
"Used" : 1,
"XOffset" : "0",
"YOffset" : "0",
"idx" : "89"
}
],
"status" : "OK",
"title" : "Devices"
}
'Automating' such call by the below scripttime lua-script is aimed at extraction of specific information, to be further used in applications.
The first 11 lines run without problems, but further extraction of the information is a problem.
I have tried various scriptlines to get a solution for A), B), C), D) and E), but either they generate error-report, or they don't give results: see further below for the 'best' trial-scriptlines and related results.
To avoid misunderstanding: those dashed commentlines in the script just below this question with A), B), C), D) and E) are only describing the desired actions/functions and are in no way meant as scriptlines!
Question:
Help requested in the form of better applicable scriptlines for A) till E) in the trialscript at the end of this message, or hints where to find applicable example scriptlines.
-- Lua-script to determine staleness, time-out and value for data from Json-call
print('Start of Timeout-script')
commandArray = {}
TimeOutLimit = 10 -- Allowed timeout in seconds
json = (loadfile "/home/pi/domoticz/scripts/lua/JSON.lua")() -- For Linux
-- json = (loadfile "D:\\Domoticz\\scripts\\lua\\json.lua")() -- For Windows
-- Line 07
local content=assert(io.popen('curl "http://192.168.1.6:8080/json.htm?type=devices&rid=89"')) -- notice double quotes
local list = content:read('*all')
content:close()
local jsonList = json:decode(list)
-- Line 12 Next scriptlines describe desired actions
-- A) Extract ServerTime as numeric value (not as string)
-- B) Extract LastUpdate as numeric value (not as string)
-- Staleness = ServerTime - LastUpdate
-- C) Extract HaveTimeout as boolean (not as string)
-- If HaveTimeout and (Staleness > TimeOutlimit) then
-- Print('TimeOutLimit exceeded by ' .. (Staleness - TimeOutLimit) .. 'seconds')
-- End
-- D) Extract textstring from Type or Data
-- E) Extract numeric value from Data
print('End of Timeout-script')
return commandArray
For lines 11 etc, the following trial-scriptlines gave 'best' results (= no errors):
-- Line 11
-- local Servertime = json:decode(ServerTime)
-- print('Servertime : '..Servertime)
-- Line 14
-- CheckTimeOut =jsonValue.result[1].HaveTimeout -- value from "HaveTimeout", inside "result" bloc number 1 (even if it's the only one)
CurrentServerTime =jsonValue.Servertime -- value from "ServerTime"
CurrentLastUpdate = jsonValue.result[1].LastUpdate
CurrentData = jsonValue.result[1].Data
-- Line 19
print('TimeOut : '..CheckTimeOut)
print('Servertime : '..CurrentServerTime)
print('LastUpdate : '..CurrentLastUpdate)
print('Data-content : '..CurrentData)
print('End of Timeout-script')
return commandArray
Results:
Without dashes before the lines 12 and 13, respectively 15, then the following error reports:
660: nil passed to JSON:decode()
lua:15: attempt to index global 'jsonValue' (a nil value)
With dashes before lines 12, 13 and 15 for the trial-scriptlines shown above, according to the log no errors exist (as demonstrated by the 2 prints)
2017-07-31 16:30:02.520 LUA: Start of Timeout-script
2017-07-31 16:30:02.563 LUA: End of Timeout-script
But why no print-results in the log from Lines 20 till 23?
Not having those print-results makes it difficult to determine next steps in data-extraction, to achieve the objectives described under A) till E).
;-) Error reports generally are more useful information than "no errors, but no results"
Without downloading content over the network, the working script looks like this:
local json = require("json")
local jj = [[
{
"ActTime" : 1501360852,
"ServerTime" : "2017-07-29 22:40:52",
"Sunrise" : "05:50",
"Sunset" : "21:28",
"result" : [
{
"AddjMulti" : 1.0,
"AddjMulti2" : 1.0,
"AddjValue" : 0.0,
"AddjValue2" : 0.0,
"BatteryLevel" : 255,
"CustomImage" : 0,
"Data" : "73 Lux",
"Description" : "",
"Favorite" : 1,
"HardwareID" : 4,
"HardwareName" : "Dummies",
"HardwareType" : "Dummy (Does nothing, use for virtual switches only)",
"HardwareTypeVal" : 15,
"HaveTimeout" : true,
"ID" : "82089",
"LastUpdate" : "2017-07-29 21:16:22",
"Name" : "ESP8266C_Licht1",
"Notifications" : "false",
"PlanID" : "0",
"PlanIDs" : [ 0 ],
"Protected" : false,
"ShowNotifications" : true,
"SignalLevel" : "-",
"SubType" : "Lux",
"Timers" : "false",
"Type" : "Lux",
"TypeImg" : "lux",
"Unit" : 1,
"Used" : 1,
"XOffset" : "0",
"YOffset" : "0",
"idx" : "89"
}
],
"status" : "OK",
"title" : "Devices"
}
]]
print('Start of Timeout-script')
local jsonValue = json.decode(jj)
CheckTimeOut =jsonValue.result[1].HaveTimeout
CurrentServerTime =jsonValue.Servertime
CurrentLastUpdate = jsonValue.result[1].LastUpdate
CurrentData = jsonValue.result[1].Data
print('TimeOut : '.. (CheckTimeOut and "true" or "false") )
print('Servertime : '.. (CurrentServerTime or "nil") )
print('LastUpdate : '.. (CurrentLastUpdate or "nil") )
print('Data-content : '.. (CurrentData or "nil") )
print('End of Timeout-script')
result:
Start of Timeout-script
TimeOut : true
Servertime : nil
LastUpdate : 2017-07-29 21:16:22
Data-content : 73 Lux
End of Timeout-script

BSON structure created by Apache Spark and MongoDB Hadoop-Connector

I'm trying to save some JSON from Spark (Scala) to MongoDB using the MongoDB Hadoop-Connector. The problem I'm having is that this API always seems to save your data as "{_id: ..., value: {your JSON document}}".
In the code example below, my document gets saved like this:
{
"_id" : ObjectId("55e80cfea9fbee30aa703261"),
"value" : {
"_id" : "55e6c65da9fbee285f2f9175",
"year" : 2014,
"month" : 5,
"day" : 6,
"hour" : 18,
"user_id" : 246
}
}
Is there any way to persuade the MongoDB Hadoop Connector to write the JSON/BSON in the structure you've specified, instead of nesting it under these _id/value fields?
Here's my Scala Spark code:
val jsonstr = List("""{
"_id" : "55e6c65da9fbee285f2f9175",
"year" : 2014,
"month" : 5,
"day" : 6,
"hour" : 18,
"user_id" : 246}""")
val conf = new SparkConf().setAppName("Mongo Dummy").setMaster("local[*]")
val sc = new SparkContext(conf)
// DB params
val host = "127.0.0.1"
val port = "27017"
val database = "dummy"
val collection = "fubar"
// input is collection we want to read (not doing so here)
val mongo_input = s"mongodb://$host/$database.$collection"
// output is collection we want to write
val mongo_output = s"mongodb://$host/$database.$collection"
// Set up extra config for Hadoop connector
val hadoopConfig = new Configuration()
//hadoopConfig.set("mongo.input.uri", mongo_input)
hadoopConfig.set("mongo.output.uri", mongo_output)
// convert JSON to RDD
val rdd = sc.parallelize(jsonstr)
// write JSON data to DB
val saveRDD = rdd.map { json =>
(null, Document.parse(json))
}
saveRDD.saveAsNewAPIHadoopFile("file:///bogus",
classOf[Object],
classOf[BSONObject],
classOf[MongoOutputFormat[Object, BSONObject]],
hadoopConfig)
// Finished
sc.stop
And here's my SBT:
name := "my-mongo-test"
version := "1.0"
scalaVersion := "2.10.4"
// Spark needs to appear in SBT BEFORE Mongodb connector!
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.0"
// MongoDB-Hadoop connector
libraryDependencies += "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "1.4.0"
To be honest, I'm kind of mystified at how hard it seems to be to save JSON --> BSON --> MongoDB from Spark. So any suggestions on how to save my JSON data more flexibly would be welcomed.
Well, I just found the solution. It turns out that MongoRecordWriter which is used by MongoOutputFormat inserts any value that does not inherit from BSONWritable or MongoOutput or BSONObject under value field.
The most simple solution, therefore, is to create RDD that contain BSONObject as a value, rather than Document.
I tried this solution in Java, but I'm sure it will work in Scala as well. Here is a sample code:
JavaPairRDD<Object, BSONObject> bsons = values.mapToPair(lineValues -> {
BSONObject doc = new BasicBSONObject();
doc.put("field1", lineValues.get(0));
doc.put("field2", lineValues.get(1));
return new Tuple2<Object, BSONObject>(UUID.randomUUID().toString(), doc);
});
Configuration outputConfig = new Configuration();
outputConfig.set("mongo.output.uri",
"mongodb://localhost:27017/my_db.lines");
bsons.saveAsNewAPIHadoopFile("file:///this-is-completely-unused"
, Object.class
, BSONObject.class
, MongoOutputFormat.class
, outputConfig);

How to handle error in JSON data has incorrect is node

I'm expecting following json format:
'{
"teamId" : 9,
"teamMembers" : [ {
"userId" : 1000
}, {
"userId" : 2000
}]
}'
If I test my code with following format:-
'{
"teaXmId" : 9,
"teamMembers" : [ {
"usXerId" : 1000
}, {
"userXId" : 2000
}]
}'
I'm parsing json value as follows:-
val userId = (request.body \\ "userId" )
val teamId = (request.body \ "teamId")
val list = userId.toList
list.foreach( x => Logger.info("x val: "+x)
It doesn't throw any error to handle. Code execution goes one. Later if I try to use teamId or userId, of course it doesn't work then.
So how to check whether parsing was done correctly or stop execution right away and notify user to provide correct json format
If a value is not found when using \, then the result will be of type JsUndefined(msg). You can throw an error immediately by making sure you have the type you expect:
val teamId = (request.body \ "teamId").as[Int]
or:
val JsNumber(teamId) = request.body \ "teamId" //teamId will be BigDecimal
When using \\, if nothing is found, then an empty List is returned, which makes sense. If you want to throw an error when a certain key is not found on any object of an array, you might get the object that contains the list and proceed from there:
val teamMembers = (request.body \"teamMembers").as[Seq[JsValue]]
or:
val JsObject(teamMembers) = request.body \ "teamMembers"
And then:
val userIds = teamMembers.map(v => (v \ "userId").as[Int])