I can't make sense of any of the documentation. Can someone please provide an example of how I can parse the following shortened exiftool output using the Haskell module Text.JSON? The data is generating using the command exiftool -G -j <files.jpg>.
[{
"SourceFile": "DSC00690.JPG",
"ExifTool:ExifToolVersion": 7.82,
"File:FileName": "DSC00690.JPG",
"Composite:LightValue": 11.6
},
{
"SourceFile": "DSC00693.JPG",
"ExifTool:ExifToolVersion": 7.82,
"File:FileName": "DSC00693.JPG",
"EXIF:Compression": "JPEG (old-style)",
"EXIF:ThumbnailLength": 4817,
"Composite:LightValue": 13.0
},
{
"SourceFile": "DSC00694.JPG",
"ExifTool:ExifToolVersion": 7.82,
"File:FileName": "DSC00694.JPG",
"Composite:LightValue": 3.7
}]
Well, the easiest way is to get back a JSValue from the json package, like so (assuming your data is in text.json):
Prelude Text.JSON> s <- readFile "test.json"
Prelude Text.JSON> decode s :: Result JSValue
Ok (JSArray [JSObject (JSONObject {fromJSObject = [("SourceFile",JSString (JSONString {fromJSString = "DSC00690.JPG"})),("ExifTool:ExifToolVersion",JSRational False (391 % 50)),("File:FileName",JSString (JSONString {fromJSString = "DSC00690.JPG"})),("Composite:LightValue",JSRational False (58 % 5))]}),JSObject (JSONObject {fromJSObject = [("SourceFile",JSString (JSONString {fromJSString = "DSC00693.JPG"})),("ExifTool:ExifToolVersion",JSRational False (391 % 50)),("File:FileName",JSString (JSONString {fromJSString = "DSC00693.JPG"})),("EXIF:Compression",JSString (JSONString {fromJSString = "JPEG (old-style)"})),("EXIF:ThumbnailLength",JSRational False (4817 % 1)),("Composite:LightValue",JSRational False (13 % 1))]}),JSObject (JSONObject {fromJSObject = [("SourceFile",JSString (JSONString {fromJSString = "DSC00694.JPG"})),("ExifTool:ExifToolVersion",JSRational False (391 % 50)),("File:FileName",JSString (JSONString {fromJSString = "DSC00694.JPG"})),("Composite:LightValue",JSRational False (37 % 10))]})])
this just gives you a generic json Haskell data type.
The next step will be to define a custom Haskell data type for your data, and write an instance of JSON for that, that converts between JSValue's as above, and your type.
Thanks to all. From your suggestions I was able to put together the following which translates the JSON back into name-value pairs.
data Exif =
Exif [(String, String)]
deriving (Eq, Ord, Show)
instance JSON Exif where
showJSON (Exif xs) = showJSONs xs
readJSON (JSObject obj) = Ok $ Exif [(n, s v) | (n, JSString v) <- o]
where
o = fromJSObject obj
s = fromJSString
Unfortunately, it seems the library is unable to translate the JSON straight back into a simple Haskell data structure. In Python, it is a one-liner: json.loads(s).
Related
I'm trying to get the maximum value of a MetricId field from a JSON String. However I'm getting a java.lang.UnsupportedOperationException: empty.max for the below String:
[{"MetricName":"name1","DateParsed":"2019-11-20 05:39:00","MetricId":"7855","isValid":"true"},
{"MetricName":"name2","DateParsed":"2019-05-22 17:45:00","MetricId":"1295","isValid":"false"}]
Here is how I've implemented a method for finding the Max value:
val metricIdRegex = """"MetricId"\s*:\s*(\d+)""".r
def maxMetricId(jsonString: String): String = {
metricIdRegex.findAllIn(jsonString).map({
case metricIdRegex(id) => id.toInt
}).max.toString
}
val maxId: String = maxMetricId(metricsString)
I'm expecting to get "7855" as a Max metric Id
What could be wrong with the method? I suspect that it could be a problem with the regex.
You could also use json4s which is quite popular and used by many other scala libraries:
import org.json4s._
import org.json4s.jackson.JsonMethods._
val data = """[{"MetricName":"name1","DateParsed":"2019-11-20 05:39:00","MetricId":"7855","isValid":"true"},
{"MetricName":"name2","DateParsed":"2019-05-22 17:45:00","MetricId":"1295","isValid":"false"}]"""
// parse data into JValue
val parsed = parse(data)
// go through the parsed variable and extract MetricId into a string list, then cast every item to int
val maxMetricId = (parsed \ "MetricId" \\ classOf[JString]).map{_.toInt}.max
Let me show an example how it can be done with a JSON parser efficiently without holding of a whole JSON input and parsed data in memory.
Add dependencies to your build.sbt:
libraryDependencies ++= Seq(
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "2.0.2" % Compile,
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "2.0.2" % Provided // required only in compile-time
)
Add imports, define a data structure for repeating part of your JSON array which should be parsed out, derive a codec for it, open an input stream and scan it with provided handling function which will reduce all parsed metrics to the maximum value:
import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._
import java.io.ByteArrayInputStream
import java.io.InputStream
case class Metric(#stringified MetricId: Int)
implicit val codec: JsonValueCodec[Metric] = JsonCodecMaker.make(CodecMakerConfig)
val in: InputStream = new ByteArrayInputStream( // <- replace it by FileInputStream
"""[{"MetricName":"name1","DateParsed":"2019-11-20 05:39:00","MetricId":"7855","isValid":"true"},
{"MetricName":"name2","DateParsed":"2019-05-22 17:45:00","MetricId":"1295","isValid":"false"}]""".getBytes("UTF-8"))
try {
var max = -1
scanJsonArrayFromStream[Metric](in) { m: Metric =>
max = Math.max(max, m.MetricId)
true
}
println(max)
} finally in.close()
And this code should print 7855.
In my current "learning haskell" project I try to fetch weather data from a third party api. I want to extract the name and main.temp value from the following response body:
{
...
"main": {
"temp": 280.32,
...
},
...
"name": "London",
...
}
I wrote a getWeather service to perform IO and transform the response to construct GetCityWeather data:
....
data WeatherService = GetCityWeather String Double
deriving (Show)
....
getWeather :: IO (ServiceResult WeatherService)
getWeather = do
...
response <- httpLbs request manager
...
-- work thru the response
return $ case ((maybeCityName response, maybeTemp response)) of
(Just name, Just temp) -> success name temp
bork -> err ("borked data >:( " ++ show bork))
where
showStatus r = show $ statusCode $ responseStatus r
maybeCityName r = (responseBody r)^?key "name"._String
maybeTemp r = (responseBody r)^?key "main".key "temp"._Double
success n t = Right (GetCityWeather (T.unpack n) t)
err e = Left (SimpleServiceError e)
I stuck optimizing the JSON parsing part in maybeCityName, and maybeTemp, my thoughts are:
Currently the JSON is parsed twice (I apply ^? two times on the raw response responseBody r).
I would like to get the data in "one shot". ?.. is able to get a list of values. But I extract different types (String, Double) so the ?.. does not fit here.
I'm looking for more elegant / more natural ways to safely parse JSON, read desired the values and apply them to the data constructor GetCityWeather. Thanks in advance for any help and feedback.
Update: using Folds I am able to solve the problem with two case matches
getWeather :: IO (ServiceResult WeatherService)
getWeather = do
...
let value = decode $ responseBody response
return $ case value of
Just v -> case (v ^? weatherService) of
Just wr -> Right wr
Nothing -> err "incompatible data"
Nothing -> err "bad json"
where
err t = Left (SimpleServiceError t)
weatherService :: Fold Value WeatherService
weatherService = runFold $ GetCityWeather
<$> Fold (key "name" . _String . unpacked)
<*> Fold (key "main" . key "temp" . _Double)
As #jpath point out, the real problem you have here is one about lens and JSON handling. The crux of the issue seems to be that you want to do the lens operation all at once. For that, check out the handy ReifiedFold: the "parallel" functionality you want is packed into the Applicative instance.
import Control.Lens
import Data.Aeson
import Data.Aeson.Lens
import Data.Text.Lens ( unpacked )
-- | Extract a `WeatherService` from a `Value` if possible
weatherService :: Fold Value WeatherService
weatherService = runFold $ GetCityWeather
<$> Fold (key "name" . _String . unpacked)
<*> Fold (key "main" . key "temp" . _Double))
Then, you can try to get your WeatherService all at once:
...
-- work thru the response
let body = responseBody r
return $ case body ^? weatherService of
Just wr -> Right wr
Nothing -> Left (SimpleServiceError ("borked data >:( " ++ show body))
However, for the sake of error messages, it might be a better idea to take advantage of aeson's ToJSON/FromJSON if you plan on scaling this more.
The result of Json4s decoding frequently scramble the sequence of element in a JObject if decoding into a HashMap, so I tried to decode into ListMap instead. However, there seems to be no way of doing this, when I run the following simple program:
val v: ListMap[String, Int] = ListMap("a" -> 1, "b" -> 2)
val json = JsonMethods.compact(Extraction.decompose(v))
val v2 = Extraction.extract[ListMap[String, Int]](JsonMethods.parse(json))
assert(v == v2)
The following error message was thrown:
scala.collection.immutable.Map$Map2 cannot be cast to scala.collection.immutable.ListMap
java.lang.ClassCastException: scala.collection.immutable.Map$Map2 cannot be cast to scala.collection.immutable.ListMap
Is there an easy way to fix this? Or should I switch to more recent Json libraries (Argonaut/Circe) instead?
No, you can't do this. At least not this way. According to the JSON spec
An object is an unordered set of name/value pairs.
And all the standard libraries treat it that way. It means that the order is already scrambled when you/library do the initial parsing into intermediate data structure. Moreover, you can't even guarantee that the JSON will be {"a":1, "b":2} instead of {"b":2, "a":1}
The only way to preserve the order is to store in inside the JSON in a way that enforces the order and the only such thing is an ordered list of values aka array. So you can do something like this:
val v: ListMap[String, Int] = ListMap("c" -> 1, "AaAa" -> 2, "BBBB" -> 3, "AaBB" -> 4, "BBAa" -> 5)
val jsonBad = JsonMethods.compact(Extraction.decompose(v))
val bad = Extraction.extract[Map[String, Int]](JsonMethods.parse(jsonBad))
val jsonGood = JsonMethods.compact(Extraction.decompose(v.toList))
val good = ListMap(Extraction.extract[List[(String, Int)]](JsonMethods.parse(jsonGood)): _*)
println(s"'$jsonBad' => $bad")
println(s"'$jsonGood' => $good")
Which prints
'{"c":1,"AaAa":2,"BBBB":3,"AaBB":4,"BBAa":5}' => Map(AaAa -> 2, BBBB -> 3, AaBB -> 4, BBAa -> 5, c -> 1)
'[{"c":1},{"AaAa":2},{"BBBB":3},{"AaBB":4},{"BBAa":5}]' => ListMap(c -> 1, AaAa -> 2, BBBB -> 3, AaBB -> 4, BBAa -> 5)
Here is a library, which supports all Scala collections, so you can parse and serialize to/from ListMap easy, also it serializes case class fields in stable order of declaration:
libraryDependencies ++= Seq(
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-core" % "0.29.2" % Compile,
"com.github.plokhotnyuk.jsoniter-scala" %% "jsoniter-scala-macros" % "0.29.2" % Provided // required only in compile-time
)
import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._
val codec = JsonCodecMaker.make[ListMap[String, Int]](CodecMakerConfig())
val v: ListMap[String, Int] = ListMap("a" -> 1, "b" -> 2, "c" -> 3, "d" -> 4, "e" -> 5)
val json = writeToArray(codec, v)
val v2 = readFromArray(codec, json)
require(v == v2)
I'm trying to write a wrapper around ffprobe that extracts value in JSON of the format {"format": {"format_name": value}}. The JSON is output by a created process. Here's what I've gotten to.
import System.Process
import System.Environment
import System.IO
import Text.JSON
main = do
args <- getArgs
(_, Just out, _, p) <- createProcess
(proc "ffprobe" [args!!0, "-of", "json", "-show_format"])
{ std_out = CreatePipe }
s <- hGetContents out
--putStrLn $ show (decode s :: Result JSValue)
--waitForProcess p
--putStrLn $ valFromObj "format_name" format
-- where format = valFromObj "format" rootObj
-- (Ok rootObj) = decode s :: Result (JSObject (JSValue))
let (Ok rootObj) = decode s :: Result (JSObject (JSValue))
let (Ok format) = valFromObj "format" rootObj :: Result (JSObject (JSValue))
putStrLn format_name
where (Ok format_name) = valFromObj "format_name" format
It fails to compile with:
[1 of 1] Compiling Main ( ffprobe.hs, ffprobe.o )
ffprobe.hs:20:59: error:
Variable not in scope: format :: JSObject JSValue
I'm confused about several things, including why I can't get the last line to compile:
Why can't I assert for Ok in the Result after the ::. Like :: Result Ok JSObject JSValue?
Why can't I extract the values in a where clause?
Why is it Result (JSObject (JSValue)) and not Result JSObject JSValue?
Why is format out of scope?
I have a feeling I'm mixing the IO and Result monads together in the same do block or something. Is Result even a monad? Can I extract the value I want in a separate do without crapping all over the IO do?
I think your compile error is because of the position of the where. Try
main = do
...
let (Ok format) = valFromObj "format" rootObj :: Result (JSObject (JSValue))
let (Ok format_name) = valFromObj "format_name" format
putStrLn format_name
The scope of the where is outside the do so it isn't aware of format.
You cannot do this:
main = do
let bar = "only visible inside main? "
return baz
where
baz = bar ++ " yes, this will break!"
This gives:
test.hs:7:11:
Not in scope: ‘bar’
Perhaps you meant ‘baz’ (line 7)
Let bindings unlike function arguments are not available in where bindings. Above bar is not in scope for baz to use it. Compare to your code.
I am trying to parse a json string with special characters in its attributes names (dots).
This is what I'm trying:
//Json parser objects
case class SolrDoc(`rdf.about`:String, `dc.title`:List[String],
`dc.creator`:List[String], `dc.dateCopyrighted`:List[Int],
`dc.publisher`:List[String], `dc.type` :String)
case class SolrResponse(numFound:String, start:String, docs: List[SolrDoc])
val req = url("http://localhost:8983/solr/select") <<? Map("q" -> q)
var search_result = http(req ># { json => (json \ "response") })
var response = search_result.extract[SolrResponse]
Even though my json string contains values for all the fields this is the error I'm getting:
Message: net.liftweb.json.MappingException: No usable value for docs
No usable value for rdf$u002Eabout
Did not find value which can be converted into java.lang.String
I suspect that it has something to do with the dot on the names but so far I did not manage to make it work.
Thanks!
These is an extract from my LiftProject.scala file :
"net.databinder" % "dispatch-http_2.8.1" % "0.8.6",
"net.databinder" % "dispatch-http-json_2.8.1" % "0.8.6",
"net.databinder" % "dispatch-lift-json_2.8.1" % "0.8.6"
Dots in names should not be a problem. This is with lift-json-2.4-M4
scala> val json = """ {"first.name":"joe"} """
scala> parse(json).extract[Person]
res0: Person = Person(joe)
Where
case class Person(`first.name`: String)