Understanding the Data.Aeson FromJSON typeclass - json

I recently started using Data.Aeson for one of my projects. And I am recently new to Haskell as well. So I am trying to figure out how the implementation of parseJSON function in FromJSON typeclass works.
So I have a code from my codebase.
data MyProfile = MyProfile { name :: String, age :: Int } deriving Show
instance FromJSON MyProfile where
parseJSON (Object m) = MyProfile <$>
m .: "name" <*>
m .: "age"
parseJSON x = fail ("not an object: " ++ show x)
And the YAML file I am trying to read is pretty simple as well.
profile:
name: "Foo"
age: 16
I am trying to understand the working of that applicative functor. I browsed through the Data.Aeson module and found that (.:) returns a Parser (FromJSON a).
So the facts that I have understood is,
Object m holds the profile: section of the yaml
MyProfile corresponds to profile
name in ParseJSON is trying to get the value for the key in the JSON object m
Similar with age as well
And each <*> returns a Parser (FromJSON) which is then applied over to the next <*>
What I am not understanding is,
How does MyProfile gets mapped to the profile section? What if I have a huge yaml file and multiple data defined in my program?
In the code MyProfile <$> m .: "name", shouldn't the first argument of <$> be a function? I perceive that <$> is similar to fmap and hence the first argument must be a function (which is applied to the second argument). But MyProfile is a data! Confusing!
How is the yaml values, in this case, Foo and 16, added to the MyProfile data?
Please correct me if any of my understandings is wrong.

Related

Aeson does not find a key that I believe is present

I'm trying to parse a JSON blob that looks like this:
"{\"order_book\":{\"asks\":[[\"0.06777\",\"0.00006744\"],[\"0.06778\",\"0.01475361\"], ... ]],\"bids\":[[\"0.06744491\",\"1.35\"],[\"0.06726258\",\"0.148585363\"], ...]],\"market_id\":\"ETH-BTC\"}}"
Those lists of pairs of numbers are actually much longer; I've replaced their tails with ellipses.
Here's my code:
{-# LANGUAGE OverloadedStrings #-}
module Demo where
import Data.Aeson
import Data.ByteString.Lazy hiding (putStrLn)
import Data.Either (fromLeft)
import Network.HTTP.Request
data OrderBook = OrderBook
{ orderBook_asks :: [[(Float,Float)]]
, orderBook_bids :: [[(Float,Float)]]
, orderBook_marketId :: String
}
instance FromJSON OrderBook where
parseJSON = withObject "order_book" $ \v -> OrderBook
<$> v .: "asks"
<*> v .: "bids"
<*> v .: "market_id"
demo :: IO ()
demo = do
r <- get "https://www.buda.com/api/v2/markets/eth-btc/order_book"
let d = eitherDecode $ fromStrict $ responseBody r :: Either String OrderBook
putStrLn $ "Here's the parse error:"
putStrLn $ fromLeft undefined d
putStrLn $ "\n\nAnd here's the data:"
putStrLn $ show $ responseBody r
Here's what running demo gets me:
Here's the parse error:
Error in $: key "asks" not found
And here's the data:
"{\"order_book\":{\"asks\":[[\"0.06777\",\"0.00006744\"],[\"0.06778\",\"0.01475361\"], ... ]],\"bids\":[[\"0.06744491\",\"1.35\"],[\"0.06726258\",\"0.148585363\"], ...]],\"market_id\":\"ETH-BTC\"}}"
The "asks" key looks clearly present to me -- it's the first one nested under the "order_book" key.
The key is present, but it's wrapped inside another nested object, so you have to unwrap the outer object before you can parse the keys.
The smallest-diff way to do this is probably just inline:
instance FromJSON OrderBook where
parseJSON = withObject "order_book" $ \outer -> do
v <- outer .: "order_book"
OrderBook
<$> v .: "asks"
<*> v .: "bids"
<*> v .: "market_id"
Though you might want to consider introducing another wrapping type instead. This would really depend on the semantics of the data format you have.
I guess you were probably assuming that this is what withObject "order_book" would do, but that's not what it does. The first parameter of withObject is just a human-readable name of the object being parsed, used to create error messages. Customarily that parameter should name the type that is being parsed - i.e. withObject "OrderBook". See the docs.
Separately, I think your asks and bids fields are mistyped.
First, your JSON input looks like they are supposed to be arrays of tuples, but your Haskell type says doubly nested arrays of tuples. So this will fail to parse.
Second, your JSON input has strings as elements of those tuples, but your Haskell type says Float. This will also fail to parse.
The correct type, according to your JSON input, should be:
{ orderBook_asks :: [(String,String)]
, orderBook_bids :: [(String,String)]
Alternatively, if you really want the floats, you'll have to parse them from strings:
instance FromJSON OrderBook where
parseJSON = withObject "order_book" $ \outer -> do
v <- outer .: "order_book"
OrderBook
<$> (map parseTuple <$> v .: "asks")
<*> (map parseTuple <$> v .: "bids")
<*> v .: "market_id"
where
parseTuple (a, b) = (read a, read b)
(note that this ☝️ code is not to be copy&pasted: I'm using read for parsing strings into floats, which will crash at runtime if the strings are malformatted; in a real program you should use a better way of parsing)
withObject "order_book" does not look into the value at key "order_book". In fact, the "order_book" argument is ignored apart from appearing in the error message; actually you should have withObject "OrderBook" there.
All withObject does is confirm that what you have is an object. Then it proceeds using that object to look for the keys "asks", "bids" and "market_id" – but the only key that's there at this level is order_book.
The solution is to only use this parser with the {"asks":[["0.06777"...]...]...} object. The "order_book" key tells no information anyway, unless there are other keys present there as well. You can represent that outer object with another Haskell type and its own FromJSON instance.

Parsing an Options Object to a List of Options

I have been working on a small library against an JSON-based API. This library makes use of an "options" object, a series of key-value pairs that specify advanced behaviour:
{
"id": 1234
...
"options": {
"notify_no_data": True,
"no_data_timeframe": 20,
"notify_audit": False,
"silenced": {"*" :1428937807}
}
}
I've represented the options concept in Haskell using a list of options:
data Option = NotifyNoData NominalDiffTime -- notify after xyz
| NotifyAudit -- notify on changes
| Silenced (Maybe UTCTime) -- silence all notifications ("*") until xyz (or indefinitely)
newtype Options = Options [Option]
Implementing toJSON was simple enough; I instantiated toJSON for the Option type, and then used those as helpers for the Options type.
instance ToJSON Option where
toJSON (Silenced mtime) =
Object $ Data.HashMap.fromList [("silenced", mapping)]
where stamp = maybe Null (jsonTime . floor . utcTimeToPOSIXSeconds) mtime
mapping = Object $ Data.HashMap.singleton "*" stamp
toJSON (NotifyNoData difftime) =
Object $ Data.HashMap.fromList [("notify_no_data", Bool True)
,("no_data_timeframe", stamp)]
where stamp = jsonTime $ floor (difftime / 60)
toJSON NotifyAudit =
Object $ Data.HashMap.fromList [("notify_audit", Bool True)]
instance ToJSON Options where
toJSON (Options options) = Object $ Data.HashMap.unions $ reverse $ (opts:) $ map ((\(Object o) -> o) . toJSON) options
where opts = Data.HashMap.fromList [("silenced", Object Data.HashMap.empty)
,("notify_no_data", Bool False)
,("notify_audit", Bool False)]
The problem I am running into is in the fromJSON implementation for Options. All the use cases I've seen before supply a fairly simple json-object-representation to data-representation mapping. What I need to do is convert an object to an object of options to a list of data (Option) representations. For example, the sample JSON under "options" that I gave at the start would have to become:
Options [NotifyNoData 20, Silenced (Just (posixSecondsToUTCTime 1428937807))]
FromJSON requires an implementation of parseJSON :: FromJSON a => Value -> Parser a. I am having trouble understanding how to build a Parser using this optional object structure given by the API. Is there a standard approach to parsing a JSON object to a list like this? It may be that I am failing to fully understand the Parser typeclass.
You can maybe use the parser monad to extract all the information, something like:
parseJSON (Object v) = do
maybeNotify <- v .:? "notify_no_data"
maybeTimeFrame <- v .:? "no_data_timeframe"
let nots = case (maybeNotify,maybeTimeFrame) of
(Just True,Just stamp) -> [NotifyNoData $ fromJsontime stamp]
_ -> []
return $ Options nots
(fromJsontime is just a helper function to convert from the JSON value to your NominalDiffTime, I suppose you have something for that).
Do the same for the other types of options, concatenating the result.
Using the json-stream parser (which I am the author of), it would look like this (from my head, untested):
option = NotifyNoData <$> "no_data_timeframe" .: value
<* filterI id ("notify_no_data" .: bool)
<|> const NotifyAudit <$> filterI id ("notify_audit" .: bool)
<|> Silenced <$> "silenced" .: "*" .:? value
optionList = Options <$> toList option
The code expects you have FromJSON instance for UTCTime (or you could just use 'integer' and some kind of fromposixtime etc.).

Haskell aeson package basic usage

Working my way through Haskell and I'm trying to learn how to serialized to/from JSON.
I'm using aeson-0.8.0.2 & I'm stuck at basic decoding. Here's what I have:
file playground/aeson.hs:
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Data.Aeson
data Person = Person
{ name :: Text
, age :: Int
} deriving Show
instance FromJSON Person where
parseJSON (Object v) = Person <$>
v .: "name" <*>
v .: "age"
parseJSON _ = mzero
main = do
let a = decode "{\"name\":\"Joe\",\"age\":12}" :: Maybe Person
print "aa"
ghc --make playground/aeson.hs yields:
[1 of 1] Compiling Main ( playground/aeson.hs,
playground/aeson.o )
playground/aeson.hs:13:35: Not in scope: `'
playground/aeson.hs:14:40: Not in scope: `<*>'
playground/aeson.hs:17:28: Not in scope: `mzero'
Any idea what I'm doing wrong?
Why is OverloadedString needed here?
Also, I have no idea what <$>, <*>, or mzero are supposed to mean; I'd appreciate tips on where I can read about any of these.
You need to import Control.Applicative and Control.Monad to get <$>, <*> and mzero. <$> just an infix operator for fmap, and <*> is the Applicative operator, you can think of it as a more generalized form of fmap for now. mzero is defined for the MonadPlus class, which is a class representing that a Monad has the operation
mplus :: m a -> m a -> m a
And a "monadic zero" element called mzero. The simplest example is for lists:
> mzero :: [Int]
[]
> [1, 2, 3] `mplus` [4, 5, 6]
[1, 2, 3, 4, 5, 6]
Here mzero is being used to represent a failure to parse. For looking up symbols in the future, I recommend using hoogle or FP Complete's version. Once you find the symbol, read the documentation, the source, and look around for examples of its use on the internet. You'll learn a lot by looking for it yourself, although it'll take you a little while to get used to this kind of research.
The OverloadedStrings extension is needed here because the Aeson library works with the Text type from Data.Text instead of the built-in String type. This extension lets you use string literals as Text instead of String, just as the numeric literal 0 can be an Int, Integer, Float, Double, Complex and other types. OverloadedStrings makes string literals have type Text.String.IsString s => s instead of just String, so it makes it easy to use alternate string-y types.
For <$> you need to import Control.Applicative and for mzero you need to import Control.Monad.
You can determine this by using the web-based version of hoogle (http://www.haskell.org/hoogle) or the command line version:
$ hoogle '<$>'
$ hoogle mzero

Haskell :: Aeson :: parse ADT based on field value

I'm using an external API which returns JSON responses. One of the responses is an array of objects and these objects are identified by the field value inside them. I'm having some trouble understanding how the parsing of such JSON response could be done with Aeson.
Here is a simplified version of my problem:
newtype Content = Content { content :: [Media] } deriving (Generic)
instance FromJSON Content
data Media =
Video { objectClass :: Text
, title :: Text } |
AudioBook { objectClass :: Text
, title :: Text }
In API documentation it is said that the object can be identified by the field objectClass which has value "video" for our Video object and "audiobook" for our AudioBook and so on. Example JSON:
[{objectClass: "video", title: "Some title"}
,{objectClass: "audiobook", title: "Other title"}]
The question is how can this type of JSON be approached using Aeson?
instance FromJSON Media where
parseJSON (Object x) = ???
You basically need a function Text -> Text -> Media:
toMedia :: Text -> Text -> Media
toMedia "video" = Video "video"
toMedia "audiobook" = AudioBook "audiobook"
The FromJSON instance is now really simple (using <$> and <*> from Control.Applicative):
instance FromJSON Media where
parseJSON (Object x) = toMedia <$> x .: "objectClass" <*> x .: "title"
However, at this point you're redundant: the objectClass field in Video or Audio doesn't give you more information than the actual type, so you might remove it:
data Media = Video { title :: Text }
| AudioBook { title :: Text }
toMedia :: Text -> Text -> Media
toMedia "video" = Video
toMedia "audiobook" = AudioBook
Also note that toMedia is partial. You probably want to catch invalid "objectClass" values:
instance FromJSON Media where
parseJSON (Object x) =
do oc <- x .: "objectClass"
case oc of
String "video" -> Video <$> x .: "title"
String "audiobook" -> AudioBook <$> x .: "title"
_ -> empty
{- an alternative using a proper toMedia
toMedia :: Alternative f => Text -> f (Text -> Media)
toMedia "video" = pure Video
toMedia "audiobook" = pure AudioBook
toMedia _ = empty
instance FromJSON Media where
parseJSON (Object x) = (x .: "objectClass" >>= toMedia) <*> x .: "title"
-}
And last, but not least, remember that valid JSON uses strings for the name.
The default translation for a data type like:
data Media = Video { title :: Text }
| AudioBook { title :: Text }
deriving Generic
is actually very close to what you want. (For the simplicity of my examples, I define ToJSON instances and encode the examples to see what kind of JSON we get.)
aeson, default
So, with the default instance we have (view the complete source file which produces this output):
[{"tag":"Video","title":"Some title"},{"tag":"AudioBook","title":"Other title"}]
Let's see whether we can get even closer with custom options...
aeson, custom tagFieldName
With custom options:
mediaJSONOptions :: Options
mediaJSONOptions =
defaultOptions{ sumEncoding =
TaggedObject{ tagFieldName = "objectClass"
-- , contentsFieldName = undefined
}
}
instance ToJSON Media
where toJSON = genericToJSON mediaJSONOptions
we get:
[{"objectClass":"Video","title":"Some title"},{"objectClass":"AudioBook","title":"Other title"}]
(Think yourself what you want to do with an undefined field in the real code.)
aeson, custom constructorTagModifier
Adding
, constructorTagModifier = fmap Char.toLower
to mediaJSONOptions gives:
[{"objectClass":"video","title":"Some title"},{"objectClass":"audiobook","title":"Other title"}]
Great! Exactly what you specified!
decoding
Simply add an instance with the same options to be able to decode from this format:
instance FromJSON Media
where parseJSON = genericParseJSON mediaJSONOptions
Example:
*Main> encode example
"[{\"objectClass\":\"video\",\"title\":\"Some title\"},{\"objectClass\":\"audiobook\",\"title\":\"Other title\"}]"
*Main> decode $ fromString "[{\"objectClass\":\"video\",\"title\":\"Some title\"},{\"objectClass\":\"audiobook\",\"title\":\"Other title\"}]" :: Maybe [Media]
Just [Video {title = "Some title"},AudioBook {title = "Other title"}]
*Main>
Complete source file.
generic-aeson, default
To get a more complete picture, let's also look at what generic-aeson package would give (at hackage). It has also nice default translations, different in some respects from those from aeson.
Doing
import Generics.Generic.Aeson -- from generic-aeson package
and defining:
instance ToJSON Media
where toJSON = gtoJson
gives the result:
[{"video":{"title":"Some title"}},{"audioBook":{"title":"Other title"}}]
So, it's different from all what we've seen when using aeson.
generic-aeson's options (Settings) are not interesting for us (they allow only to strip a prefix).
(The complete source file.)
aeson, ObjectWithSingleField
Apart from lower-casing the first letter of the constructor names, generic-aeson's translation seems similar to an option available in aeson:
Let's try this:
mediaJSONOptions =
defaultOptions{ sumEncoding = ObjectWithSingleField
, constructorTagModifier = fmap Char.toLower
}
and yes, the result is:
[{"video":{"title":"Some title"}},{"audiobook":{"title":"Other title"}}]
the rest of options: (aeson, TwoElemArray)
One available option for sumEncoding has been left out from consideration above, because it gives an array which is not quite similar to the JSON representation asked about. It's TwoElemArray. Example:
[["video",{"title":"Some title"}],["audiobook",{"title":"Other title"}]]
is given by:
mediaJSONOptions =
defaultOptions{ sumEncoding = TwoElemArray
, constructorTagModifier = fmap Char.toLower
}

Parsing JSON string into record in Haskell

I'm struggling to understand this (I'm still a bit new to Haskell) but I'm finding the documentation for the Text.JSON package to be a little confusing. Basically I have this data record type: -
data Tweet = Tweet
{
from_user :: String,
to_user_id :: String,
profile_image_url :: String,
created_at :: String,
id_str :: String,
source :: String,
to_user_id_str :: String,
from_user_id_str :: String,
from_user_id :: String,
text :: String,
metadata :: String
}
and I have some tweets in JSON format that conform to the structure of this type. The thing that I'm struggling with is how to map the above to what gets returned from the following code
decode tweet :: Result JSValue
into the above datatype. I understand that I'm supposed to create an instance of instance JSON Tweet but I don't know where to go from there.
Any pointers would be greatly appreciated, thanks!
I'd recommend that you use the new aeson package instead of the json package, as the former performs much better. Here's how you'd convert a JSON object to a Haskell record, using aeson:
{-# LANGUAGE OverloadedStrings #-}
module Example where
import Control.Applicative
import Control.Monad
import Data.Aeson
data Tweet = Tweet {
from_user :: String,
to_user_id :: String,
profile_image_url :: String,
created_at :: String,
id_str :: String,
source :: String,
to_user_id_str :: String,
from_user_id_str :: String,
from_user_id :: String,
text :: String,
metadata :: String
}
instance FromJSON Tweet where
parseJSON (Object v) =
Tweet <$> v .: "from_user"
<*> v .: "to_user_id"
<*> v .: "profile_image_url"
<*> v .: "created_at"
<*> v .: "id_str"
<*> v .: "source"
<*> v .: "to_user_id_str"
<*> v .: "from_user_id_str"
<*> v .: "from_user_id"
<*> v .: "text"
<*> v .: "metadata"
-- A non-Object value is of the wrong type, so use mzero to fail.
parseJSON _ = mzero
Then use Data.Aeson.json to get a attoparsec parser that converts a ByteString into a Value. The call fromJSON on the Value to attempt to parse it into your record. Note that there are two different parsers involved in these two steps, a Data.Attoparsec.Parser parser for converting the ByteString into a generic JSON Value and then a Data.Aeson.Types.Parser parser for converting the JSON value into a record. Note that both steps can fail:
The first parser can fail if the ByteString isn't a valid JSON value.
The second parser can fail if the (valid) JSON value doesn't contain one of the fields you mentioned in your fromJSON implementation.
The aeson package prefers the new Unicode type Text (defined in the text package) to the more old school String type. The Text type has a much more memory efficient representation than String and generally performs better. I'd recommend that you change the Tweet type to use Text instead of String.
If you ever need to convert between String and Text, use the pack and unpack functions defined in Data.Text. Note that such conversions require O(n) time, so avoid them as much as possible (i.e. always use Text).
You need to write a showJSON and readJSON method, for your type, that builds your Haskell values out of the JSON format. The JSON package will take care of parsing the raw string into a JSValue for you.
Your tweet will be a JSObject containing a map of strings, most likely.
Use show to look at the JSObject, to see how the fields are laid out.
You can lookup each field using get_field on the JSObject.
You can use fromJSString to get a regular Haskell strings from a JSString.
Broadly, you'll need something like,
{-# LANGUAGE RecordWildCards #-}
import Text.JSON
import Text.JSON.Types
instance JSON Tweet where
readJSON (JSObject o) = return $ Tweet { .. }
where from_user = grab o "from_user"
to_user_id = grab o "to_user_id"
profile_image_url = grab o "proile_image_url"
created_at = grab o "created_at"
id_str = grab o "id_str"
source = grab o "source"
to_user_id_str = grab o "to_user_id_str"
from_user_id_str = grab o "from_user_id_str"
from_user_id = grab o "from_user_id"
text = grab o "text"
metadata = grab o "metadata"
grab o s = case get_field o s of
Nothing -> error "Invalid field " ++ show s
Just (JSString s') -> fromJSString s'
Note, I'm using the rather cool wild cards language extension.
Without an example of the JSON encoding, there's not much more I can advise.
Related
You can find example instances for the JSON encoding via instances
in the source, for
simple types. Or in other packages that depend on json.
An instance for AUR messages is here, as a (low level) example.
Import Data.JSon.Generic and Data.Data, then add deriving (Data) to your record type, and then try using decodeJSON on the tweet.
I support the answer by #tibbe.
However, I would like to add How you check put some default value in case, the argument misses in the JSON provided.
In tibbe's answer you can do the following:
Tweet <$> v .: "from_user"
<*> v .:? "to_user_id" .!= "some user here"
<*> v .: "profile_image_url" .!= "url to image"
<*> v .: "created_at"
<*> v .: "id_str" != 232131
<*> v .: "source"
this will the dafault parameters to be taken while parsing the JSON.