Reading nested JSON data encoded as a nested string with Aeson - json

I have this weird JSON to parse containing nested JSON ... a string. So instead of
{\"title\": \"Lord of the rings\", \"author\": {\"666\": \"Tolkien\"}\"}"
I have
{\"title\": \"Lord of the rings\", \"author\": \"{\\\"666\\\": \\\"Tolkien\\\"}\"}"
Here's my (failed) attempt to parse the nested string using decode, inside an instance of FromJSON :
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Maybe
import GHC.Generics
import Data.Aeson
import qualified Data.Map as M
type Authors = M.Map Int String
data Book = Book
{
title :: String,
author :: Authors
}
deriving (Show, Generic)
decodeAuthors x = fromJust (decode x :: Maybe Authors)
instance FromJSON Book where
parseJSON = withObject "Book" $ \v -> do
t <- v .: "title"
a <- decodeAuthors <?> v .: "author"
return $ Book t a
jsonTest = "{\"title\": \"Lord of the rings\", \"author\": \"{\\\"666\\\": \\\"Tolkien\\\"}\"}"
test = decode jsonTest :: Maybe Book
Is there a way to decode the whole JSON in a single pass ? Thanks !

A couple problems here.
First, your use of <?> is nonsensical. I'm going to assume it's a typo, and what you actually meant was <$>.
Second, the type of decodeAuthors is ByteString -> Authors, which means its parameter is of type ByteString, which means that the expression v .: "author" must be of type Parser ByteString, which means that there must be an instance FromJSON ByteString, but such instance doesn't exists (for reasons that escape me at the moment).
What you actually want is for v .: "author" to return a Parser String (or perhaps Parser Text), and then have decodeAuthors accept a String and convert it to ByteString (using pack) before passing to decode:
import Data.ByteString.Lazy.Char8 (pack)
decodeAuthors :: String -> Authors
decodeAuthors x = fromJust (decode (pack x) :: Maybe Authors)
(also note: it's a good idea to give you declarations type signatures that you think they should have. This lets the compiler point out errors earlier)
Edit:
As #DanielWagner correctly points out, pack may garble Unicode text. If you want to handle it correctly, use Data.ByteString.Lazy.UTF8.fromString from utf8-string to do the conversion:
import Data.ByteString.Lazy.UTF8 (fromString)
decodeAuthors :: String -> Authors
decodeAuthors x = fromJust (decode (fromString x) :: Maybe Authors)
But in that case you should also be careful about the type of jsonTest: the way your code is written, its type would be ByteString, but any non-ASCII characters that may be inside would be cut off because of the way IsString works. To preserve them, you need to use the same fromString on it:
jsonTest = fromString "{\"title\": \"Lord of the rings\", \"author\": \"{\\\"666\\\": \\\"Tolkien\\\"}\"}"

Related

Haskell Aeson json encoding bytestrings

I need to serialize a record in Haskell, and am trying to do it with Aeson. The problem is that some of the fields are ByteStrings, and I can't work out from the examples how to encode them. My idea is to first convert them to text via base64. Here is what I have so far (I put 'undefined' where I didn't know what to do):
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.Aeson as J
import qualified Data.ByteString as B
import qualified Data.ByteString.Base64 as B64
import qualified Data.Text as T
import qualified Data.Text.Encoding as E
import qualified GHC.Generics as G
data Data = Data
{ number :: Int
, bytestring :: B.ByteString
} deriving (G.Generic, Show)
instance J.ToJSON Data where
toEncoding = J.genericToEncoding J.defaultOptions
instance J.FromJSON Data
instance J.FromJSON B.ByteString where
parseJSON = undefined
instance J.ToJSON B.ByteString where
toJSON = undefined
byteStringToText :: B.ByteString -> T.Text
byteStringToText = E.decodeUtf8 . B64.encode
textToByteString :: T.Text -> B.ByteString
textToByteString txt =
case B64.decode . E.encodeUtf8 $ txt of
Left err -> error err
Right bs -> bs
encodeDecode :: Data -> Maybe Data
encodeDecode = J.decode . J.encode
main :: IO ()
main = print $ encodeDecode $ Data 1 "A bytestring"
It would be good if it was not necessary to manually define new instances of ToJSON and FromJSON for every record, because I have quite a few different records with bytestrings in them.
parseJson needs to return a value of type Parser B.ByteString, so you just need to call pure on the return value of B64.decode.
import Control.Monad
-- Generalized to any MonadPlus instance, not just Either String
textToByteString :: MonadPlus m => T.Text -> m B.ByteString
textToByteString = case B64.decode (E.encodeUtf8 x) of
Left _ -> mzero
Right bs -> pure bs
instance J.FromJSON B.ByteString where
parseJSON (J.String x) = textToByteString x
parseJSON _ = mzero
Here, I've chosen to return mzero both if you try to decode anything other than a JSON string and if there is a problem with the base-64 decoding.
Likewise, toJSON needs just needs to encode the Text value you create from the base64-encoded ByteString.
instance J.ToJSON B.ByteString where
toJSON = J.toJSON . byteStringToText
You might want to consider using a newtype wrapper instead of defining the ToJSON and FromJSON instances on B.ByteString directly.

Trouble with JSON (Data.Aeson)

I'm new to Haskell and in order to learn the language I am working on a project that involves dealing with JSON. I am currently getting the feeling Haskell is the wrong language for the job, but that isn't the point here.
I've been struggling to understand how this works for a few days. I have searched and everything I have found does not seem to work. Here's the issue:
I have some JSON in the following format:
>>>less "path/to/json"
{
"stringA1_stringA2": {"stringA1":floatA1,
"stringA2":foatA2},
"stringB1_stringB2": {"stringB1":floatB1,
"stringB2":floatB2}
...
}
Here floatX1 and floatX2 are actually strings of the form "0.535613567", "1.221362183" etc. What I want to do is parse this into the following data
data Mydat = Mydat { name :: String, num :: Float} deriving (Show)
where name would correspond to "stringX1_stringX2" and num to floatX1 for X = A,B,...
So far I have reached a 'solution' which feels fairly hackish and convoluted and doesn't work properly.
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
import Data.Functor
import Data.Monoid
import Data.Aeson
import Data.List
import Data.Text
import Data.Map (Map)
import qualified Data.HashMap.Strict as DHM
--import qualified Data.HashMap as DHM
import qualified Data.ByteString.Lazy as LBS
import System.Environment
import GHC.Generics
import Text.Read
data Mydat = Mydat {name :: String, num :: Float} deriving (Show)
test s = do
d <- LBS.readFile s
let v = decode d :: Maybe (DHM.HashMap String Object)
case v of
-- Just v -> print v
Just v -> return $ Prelude.map dataFromList $ DHM.toList $ DHM.map (DHM.lookup "StringA1") v
good = ['1','2','3','4','5','6','7','8','9','0','.']
f x = elem x good
dataFromList :: (String, Maybe Value) -> Mydat
dataFromList (a,b) = Mydat a (read (Prelude.filter f (show b)) :: Float)
Now I can compile this and run
test "path/to/json"
in ghci and it prints a list of Mydat's in the case where "stringX1"="stringA1" for all X. In reality there are two values for "stringX1" so aside from the hackyness this is not satisfactory. There must be a better way to do this. I get that I need to write my own parser probably but I am confused about how this works so any suggestions would be great. Thanks in advance.
The structure of your JSON is pretty nasty, but here's a basic working solution:
#!/usr/bin/env stack
-- stack --resolver lts-11.5 script --package containers --package aeson
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Map as Map
import qualified Data.Aeson as Aeson
data Mydat = Mydat { name :: String
, num :: Float
} deriving (Show)
instance Eq Mydat where
(Mydat _ x1) == (Mydat _ x2) = x1 == x2
instance Ord Mydat where
(Mydat _ x1) `compare` (Mydat _ x2) = x1 `compare` x2
type MydatRaw = Map.Map String (Map.Map String String)
processRaw :: MydatRaw -> [Mydat]
processRaw = Map.foldrWithKey go []
where go key value accum =
accum ++ (Mydat key . read <$> Map.elems value)
main :: IO ()
main =
do let json = "{\"stringA1_stringA2\":{\"stringA1\":\"0.1\",\"stringA2\":\"0.2\"}}"
print $ fmap processRaw (Aeson.eitherDecode json)
Note that read is partial and generally not a good idea. But I'll leave it to you to flesh out a safer version :)
As I commented, the best thing would probably be to make your JSON file well-formed in the sense that the float fields should really be floats, not strings.
If that's not an option, I would recommend you phrase out the type that the JSON file seems to represent as simple as possible (but without dynamic Objects), and then convert that to the type you actually want.
import Data.Map (Map)
import qualified Data.Map as Map
type GarbledJSON = Map String (Map String String)
-- ^ you could also stick with hash maps for everything, but
-- usually `Map` is actually more sensible in Haskell.
data MyDat = MyDat {name :: String, num :: Float} deriving (Show)
test :: FilePath -> IO [MyDat]
test s = do
d <- LBS.readFile s
case decode d :: Maybe GarbledJSON of
Just v -> return [ MyDat iName ( read . filter (`elem`good)
$ iVals Map.! valKey )
| (iName, iVals) <- Map.toList v
, let valKey = takeWhile (/='_') iName ]
Note that this will crash completely if any of the items don't contain the first part of the name as a string of float format, and likely give bogus items when you filter out characters that aren't good. If you just want to ignore any malformed items (which is also not a very clean approach...), you can do it this way:
test :: FilePath -> IO [MyDat]
test s = do
d <- LBS.readFile s
return $ case decode d :: Maybe GarbledJSON of
Just v -> [ MyDat iName iVal
| (iName, iVals) <- Map.toList v
, let valKey = takeWhile (/='_') iName
, Just iValStr <- [iVals Map.!? valKey]
, [(iVal,"")] <- [reads iValStr] ]
Nothing -> []

Trying to read JSON from file using Aeson

Here's my code:
import Data.Aeson
import Control.Applicative
import Control.Monad
import Data.Text
import GHC.Generics
import qualified Data.ByteString.Lazy as B
data JSON' =
JSON' {
foo :: !Text,
int :: Int
} deriving (Show, Generic)
instance FromJSON JSON'
instance ToJSON JSON'
jsonFile :: FilePath
jsonFile = "test.json"
getJSON :: IO B.ByteString
getJSON = B.readFile jsonFile
main :: IO ()
main = do
-- Get JSON data and decode it
d <- (eitherDecode <$> getJSON) :: IO (Either String [JSON'])
-- If d is Left, the JSON was malformed.
-- In that case, we report the error.
-- Otherwise, we perform the operation of
-- our choice. In this case, just print it.
case d of
Left err -> putStrLn err
Right ps -> print ps
test.json looks liket his:
-- test.json
{
"foo": "bar",
"int": 1
}
When I run this code I get this error:
Can't make a derived instance of ‘Generic JSON'’:
You need DeriveGeneric to derive an instance for this class
In the data declaration for ‘JSON'’
So far the documentation for Aeson, like all documentation on Hackage, is not helpful at all. I have no idea what I'm doing wrong. So far it seems like I'm reading a file into a bytestring, transforming it into a "tree" like data structure, and then printing one leaf per node. My code is straight from this link
What am I doing wrong?
UPDATE
After adding the language extension declaration to the top of the file
{-# LANGUAGE DeriveGeneric #-}
I'm getting this error:
Error in $: expected [a], encountered Object
Not sure what this means.
you need to put {-# LANGUAGE DeriveGeneric #-} on the top of your file.
On a side note the documentation of aeson says
instance ToJSON Person where
-- ...
toEncoding = genericToEncoding defaultOptions
instance FromJSON Person
-- No need to provide a parseJSON implementation.
I think you probably also forgot the toEncoding line in your ToJSON.
Add this to the top of the file
{-# LANGUAGE DeriveGeneric #-}
The ability to use derive (Generic) is a language extension, and you have to tell GHC that you want this to be turned on.

Haskell aeson package basic usage

Working my way through Haskell and I'm trying to learn how to serialized to/from JSON.
I'm using aeson-0.8.0.2 & I'm stuck at basic decoding. Here's what I have:
file playground/aeson.hs:
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Data.Aeson
data Person = Person
{ name :: Text
, age :: Int
} deriving Show
instance FromJSON Person where
parseJSON (Object v) = Person <$>
v .: "name" <*>
v .: "age"
parseJSON _ = mzero
main = do
let a = decode "{\"name\":\"Joe\",\"age\":12}" :: Maybe Person
print "aa"
ghc --make playground/aeson.hs yields:
[1 of 1] Compiling Main ( playground/aeson.hs,
playground/aeson.o )
playground/aeson.hs:13:35: Not in scope: `'
playground/aeson.hs:14:40: Not in scope: `<*>'
playground/aeson.hs:17:28: Not in scope: `mzero'
Any idea what I'm doing wrong?
Why is OverloadedString needed here?
Also, I have no idea what <$>, <*>, or mzero are supposed to mean; I'd appreciate tips on where I can read about any of these.
You need to import Control.Applicative and Control.Monad to get <$>, <*> and mzero. <$> just an infix operator for fmap, and <*> is the Applicative operator, you can think of it as a more generalized form of fmap for now. mzero is defined for the MonadPlus class, which is a class representing that a Monad has the operation
mplus :: m a -> m a -> m a
And a "monadic zero" element called mzero. The simplest example is for lists:
> mzero :: [Int]
[]
> [1, 2, 3] `mplus` [4, 5, 6]
[1, 2, 3, 4, 5, 6]
Here mzero is being used to represent a failure to parse. For looking up symbols in the future, I recommend using hoogle or FP Complete's version. Once you find the symbol, read the documentation, the source, and look around for examples of its use on the internet. You'll learn a lot by looking for it yourself, although it'll take you a little while to get used to this kind of research.
The OverloadedStrings extension is needed here because the Aeson library works with the Text type from Data.Text instead of the built-in String type. This extension lets you use string literals as Text instead of String, just as the numeric literal 0 can be an Int, Integer, Float, Double, Complex and other types. OverloadedStrings makes string literals have type Text.String.IsString s => s instead of just String, so it makes it easy to use alternate string-y types.
For <$> you need to import Control.Applicative and for mzero you need to import Control.Monad.
You can determine this by using the web-based version of hoogle (http://www.haskell.org/hoogle) or the command line version:
$ hoogle '<$>'
$ hoogle mzero

Parsing JSON string into record in Haskell

I'm struggling to understand this (I'm still a bit new to Haskell) but I'm finding the documentation for the Text.JSON package to be a little confusing. Basically I have this data record type: -
data Tweet = Tweet
{
from_user :: String,
to_user_id :: String,
profile_image_url :: String,
created_at :: String,
id_str :: String,
source :: String,
to_user_id_str :: String,
from_user_id_str :: String,
from_user_id :: String,
text :: String,
metadata :: String
}
and I have some tweets in JSON format that conform to the structure of this type. The thing that I'm struggling with is how to map the above to what gets returned from the following code
decode tweet :: Result JSValue
into the above datatype. I understand that I'm supposed to create an instance of instance JSON Tweet but I don't know where to go from there.
Any pointers would be greatly appreciated, thanks!
I'd recommend that you use the new aeson package instead of the json package, as the former performs much better. Here's how you'd convert a JSON object to a Haskell record, using aeson:
{-# LANGUAGE OverloadedStrings #-}
module Example where
import Control.Applicative
import Control.Monad
import Data.Aeson
data Tweet = Tweet {
from_user :: String,
to_user_id :: String,
profile_image_url :: String,
created_at :: String,
id_str :: String,
source :: String,
to_user_id_str :: String,
from_user_id_str :: String,
from_user_id :: String,
text :: String,
metadata :: String
}
instance FromJSON Tweet where
parseJSON (Object v) =
Tweet <$> v .: "from_user"
<*> v .: "to_user_id"
<*> v .: "profile_image_url"
<*> v .: "created_at"
<*> v .: "id_str"
<*> v .: "source"
<*> v .: "to_user_id_str"
<*> v .: "from_user_id_str"
<*> v .: "from_user_id"
<*> v .: "text"
<*> v .: "metadata"
-- A non-Object value is of the wrong type, so use mzero to fail.
parseJSON _ = mzero
Then use Data.Aeson.json to get a attoparsec parser that converts a ByteString into a Value. The call fromJSON on the Value to attempt to parse it into your record. Note that there are two different parsers involved in these two steps, a Data.Attoparsec.Parser parser for converting the ByteString into a generic JSON Value and then a Data.Aeson.Types.Parser parser for converting the JSON value into a record. Note that both steps can fail:
The first parser can fail if the ByteString isn't a valid JSON value.
The second parser can fail if the (valid) JSON value doesn't contain one of the fields you mentioned in your fromJSON implementation.
The aeson package prefers the new Unicode type Text (defined in the text package) to the more old school String type. The Text type has a much more memory efficient representation than String and generally performs better. I'd recommend that you change the Tweet type to use Text instead of String.
If you ever need to convert between String and Text, use the pack and unpack functions defined in Data.Text. Note that such conversions require O(n) time, so avoid them as much as possible (i.e. always use Text).
You need to write a showJSON and readJSON method, for your type, that builds your Haskell values out of the JSON format. The JSON package will take care of parsing the raw string into a JSValue for you.
Your tweet will be a JSObject containing a map of strings, most likely.
Use show to look at the JSObject, to see how the fields are laid out.
You can lookup each field using get_field on the JSObject.
You can use fromJSString to get a regular Haskell strings from a JSString.
Broadly, you'll need something like,
{-# LANGUAGE RecordWildCards #-}
import Text.JSON
import Text.JSON.Types
instance JSON Tweet where
readJSON (JSObject o) = return $ Tweet { .. }
where from_user = grab o "from_user"
to_user_id = grab o "to_user_id"
profile_image_url = grab o "proile_image_url"
created_at = grab o "created_at"
id_str = grab o "id_str"
source = grab o "source"
to_user_id_str = grab o "to_user_id_str"
from_user_id_str = grab o "from_user_id_str"
from_user_id = grab o "from_user_id"
text = grab o "text"
metadata = grab o "metadata"
grab o s = case get_field o s of
Nothing -> error "Invalid field " ++ show s
Just (JSString s') -> fromJSString s'
Note, I'm using the rather cool wild cards language extension.
Without an example of the JSON encoding, there's not much more I can advise.
Related
You can find example instances for the JSON encoding via instances
in the source, for
simple types. Or in other packages that depend on json.
An instance for AUR messages is here, as a (low level) example.
Import Data.JSon.Generic and Data.Data, then add deriving (Data) to your record type, and then try using decodeJSON on the tweet.
I support the answer by #tibbe.
However, I would like to add How you check put some default value in case, the argument misses in the JSON provided.
In tibbe's answer you can do the following:
Tweet <$> v .: "from_user"
<*> v .:? "to_user_id" .!= "some user here"
<*> v .: "profile_image_url" .!= "url to image"
<*> v .: "created_at"
<*> v .: "id_str" != 232131
<*> v .: "source"
this will the dafault parameters to be taken while parsing the JSON.