reading Csv as custom data types - csv

I have a csv at some filePath with two columns with no headers
john,304
sarah,300
...
I have been able to read the csv as such:
import Data.Csv as Csv
import Data.ByteString.Lazy as BL
import Data.Vector as V
...
results <- fmap V.toList . Csv.decode #(String,Integer) Csv.NoHeader <$> BL.readFile filePath
-- Right [("john",300),("sarah",302)]
If I have custom data type for the csv columns as such:
data PersonnelData = PersonnelData
{ name :: !String
, amount :: !Integer
} deriving (Show, Generic, Csv.FromRecord)
How can I modify the above to decode / read the file for this data type?

Where you have this:
Csv.decode #(String,Integer)
You are are using a visible type application to explicitly tell Csv.decode that its first type parameter should be the type (String, Integer). Let's have a look at the signature for decode to see what that means:
decode :: FromRecord a  
=> HasHeader
-> ByteString
-> Either String (Vector a)
There's only one type parameter, and it's pretty clear from where it appears (in the output) that it is the type that decode is decoding into. So Csv.decode #(String,Integer) is a function that explicitly decodes CSV records to (String, Integer).
So the only change you need to make to your code is to explicitly tell it you want to decode to PersonnelData instead of (String, Integer). Just use Csv.decode #PersonnelData. (You need a FromRecord instance, but you already have provided that by deriving it)

Related

Can aeson handle JSON with imprecise types?

I have to deal with JSON from a service that sometimes gives me "123" instead of 123 as the value of field. Of course this is ugly, but I cannot change the service. Is there an easy way to derive an instance of FromJSON that can handle this? The standard instances derived by means of deriveJSON (https://hackage.haskell.org/package/aeson-1.5.4.1/docs/Data-Aeson-TH.html) cannot do that.
One low-hanging (although perhaps not so elegant) option is to define the property as an Aeson Value. Here's an example:
{-#LANGUAGE DeriveGeneric #-}
module Q65410397 where
import GHC.Generics
import Data.Aeson
data JExample = JExample { jproperty :: Value } deriving (Eq, Show, Generic)
instance ToJSON JExample where
instance FromJSON JExample where
Aeson can decode a JSON value with a number:
*Q65410397> decode "{\"jproperty\":123}" :: Maybe JExample
Just (JExample {jproperty = Number 123.0})
It also works if the value is a string:
*Q65410397> decode "{\"jproperty\":\"123\"}" :: Maybe JExample
Just (JExample {jproperty = String "123"})
Granted, by defining the property as Value this means that at the Haskell side, it could also hold arrays and other objects, so you should at least have a path in your code that handles that. If you're absolutely sure that the third-party service will never give you, say, an array in that place, then the above isn't the most elegant solution.
On the other hand, if it gives you both 123 and "123", there's already some evidence that maybe you shouldn't trust the contract to be well-typed...
Assuming you want to avoid writing FromJSON instances by hand as much as possible, perhaps you could define a newtype over Int with a hand-crafted FromJSON instance—just for handling that oddly parsed field:
{-# LANGUAGE TypeApplications #-}
import Control.Applicative
import Data.Aeson
import Data.Text
import Data.Text.Read (decimal)
newtype SpecialInt = SpecialInt { getSpecialInt :: Int } deriving (Show, Eq, Ord)
instance FromJSON SpecialInt where
parseJSON v =
let fromInt = parseJSON #Int v
fromStr = do
str <- parseJSON #Text v
case decimal str of
Right (i, _) -> pure i
Left errmsg -> fail errmsg
in SpecialInt <$> (fromInt <|> fromStr)
You could then derive FromJSON for records which have a SpecialInt as a field.
Making the field a SpecialInt instead of an Int only for the sake of the FromJSON instance feels a bit intrusive though. "Needs to be parsed in an odd way" is a property of the external format, not of the domain.
In order to avoid this awkwardness and keep our domain types clean, we need a way to tell GHC: "hey, when deriving the FromJSON instance for my domain type, please treat this field as if it were a SpecialInt, but return an Int at the end". That is, we want to deal with SpecialInt only when deserializing. This can be done using the "generic-data-surgery" library.
Consider this type
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
data User = User { name :: String, age :: Int } deriving (Show,Generic)
and imagine we want to parse "age" as if it were a SpecialInt. We can do it like this:
{-# LANGUAGE DataKinds #-}
import Generic.Data.Surgery (toOR', modifyRField, fromOR, Data)
instance FromJSON User where
parseJSON v = do
r <- genericParseJSON defaultOptions v
-- r is a synthetic Data which we must tweak in the OR and convert to User
let surgery = fromOR . modifyRField #"age" #1 getSpecialInt . toOR'
pure (surgery r)
Putting it to work:
{-# LANGUAGE OverloadedStrings #-}
main :: IO ()
main = do
print $ eitherDecode' #User $ "{ \"name\" : \"John\", \"age\" : \"123\" }"
print $ eitherDecode' #User $ "{ \"name\" : \"John\", \"age\" : 123 }"
One limitation is that "generic-data-surgery" works by tweaking Generic representations, so this technique won't work with deserializers generated using Template Haskell.

How to use Haskell "json" package to parse to type [Map String String]?

I've got some sample JSON data like this:
[{
"File:FileSize": "104 MB",
"File:FileModifyDate": "2015:04:11 10:39:00-07:00",
"File:FileAccessDate": "2016:01:17 22:37:23-08:00",
"File:FileInodeChangeDate": "2015:04:26 07:50:50-07:00"
}]
and I'm trying to parse the data using the json package (not aeson):
import qualified Data.Map.Lazy as M
import Text.JSON
content <- readFile "file.txt"
decode content :: Result [M.Map String String]
This gives me an error:
Error "readJSON{Map}: unable to parse array value"
I can get as far as this:
fmap
(map (M.fromList . fromJSObject))
(decode content :: Result [JSObject String])
but it seems like an awfully manual way to do it. Surely the JSON data could be parsed directly into a type [Map String String]. Pointers?
Without MAP_AS_DICT switch, the JSON (MAP a b) instance will be:
instance (Ord a, JSON a, JSON b) => JSON (M.Map a b) where
showJSON = encJSArray M.toList
readJSON = decJSArray "Map" M.fromList
So only JSON array can be parsed to Data.Map, otherwise it will call mkError and terminate.
Due to haskell's restriction on instances, you won't be able to write an instance for JSON (Map a b) yourself, so your current workaround may be the best solution.

Finding a value in ByteString (which is actually JSON)

A web service returns a response as ByteString
req <- parseUrl "https://api.example.com"
res <- withManager $ httpLbs $ configReq req
case (HashMap.lookup "result" $ responseBody res) of .... -- error - responseBody returns ByteString
where
configReq r = --......
To be more specific, responseBody returns data in ByteString, although it's actually valid JSON. I need to find a value in it. Obviously, it would be easier to find it if it was JSON and not ByteString.
If that's the case, how do I convert it to JSON?
UPDATE:
decode $ responseBody resp :: IO (Either String Aeson.Value)
error:
Couldn't match expected type `IO (Either String Value)'
with actual type `Maybe a0'
You'll find several resources for converting bytestring to JSON. The simplest use cases are on the hackage page itself, and the rest you can infer using type signatures of the entities involved.
https://hackage.haskell.org/package/aeson-0.7.0.6/docs/Data-Aeson.html
But here's a super quick intro to JSON with Aeson:
In most languages, you have things like this:
someString = '{ "name" : ["value1", 2] }'
aDict = json.loads(someString)
This is obviously great, because JSON has a nearly one to one mapping with a fundamental data-structure of the language. Containers in most dynamic languages can contain values of any type, and so moving from JSON to data structure is a single step.
However, that is not the case with Haskell. You can't put things of arbitrary types into a container like type (A list, or a dictionary).
So Aeson does a neat thing. It defines an intermediate Haskell type for you, that maps directly to JSON.
A fundamental unit in Aeson is a Value. The Value can contain many things. Like an integer, string, an array, or an object.
https://hackage.haskell.org/package/aeson-0.7.0.6/docs/Data-Aeson.html#t:Value
An aeson array is a Vector (like a list but better) of Values and an aeson object is a HashMap of Text to Values
The next interesting step is that you can define functions that will convert an Aeson value to your Haskell type. This completes the loop. ByteString to Value to a custom type.
So all you do is implement parseJSON and toJSON functions that convert aeson Values to your type and vice-versa. The bit that converts a bytestring into a valid aeson value is implemented by aeson. So the heavy lifting is all done.
Just important to note, that Aeson bytestring is a lazy bytestring, so you might need some strict to lazy helpers.
stringToLazy :: String -> ByteString
stringToLazy x = Data.Bytestring.Lazy.fromChunks [(Data.ByteString.Char8.pack x)]
lazyToString :: ByteString -> String
lazyToString x = Data.ByteString.Char8.unpack $ Data.ByteString.Char8.concat $ Data.ByteString.Lazy.toChunks
That should be enough to get started with Aeson.
--
Common decoding functions with Aeson:
decode :: ByteString -> Maybe YourType
eitherDecode :: ByteString -> Either String YourType.
In your case, you're looking for eitherDecode.

FromJSON custom for custom type

The newest version of Data.Aeson changed the way that ToJSON and FromJSON work for simple types like:
data Permission = Read | Write
It used to be that the generic call:
instance ToJSON Permission where
...Would create JSON that looked like {"Read":[]} or {"Write":[]}.
But now it creates:
{tag:"Read",contents:"[]"}
Which makes sense but breaks code I have written. I wrote a toJSON part by hand to give the correct looking stuff but writing the fromJSON is confusing me.
Any ideas?
Thanks
You could control how datatype with all nullary constructors is encoded using allNullaryToStringTag field on Data.Aeson.Options. Set it to True and it will be encoded simply as string.
import Data.Aeson.Types (Options (..), defaultOptions)
data Permission = Read | Write
$(deriveToJSON (defaultOptions {allNullaryToStringTag = True}) ''Permission)
Take a look at Options definition, it contains other handy fields.
Since the value contained in the Object constructor for Data.Aeson.Value is just a strict HashMap, we can extract the keys from it and make a decision based on that. I tried this and it worked pretty well.
{-# LANGUAGE OverloadedStrings #-}
module StackOverflow where
import Data.Aeson
import Control.Monad
import Data.HashMap.Strict (keys)
data Permission = Read | Write
instance FromJSON Permission where
parseJSON (Object v) =
let ks = keys v
in case ks of
["Read"] -> return Read
["Write"] -> return Write
_ -> mzero
parseJSON _ = mzero
You can test it with decode "{\"Read\": []}" :: Maybe Permission. The mzero in parseJSON ensures that if something else is passed in, it'll just return Nothing. Since you seem to want to only check if there is a single key matching one of your two permissions, this is pretty straightforward and will properly return Nothing on all other inputs.

How do I use the json library?

I'm trying to figure out Haskell's json library. However, I'm running into a bit of an issue in ghci:
Prelude> import Text.JSON
Prelude Text.JSON> decode "[1,2,3]"
<interactive>:1:0:
Ambiguous type variable `a' in the constraint:
`JSON a' arising from a use of `decode' at <interactive>:1:0-15
Probable fix: add a type signature that fixes these type variable(s)
I think this has something to do with the a in the type signature:
decode :: JSON a => String -> Result a
Can someone show me:
How to decode a string?
What's going with the type system here?
You need to specify which type you want to get back, like this:
decode "[1,2,3]" :: Result [Integer]
-- Ok [1,2,3]
If that line was part of a larger program where you would go on and use the result of decode the type could just be inferred, but since ghci doesn't know which type you need, it can't infer it.
It's the same reason why read "[1,2,3]" doesn't work without a type annotation or more context.
The decode function is defined as follows:
decode :: JSON a => String -> Result a
In a real program, the type inference engine can usually figure out what type to expect from decode. For example:
userAge :: String -> Int
userAge input = case decode input of
Result a -> a
_ -> error $ "Couldn't parse " ++ input
In this case, the type of userAge causes the typechecker to infer that decode's return value, in this particular case, is Result Int.
However, when you use decode in GHCi, you must specify the type of the value, e.g.:
decode "6" :: Result Int
=> Ok 6
A quick glance at the docs seems to suggest that the purpose of this function is to allow you to read JSON into any Haskell data structure of a supported type, so
decode "[1, 2, 3]" :: Result [Int]
ought to work