nltk ChartParser doesnot show any result - nltk

I am trying to recreate the example shown first in (http://www.nltk.org/book/ch08.html), but with different sentence and grammar structure.
However, when I run the following program, I cannot see any result produced by parser.parse(). No error, no anything.
What am I missing here?
from nltk import pos_tag, word_tokenize, CFG
import nltk
grammar = nltk.CFG.fromstring("""
DT -> 'Any' | 'any'
DT -> 'the'
JJ -> 'habitable'
NNS -> 'areas' | 'Scientists'
NN -> 'border' | 'planet' | 'region'
IN -> 'in' | 'that' | 'on'
VPB -> 'are' | 'think'
S -> NP VP
NP -> NP PP
NP -> DT NG
NP -> NNS
NG -> JJ NG
NG -> NN
NG -> NNS
NG -> NN NN
VP -> VBP PP
PP -> IN NP
""")
sent = "Scientists think that any habitable areas on the planet are in the border region".split()
parser = nltk.ChartParser(grammar)
for tree in parser.parse(sent):
print(tree)

Related

Haskell cassava (Data.Csv): Carry along additional columns

I have two .csv files
A.csv:
A,B,C,D,E
1,2,3,4,5
5,4,3,2,1
B.csv
A,E,B,C,F
6,7,8,9,1
4,3,4,5,6
I would like to read them in Haskell with strict parsing rules for the variables A, B and C. I would then like to apply complex merge and filter operations to the rows of A.csv and B.csv and create a file C.csv from the result. The code block at the end of this post essentially covers this functionality.
Question:
I would now like to do all of this while keeping the variables D, E and F around. In my real dataset I have an unknown and arbitrary number of such additional columns. I can not easily represent them in the respective data type (ABC below). All of them should stay and be properly represented in the output dataset.
With the code below, C.csv looks like this:
A,B,C
1,2,3
5,4,3
6,8,9
4,4,5
I would instead like to have a result like this:
A,B,C,D,E,F
1,2,3,4,5,_
5,4,3,2,1,_
6,8,9,_,7,1
4,4,5,_,3,6
Is there a way to do this with cassava? Do I have to write a custom parser from scratch to get this functionality? How would I go about this?
This example code lacks the desired feature. It is a self-contained stack script.
#!/usr/bin/env stack
-- stack --resolver lts-18.7 script --package cassava,bytestring,vector
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards #-}
import qualified Data.ByteString.Lazy as B
import qualified Data.Csv as C
import qualified Data.Vector as V
data ABC = ABC {a :: Int, b :: Int, c :: Int} deriving Show
instance C.FromNamedRecord ABC where
parseNamedRecord m =
ABC <$> m C..: "A" <*> m C..: "B" <*> m C..: "C"
instance C.ToNamedRecord ABC where
toNamedRecord ABC {..} =
C.namedRecord ["A" C..= a, "B" C..= b, "C" C..= c]
decodeABC :: B.ByteString -> [ABC]
decodeABC x =
case C.decodeByName x of
Left err -> error err
Right (_,xs) -> V.toList xs
header :: C.Header
header = V.fromList ["A", "B", "C"]
main :: IO ()
main = do
fileA <- B.readFile "A.csv"
fileB <- B.readFile "B.csv"
let decodedA = decodeABC fileA
let decodedB = decodeABC fileB
putStrLn $ show decodedA
putStrLn $ show decodedB
B.writeFile "C.csv" $ C.encodeByName header (decodedA ++ decodedB)
This code includes the desired feature (thanks to the input of #Daniel Wagner):
#!/usr/bin/env stack
-- stack --resolver lts-18.7 script --package cassava,bytestring,vector,unordered-containers
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards #-}
import qualified Data.ByteString.Lazy as B
import qualified Data.Csv as C
import qualified Data.HashMap.Strict as HM
import qualified Data.Vector as V
data ABC = ABC {a :: Int, b :: Int, c :: Int, addCols :: C.NamedRecord} deriving Show
abcDefinedCols = ["A", "B", "C"]
abcRefHashMap = HM.fromList $ map (\x -> (x, ())) abcDefinedCols
instance C.FromNamedRecord ABC where
parseNamedRecord m =
pure ABC
<*> m C..: "A"
<*> m C..: "B"
<*> m C..: "C"
<*> pure (m `HM.difference` abcRefHashMap)
instance C.ToNamedRecord ABC where
toNamedRecord m =
(addCols m) `HM.union` C.namedRecord ["A" C..= a m, "B" C..= b m, "C" C..= c m]
decodeABC :: B.ByteString -> [ABC]
decodeABC x =
case C.decodeByName x of
Left err -> error err
Right (_,xs) -> V.toList xs
makeCompleteHeader :: [ABC] -> C.Header
makeCompleteHeader ms = V.fromList $ abcDefinedCols ++ HM.keys (HM.unions (map addCols ms))
combineABCs :: [ABC] -> [ABC] -> [ABC]
combineABCs xs1 xs2 =
let simpleSum = xs1 ++ xs2
addColKeys = HM.keys (HM.unions (map addCols simpleSum))
toAddHashMap = HM.fromList (map (\k -> (k, "n/a")) addColKeys)
in map (\x -> x { addCols = fillAddCols (addCols x) toAddHashMap }) simpleSum
where
fillAddCols :: C.NamedRecord -> C.NamedRecord -> C.NamedRecord
fillAddCols cur toAdd = HM.union cur (toAdd `HM.difference` cur)
main :: IO ()
main = do
fileA <- B.readFile "A.csv"
fileB <- B.readFile "B.csv"
let decodedA = decodeABC fileA
let decodedB = decodeABC fileB
putStrLn $ show decodedA
putStrLn $ show decodedB
let ab = combineABCs decodedA decodedB
B.writeFile "C.csv" $ C.encodeByName (makeCompleteHeader ab) ab
data ABCPlus = ABCPlus { a :: Int, b :: Int, c :: Int, d :: NamedRecord } deriving Show
instance FromNamedRecord ABCPlus where
parseNamedRecord m = pure ABC
<*> m .: "A"
<*> m .: "B"
<*> m .: "C"
<*> pure m -- or perhaps: pure (m `HM.difference` HM.fromList [("A", ()), ("B", ()), ("C", ())])
instance ToNamedRecord ABCPlus where
toNamedRecord m = d m -- or perhaps: d m `HM.union` namedRecord ["A" .= a m, "B" .= b m, "C" .= c m]
headers :: [ABCPlus] -> Header
headers ms = header $ ["A", "B", "C"] ++ HM.keys (relevant combined) where
relevant m = m `HM.difference` HM.fromList [("A", ()), ("B", ()), ("C", ())] -- or perhaps: m
combined = HM.unions [relevantKeys (d m) | m <- ms]

Proper JsonObjCodec with Fleece for Complex, Nested Discriminated Unions

I have the following DU which is composed of other DUs or/and Records.
type BiometricRules =
| Age of Comparator * AgeMeasure
| Glycemia of Comparator * BiometricMeasure
| Biometric of BiometricType * Comparator * BiometricMeasure
| Sex of SexMeasure
| MedicalCondition of MedicalCondition
| Score of ScoreType * Comparator * ScoreMeasure
While trying to deserialize and serialize with Fleece, I have written the following JsonObjCodec.
with
static member JsonObjCodec =
Age <!> jreq "Age" (function Age (comp, ageMeasure) -> Some (comp |> string, ageMeasure |> string) | _ -> None)
<|> (Glycemia <!> jreq "Glycemia" (function Glycemia (comp, bioMeasure) -> Some (comp |> string, bioMeasure) | _ -> None))
<|> (Biometric <!> jreq "BiometricRule" (function Biometric (bt, comp, bm) -> Some (bt |> string, comp |> string, bm) | _ -> None))
<|> (Sex <!> jreq "Sex" (function Sex s -> Some (s |> string) | _ -> None))
<|> (BiometricRules.MedicalCondition <!> jreq "MedicalCondition" (function BiometricRules.MedicalCondition x -> Some (x) | _ -> None))
<|> (Score <!> jreq "Score" (function Score (st, comp, scoreMeasure) -> Some (st |> string, comp |> string, scoreMeasure) | _ -> None))
For unknown reason it does not compile with error No overloads match for method 'Map'. All the nested DUs or Records have either a JsonObjCodec or static FromString and ToString methods defined.
Any solution with respect to how I could solve this via Fleece would be appreciated. The library is already deeply used in the project, so changing it would involve too much refactoring.
Below I copy pasted the definition of the other DU and Records, as reference:
type Comparator =
| GreaterThan
| LowerThan
| LowerThanOrEqual
| GreaterThanOrEqual
| EqualTo
with
override this.ToString() =
match this with
| GreaterThan -> ">"
| LowerThan -> "<"
| LowerThanOrEqual -> "<="
| GreaterThanOrEqual -> ">="
| EqualTo -> "="
static member FromString s =
match s with
| ">" -> GreaterThan
| "<" -> LowerThan
| ">=" -> GreaterThanOrEqual
| "<=" -> LowerThanOrEqual
| "=" -> EqualTo
| _ -> failwith "Not a valid comparator."
type AgeMeasure =
| Years of decimal
| Months of decimal
| Weeks of decimal
with
override this.ToString() =
match this with
| Years y -> string y + " years"
| Months m -> string m + " months"
| Weeks w -> string w + " weeks"
static member FromString (s: string) =
match s with
| _ when s.EndsWith("years") -> Years (Decimal.Parse(s.Replace("years", "")))
| _ when s.EndsWith("months") -> Months (Decimal.Parse(s.Replace("months", "")))
| _ when s.EndsWith("weeks") -> Weeks (Decimal.Parse(s.Replace("weeks", "")))
type BiometricMeasure = {
Value: decimal
UoM: string option
} with
static member JsonObjCodec =
fun va uom -> {
Value = va
UoM = if uom = "NA" then None else Some uom
}
<!> jreq "Value" (Some << fun bm -> bm.Value)
<*> jreq "UoM" (Some << fun bm -> if bm.UoM |> Option.isNone then "NA" else bm.UoM |> Option.get)
type BiometricType =
| SBP
| DBP
| Glycemia
| Specified of string
with
override this.ToString() =
match this with
| SBP -> "SBP"
| DBP -> "DBP"
| Glycemia -> "Glycemia"
| Specified s -> s
static member FromString s =
match s with
| "SBP" -> SBP
| "DBP" -> DBP
| "Glycemia" -> Glycemia
| _ -> Specified s
type SexMeasure =
| Female
| Male
| Other of string
with
override this.ToString() =
match this with
| Female -> "Female"
| Male -> "Male"
| Other s -> s
static member FromString (s: string) =
match s.ToLower() with
| "Female" -> Female
| "Male" -> Male
| other -> Other other
type MedicalCondition =
| ICD of ICD
| Other of string
with
static member JsonObjCodec =
ICD <!> jreq "MedicalCondition" (function ICD v -> Some v | _ -> None)
<|> (Other <!> jreq "MedicalCondition" (function Other v -> Some v | _ -> None))
type ScoreType =
| BMI
| Other of string
with
override this.ToString() =
match this with
| BMI -> "BMI"
| Other s -> s
static member FromString s =
match s with
| "BMI" -> BMI
| _ -> Other s
type ScoreMeasure = decimal
Libraries Used:
<PackageReference Update="FSharp.Core" Version="4.7" />
<PackageReference Include="FSharpPlus" Version="1.1.1" />
<PackageReference Include="Newtonsoft.Json" Version="12.0.3" />
<PackageReference Include="Fleece.NewtonsoftJson" Version="0.8.0" />
<PackageReference Include="FSharp.Data" Version="3.3.3" />
The Problem
Fleece provides Json codecs, not string codecs, so defining ToString and FromString is not the way to go, unless you need them for other stuff.
The solution
Define ToJson and OfJson for your internal DUs. Then remove all the |> string fragments in JsonObjCodec body.
Here's a quick and dirty example (I advise error handling to be improved) for Comparator :
static member ToJson x = JString (string x)
static member OfJson x =
match x with
| JString x -> Ok (Comparator.FromString x)
| _ -> Error (Uncategorized "JString expected")
Alternative solution
Leave all your internal DUs like this, but add the missing "parse" section in your JsonObjCodec:
...
with
static member JsonObjCodec =
(fun (a, b) -> Age (Comparator.FromString a, AgeMeasure.FromString b)) <!> jreq "Age" (function Age (comp, ageMeasure) -> Some (comp |> string, ageMeasure |> string) | _ -> None)
<|> ...
this becomes a bit verbose but will do the job.
Tips
Instead of using the <|> operator to add codecs, you colud use the jchoice combinator, it will read better.
If you really need your String / FromString methods I would suggest renaming FromString to Parse or renaming it to TryParse and returning an option type. This way you can take advantage of FSharpPlus tryParse function.
Also, if you are using the string/parse pattern everywhere it might worth to create a codec combinator that works from transforming to/from strings. This is not an easy task but it might worth the mental effort.
For debugging stuff like this purpose, try not to open the namespace FSharpPlus as it contains generic definitions of operators like <|>, <!> and <*>, this way you'll get better compile error messages.

Lifting a function with another function as an argument in Haskell

So I have a function in Haskell that I've simplified for the purpose of asking this question:
import Data.Foldable
import Data.Set
myFn :: Int -> Set Int
myFn a
| a <= 0 = singleton 1
| otherwise = foldMap helper (myFn (a - 1))
helper :: Int -> Set Int
helper a = insert (a + 2) (singleton a)
main :: IO ()
main = print . Data.Set.toList $ myFn 5
I want to have myFn's dependency on helper to be put into a Reader, since inversion of control allows me to switch implementations in my tests:
import Control.Monad.Reader
import Data.Foldable
import Data.Set
data MyEnv = MyEnv { helper' :: Int -> Set Int }
type MyReader = Reader MyEnv
myFn :: Int -> MyReader (Set Int)
myFn a
| a <= 0 = return $ singleton 1
| otherwise = do
myFn' <- myFn (a - 1)
helper'' <- asks helper'
return (foldMap helper'' myFn')
helper :: Int -> Set Int
helper a = insert (a + 2) (singleton a)
main :: IO ()
main =
let
myEnv = MyEnv helper
in
print . Data.Set.toList $ runReader (myFn 5) myEnv
This works fine, except I don't like these three lines in particular:
myFn' <- myFn (a - 1)
helper'' <- asks helper'
return (foldMap helper'' myFn')
I feel like there should be a way to lift foldMap in the same way as mapM is a lifted version of map through its composition with sequence. Ideally, I would like those three lines to collapse down to one:
foldMapM helper'' (partitions (n - 1))
Assuming that: helper'' :: Int -> MyReader (Set Int)
This would of course require a foldMapM function with a signature similar to:
foldMapM
:: (Monad m, Foldable t, Monoid n)
=> (a -> m n)
-> m (t a)
-> m n
I have tried so many things, but I just cannot seem to implement this function, though! Can anyone help?
Basically, you would like to create Monad m => m a -> m b -> m c from a -> b -> c. That's exactly what liftM2 (from Control.Monad) does:
liftM2 :: Monad m => (a1 -> a2 -> r) -> m a1 -> m a2 -> m r
Promote a function to a monad, scanning the monadic arguments from
left to right. For example,
liftM2 (+) [0,1] [0,2] = [0,2,1,3]
liftM2 (+) (Just 1) Nothing = Nothing
Therefore, it's as simple as using liftM2 foldMap:
myFn :: Int -> MyReader (Set Int)
myFn a
| a <= 0 = return $ singleton 1
| otherwise = liftM2 foldMap (asks helper') (myFn (a - 1))
Alternatively you can use <$> and <*> from Control.Applicative if you don't like additional parentheses:
myFn :: Int -> MyReader (Set Int)
myFn a
| a <= 0 = return $ singleton 1
| otherwise = foldMap <$> asks helper' <*> myFn (a - 1)
For more information, have a look at the Typeclassopedia.

haskell i/o putstrln error when calling function

I have created a film database and functions related to the database.
I am now creating a demo function where it gives results of particular functions carried out when the number is pressed in ghci. For example, when demo 2 is typed, it shows all the films in the database.
I have managed to create most the demo functions however, i am having problem with 3 of them and keep being displayed with errors. I have commented out the ones which do not work and need help understanding what the problem is.
I have included all the functions i have created with the demo function below.
import Data.List
import Text.Printf
import Data.Ord
import Data.Char
type Rating = (String, Int)
type Title = String
type Director = String
type Year = Int
type Film = (Title, Director, Year,[Rating])
testDatabase :: [Film]
testDatabase = [("Blade Runner","Ridley Scott",1982,[("Amy",6), ("Bill",9), ("Ian",7), ("Kevin",9), ("Emma",4), ("Sam",5), ("Megan",4)]),
("The Fly","David Cronenberg",1986,[("Megan",4), ("Fred",7), ("Chris",5), ("Ian",0), ("Amy",5)]),
("Psycho","Alfred Hitchcock",1960,[("Bill",4), ("Jo",4), ("Garry",8), ("Kevin",7), ("Olga",8), ("Liz",10), ("Ian",9)]),
("Body Of Lies","Ridley Scott",2008,[("Sam",3), ("Neal",7), ("Kevin",2), ("Chris",5), ("Olga",6)]),
("Avatar","James Cameron",2009,[("Olga",2), ("Wally",8), ("Megan",9), ("Tim",5), ("Zoe",8), ("Emma",3)]),
("Titanic","James Cameron",1997,[("Zoe",7), ("Amy",2), ("Emma",5), ("Heidi",3), ("Jo",8), ("Megan",5), ("Olga",7), ("Tim",10)]),
("The Departed","Martin Scorsese",2006,[("Heidi",2), ("Jo",8), ("Megan",5), ("Tim",2), ("Fred",5)]),
("Aliens","Ridley Scott",1986,[("Fred",8), ("Dave",6), ("Amy",10), ("Bill",7), ("Wally",2), ("Zoe",5)]),
("Prometheus","Ridley Scott",2012,[("Garry",3), ("Chris",4), ("Emma",5), ("Bill",1), ("Dave",3)]),
("E.T. The Extra-Terrestrial","Steven Spielberg",1982,[("Ian",7), ("Amy",2), ("Emma",7), ("Sam",8), ("Wally",5), ("Zoe",6)]),
("The Birds","Alfred Hitchcock",1963,[("Garry",7), ("Kevin",9), ("Olga",4), ("Tim",7), ("Wally",3)]),
("Goodfellas","Martin Scorsese",1990,[("Emma",7), ("Sam",9), ("Wally",5), ("Dave",3)]),
("The Shawshank Redemption","Frank Darabont",1994,[("Jo",8), ("Sam",10), ("Zoe",3), ("Dave",7), ("Emma",3), ("Garry",10), ("Kevin",7)]),
("Gladiator","Ridley Scott",2000,[("Garry",7), ("Ian",4), ("Neal",6), ("Wally",3), ("Emma",4)]),
("The Green Mile","Frank Darabont",1999,[("Sam",3), ("Zoe",4), ("Dave",8), ("Wally",5), ("Jo",5)]),
("True Lies","James Cameron",1994,[("Dave",3), ("Kevin",4), ("Jo",0)]),
("Minority Report","Steven Spielberg",2002,[("Dave",5), ("Garry",6), ("Megan",2), ("Sam",7), ("Wally",8)]),
("The Wolf of Wall Street","Martin Scorsese",2013,[("Dave",6), ("Garry",6), ("Megan",0), ("Sam",4)]),
("War Horse","Steven Spielberg",2011,[("Dave",6), ("Garry",6), ("Megan",3), ("Sam",7), ("Wally",8), ("Zoe",8)]),
("Lincoln","Steven Spielberg",2012,[("Ian",3), ("Sam",7), ("Wally",3), ("Zoe",4), ("Liz",7), ("Megan",4)]),
("Vertigo","Alfred Hitchcock",1958,[("Bill",7), ("Emma",5), ("Zoe",9), ("Olga",6), ("Tim",10)]),
("The Terminal","Steven Spielberg",2004,[("Olga",3), ("Heidi",8), ("Bill",2), ("Sam",6), ("Garry",8)]),
("Jaws","Steven Spielberg",1975,[("Fred",3), ("Garry",0), ("Jo",3), ("Neal",9), ("Emma",7)]),
("Hugo","Martin Scorsese",2011,[("Sam",4), ("Wally",3), ("Zoe",4), ("Liz",7)])]
------------------------------------------------------------
-----------------FUNCTIONAL CODE----------------------------
------------------------------------------------------------
--when adding need to be addFilm string string int and the list name called testDatabase
addFilm :: String -> String -> Int -> [Film] -> [Film]
addFilm title director year database = (title, director, year, []) : database
--Some functions needed later on:
averageFilmRating :: [(String,Int)] -> Float
averageFilmRating ratings
= (fromIntegral(sum $ map snd ratings)) / (fromIntegral(length ratings))
--Formats the films for decimal, gives average rating of films instead of all users ratings.
formatFilmOutput :: Film -> String
formatFilmOutput (title, director, year, rating)
= printf "%s by %s. Year: %d, Average Rating: %.1f" (title) (director) (year) (averageFilmRating rating)
--Shows all films in the database
displayAllFilm :: [String]
displayAllFilm = map formatFilmOutput testDatabase
--Shows films by director name
displayByDirector :: String -> [Film]
displayByDirector name
= filter(\(_,director,_,_) -> director == name) testDatabase
--Gives the average of directors films
directorAverage :: String -> Float
directorAverage dir
= averageFilmRating [rating | (title, director, year, ratings) <- displayByDirector dir, rating <- ratings]
--These two functions give the films rated of average 6 or over
filmsRated :: Int -> [Film]
filmsRated rating
= filter(\(_,_,_,a) -> averageFilmRating a >= fromIntegral rating) testDatabase
filmsaveragesix = filmsRated 6
--Shows what films the user has rated.
userRatedFilms :: String -> [Film]
userRatedFilms username
= filter ((username `elem`) . (\(_,_,_,xs) -> map fst xs)) testDatabase
-- Allows user to rate or re-rate film.
databaseNoFilm:: [Film] -> Title -> [Film]
databaseNoFilm database t = [(title, director, year, ratings) | (title, director, year, ratings) <- database, title /= t]
rateFilm :: [Film] -> Title -> Rating -> [Film]
rateFilm database findtitle (u, r) = databaseNoFilm database findtitle ++ [(title,director,year,(u, r):[(username,rtg) | (username,rtg) <- ratings, username /= u]) | (title, director, year, ratings) <- database, title == findtitle]
--Displays films by year in descending order of rating
filmsByYear :: Int -> [Film]
filmsByYear year = sortFilms $ filter(\(_,_,yr,_) -> yr >= year) testDatabase
sortFilms :: [Film] -> [Film]
sortFilms = sortBy $ flip $ comparing averageFilmRating'
where
averageFilmRating' (_,_,_,rs) = averageFilmRating rs
------------------------------------------------------------
-----------------DEMO FUNCTION------------------------------
------------------------------------------------------------
demo :: Int -> IO ()
demo choice = do
case choice of
-- 1 -> do
-- putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase
2 -> do
putStrLn (unlines displayAllFilm)
3 -> do
putStrLn (unlines (map formatFilmOutput(displayByDirector "James Cameron")))
4 -> do
putStrLn (unlines (map formatFilmOutput(filmsaveragesix)))
-- Get the director average of James Cameron
-- 5 -> do
-- putStrLn (directorAverage "James Cameron")
6 -> do
putStrLn (unlines (map formatFilmOutput(userRatedFilms "Zoe")))
-- all films after Zoe rates "Jaws" 8
-- 7 -> do
-- putStrLn rateFilm testDatabase "Jaws" ("Zoe", 8)
-- 77 all films after Zoe rates "Vertigo" 3
8 -> do
putStrLn (unlines (map formatFilmOutput(filmsByYear 2009)))
The problem here relates to the demo functions which have been commented out. When uncommented and run the error for demo 1 relates to:
Couldn't match type `String -> String -> Int -> [Film] -> [Film]'
with `[Char]'
Expected type: String
Actual type: String -> String -> Int -> [Film] -> [Film]
In the first argument of `putStrLn', namely `addFilm'
In a stmt of a 'do' block:
putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase
In the expression:
do { putStrLn
addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase }
It is a similar error for the rest of the commented demo functions
The problem you were having has to do with the line
putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase
To the compiler, this looks like you're trying to apply 5 arguments to putStrLn, which is a function that only takes 1 argument, hence the compiler error. If you were to do
putStrLn (addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase)
Or equivalently (and prettier)
putStrLn $ addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase
Then you're applying the 4 arguments to addFilm, then applying that result to putStrLn. This still won't compile because addFilm returns a list of Films, not a String which is what putStrLn expects. You can instead use print, which is defined as
print :: Show a => a -> IO ()
print x = putStrLn $ show x
Or you could use your formatFilmOutput function:
putStrLn $ unlines $ map formatFilmOutput $ addFilm "Gravity" ...
and unlines converts the list of strings into a single string joined by new lines.
In case you're wondering what the $ operator is, it's literally defined as
($) :: (a -> b) -> a -> b
($) = id
infixr 0 $
The important part of the definition is the infixr 0 $ line. This means that it's right associative with a fixity of 0, which is the lowest precedence. Function application has a precedence of 9, meaning it always takes priority over operators, so
add1 x * 2
Is always the same as
(add1 x) * 2
The $ operator just acts as an alternative to parentheses. You can use it as
f $ g $ h $ i $ j $ k $ l $ m x
-- f through m are functions of one argument
Which means apply x to m, then apply that to l, then to k, then to j, and so on, or you could write it as
f (g (h (i (j (k (l (m x)))))))
Which isn't fun to balance parentheses for. This only works for applying the last argument to a function, not any of the middle ones. So
add = (+)
add $ 1 + 2 $ 3 + 4
won't work, that parses as
add (1 + 2 (3 + 4))
add (3 (7))
add (3 7)
Which just doesn't make sense.

Haskell - correct way to map BSON to JSON - where to put code

So, I'm new to Haskell, and its community. I want to make a mongodb-backed JSON API. Mongo and JSON are a good fit (at least in node), because it stores its documents in BSON, which is "binary json", so it theory it's easy to convert it to JSON.
After many mistakes, I managed to write the following code.
{-# LANGUAGE OverloadedStrings, ExtendedDefaultRules #-}
-- https://github.com/mailrank/aeson/blob/master/examples/Demo.hs
-- cabal install aeson
-- cabal install mongoDb
import Data.Aeson
import qualified Data.Aeson.Types as T
import Data.Attoparsec (parse, Result(..))
import Data.Attoparsec.Number (Number(..))
import qualified Data.Text as Text
import Control.Applicative ((<$>))
import Control.Monad (mzero)
import qualified Data.ByteString.Char8 as BS
-- Aeson's "encode" to JSON generates lazy bytestrings
import qualified Data.ByteString.Lazy.Char8 as BSL
import qualified Data.CompactString as CS
import Database.MongoDB
import Data.Bson
import qualified Data.Bson as Bson
import qualified Data.Vector
import GHC.Int
-- Is there a better way to convert between string representations?
csToTxt :: UString -> Text.Text
csToTxt cs = Text.pack $ CS.unpack cs
bsToTxt :: BS.ByteString -> Text.Text
bsToTxt bs = Text.pack $ BS.unpack bs
fieldToPair :: Field -> T.Pair
fieldToPair f = key .= val
where key = csToTxt $ label f
val = toJSON (value f)
instance ToJSON Field where
toJSON f = object [fieldToPair f]
-- Is this what I'm supposed to do? Just go through and map everything?
instance ToJSON Data.Bson.Value where
toJSON (Float f) = T.Number $ D f
toJSON (Bson.String s) = T.String $ csToTxt s
toJSON (Bson.Array xs) = T.Array $ Data.Vector.fromList (map toJSON xs)
toJSON (Doc fs) = object $ map fieldToPair fs
toJSON (Uuid (UUID bs)) = T.String $ bsToTxt bs
toJSON (Bson.Bool b) = T.Bool b
toJSON (Int32 i) = T.Number (I (fromIntegral i))
toJSON (Int64 i) = T.Number (I (fromIntegral i))
toJSON (ObjId (Oid w32 w64)) = T.String "look up GHC.Word.Word32 and GHC.Word.Word64"
toJSON (UTC time) = T.String "look up Data.Time.Clock.UTC.UTCTime"
toJSON (Md5 m) = T.Null
toJSON (UserDef u) = T.Null
toJSON (Bin b) = T.Null
toJSON (Fun f) = T.Null
toJSON Bson.Null = T.Null
toJSON (RegEx r) = T.Null
toJSON (JavaScr j) = T.Null
toJSON (Sym s) = T.Null
toJSON (Stamp s) = T.Null
toJSON (MinMax mm) = T.Null
-- Data.Bson.Value and T.Value for reference
-- data Data.Bson.Value
-- = Float Double
-- | Data.Bson.String UString
-- | Doc Document
-- | Data.Bson.Array [Data.Bson.Value]
-- | Bin Binary
-- | Fun Function
-- | Uuid UUID
-- | Md5 MD5
-- | UserDef UserDefined
-- | ObjId ObjectId
-- | Data.Bson.Bool Bool
-- | UTC time-1.2.0.3:Data.Time.Clock.UTC.UTCTime
-- | Data.Bson.Null
-- | RegEx Regex
-- | JavaScr Javascript
-- | Sym Symbol
-- | Int32 GHC.Int.Int32
-- | Int64 GHC.Int.Int64
-- | Stamp MongoStamp
-- | MinMax MinMaxKey
-- data T.Value
-- = Object Object
-- | T.Array Array
-- | T.String Text.Text
-- | Number Data.Attoparsec.Number.Number
-- | T.Bool !Bool
-- | T.Null
main ::IO ()
main = do
putStrLn $ "testing again: " ++ BSL.unpack (encode ["Hello", "I", "am", "angry"])
let field = "key" =: "value"
print field
print $ label field
putStrLn $ CS.unpack $ label field
putStrLn $ show "asdf"
-- Getting close
putStrLn $ "testing again: " ++ BSL.unpack (encode ["hello" =: "world", "num" =: 10.05, "num2" =: 5, "sub" =: ["doc","charlie"], "bool" =: False])
putStrLn $ "testing again: " ++ BSL.unpack (encode ["hello" =: "world", "sub" =: ["one" =: 1, "two" =: 2]])
Is there a better way to map between two types that are as similar as these are?
Is there a better way to map between the two string implementations?
Once I finish this, where should it live? Does it belong in either the JSON or BSON/MongoDB projects, or should it be published as its own module?
For the benefit of people finding this now, this has been done in: https://hackage.haskell.org/package/AesonBson. Looks like the same approach.
From Cale on #haskell:
Do either of those packages depend on the other already? If not, then probably you'd want to go with the third package option.