nltk ChartParser doesnot show any result - nltk
I am trying to recreate the example shown first in (http://www.nltk.org/book/ch08.html), but with different sentence and grammar structure.
However, when I run the following program, I cannot see any result produced by parser.parse(). No error, no anything.
What am I missing here?
from nltk import pos_tag, word_tokenize, CFG
import nltk
grammar = nltk.CFG.fromstring("""
DT -> 'Any' | 'any'
DT -> 'the'
JJ -> 'habitable'
NNS -> 'areas' | 'Scientists'
NN -> 'border' | 'planet' | 'region'
IN -> 'in' | 'that' | 'on'
VPB -> 'are' | 'think'
S -> NP VP
NP -> NP PP
NP -> DT NG
NP -> NNS
NG -> JJ NG
NG -> NN
NG -> NNS
NG -> NN NN
VP -> VBP PP
PP -> IN NP
""")
sent = "Scientists think that any habitable areas on the planet are in the border region".split()
parser = nltk.ChartParser(grammar)
for tree in parser.parse(sent):
print(tree)
Related
Haskell cassava (Data.Csv): Carry along additional columns
I have two .csv files A.csv: A,B,C,D,E 1,2,3,4,5 5,4,3,2,1 B.csv A,E,B,C,F 6,7,8,9,1 4,3,4,5,6 I would like to read them in Haskell with strict parsing rules for the variables A, B and C. I would then like to apply complex merge and filter operations to the rows of A.csv and B.csv and create a file C.csv from the result. The code block at the end of this post essentially covers this functionality. Question: I would now like to do all of this while keeping the variables D, E and F around. In my real dataset I have an unknown and arbitrary number of such additional columns. I can not easily represent them in the respective data type (ABC below). All of them should stay and be properly represented in the output dataset. With the code below, C.csv looks like this: A,B,C 1,2,3 5,4,3 6,8,9 4,4,5 I would instead like to have a result like this: A,B,C,D,E,F 1,2,3,4,5,_ 5,4,3,2,1,_ 6,8,9,_,7,1 4,4,5,_,3,6 Is there a way to do this with cassava? Do I have to write a custom parser from scratch to get this functionality? How would I go about this? This example code lacks the desired feature. It is a self-contained stack script. #!/usr/bin/env stack -- stack --resolver lts-18.7 script --package cassava,bytestring,vector {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} import qualified Data.ByteString.Lazy as B import qualified Data.Csv as C import qualified Data.Vector as V data ABC = ABC {a :: Int, b :: Int, c :: Int} deriving Show instance C.FromNamedRecord ABC where parseNamedRecord m = ABC <$> m C..: "A" <*> m C..: "B" <*> m C..: "C" instance C.ToNamedRecord ABC where toNamedRecord ABC {..} = C.namedRecord ["A" C..= a, "B" C..= b, "C" C..= c] decodeABC :: B.ByteString -> [ABC] decodeABC x = case C.decodeByName x of Left err -> error err Right (_,xs) -> V.toList xs header :: C.Header header = V.fromList ["A", "B", "C"] main :: IO () main = do fileA <- B.readFile "A.csv" fileB <- B.readFile "B.csv" let decodedA = decodeABC fileA let decodedB = decodeABC fileB putStrLn $ show decodedA putStrLn $ show decodedB B.writeFile "C.csv" $ C.encodeByName header (decodedA ++ decodedB) This code includes the desired feature (thanks to the input of #Daniel Wagner): #!/usr/bin/env stack -- stack --resolver lts-18.7 script --package cassava,bytestring,vector,unordered-containers {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} import qualified Data.ByteString.Lazy as B import qualified Data.Csv as C import qualified Data.HashMap.Strict as HM import qualified Data.Vector as V data ABC = ABC {a :: Int, b :: Int, c :: Int, addCols :: C.NamedRecord} deriving Show abcDefinedCols = ["A", "B", "C"] abcRefHashMap = HM.fromList $ map (\x -> (x, ())) abcDefinedCols instance C.FromNamedRecord ABC where parseNamedRecord m = pure ABC <*> m C..: "A" <*> m C..: "B" <*> m C..: "C" <*> pure (m `HM.difference` abcRefHashMap) instance C.ToNamedRecord ABC where toNamedRecord m = (addCols m) `HM.union` C.namedRecord ["A" C..= a m, "B" C..= b m, "C" C..= c m] decodeABC :: B.ByteString -> [ABC] decodeABC x = case C.decodeByName x of Left err -> error err Right (_,xs) -> V.toList xs makeCompleteHeader :: [ABC] -> C.Header makeCompleteHeader ms = V.fromList $ abcDefinedCols ++ HM.keys (HM.unions (map addCols ms)) combineABCs :: [ABC] -> [ABC] -> [ABC] combineABCs xs1 xs2 = let simpleSum = xs1 ++ xs2 addColKeys = HM.keys (HM.unions (map addCols simpleSum)) toAddHashMap = HM.fromList (map (\k -> (k, "n/a")) addColKeys) in map (\x -> x { addCols = fillAddCols (addCols x) toAddHashMap }) simpleSum where fillAddCols :: C.NamedRecord -> C.NamedRecord -> C.NamedRecord fillAddCols cur toAdd = HM.union cur (toAdd `HM.difference` cur) main :: IO () main = do fileA <- B.readFile "A.csv" fileB <- B.readFile "B.csv" let decodedA = decodeABC fileA let decodedB = decodeABC fileB putStrLn $ show decodedA putStrLn $ show decodedB let ab = combineABCs decodedA decodedB B.writeFile "C.csv" $ C.encodeByName (makeCompleteHeader ab) ab
data ABCPlus = ABCPlus { a :: Int, b :: Int, c :: Int, d :: NamedRecord } deriving Show instance FromNamedRecord ABCPlus where parseNamedRecord m = pure ABC <*> m .: "A" <*> m .: "B" <*> m .: "C" <*> pure m -- or perhaps: pure (m `HM.difference` HM.fromList [("A", ()), ("B", ()), ("C", ())]) instance ToNamedRecord ABCPlus where toNamedRecord m = d m -- or perhaps: d m `HM.union` namedRecord ["A" .= a m, "B" .= b m, "C" .= c m] headers :: [ABCPlus] -> Header headers ms = header $ ["A", "B", "C"] ++ HM.keys (relevant combined) where relevant m = m `HM.difference` HM.fromList [("A", ()), ("B", ()), ("C", ())] -- or perhaps: m combined = HM.unions [relevantKeys (d m) | m <- ms]
Proper JsonObjCodec with Fleece for Complex, Nested Discriminated Unions
I have the following DU which is composed of other DUs or/and Records. type BiometricRules = | Age of Comparator * AgeMeasure | Glycemia of Comparator * BiometricMeasure | Biometric of BiometricType * Comparator * BiometricMeasure | Sex of SexMeasure | MedicalCondition of MedicalCondition | Score of ScoreType * Comparator * ScoreMeasure While trying to deserialize and serialize with Fleece, I have written the following JsonObjCodec. with static member JsonObjCodec = Age <!> jreq "Age" (function Age (comp, ageMeasure) -> Some (comp |> string, ageMeasure |> string) | _ -> None) <|> (Glycemia <!> jreq "Glycemia" (function Glycemia (comp, bioMeasure) -> Some (comp |> string, bioMeasure) | _ -> None)) <|> (Biometric <!> jreq "BiometricRule" (function Biometric (bt, comp, bm) -> Some (bt |> string, comp |> string, bm) | _ -> None)) <|> (Sex <!> jreq "Sex" (function Sex s -> Some (s |> string) | _ -> None)) <|> (BiometricRules.MedicalCondition <!> jreq "MedicalCondition" (function BiometricRules.MedicalCondition x -> Some (x) | _ -> None)) <|> (Score <!> jreq "Score" (function Score (st, comp, scoreMeasure) -> Some (st |> string, comp |> string, scoreMeasure) | _ -> None)) For unknown reason it does not compile with error No overloads match for method 'Map'. All the nested DUs or Records have either a JsonObjCodec or static FromString and ToString methods defined. Any solution with respect to how I could solve this via Fleece would be appreciated. The library is already deeply used in the project, so changing it would involve too much refactoring. Below I copy pasted the definition of the other DU and Records, as reference: type Comparator = | GreaterThan | LowerThan | LowerThanOrEqual | GreaterThanOrEqual | EqualTo with override this.ToString() = match this with | GreaterThan -> ">" | LowerThan -> "<" | LowerThanOrEqual -> "<=" | GreaterThanOrEqual -> ">=" | EqualTo -> "=" static member FromString s = match s with | ">" -> GreaterThan | "<" -> LowerThan | ">=" -> GreaterThanOrEqual | "<=" -> LowerThanOrEqual | "=" -> EqualTo | _ -> failwith "Not a valid comparator." type AgeMeasure = | Years of decimal | Months of decimal | Weeks of decimal with override this.ToString() = match this with | Years y -> string y + " years" | Months m -> string m + " months" | Weeks w -> string w + " weeks" static member FromString (s: string) = match s with | _ when s.EndsWith("years") -> Years (Decimal.Parse(s.Replace("years", ""))) | _ when s.EndsWith("months") -> Months (Decimal.Parse(s.Replace("months", ""))) | _ when s.EndsWith("weeks") -> Weeks (Decimal.Parse(s.Replace("weeks", ""))) type BiometricMeasure = { Value: decimal UoM: string option } with static member JsonObjCodec = fun va uom -> { Value = va UoM = if uom = "NA" then None else Some uom } <!> jreq "Value" (Some << fun bm -> bm.Value) <*> jreq "UoM" (Some << fun bm -> if bm.UoM |> Option.isNone then "NA" else bm.UoM |> Option.get) type BiometricType = | SBP | DBP | Glycemia | Specified of string with override this.ToString() = match this with | SBP -> "SBP" | DBP -> "DBP" | Glycemia -> "Glycemia" | Specified s -> s static member FromString s = match s with | "SBP" -> SBP | "DBP" -> DBP | "Glycemia" -> Glycemia | _ -> Specified s type SexMeasure = | Female | Male | Other of string with override this.ToString() = match this with | Female -> "Female" | Male -> "Male" | Other s -> s static member FromString (s: string) = match s.ToLower() with | "Female" -> Female | "Male" -> Male | other -> Other other type MedicalCondition = | ICD of ICD | Other of string with static member JsonObjCodec = ICD <!> jreq "MedicalCondition" (function ICD v -> Some v | _ -> None) <|> (Other <!> jreq "MedicalCondition" (function Other v -> Some v | _ -> None)) type ScoreType = | BMI | Other of string with override this.ToString() = match this with | BMI -> "BMI" | Other s -> s static member FromString s = match s with | "BMI" -> BMI | _ -> Other s type ScoreMeasure = decimal Libraries Used: <PackageReference Update="FSharp.Core" Version="4.7" /> <PackageReference Include="FSharpPlus" Version="1.1.1" /> <PackageReference Include="Newtonsoft.Json" Version="12.0.3" /> <PackageReference Include="Fleece.NewtonsoftJson" Version="0.8.0" /> <PackageReference Include="FSharp.Data" Version="3.3.3" />
The Problem Fleece provides Json codecs, not string codecs, so defining ToString and FromString is not the way to go, unless you need them for other stuff. The solution Define ToJson and OfJson for your internal DUs. Then remove all the |> string fragments in JsonObjCodec body. Here's a quick and dirty example (I advise error handling to be improved) for Comparator : static member ToJson x = JString (string x) static member OfJson x = match x with | JString x -> Ok (Comparator.FromString x) | _ -> Error (Uncategorized "JString expected") Alternative solution Leave all your internal DUs like this, but add the missing "parse" section in your JsonObjCodec: ... with static member JsonObjCodec = (fun (a, b) -> Age (Comparator.FromString a, AgeMeasure.FromString b)) <!> jreq "Age" (function Age (comp, ageMeasure) -> Some (comp |> string, ageMeasure |> string) | _ -> None) <|> ... this becomes a bit verbose but will do the job. Tips Instead of using the <|> operator to add codecs, you colud use the jchoice combinator, it will read better. If you really need your String / FromString methods I would suggest renaming FromString to Parse or renaming it to TryParse and returning an option type. This way you can take advantage of FSharpPlus tryParse function. Also, if you are using the string/parse pattern everywhere it might worth to create a codec combinator that works from transforming to/from strings. This is not an easy task but it might worth the mental effort. For debugging stuff like this purpose, try not to open the namespace FSharpPlus as it contains generic definitions of operators like <|>, <!> and <*>, this way you'll get better compile error messages.
Lifting a function with another function as an argument in Haskell
So I have a function in Haskell that I've simplified for the purpose of asking this question: import Data.Foldable import Data.Set myFn :: Int -> Set Int myFn a | a <= 0 = singleton 1 | otherwise = foldMap helper (myFn (a - 1)) helper :: Int -> Set Int helper a = insert (a + 2) (singleton a) main :: IO () main = print . Data.Set.toList $ myFn 5 I want to have myFn's dependency on helper to be put into a Reader, since inversion of control allows me to switch implementations in my tests: import Control.Monad.Reader import Data.Foldable import Data.Set data MyEnv = MyEnv { helper' :: Int -> Set Int } type MyReader = Reader MyEnv myFn :: Int -> MyReader (Set Int) myFn a | a <= 0 = return $ singleton 1 | otherwise = do myFn' <- myFn (a - 1) helper'' <- asks helper' return (foldMap helper'' myFn') helper :: Int -> Set Int helper a = insert (a + 2) (singleton a) main :: IO () main = let myEnv = MyEnv helper in print . Data.Set.toList $ runReader (myFn 5) myEnv This works fine, except I don't like these three lines in particular: myFn' <- myFn (a - 1) helper'' <- asks helper' return (foldMap helper'' myFn') I feel like there should be a way to lift foldMap in the same way as mapM is a lifted version of map through its composition with sequence. Ideally, I would like those three lines to collapse down to one: foldMapM helper'' (partitions (n - 1)) Assuming that: helper'' :: Int -> MyReader (Set Int) This would of course require a foldMapM function with a signature similar to: foldMapM :: (Monad m, Foldable t, Monoid n) => (a -> m n) -> m (t a) -> m n I have tried so many things, but I just cannot seem to implement this function, though! Can anyone help?
Basically, you would like to create Monad m => m a -> m b -> m c from a -> b -> c. That's exactly what liftM2 (from Control.Monad) does: liftM2 :: Monad m => (a1 -> a2 -> r) -> m a1 -> m a2 -> m r Promote a function to a monad, scanning the monadic arguments from left to right. For example, liftM2 (+) [0,1] [0,2] = [0,2,1,3] liftM2 (+) (Just 1) Nothing = Nothing Therefore, it's as simple as using liftM2 foldMap: myFn :: Int -> MyReader (Set Int) myFn a | a <= 0 = return $ singleton 1 | otherwise = liftM2 foldMap (asks helper') (myFn (a - 1)) Alternatively you can use <$> and <*> from Control.Applicative if you don't like additional parentheses: myFn :: Int -> MyReader (Set Int) myFn a | a <= 0 = return $ singleton 1 | otherwise = foldMap <$> asks helper' <*> myFn (a - 1) For more information, have a look at the Typeclassopedia.
haskell i/o putstrln error when calling function
I have created a film database and functions related to the database. I am now creating a demo function where it gives results of particular functions carried out when the number is pressed in ghci. For example, when demo 2 is typed, it shows all the films in the database. I have managed to create most the demo functions however, i am having problem with 3 of them and keep being displayed with errors. I have commented out the ones which do not work and need help understanding what the problem is. I have included all the functions i have created with the demo function below. import Data.List import Text.Printf import Data.Ord import Data.Char type Rating = (String, Int) type Title = String type Director = String type Year = Int type Film = (Title, Director, Year,[Rating]) testDatabase :: [Film] testDatabase = [("Blade Runner","Ridley Scott",1982,[("Amy",6), ("Bill",9), ("Ian",7), ("Kevin",9), ("Emma",4), ("Sam",5), ("Megan",4)]), ("The Fly","David Cronenberg",1986,[("Megan",4), ("Fred",7), ("Chris",5), ("Ian",0), ("Amy",5)]), ("Psycho","Alfred Hitchcock",1960,[("Bill",4), ("Jo",4), ("Garry",8), ("Kevin",7), ("Olga",8), ("Liz",10), ("Ian",9)]), ("Body Of Lies","Ridley Scott",2008,[("Sam",3), ("Neal",7), ("Kevin",2), ("Chris",5), ("Olga",6)]), ("Avatar","James Cameron",2009,[("Olga",2), ("Wally",8), ("Megan",9), ("Tim",5), ("Zoe",8), ("Emma",3)]), ("Titanic","James Cameron",1997,[("Zoe",7), ("Amy",2), ("Emma",5), ("Heidi",3), ("Jo",8), ("Megan",5), ("Olga",7), ("Tim",10)]), ("The Departed","Martin Scorsese",2006,[("Heidi",2), ("Jo",8), ("Megan",5), ("Tim",2), ("Fred",5)]), ("Aliens","Ridley Scott",1986,[("Fred",8), ("Dave",6), ("Amy",10), ("Bill",7), ("Wally",2), ("Zoe",5)]), ("Prometheus","Ridley Scott",2012,[("Garry",3), ("Chris",4), ("Emma",5), ("Bill",1), ("Dave",3)]), ("E.T. The Extra-Terrestrial","Steven Spielberg",1982,[("Ian",7), ("Amy",2), ("Emma",7), ("Sam",8), ("Wally",5), ("Zoe",6)]), ("The Birds","Alfred Hitchcock",1963,[("Garry",7), ("Kevin",9), ("Olga",4), ("Tim",7), ("Wally",3)]), ("Goodfellas","Martin Scorsese",1990,[("Emma",7), ("Sam",9), ("Wally",5), ("Dave",3)]), ("The Shawshank Redemption","Frank Darabont",1994,[("Jo",8), ("Sam",10), ("Zoe",3), ("Dave",7), ("Emma",3), ("Garry",10), ("Kevin",7)]), ("Gladiator","Ridley Scott",2000,[("Garry",7), ("Ian",4), ("Neal",6), ("Wally",3), ("Emma",4)]), ("The Green Mile","Frank Darabont",1999,[("Sam",3), ("Zoe",4), ("Dave",8), ("Wally",5), ("Jo",5)]), ("True Lies","James Cameron",1994,[("Dave",3), ("Kevin",4), ("Jo",0)]), ("Minority Report","Steven Spielberg",2002,[("Dave",5), ("Garry",6), ("Megan",2), ("Sam",7), ("Wally",8)]), ("The Wolf of Wall Street","Martin Scorsese",2013,[("Dave",6), ("Garry",6), ("Megan",0), ("Sam",4)]), ("War Horse","Steven Spielberg",2011,[("Dave",6), ("Garry",6), ("Megan",3), ("Sam",7), ("Wally",8), ("Zoe",8)]), ("Lincoln","Steven Spielberg",2012,[("Ian",3), ("Sam",7), ("Wally",3), ("Zoe",4), ("Liz",7), ("Megan",4)]), ("Vertigo","Alfred Hitchcock",1958,[("Bill",7), ("Emma",5), ("Zoe",9), ("Olga",6), ("Tim",10)]), ("The Terminal","Steven Spielberg",2004,[("Olga",3), ("Heidi",8), ("Bill",2), ("Sam",6), ("Garry",8)]), ("Jaws","Steven Spielberg",1975,[("Fred",3), ("Garry",0), ("Jo",3), ("Neal",9), ("Emma",7)]), ("Hugo","Martin Scorsese",2011,[("Sam",4), ("Wally",3), ("Zoe",4), ("Liz",7)])] ------------------------------------------------------------ -----------------FUNCTIONAL CODE---------------------------- ------------------------------------------------------------ --when adding need to be addFilm string string int and the list name called testDatabase addFilm :: String -> String -> Int -> [Film] -> [Film] addFilm title director year database = (title, director, year, []) : database --Some functions needed later on: averageFilmRating :: [(String,Int)] -> Float averageFilmRating ratings = (fromIntegral(sum $ map snd ratings)) / (fromIntegral(length ratings)) --Formats the films for decimal, gives average rating of films instead of all users ratings. formatFilmOutput :: Film -> String formatFilmOutput (title, director, year, rating) = printf "%s by %s. Year: %d, Average Rating: %.1f" (title) (director) (year) (averageFilmRating rating) --Shows all films in the database displayAllFilm :: [String] displayAllFilm = map formatFilmOutput testDatabase --Shows films by director name displayByDirector :: String -> [Film] displayByDirector name = filter(\(_,director,_,_) -> director == name) testDatabase --Gives the average of directors films directorAverage :: String -> Float directorAverage dir = averageFilmRating [rating | (title, director, year, ratings) <- displayByDirector dir, rating <- ratings] --These two functions give the films rated of average 6 or over filmsRated :: Int -> [Film] filmsRated rating = filter(\(_,_,_,a) -> averageFilmRating a >= fromIntegral rating) testDatabase filmsaveragesix = filmsRated 6 --Shows what films the user has rated. userRatedFilms :: String -> [Film] userRatedFilms username = filter ((username `elem`) . (\(_,_,_,xs) -> map fst xs)) testDatabase -- Allows user to rate or re-rate film. databaseNoFilm:: [Film] -> Title -> [Film] databaseNoFilm database t = [(title, director, year, ratings) | (title, director, year, ratings) <- database, title /= t] rateFilm :: [Film] -> Title -> Rating -> [Film] rateFilm database findtitle (u, r) = databaseNoFilm database findtitle ++ [(title,director,year,(u, r):[(username,rtg) | (username,rtg) <- ratings, username /= u]) | (title, director, year, ratings) <- database, title == findtitle] --Displays films by year in descending order of rating filmsByYear :: Int -> [Film] filmsByYear year = sortFilms $ filter(\(_,_,yr,_) -> yr >= year) testDatabase sortFilms :: [Film] -> [Film] sortFilms = sortBy $ flip $ comparing averageFilmRating' where averageFilmRating' (_,_,_,rs) = averageFilmRating rs ------------------------------------------------------------ -----------------DEMO FUNCTION------------------------------ ------------------------------------------------------------ demo :: Int -> IO () demo choice = do case choice of -- 1 -> do -- putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase 2 -> do putStrLn (unlines displayAllFilm) 3 -> do putStrLn (unlines (map formatFilmOutput(displayByDirector "James Cameron"))) 4 -> do putStrLn (unlines (map formatFilmOutput(filmsaveragesix))) -- Get the director average of James Cameron -- 5 -> do -- putStrLn (directorAverage "James Cameron") 6 -> do putStrLn (unlines (map formatFilmOutput(userRatedFilms "Zoe"))) -- all films after Zoe rates "Jaws" 8 -- 7 -> do -- putStrLn rateFilm testDatabase "Jaws" ("Zoe", 8) -- 77 all films after Zoe rates "Vertigo" 3 8 -> do putStrLn (unlines (map formatFilmOutput(filmsByYear 2009))) The problem here relates to the demo functions which have been commented out. When uncommented and run the error for demo 1 relates to: Couldn't match type `String -> String -> Int -> [Film] -> [Film]' with `[Char]' Expected type: String Actual type: String -> String -> Int -> [Film] -> [Film] In the first argument of `putStrLn', namely `addFilm' In a stmt of a 'do' block: putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase In the expression: do { putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase } It is a similar error for the rest of the commented demo functions
The problem you were having has to do with the line putStrLn addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase To the compiler, this looks like you're trying to apply 5 arguments to putStrLn, which is a function that only takes 1 argument, hence the compiler error. If you were to do putStrLn (addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase) Or equivalently (and prettier) putStrLn $ addFilm "Gravity" "Alfonso Cuaron" 2013 testDatabase Then you're applying the 4 arguments to addFilm, then applying that result to putStrLn. This still won't compile because addFilm returns a list of Films, not a String which is what putStrLn expects. You can instead use print, which is defined as print :: Show a => a -> IO () print x = putStrLn $ show x Or you could use your formatFilmOutput function: putStrLn $ unlines $ map formatFilmOutput $ addFilm "Gravity" ... and unlines converts the list of strings into a single string joined by new lines. In case you're wondering what the $ operator is, it's literally defined as ($) :: (a -> b) -> a -> b ($) = id infixr 0 $ The important part of the definition is the infixr 0 $ line. This means that it's right associative with a fixity of 0, which is the lowest precedence. Function application has a precedence of 9, meaning it always takes priority over operators, so add1 x * 2 Is always the same as (add1 x) * 2 The $ operator just acts as an alternative to parentheses. You can use it as f $ g $ h $ i $ j $ k $ l $ m x -- f through m are functions of one argument Which means apply x to m, then apply that to l, then to k, then to j, and so on, or you could write it as f (g (h (i (j (k (l (m x))))))) Which isn't fun to balance parentheses for. This only works for applying the last argument to a function, not any of the middle ones. So add = (+) add $ 1 + 2 $ 3 + 4 won't work, that parses as add (1 + 2 (3 + 4)) add (3 (7)) add (3 7) Which just doesn't make sense.
Haskell - correct way to map BSON to JSON - where to put code
So, I'm new to Haskell, and its community. I want to make a mongodb-backed JSON API. Mongo and JSON are a good fit (at least in node), because it stores its documents in BSON, which is "binary json", so it theory it's easy to convert it to JSON. After many mistakes, I managed to write the following code. {-# LANGUAGE OverloadedStrings, ExtendedDefaultRules #-} -- https://github.com/mailrank/aeson/blob/master/examples/Demo.hs -- cabal install aeson -- cabal install mongoDb import Data.Aeson import qualified Data.Aeson.Types as T import Data.Attoparsec (parse, Result(..)) import Data.Attoparsec.Number (Number(..)) import qualified Data.Text as Text import Control.Applicative ((<$>)) import Control.Monad (mzero) import qualified Data.ByteString.Char8 as BS -- Aeson's "encode" to JSON generates lazy bytestrings import qualified Data.ByteString.Lazy.Char8 as BSL import qualified Data.CompactString as CS import Database.MongoDB import Data.Bson import qualified Data.Bson as Bson import qualified Data.Vector import GHC.Int -- Is there a better way to convert between string representations? csToTxt :: UString -> Text.Text csToTxt cs = Text.pack $ CS.unpack cs bsToTxt :: BS.ByteString -> Text.Text bsToTxt bs = Text.pack $ BS.unpack bs fieldToPair :: Field -> T.Pair fieldToPair f = key .= val where key = csToTxt $ label f val = toJSON (value f) instance ToJSON Field where toJSON f = object [fieldToPair f] -- Is this what I'm supposed to do? Just go through and map everything? instance ToJSON Data.Bson.Value where toJSON (Float f) = T.Number $ D f toJSON (Bson.String s) = T.String $ csToTxt s toJSON (Bson.Array xs) = T.Array $ Data.Vector.fromList (map toJSON xs) toJSON (Doc fs) = object $ map fieldToPair fs toJSON (Uuid (UUID bs)) = T.String $ bsToTxt bs toJSON (Bson.Bool b) = T.Bool b toJSON (Int32 i) = T.Number (I (fromIntegral i)) toJSON (Int64 i) = T.Number (I (fromIntegral i)) toJSON (ObjId (Oid w32 w64)) = T.String "look up GHC.Word.Word32 and GHC.Word.Word64" toJSON (UTC time) = T.String "look up Data.Time.Clock.UTC.UTCTime" toJSON (Md5 m) = T.Null toJSON (UserDef u) = T.Null toJSON (Bin b) = T.Null toJSON (Fun f) = T.Null toJSON Bson.Null = T.Null toJSON (RegEx r) = T.Null toJSON (JavaScr j) = T.Null toJSON (Sym s) = T.Null toJSON (Stamp s) = T.Null toJSON (MinMax mm) = T.Null -- Data.Bson.Value and T.Value for reference -- data Data.Bson.Value -- = Float Double -- | Data.Bson.String UString -- | Doc Document -- | Data.Bson.Array [Data.Bson.Value] -- | Bin Binary -- | Fun Function -- | Uuid UUID -- | Md5 MD5 -- | UserDef UserDefined -- | ObjId ObjectId -- | Data.Bson.Bool Bool -- | UTC time-1.2.0.3:Data.Time.Clock.UTC.UTCTime -- | Data.Bson.Null -- | RegEx Regex -- | JavaScr Javascript -- | Sym Symbol -- | Int32 GHC.Int.Int32 -- | Int64 GHC.Int.Int64 -- | Stamp MongoStamp -- | MinMax MinMaxKey -- data T.Value -- = Object Object -- | T.Array Array -- | T.String Text.Text -- | Number Data.Attoparsec.Number.Number -- | T.Bool !Bool -- | T.Null main ::IO () main = do putStrLn $ "testing again: " ++ BSL.unpack (encode ["Hello", "I", "am", "angry"]) let field = "key" =: "value" print field print $ label field putStrLn $ CS.unpack $ label field putStrLn $ show "asdf" -- Getting close putStrLn $ "testing again: " ++ BSL.unpack (encode ["hello" =: "world", "num" =: 10.05, "num2" =: 5, "sub" =: ["doc","charlie"], "bool" =: False]) putStrLn $ "testing again: " ++ BSL.unpack (encode ["hello" =: "world", "sub" =: ["one" =: 1, "two" =: 2]]) Is there a better way to map between two types that are as similar as these are? Is there a better way to map between the two string implementations? Once I finish this, where should it live? Does it belong in either the JSON or BSON/MongoDB projects, or should it be published as its own module?
For the benefit of people finding this now, this has been done in: https://hackage.haskell.org/package/AesonBson. Looks like the same approach.
From Cale on #haskell: Do either of those packages depend on the other already? If not, then probably you'd want to go with the third package option.