Haskell Errors Binding a CSV File to a Handle - csv

So I was writing a small toybox of utility functions for CSV files and throughout testing the functions I'd been binding the file by hand with
table' <- parseCSVFromFile filepath
but (from Text.CSV)
parseCSVFromFile :: FilePath -> IO (Either parsec-3.1.9:Text.Parsec.Error.ParseError CSV)
and so I had to write a quick line to strip off this either error csv crap
stripBare csv = head $ rights $ flip (:) [] csv
and redefine it as table = stripBare table'. After this, list functions work on the csv files contents and life goes on
(Digression: Surprisingly there isn't a direct Either a b -> b function in Data.Either. So I used Data.Either.rights :: [Either a b] -> [b])
I wanted to do the work of undressing the csv type and binding it to a handle in one shot. Something like
table = stripBare $ table' <- parseCSVFromFile filepath
but this gives a parse error on (<-) saying I may be missing a do... then
table = stripBare $ do table' <- parseCSVFromFile filepath
yells at me saying that the last statement in a do block must be an expression.
What am I doing wrong?
As a separate curiosity, I saw here that
do notation in Haskell desugars in a pretty simple way.
do
x <- foo
e1
e2
...
turns into
foo >>= \x ->
do
e1
e2
I find this attractive and tried the following line which gave me a type error
*Toy> :type (parseCSVFromFile "~/.csv") >>= \x -> x
<interactive>:1:52: error:
* Couldn't match type `Either
parsec-3.1.9:Text.Parsec.Error.ParseError'
with `IO'
Expected type: IO CSV
Actual type: Either parsec-3.1.9:Text.Parsec.Error.ParseError CSV
* In the expression: x
In the second argument of `(>>=)', namely `\ x -> x'
In the expression:
(parseCSVFromFile "~/.csv") >>= \ x -> x

Code such as
head $ rights $ flip (:) [] csv
is dangerous. You are exploiting the partiality of head to hide the fact that csv might be a Left something. We can rewrite it as
= head $ rights $ (:) csv []
= head $ rights [csv]
= case csv of
Left _ -> error "Left found!"
Right x -> x
Usually it's better to handle the Left case directly in the do IO block. Something like: (pseudo code follows)
foo :: String -> IO ()
foo filepath = do
table' <- parseCSVFromFile filepath
case table' of
Left err -> do
putStrLn "Error in parsing CSV"
print err
moreErrorHandlingHere
Right table -> do
putStrLn "CSV loaded!"
use table

Related

Get Column in Haskell CSV and infer the column type

I'm exploring a csv file in an interactive ghci session (in a jupyter notebook):
import Text.CSV
import Data.List
import Data.Maybe
dat <- parseCSVFromFile "/home/user/data.csv"
headers = head dat
records = tail dat
-- define a way to get a particular row by index
indexRow :: [[Field]] -> Int -> [Field]
indexRow csv index = csv !! index
indexRow records 1
-- this works!
-- Now, define a way to get a particular column by index
indexField :: [[Field]] -> Int -> [Field]
indexField records index = map (\x -> x !! index) records
While this works if I know in advance the type of column 3:
map (\x -> read x :: Double) $ indexField records 3
How can I ask read to infer what the type might be when for example my columns could contain strings or num? I'd like it to try for me, but:
map read $ indexField records 3
fails with
Prelude.read: no parse
I don't care whether they are string or num, I just need that they are all the same and I am failing to find a way to specify that generally with the read function at least.
Weirdly, if I define a mean function like so:
mean :: Fractional a => [a] -> Maybe a
mean [] = Nothing
mean [x] = Just x
mean xs = Just (sum(xs) / (fromIntegral (length xs)))
This works:
mean $ map read $ indexField records 2
Just 13.501359655240003
But without the mean, this still fails:
map read $ indexField records 2
Prelude.read: no parse
Unfortunately, read is at the end of its wits when it comes to situations like this. Let's revisit read:
read :: Read a => String -> a
As you can see, a doesn't depend on the input, but solely on the output, and therefore of the context of our function. If you use read a + read b, then the additional Num context will limit the types to Integer or Double due to default rules. Let's see it in action:
> :set +t
> read "1234"
*** Exception: Prelude.read: no parse
> read "1234" + read "1234"
2468
it :: (Num a, Read a) => a
Ok, a is still not helpful. Is there any type that we can read without additional context? Sure, unit:
> read "()"
()
it :: Read a => a
That's still not helpful at all, so let's enable the monomorphism restriction:
> :set -XMonomorphismRestriction
> read "1234" + read "1234"
2468
it :: Integer
Aha. In the end, we had an Integer. Due to +, we had to decide on a type. Now, with the MonomorphismRestriction enabled, what happens on read "1234" without additional context?
> read "1234"
<interactive>:20:1
No instance for (Read a0) arising from a use of 'read'
The type variable 'a0' is ambiguous
Now GHCi doesn't pick any (default) type and forces you to chose one. Which makes the underlying error much more clear.
So how do we fix this? As CSV can contain arbitrary fields at run-time and all types are determined statically, we have to cheat by introducing something like
data CSVField = CSVString String | CSVNumber Double | CSVUnknown
and then write
parse :: Field -> CSVField
After all, our type needs to cover all possible fields.
However, in your case, we can just restrict read's type:
myRead :: String -> Double
myRead = read
But that's not wise, as we can still end up with errors if the column doesn't contain Doubles to begin with. So instead, let's use readMaybe and mapM:
columnAsNumbers :: [Field] -> Maybe [Double]
columnAsNumbers = mapM readMaybe
That way, the type is fixed, and we're forced to check whether we have Just something or Nothing:
mean <$> columnAsNumbers (indexFields records 2)
If you find yourself often using columnAsNumbers create an operator, though:
(!!$) :: [[Field]] -> Maybe [Double]
records !!$ index = columnAsNumbers $ indexFields records index

Creating a list when parsing a file using Cassava in Haskell

I'm able to parse my csv file using the following code from Data.Csv:
valuesToList :: Foo -> (Int, Int)
valuesToList (Foo a b) = (a,b)
loadMyData :: IO ()
loadMyData = do
csvData <- BL.readFile "mydata.csv"
case decodeByName csvData of
Left err -> putStrLn err
Right (_, v) -> print $ V.toList $ V.map valuesToList v
When I run this I get the correct output on the screen. The problem I have is I'm not clear how to create a pure function so I can consume the contents of the list, as in:
let l = loadMyData
Where l is of type [Int,Int] I'm guessing it's because I'm in the IO Monad and I'm doing something hopelessly silly...
I'm doing something hopelessly silly...
Yes, but worry not!
loadMyData = BL.readFile "mydata.csv"
processMyData :: String -> String
processMyData csvData =
case decodeByName csvData of
Left err -> err
Right (_, v) -> show $ V.toList $ V.map valuesToList v
main = do
csv <- loadMyData
let output = processMyData csv
print output
This way you separate the pure, "let" part from the impure loading part. This isn't Code Review (if you asked there, I could probably elaborate more), but I'd type the processing as String -> Either String [Int, Int] or something and keep the failure information in the type system.
processMyData csvData =
case decodeByName csvData of
Left err -> Left err
Right (_, v) -> Right $ V.toList $ V.map valuesToList v
And that in turn could simply be (pardon me if I make a mistake somewhere, I'm doing it live):
processMyData = fmap (V.toList . V.map valuesToList) . decodeByName
That should work because of how the Functor instance for Either is constructed.
Oh, and also use Control.Applicative for bonus style points:
main = do
output <- processMyData <$> loadMyData
print output
(in case you don't understand that example, (<$>) is infix fmap, but that's not stricly necessary; hence bonus)

How do I print the name and value of a custom data type in Haskell

Lets say I define a data type as follows:
data OP = Plus | Minus | Num Int deriving (Show, Eq)
Then I take a list of strings, and get a list of their respective OP values like this:
getOp :: [String] -> [OP]
getOp [] = []
getOp (x:rest)
| x == "+" = Plus:(getOp rest)
| isInfixOf "Num" x == True = Num (read (drop 4 x) :: Int):(getOp rest)
| otherwise = "-" = Minus:(getOp rest)
I then want to show the [OP] list, separated by new lines. I've done it with list of Strings easily, but not sure what to do with a list of data types.
I have the following structure as a starting point:
showOp :: [OP] -> String
showOp [] = []
showOp (o:os) = (putStr o):'\n':(showOp os)
I know the last line is wrong. I'm trying to return a [Char] in the first section, then a Char, then a recursive call. I tried some other variations for the last line (see below) with no luck.
showOp o = show o (works but not what I need. It shows the whole list, not each element on a new line
showOp o = putStrLn (show o) (epic fail)
showOp o
| o == "+" = "Plus\n":(showOp os)
| more of the same. Trying to return a [Char] instead of a Char, plus other issues.
Also, i'm not sure how the output will need to be different for the Num Int type, since I'll need to show the type name and the value.
An example i/o for this would be something like:
in:
getOp ["7","+","4","-","10"]
out:
Num 7
Plus
Num 4
Minus
Num 10
You need to look at the types of the functions and objects you are using. Hoogle is a great resource for getting function signatures.
For starters, the signature of putStr is
putStr :: String -> IO ()
but your code has putStr o, where o is not a string, and the result should not be an IO (). Do you really want showOp to print the Op, or just make a multi-line string for it?
If the former, you need the signature of showOp to reflect that:
showOp :: [Op] -> IO ()
Then you can use some do-notation to finish the function.
I'll write a solution for your given type signature. Since showOp should return a String and putStr returns an IO (), we won't be using putStr anywhere. Note that String is simply a type synonym for [Char], which is why we can treat Strings as a list.
showOp :: [Op] -> String
showOp [] = [] -- the empty list is a String
showOp (o:os) = showo ++ ('\n' : showos)
where showo = (show o) -- this is a String, i.e. [Char]
showos = showOp os -- this is also a String
Both showo and showos are Strings: both show and showOp return Strings.
We can add a single character to a list of characters using the cons operation :. We can append two lists of strings using append operator ++.
Now you might want another function
printOp :: [Op] -> IO ()
printOp xs = putStr $ showOp xs
How about:
showOp = putStrLn . unlines . map show
Note that your data constructor OP is already an instance of Show. Hence, you can actually map show into your array which contains members of type OP. After that, things become very somple.
A quick couple of notes ...
You might have wanted:
getOp :: [String] -> [OP]
getOp [] = []
getOp (x:rest)
| x == "+" = Plus:(getOp rest)
| x == "-" = Minus:(getOp rest)
| isInfixOf "Num" x == True = Num (read (drop 4 x) :: Int):(getOp rest)
| otherwise = (getOp rest)
Instead of what you have. Your program has a syntax error ...
Next, the input that you wanted to provide was probably
["Num 7","+","Num 4","-","Num 10"]
?. I guess that was a typo.

Yesod: how to send Redis results as JSON

Apologies in advance for my Haskell inexperience. I am writing a little wrapper for a Redis instance for a learning project. So far Yesod has been an absolute wonder. With very little Haskell experience I got browserId Auth working, and I'm inserting records into Redis successfully and quickly.
I'm stuck figuring out how to get the Redis responses converted into JSON and sent back. Here is a working, un-scaffold, app that shows getting a static RepJson or a RepPlain with Redis info (App is called LRedis):
{-# LANGUAGE OverloadedStrings, TemplateHaskell, TypeFamilies,
MultiParamTypeClasses, QuasiQuotes #-}
import Yesod
import Data.Text
import Data.Text.Encoding
import Data.ByteString.UTF8
import Database.Redis
import qualified Data.ByteString.Lazy as L
data LRedis = LRedis
instance Yesod LRedis where
mkYesod "LRedis" [parseRoutes|
/ HomeR GET
/learnJson LearnJsonR GET
/redisWorks RedisWorksR GET
|]
getHomeR :: Handler RepHtml
getHomeR = do
defaultLayout[whamlet|
<p>Hi this is a headless API thing.
|]
getLearnJsonR :: Handler RepJson
getLearnJsonR = do
jsonToRepJson $ object [("json", ("ftw"::Text))]
getRedisWorksR :: Handler RepPlain
getRedisWorksR = do
conn <- liftIO $ connect defaultConnectInfo
liftIO $ runRedis conn $ do
result <- Database.Redis.get (fromString "hello")
case result of
Left e -> return $ RepPlain "Error"
Right mAnswer -> do
case mAnswer of
Nothing -> return $ RepPlain "Not found."
Just x -> return $ RepPlain (toContent x)
main :: IO()
main = do
warpDebug 3000 $ LRedis
Again, that is all working. It will return the string stored in "hello" in redis if you curl /redisWorks, or it will return JSON if you curl /learnJson, but I want to give the redis answer as JSON, not a plain string. I thought I could just naively combine the two, like:
getRedisJsonR :: Handler RepJson
getRedisJsonR = do
conn <- liftIO $ connect defaultConnectInfo
liftIO $ runRedis conn $ do
result <- Database.Redis.get (fromString "hello")
case result of
Left e -> jsonToRepJson $ object [("response", ("error"::Text))]
Right mAnswer -> do
case mAnswer of
Nothing -> jsonToRepJson $ object [("response", ("Nothing"::Text))]
Just x -> jsonToRepJson $ object [("response", ((decodeUtf8 x)::Text))]
But after adding the route /redisJson RedisJsonR GET it fails with this compilation error:
Couldn't match expected type `Redis a0'
with actual type `GHandler sub0 master0 RepJson'
In the expression:
jsonToRepJson $ object [("response", ("error" :: Text))]
In a case alternative:
Left e -> jsonToRepJson $ object [("response", ("error" :: Text))]
In a stmt of a 'do' block:
case result of {
Left e -> jsonToRepJson $ object [("response", ("error" :: Text))]
Right mAnswer
-> do { case mAnswer of {
Nothing -> ...
Just x -> ... } } }
It seems like its telling me I need to do something different with the result in case of an error, but I don't know what that would be, or why it's necessary given the RepPlain version is working.
Is there an example of getting the results from Redis into JSON within Yesod?
Is there just something simple I'm doing wrong with IO or something?
Convenient link to Hedis docs: http://hackage.haskell.org/package/hedis Thank you for helping me with this. Sorry again if it turns out to be super simple.
getRedisJsonR :: Handler RepJson
getRedisJsonR = do
conn <- liftIO $ connect defaultConnectInfo
res <- liftIO $ runRedis conn $ do
result <- Database.Redis.get (fromString "hello")
case result of
Left e -> return $ jsonToRepJson $ object [("response", ("error"::Text))]
Right mAnswer -> do
case mAnswer of
Nothing -> return $ jsonToRepJson $ object [("response", ("Nothing"::Text))]
Just x -> return $ jsonToRepJson $ object [("response", ((decodeUtf8 x)::Text))]
res
I actually don't trust myself to explain why this works and your original code doesn't -- I've been programming in Haskell on a daily basis for just under three months, so I have a developing gut sense for what will work but I'm really not there yet on the theory side, especially when it comes to stacked monads, which I think is what we're dealing with here (either Redis is on top of Handler or vice versa, and liftIO is facilitating the stacking).
Hopefully someone else can weigh in -- seems like a great concrete example to illustrate some monad concepts.

haskell word searching program development

hello I am making some word searching program
for example
when "text.txt" file contains "foo foos foor fo.. foo fool"
and search "foo"
then only number 2 printed
and search again and again
but I am haskell beginner
my code is here
:module +Text.Regex.Posix
putStrLn "type text file"
filepath <- getLine
data <- readFile filepath
--1. this makes <interactive>:1:1: parse error on input `data' how to fix it?
parsedData =~ "[^- \".,\n]+" :: [[String]]
--2. I want to make function and call it again and again
searchingFunc = do putStrLn "search for ..."
search <- getLine
result <- map (\each -> if each == search then count = count + 1) data
putStrLn result
searchingFunc
}
sorry for very very poor code
my development environment is Windows XP SP3 WinGhci 1.0.2
I started the haskell several hours ago sorry
thank you very much for reading!
edit: here's original scheme code
thanks!
#lang scheme/gui
(define count 0)
(define (search str)
(set! count 0)
(map (λ (each) (when (equal? str each) (set! count (+ count 1)))) data)
(send msg set-label (format "~a Found" count)))
(define path (get-file))
(define port (open-input-file path))
(define data '())
(define (loop [line (read-line port)])
(when (not (eof-object? line))
(set! data (append data
(regexp-match* #rx"[^- \".,\n]+" line)))
(loop)))
(loop)
(define (cb-txt t e) (search (send t get-value)))
(define f (new frame% (label "text search") (min-width 300)))
(define txt (new text-field% (label "type here to search") (parent f) (callback (λ (t e) (cb-txt t e)))))
(define msg (new message% (label "0Found ") (parent f)))
(send f show #t)
I should start by iterating what everyone would (and should) say: Start with a book like Real World Haskell! That said, I'll post a quick walkthrough of code that compiles, and hopefully does something close to what you originally intended. Comments are inline, and hopefully should illustrate some of the shortcomings of your approach.
import Text.Regex.Posix
-- Let's start by wrapping your first attempt into a 'Monadic Action'
-- IO is a monad, and hence we can sequence 'actions' (read as: functions)
-- together using do-notation.
attemptOne :: IO [[String]]
-- ^ type declaration of the function 'attemptOne'
-- read as: function returning value having type 'IO [[String]]'
attemptOne = do
putStrLn "type text file"
filePath <- getLine
fileData <- readFile filePath
putStrLn fileData
let parsed = fileData =~ "[^- \".,\n]+" :: [[String]]
-- ^ this form of let syntax allows us to declare that
-- 'wherever there is a use of the left-hand-side, we can
-- substitute it for the right-hand-side and get equivalent
-- results.
putStrLn ("The data after running the regex: " ++ concatMap concat parsed)
return parsed
-- ^ return is a monadic action that 'lifts' a value
-- into the encapsulating monad (in this case, the 'IO' Monad).
-- Here we show that given a search term (a String), and a body of text to
-- search in, we can return the frequency of occurrence of the term within the
-- text.
searchingFunc :: String -> [String] -> Int
searchingFunc term
= length . filter predicate
where
predicate = (==)term
-- ^ we use function composition (.) to create a new function from two
-- existing ones:
-- filter (drop any elements of a list that don't satisfy
-- our predicate)
-- length: return the size of the list
-- Here we build a wrapper-function that allows us to run our 'pure'
-- searchingFunc on an input of the form returned by 'attemptOne'.
runSearchingFunc :: String -> [[String]] -> [Int]
runSearchingFunc term parsedData
= map (searchingFunc term) parsedData
-- Here's an example of piecing everything together with IO actions
main :: IO ()
main = do
results <- attemptOne
-- ^ run our attemptOne function (representing IO actions)
-- and save the result
let searchResults = runSearchingFunc "foo" results
-- ^ us a 'let' binding to state that searchResults is
-- equivalent to running 'runSearchingFunc'
print searchResults
-- ^ run the IO action that prints searchResults
print (runSearchingFunc "foo" results)
-- ^ run the IO action that prints the 'definition'
-- of 'searchResults'; i.e. the above two IO actions
-- are equivalent.
return ()
-- as before, lift a value into the encapsulating Monad;
-- this time, we're lifting a value corresponding to 'null/void'.
To load this code, save it into a .hs file (I saved it into 'temp.hs'), and run the following from ghci. Note: the file 'f' contains a few input words:
*Main Text.Regex.Posix> :l temp.hs
[1 of 1] Compiling Main ( temp.hs, interpreted )
Ok, modules loaded: Main.
*Main Text.Regex.Posix> main
type text file
f
foo foos foor fo foo foo
The data after running the regex: foofoosfoorfofoofoo
[1,0,0,0,1,1]
[1,0,0,0,1,1]
There is a lot going on here, from do notation to Monadic actions, 'let' bindings to the distinction between pure and impure functions/values. I can't stress the value of learning the fundamentals from a good book!
Here is what I made of it. It doesn't does any error checking and is as basic as possible.
import Text.Regex.Posix ((=~))
import Control.Monad (when)
import Text.Printf (printf)
-- Calculates the number of matching words
matchWord :: String -> String -> Int
matchWord file word = length . filter (== word) . concat $ file =~ "[^- \".,\n]+"
getInputFile :: IO String
getInputFile = do putStrLn "Enter the file to search through:"
path <- getLine
readFile path -- Attention! No error checking here
repl :: String -> IO ()
repl file = do putStrLn "Enter word to search for (empty for exit):"
word <- getLine
when (word /= "") $
do print $ matchWord file word
repl file
main :: IO ()
main = do file <- getInputFile
repl file
Please start step by step. IO in Haskell is hard, so you shouldn't start with file manipulation. I would suggest to write a function that works properly on a given String. That way you can learn about syntax, pattern matching, list manipulation (maps, folds) and recursion without beeing distracted by the do notation (which kinda looks imperative, but isn't, and really needs a deeper understanding).
You should check out Learn you a Haskell or Real World Haskell to get a sound foundation. What you do now is just stumbling in the dark - which may work if you learn languages that are similar to the ones you know, but definitely not for Haskell.