What is the easiest way to generate an html table in haskell [closed] - html

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I would like to output a table in html format.
Basically I would like something like :
[[a]] -> <table>
What is the easiest way to do so ?

The easiest way to generate Html is probably blaze:
import Text.Blaze.Html5 (table, td, tr, toHtml, ToMarkup, Html)
import Control.Monad (forM_, mapM_)
myTable :: (ToMarkup a) => [[a]] -> Html
myTable xs = table $ forM_ xs (tr . mapM_ (td . toHtml))
Note that you need to use renderHtml from Text.Blaze.Renderer.* to get a ByteString, String or Text.

Edit: During writing the answer #Zeta already posted a better solution using blaze-html. So I recommend using his solution (see section "Words of Warning" for the listening of this solutions disadvantages...) ;-)
Here is an implementation:
-- file test.hs:
insideTag :: String -> String -> String
insideTag tag content = "<" ++ tag ++ ">" ++ content ++ "</" ++ tag ++ ">"
toTable :: Show a => [[a]] -> String
toTable = insideTag "table" . concatMap (insideTag "tr") . map (concatMap (insideTag "td" . show))
main :: IO ()
main = do
putStrLn $ toTable [[1,2,3],[4,5,6],[7,8,9]]
return ()
The command runhaskell test.hs will now print
<table><tr><td>1</td><td>2</td><td>3</td></tr><tr><td>4</td><td>5</td><td>6</td></tr><tr><td>7</td><td>8</td><td>9</td></tr></table>
Explanation of the code
insideTag encapsulates content inside a html tag:
ghci> let insideTag tag content = "<" ++ tag ++ ">" ++ content ++ "</" ++ tag ++ ">"
ghci> insideTag "h1" "hello world"
"<h1>hello world</h1>"
map (concatMap (insideTag "td" . show)) list encapsulate the inner elements into <td> tags and concatenate them:
ghci> map (concatMap (insideTag "td" . show)) [[1,2], [3,4]]
["<td>1</td><td>2</td>","<td>3</td><td>4</td>"]
The same can be done for the outer list:
ghci> concatMap (insideTag "tr") ["<td>1</td><td>2</td>","<td>3</td><td>4</td>"]
"<tr><td>1</td><td>2</td></tr><tr><td>3</td><td>4</td></tr>"
The last string only has to be encapsulate into a <table> tag:
ghci> insideTag "table" "<tr><td>1</td><td>2</td></tr><tr><td>3</td><td>4</td></tr>"
"<table><tr><td>1</td><td>2</td></tr><tr><td>3</td><td>4</td></tr></table>"
Words of warning
The above code uses the normal [Char] type for strings which is not memory efficient. So I recommend that you use Data.Text if you deal with big tables (toTable remains the same, you just have to change show to pack . show; insideTag has to reimplemented for Data.Text).
There is also no HTML escaping for the table content!!! So the above code is vulnerable to XSS attacts. So do not use the above code, if the produced HTML shall be included in a website (especially if the website user has an influence on the table content)!

Related

LISP: how to properly encode a slash ("/") with cl-json?

I have code that uses the cl-json library to add a line, {"main" : "build/electron.js"} to a package.json file:
(let ((package-json-pathname (merge-pathnames *app-pathname* "package.json")))
(let
((new-json (with-open-file (package-json package-json-pathname :direction :input :if-does-not-exist :error)
(let ((decoded-package (json:decode-json package-json)))
(let ((main-entry (assoc :main decoded-package)))
(if (null main-entry)
(push '(:main . "build/electron.js") decoded-package)
(setf (cdr main-entry) "build/electron.js"))
decoded-package)))))
(with-open-file (package-json package-json-pathname :direction :output :if-exists :supersede)
(json:encode-json new-json package-json))
)
)
The code works, but the result has an escaped slash:
"main":"build\/electron.js"
I'm sure this is a simple thing, but no matter which inputs I try -- "//", "/", "#//" -- I still get the escaped slash.
How do I just get a normal slash in my output?
Also, I'm not sure if there's a trivial way for me to get pretty-printed output, or if I need to write a function that does this; right now the output prints the entire package.json file to a single line.
Special characters
The JSON Spec indicates that "Any character may be escaped.", but some of them MUST be escaped: "quotation mark, reverse solidus, and the control characters". The linked section is followed by a grammar that show "solidus" (/) in the list of escaped characters. I don't think it is really important in practice (typically it needs not be escaped), but that may explain why the library escapes this character.
How to avoid escaping
cl-json relies on an internal list of escaped characters named +json-lisp-escaped-chars+, namely:
(defparameter +json-lisp-escaped-chars+
'((#\" . #\")
(#\\ . #\\)
(#\/ . #\/)
(#\b . #\Backspace)
(#\f . #\)
(#\n . #\Newline)
(#\r . #\Return)
(#\t . #\Tab)
(#\u . (4 . 16)))
"Mapping between JSON String escape sequences and Lisp chars.")
The symbol is not exported, but you can still refer to it externally with ::. You can dynamically rebind the parameter around the code that needs to use a different list of escaped characters; for example, you can do as follows:
(let ((cl-json::+json-lisp-escaped-chars+
(remove #\/ cl-json::+json-lisp-escaped-chars+ :key #'car)))
(cl-json:encode-json-plist '("x" "1/5")))
This prints:
{"x":"1/5"}

CSV Parsing Issue with Attoparsec

Here is my code that does CSV parsing, using the text and attoparsec
libraries:
import qualified Data.Attoparsec.Text as A
import qualified Data.Text as T
-- | Parse a field of a record.
field :: A.Parser T.Text -- ^ parser
field = fmap T.concat quoted <|> normal A.<?> "field"
where
normal = A.takeWhile (A.notInClass "\n\r,\"") A.<?> "normal field"
quoted = A.char '"' *> many between <* A.char '"' A.<?> "quoted field"
between = A.takeWhile1 (/= '"') <|> (A.string "\"\"" *> pure "\"")
-- | Parse a block of text into a CSV table.
comma :: T.Text -- ^ CSV text
-> Either String [[T.Text]] -- ^ error | table
comma text
| T.null text = Right []
| otherwise = A.parseOnly table text
where
table = A.sepBy1 record A.endOfLine A.<?> "table"
record = A.sepBy1 field (A.char ',') A.<?> "record"
This works well for a variety of inputs but is not working in case that there
is a trailing \n at the end of the input.
Current behaviour:
> comma "hello\nworld"
Right [["hello"],["world"]]
> comma "hello\nworld\n"
Right [["hello"],["world"],[""]]
Wanted behaviour:
> comma "hello\nworld"
Right [["hello"],["world"]]
> comma "hello\nworld\n"
Right [["hello"],["world"]]
I have been trying to fix this issue but I ran out of idaes. I am almost
certain that it will have to be something with A.endOfInput as that is the
significant anchor and the only "bonus" information we have. Any ideas on how
to work that into the code?
One possible idea is to look at the end of the string before running the
Attoparsec parser and removing the last character (or two in case of \r\n)
but that seems to be a hacky solution that I would like avoid in my code.
Full code of the library can be found here: https://github.com/lovasko/comma

Records from <tr>s in an Html table using Arrows and HXT in Haskell

Looking to extract records from a table in a very well formed HTMl table using HXT. I've reviewed a couple of examples on SO and the HXT documentation, such as:
Extracting Values from a Subtree
http://adit.io/posts/2012-04-14-working_with_HTML_in_haskell.html
https://www.schoolofhaskell.com/school/advanced-haskell/xml-parsing-with-validation
Running Haskell HXT outside of IO?
extract multiples html tables with hxt
Parsing html in haskell
http://neilbartlett.name/blog/2007/08/01/haskell-explaining-arrows-through-xml-transformationa/
https://wiki.haskell.org/HXT/Practical/Simple2
https://wiki.haskell.org/HXT/Practical/Simple1
Group html table rows with HXT in Haskell
Parsing multiple child nodes in Haskell with HXT
My problem is:
I want to identify a table uniquely by a known id, and then for each
tr within that table, create a record object and return this as a list
of records.
Here's my HTML
<!DOCTYPE html>
<head>
<title>FakeHTML</title>
</head>
<body>
<table id="fakeout-dont-get-me">
<thead><tr><td>Null</td></tr></thead>
<tbody><tr><td>Junk!</td></tr></tbody>
</table>
<table id="Greatest-Table">
<thead>
<tr><td>Name</td><td>Favorite Rock</td></tr>
</thead>
<tbody>
<tr id="rock1">
<td>Fred</td>
<td>Igneous</td>
</tr>
<tr id="rock2">
<td>Bill</td>
<td>Sedimentary</td>
</tr>
</tbody>
</table>
</body>
</html>
Here's the code I'm trying, along with 2 different approaches to parsing this. First, imports ...
{-# LANGUAGE Arrows, OverloadedStrings, DeriveDataTypeable, FlexibleContexts #-}
import Text.XML.HXT.Core
import Text.HandsomeSoup
import Text.XML.HXT.XPath.XPathEval
import Data.Tree.NTree.TypeDefs
import Text.XML.HXT.XPath.Arrows
What I want is a list of Rockrecs, eg from...
recs = [("rock1", "Name", "Fred", "Favorite Rock", "Igneous"),
("rock2", "Name", "Bill", "Favorite Rock", "Sedimentary")]
data Rockrec = Rockrec { rockID:: String,
rockName :: String,
rockFav :: String} deriving Show
rocks = [(\(a,_,b,_,c) -> Rockrec a b c ) r | r <- recs]
-- [Rockrec {rockID = "rock1", rockName = "Fred", rockFav = "Igneous"},
-- Rockrec {rockID = "rock2", rockName = "Bill", rockFav = "Sedimentary"}]
Here's my first way, which uses a bind on runLA after I return a bunch of [XMLTree]. That is, I do a first parse just to get the right table, then I process the tree rows after that first grab.
Attempt 1
getTab = do
dt <- Prelude.readFile "fake.html"
let html = parseHtml dt
tab <- runX $ html //> hasAttrValue "id" (== "Greatest-Table")
return tab
-- hmm, now this gets tricky...
-- table <- getTab
node tag = multi (hasName tag)
-- a la https://stackoverflow.com/questions/3901492/running-haskell-hxt-outside-of-io?rq=1
getIt :: ArrowXml cat => cat (Data.Tree.NTree.TypeDefs.NTree XNode) (String, String)
getIt = (node "tr" >>>
(getAttrValue "id" &&& (node "td" //> getText)))
This kinda works. I need to massage a bit, but can get it to run...
-- table >>= runLA getIt
-- [("","Name"),("","Favorite Rock"),("rock1","Fred"),("rock1","Igneous"),("rock2","Bill"),("rock2","Sedimentary")]
This is a second approach, inspired by https://wiki.haskell.org/HXT/Practical/Simple1. Here, I think I'm relying on something in {-# LANGUAGE Arrows -} (which coincidentally breaks my list comprehension for rec above), to use the proc function to do this in a more readable do block. That said, I can't even get a minimal version of this to compile:
Attempt 2
getR :: ArrowXml cat => cat XmlTree Rockrec
getR = (hasAttrValue "id" (== "Greatest-Table")) >>>
proc x -> do
rockId <- getText -< x
rockName <- getText -< x
rockFav <- getText -< x
returnA -< Rockrec rockId rockName rockFav
EDIT
Trouble with the types, in response to the comment below from Alec
λ> getR [table]
<interactive>:56:1-12: error:
• Couldn't match type ‘NTree XNode’ with ‘[[XmlTree]]’
Expected type: [[XmlTree]] -> Rockrec
Actual type: XmlTree -> Rockrec
• The function ‘getR’ is applied to one argument,
its type is ‘cat0 XmlTree Rockrec’,
it is specialized to ‘XmlTree -> Rockrec’
In the expression: getR [table]
In an equation for ‘it’: it = getR [table]
λ> getR table
<interactive>:57:1-10: error:
• Couldn't match type ‘NTree XNode’ with ‘[XmlTree]’
Expected type: [XmlTree] -> Rockrec
Actual type: XmlTree -> Rockrec
• The function ‘getR’ is applied to one argument,
its type is ‘cat0 XmlTree Rockrec’,
it is specialized to ‘XmlTree -> Rockrec’
In the expression: getR table
In an equation for ‘it’: it = getR table
END EDIT
Even if I'm not selecting elements, I can't get the above to run. I'm also a little puzzled at how I should do something like put the first td in rockName and the second td in rockFav, how to include an iterator on these (supposing I have a lot of td fields, instead of just 2.)
Any further general tips on how to do this more painlessly appreciated.
From HXT/Practical/Google1 I think I am able to piece together a solution.
{-# LANGUAGE Arrows #-}
{-# LANGUAGE ScopedTypeVariables #-}
module Hanzo where
import Text.HandsomeSoup
import Text.XML.HXT.Cor
atTag tag =
deep (isElem >>> hasName tag)
text =
deep isText >>> getText
data Rock = Rock String String String deriving Show
rocks =
atTag "tbody" //> atTag "tr"
>>> proc x -> do
rowID <- x >- getAttrValue "id"
name <- x >- atTag "td" >. (!! 0) >>> text
kind <- x >- atTag "td" >. (!! 1) >>> text
returnA -< Rock rowID name kind
main = do
dt <- readFile "html.html"
result <- runX $ parseHtml dt
//> hasAttrValue "id" (== "Greatest-Table")
>>> rocks
print result
The key takeways are these:
Your arrows work on streams of elements, but not individual elements. This is the ArrowList constraint. Thus, calling getText three times will produce surprising behavior because getText represents all the different possible text values you could get in the course of streaming <table> elements through your proc x -> do {...}.
What we can do instead is focus on the stream we want: a stream of <tr>s inside the <tbody>. For each table row, we grab the ID attribute value and the text of the first two <td>s.
This does not seem the most elegant solution, but one way we can index into a stream is to filter it down with the (>.) :: ArrowList cat => cat a b -> ([b] -> c) -> cat a c combinator.
One last trick, one that I noticed in the practical wiki examples: we can use deep and isElem/isText to focus on just the nodes we want. XML trees are noisy!

Haskell print the first line into Browser Tab [duplicate]

This question already has an answer here:
Return the first line of a String in Haskell
(1 answer)
Closed 8 years ago.
Just a simple question, my code is complete. It takes an input file, breaks it into lines, reads the file line by line, does the conversions, which is in this case, turns certain things into HTML format (ex: #This is a line into a line with H1 HTML tags, formatting it into a header). The only thing I have left is to take the First line of code, and print that code into the browser tab. Also, the body, or tail must be printed into the window, not the tab. So the first line of my .txt file is The Title! which I want to show in the tab of the web browser. Here is something I have for that:
formatToHTML :: String -> String
formatToHTML [] = []
formatToHTML x
| head x == --any char = "<title>" ++ head ++ "</title>"
| tail x == --rest of file = "<body>" ++ tail ++ "</tail>"
| otherwise = null
or
formatToHTML :: [String] -> String
formatToHTML = unlines. map (show) "<title>" ++ head ++ </title>" $ lines
I dont want to, or I think even need to use guards here, but I cant think of a shorter way to do my task.
I would call this from my main method before I output my file to html.
Also, I know its a amateur haskell question. but how would I represent any char. Say, I want to say, if the head of x exists, print the head with the title tags. print tail with body tags. Help? Thank You
My guess of what you want is:
formatHtml :: [String] -> String
formatHtml [] = ""
formatHtml (x:xs) = unlines theLines
where theLines = [ "<title>" ++ ...convert x to html... ++ "</title>",
"<body>" ] ++ map toHtml xs ++ [ "</body>" ]
toHtml :: String -> String
toHmtl str = ...converts str to HTML...
Example:
formatHtml [ "the title", "body line 1", "body line2" ]
results in:
<title>the title</title>
<body>
body line 1
body line 2
</body>
You still have to define the toHtml function and decide how to convert the first line to the inner html of the tag.

Read An Input.md file and output a .html file Haskell

I had a question concerning some basic transformations in Haskell.
Basically, I have a written Input file, named Input.md. This contains some markdown text that is read in my project file, and I want to write a few functions to do transformations on the text. After completing these functions under a function called convertToHTML, I have output the file as an .html file in the correct format.
module Main
(
convertToHTML,
main
) where
import System.Environment (getArgs)
import System.IO
import Data.Char (toLower, toUpper)
process :: String -> String
process s = head $ lines s
convertToHTML :: String -> String
convertToHTML str = do
x <- str
if (x == '#')
then "<h1>"
else return x
--convertToHTML x = map toUpper x
main = do
args <- getArgs -- command line args
let (infile,outfile) = (\(x:y:ys)->(x,y)) args
putStrLn $ "Input file: " ++ infile
putStrLn $ "Output file: " ++ outfile
contents <- readFile infile
writeFile outfile $ convertToHTML contents
So,
How would I read through my input file, and transform any line that starts with a # to an html tag
How would I read through my input file once more and transform any WORD that is surrounded by _word_ (1 underscore) to another html tag
Replace any Character with an html string.
I tried using such functions such as Map, Filter, ZipWith, but could not figure out how to iterate through the text and transform each text. Please if anybody has any suggestions. I've been working on this for 2 days straight and have a bunch of failed code to show for a couple of weeks and have a bunch of failed code to show it.
I tried using such functions such as Map, Filter, ZipWith, but could not figure out how to iterate through the text and transform each text.
Because they work on appropriate element collection. And they don't really "iterate"; you simply have to feed the appropriate data. Let's tackle the # problem as an example.
Our file is one giant String, and what we'd like is to have it nicely split in lines, so [String]. What could do it for us? I have no idea, so let's just search Hoogle for String -> [String].
Ah, there we go, lines function! Its counterpart, unlines, is also going to be useful. Now we can write our line wrapper:
convertHeader :: String -> String
convertHeader [] = [] -- that prevents us from calling head on an empty line
convertHeader x = if head x == '#' then "<h1>" ++ x ++ "</h1>"
else x
and so:
convertHeaders :: String -> String
convertHeaders = unlines . map convertHeader . lines
-- ^String ^[String] ^[String] ^String
As you can see the function first converts the file to lines, maps convertHeader on each line, and the puts the file back together.
See it live on Ideone
Try now doing the same with words to replace your formatting patterns. As a bonus exercise, change convertHeader to count the number of # in front of the line and output <h1>, <h2>, <h3> and so on accordingly.