Tail recursive reading of file OCaml - exception

I write a function that print all contents of file like that
let rec print_file channel =
try
begin
print_endline (input_line channel);
print_file channel
end
with End_of_file -> ()
And because of print_file is last operation I was thinking that it will be optimized to regular loop. but when I had runned my program on really big file, I had got stack overflow.
So I has tried to wrap input_line function to input_line_opt that don't raise exception and little change code of print_file.
let input_line_opt channel =
try Some (input_line channel)
with End_of_file -> None
let rec print_file channel =
let line = input_line_opt channel in
match line with
Some line -> (print_endline line; print_file channel)
| None -> ()
And now it works like regular tail recursive function and don't overflow my stack.
What difference between this two function?

In the first example, try ... with is an operation that occurs after the recursive call to print_file. So the function is not tail recursive.
You can imagine that try sets up some data on the stack, and with removes the data from the stack. Since you remove the data after the recursive call, the stack gets deeper and deeper.
This was a consistent problem in earlier revisions of OCaml. It was tricky to write tail-recursive code for processing a file. In recent revisions you can use an exception clause of match to get a tail position for the recursive call:
let rec print_file channel =
match input_line channel with
| line -> print_endline line; print_file channel
| exception End_of_file -> ()

Related

Julia waits for function to finish before printing message in a loop

I have a function in Julia that requires to do things in a loop. A parameter is passed to the loop, and the bigger this parameter, the slower the function gets. I would like to have a message to know in which iteration it is, but it seems that Julia waits for the whole function to be finished before printing anything. This is Julia 1.4. That behaviour was not on Julia 1.3.
A example would be like this
function f(x)
rr=0.000:0.0001:x
aux=0
for r in rr
print(r, " ")
aux+=veryslowfunction(r)
end
return aux
end
As it is, f, when called, does not print anything until it has finished.
You need to add after the print command:
flush(stdout)
Explanation
The standard output of a process is usually buffered. The particular buffer size and behavior will depend on your system setting and perhaps the terminal type.
By flushing the buffer you make sure that the contents is actually sent to the terminal.
Alternatively, you can also use a library like ProgressLogging.jl (needs TerminalLoggers.jl to see actual output), or ProgressMeter.jl, which will automatically update a nicely formatted status bar during each step of the loop.
For example, with ProgressMeter, a call to
function f(x)
rr=0.000:0.0001:x
aux=0
#showprogress for r in rr
aux += veryslowfunction(r)
end
return aux
end
will show something like (in the end):
Progress: 100%|██████████████████████████████████████████████████████████████| Time: 0:00:10
Again I can't reproduce the behaviour in my terminal (it always prints), but I wanted to add that for these types of situations the #show macro is quite neat:
julia> function f(x)
rr=0.000:0.0001:x
aux=0
for r in rr
#show r
aux+=veryslowfunction(r)
end
return aux
end
f (generic function with 1 method)
julia> f(1)
r = 0.0
r = 0.0001
r = 0.0002
...
It uses println under the hood:
julia> using MacroTools
julia> a = 5
5
julia> prettify(#expand(#show a))
quote
Base.println("a = ", Base.repr($(Expr(:(=), :ibex, :a))))
ibex
end

Haskell - Pass arbitrary function and argument list as arguments to another function

My main goal is to redirect stderr to a file.
I got hold of the following code snippet...
catchOutput :: IO a -> IO (res, String)
catchOutput f = do
tmpd <- getTemporaryDirectory
(tmpf, tmph) <- openTempFile tmpd "haskell_stderr"
stderr_dup <- hDuplicate stderr
hDuplicateTo tmph stderr
hClose tmph
res <- f
hDuplicateTo stderr_dup stderr
str <- readFile tmpf
removeFile tmpf
return (res, str)
I hoped to make this more general and pass any function and argument list to catchOutput and get the function result as well as message written to stderr (if any).
I thought that an argument list of type [Data.Dynamic] might work but I failed to retrieve the function result with
res <- Data.List.foldl (f . fromDyn) Nothing $ args
Is this even possible? Help will be greatly appreciated.
There is not reason to use Data.Dynamic. You already know type the return type of f, it's a so you can use just that, i.e.
catchOutput :: IO a -> IO (a, String)
Note though, that there some significant issues with your approach:
By redirecting stderr to a file, this will also affect all other concurrent threads. So you could possibly get unrelated data sent to the temporary file.
If an exception is thrown while stderr is redirected, the original stderr will not be restored. Any operation between the two hDuplicateTo lines (hClose and f in this case) could possibly throw an exception, or the thread may receive an asynchronous exception. For this reason, you have to use something like bracket to make your code exception safe.

Can't define an exception only in a mli file

Ok, this is mostly about curiosity but I find it too strange.
Let's suppose I have this code
sig.mli
type t = A | B
main.ml
let f =
let open Sig in
function A | B -> ()
If I compile, everything will work.
Now, let's try to modify sig.mli
sig.mli
type t = A | B
exception Argh
and main.ml
main.ml
let f =
let open Sig in
function
| A -> ()
| B -> raise Argh
And let's try to compile it :
> ocamlc -o main sig.mli main.ml
File "main.ml", line 1:
Error: Error while linking main.cmo:
Reference to undefined global `Sig'
Well, is it just because I added the exception ? Maybe it means that exceptions are like functions or modules, you need a proper implementation.
But then, what if I write
main.ml
let f =
let open Sig in
function A | B -> ()
And try to compile ?
> ocamlc -o main sig.mli main.ml
>
It worked ! If I don't use the exception, it compiles !
There is no reason to this behaviour, right ? (I tested it on different compilers, 3.12.0, 4.00.0, 4.02.3 and 4.03.0 and all of them gave the same error)
Unlike variants, exception is not a pure type and requires its implementation in .ml file. Compile the following code with ocamlc -dlambda -c x.ml:
let x = Exit
-- the output --
(setglobal X!
(seq (opaque (global Pervasives!))
(let (x/1199 = (field 2 (global Pervasives!)))
(pseudo _none_(1)<ghost>:-1--1 (makeblock 0 x/1199)))))
You can see (let (x/1999 = (field 2 (global Pervasives!))).. which means assigning the value stored in the 2nd position of module Pervasives. This is the value of Exit. Exceptions have their values and therefore need .ml.
Variants do not require implementation. It is since their values can be constructed purely from their type information: constructors' tag integers. We cannot assign tag integers to exceptions (and their generalized version, open type constructors) since they are openly defined. Instead they define values for their identification in .ml.
To get an implementation of the exception, you need sig.ml. A .mli file is an interface file, a .ml file is an implementation file.
For this simple example you could just rename sig.mli to sig.ml:
$ cat sig.ml
type t = A | B
exception Argh
$ cat main.ml
let f =
let open Sig in
function
| A -> ()
| B -> raise Argh
$ ocamlc -o main sig.ml main.ml
I don't see a problem with this behavior, though it would be nice not to have to duplicate types and exceptions between .ml and .mli files. The current setup has the advantage of being simple and explicit. (I'm not a fan of compilers being too clever and doing things behind my back.)

How do I correctly use Control.Exception.catch in Haskell?

Can someone please explain the difference between the behavior in ghci of the following to lines:
catch (return $ head []) $ \(e :: SomeException) -> return "good message"
returns
"*** Exception: Prelude.head: empty list
but
catch (print $ head []) $ \(e :: SomeException) -> print "good message"
returns
"good message"
Why isn't the first case catching the exception? Why are they different? And why does the first case put a double quote before the exception message?
Thanks.
Let's examine what happens in the first case:
catch (return $ head []) $ \(e :: SomeException) -> return "good message"
You create thunk head [] which is returned as an IO action. This thunk doesn't throw any exception, because it isn't evaluated, so the whole call catch (return $ head []) $ ... (which is of type IO String) produces the String thunk without an exception. The exception occurs only when ghci tries to print the result afterwards. If you tried
catch (return $ head []) $ \(e :: SomeException) -> return "good message"
>> return ()
instead, no exception would have been printed.
This is also the reason why you get _"* Exception: Prelude.head: empty list_. GHCi starts to print the string, which starts with ". Then it tries to evaluate the string, which results in an exception, and this is printed out.
Try replacing return with evaluate (which forces its argument to WHNF) as
catch (evaluate $ head []) $ \(e :: SomeException) -> return "good message"
then you'll force the thunk to evaluate inside catch which will throw the exception and let the handler intercept it.
In the other case
catch (print $ head []) $ \(e :: SomeException) -> print "good message"
the exception occurs inside the catch part when print tries to examine head [] and so it is caught by the handler.
Update: As you suggest, a good thing is to force the value, preferably to its full normal form. This way, you ensure that there are no "surprises" waiting for you in lazy thunks. This is a good thing anyway, for example you can get hard-to-find problems if your thread returns an unevaluated thunk and it is actually evaluated in another, unsuspecting thread.
Module Control.Exception already has evaluate, which forces a thunk into its WHNF. We can easily augment it to force it to its full NF:
import Control.DeepSeq
import Control.Seq
import Control.Exception
import Control.Monad
toNF :: (NFData a) => a -> IO a
toNF = evaluate . withStrategy rdeepseq
Using this, we can create a strict variant of catch that forces a given action to its NF:
strictCatch :: (NFData a, Exception e) => IO a -> (e -> IO a) -> IO a
strictCatch = catch . (toNF =<<)
This way, we are sure that the returned value is fully evaluated, so we won't get any exceptions when examining it. You can verify that if you use strictCatch instead of catch in your first example, it works as expected.
return $ head []
wraps head [] in an IO action (because catch has an IO type, otherwise it would be any monad) and returns it. There is nothing caught because there is no error. head [] itself is not evaluated at that point, thanks to lazyness, but only returned.
So, return only adds a layer of wrapping, and the result of your whole catch expression is head [], quite valid, unevaluated. Only when GHCi or your program actually try to use that value at some later point, it will be evaluated and the empty list error is thrown - at a different point, however.
print $ head []
on the other hand immediately evaluates head [], yielding an error which is subsequently caught.
You can also see the difference in GHCi:
Prelude> :t head []
head [] :: a
Prelude> :t return $ head []
return $ head [] :: Monad m => m a
Prelude> :t print $ head []
print $ head [] :: IO ()
Prelude> return $ head [] -- no error here!
Prelude> print $ head []
*** Exception: Prelude.head: empty list
To avoid that, you can perhaps just force the value:
Prelude> let x = head [] in x `seq` return x
*** Exception: Prelude.head: empty list
GHCi works in the IO monad and return for IO doesn't force its argument. So return $ head [] doesn't throw any exception, and an exception that isn't thrown can't be caught. Printing the result afterwards throws an exception, but this exception isn't in the scope of catch anymore.
The type IO a has two parts:
The structure, the part that goes out and performs side-effects. This is represented by IO.
The result, the pure value held inside. This is represented by a.
The catch function only catches exceptions when they break through the IO structure.
In the first example, you call print on the value. Since printing a value requires performing I/O with it, any exceptions raised within the value end up in the structure itself. So catch intercepts the exception and all is well.
On the other hand, return does not inspect its argument. In fact, you are guaranteed by the monad laws that calling return does not affect the structure at all. So your value simply passes right through.
So if your code isn't affected by the exception, then where is the error message coming from? The answer is, surprisingly, outside your code. GHCi implicitly tries to print every expression that is passed to it. But by then, we are already outside the scope of the catch, hence the error message you see.

Printing stack traces

I have a very short test file:
let print_backtrace () = try raise Not_found with
Not_found -> Printexc.print_backtrace stdout;;
let f () = print_backtrace (); Printf.printf "this is to make f non-tail-recursive\n";;
f ();
I compile and run:
% ocamlc -g test.ml
% OCAMLRUNPARAM=b ./a.out
Raised at file "test.ml", line 1, characters 35-44
this is to make f non-tail-recursive
Why isn't f listed in the stack trace? How can I write a function that will print a stack trace of the location it's called from?
The documentation for Printexc.print_backtrace says:
The backtrace lists the program locations where the most-recently raised exception was raised and where it was propagated through function calls.
It actually seems to be doing the right thing. The exception hasn't been propagated back through f.
If I move the call to Printexc.print_backtrace outside the call to f, I see a full backtrace.
$ cat test2.ml
let print_backtrace () = raise Not_found
let f () = let res = print_backtrace () in res ;;
try f () with Not_found -> Printexc.print_backtrace stdout
$ /usr/local/ocaml312/bin/ocamlc -g test2.ml
$ OCAMLRUNPARAM=b a.out
Raised at file "test2.ml", line 1, characters 31-40
Called from file "test2.ml", line 3, characters 21-39
Called from file "test2.ml", line 5, characters 4-8
Here is the code to do what I suggested. I recommend using ocamldebug if at all possible, this code is much too tricky. But it works on my system for this simple example.
let print_backtrace () =
match Unix.fork () with
| 0 -> raise Not_found
| pid -> let _ = Unix.waitpid [] pid in ()
let f () =
begin
print_backtrace ();
Printf.printf "after the backtrace\n";
end
;;
f ()
Here is a test run.
$ /usr/local/ocaml312/bin/ocamlc unix.cma -g test3.ml
$ OCAMLRUNPARAM=b a.out
Fatal error: exception Not_found
Raised at file "test3.ml", line 3, characters 17-26
Called from file "test3.ml", line 8, characters 4-22
Called from file "test3.ml", line 14, characters 0-4
after the backtrace
I realized that because of the uncaught exception, you don't really have any control over the way the child process exits. That's one reason this code is much too tricky. Please don't blame me if it doesn't work for you, but I hope it does prove useful.
I tested the code on Mac OS X 10.6.8 using OCaml 3.12.0.
Best regards,