Strange output in Clojure Web Crawler - html

I have the following problem. I'm doing WebCrawler for a school assignment, and I'm doing it in Clojure. Here is the code.
(defn crawl [url current-depth max-depth]
(def hrefs (get-links url))
(if (< current-depth max-depth)
(map crawl hrefs (iterate eval (inc current-depth)) (iterate eval max-depth))
hrefs))
(defn get-links [page]
($ (get! page) td "a[href]" (attr "abs:href")))
The get! and $ functions is not written by me, I've taken them from here: https://github.com/mfornos/clojure-soup/blob/master/src/jsoup/soup.clj
My problem is that when I call (crawl "http://bard.bg" 0 0) from repl I get the following output:
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http:/
("http://www.bard.bg/genres/?id=1" "http://www.bard.bg/genres/?id=2" "http://www.bard.bg/genres/?id=4" "http://www.bard.bg/genres/?id=5" "http://www.bard.bg/genres/?id=6" "http://www.bard.bg/genres/?id=10" "http://www.bard.bg/genres/?id=17" "http://www.bard.bg/genres/?id=24"
...
So where do the first 2 lazyseqs are coming from? Why are they unfinished?
Seems like the problem is in the Clojure-Soup and more specifically here:
(defmacro $ [doc & forms]
(let [exprs# (map #(if (string? %) `(select ~%)
(if (symbol? %) `(select ~(str %))
(if (keyword? %) `(select ~(str "#"(name %)))
%))) forms)]
`(->> ~doc ~#exprs#)))`

I cannot reproduce the problem you described. In my case (crawl "http://bard.bg" 0 0) returns a list of 174 strings.
However, I'd like to take this opportunity to point you to an incorrect usage of def in the crawl function. Instead of def you should use let. Additionally, instead of (iterate eval ...) use repeat.
(defn crawl [url current-depth max-depth]
(let [hrefs (get-links url)]
(if (< current-depth max-depth)
(map crawl hrefs (repeat (inc current-depth)) (repeat max-depth))
hrefs)))
For discussion see let vs def in clojure.

Related

ClojureScript. Reset atom in Reagent when re-render occurs

I'm displaying a set of questions for a quiz test and I'm assigning a number to each question just to number them when they are shown in the browser:
(defn questions-list
[]
(let [counter (atom 0)]
(fn []
(into [:section]
(for [question #(re-frame/subscribe [:questions])]
[display-question (assoc question :counter (swap! counter inc))])))))
The problem is that when someone edits a question in the browser (and the dispatch is called and the "app-db" map is updated) the component is re-rendered but the atom "counter" logically starts from the last number not from zero. So I need to reset the atom but I don't know where. I tried with a let inside the anonymous function but that didn't work.
In this case I'd just remove the state entirely. I haven't tested this code, but your thinking imperatively here. The functional version of what your trying to do is something along the lines of:
Poor but stateless:
(let [numbers (range 0 (count questions))
indexed (map #(assoc (nth questions %) :index %) questions)]
[:section
(for [question indexed]
[display-question question])])
but this is ugly, and nth is inefficient. So lets try one better. Turns out map can take more than one collection as it's argument.
(let [numbers (range 0 (count questions))
indexed (map (fn [idx question] (assoc question :index idx)) questions)]
[:section
(for [question indexed]
[display-question question])])
But even better, turns out there is a built in function for exactly this. What I'd actually write:
[:section
(doall
(map-indexed
(fn [idx question]
[display-question (assoc question :index idx)])
questions))]
Note: None of this code has actually been run, so you might have to tweak it a bit before it works. I'd recommend looking up all of the functions in ClojureDocs to make sure you understand what they do.
If you need counter to be just an index for a question, you could instead use something like this:
(defn questions-list
[]
(let [questions #(re-frame/subscribe [:questions])
n (count questions)]
(fn []
[:section
[:ul
(map-indexed (fn [idx question] ^{:key idx} [:li question]) questions)]])))
Note: here I used [:li question] because I assumed that question is some kind of text.
Also, you could avoid computing the count for questions in this component and do it with a layer 3 subscription:
(ns your-app.subs
(:require
[re-frame.core :as rf]))
;; other subscriptions...
(rf/reg-sub
:questions-count
(fn [_ _]
[(rf/subscribe [:questions])])
(fn [[questions] _]
(count questions)))
Then in the let binding of your component you would need to replace n (count questions) with n #(re-frame/subscribe [:questions-count]).

How to create a function local mutable variable in ClojureScript?

Consider the following hypothetical nonsensical ClojureScript function:
(defn tmp []
(def p 0)
(set! p (inc p))
(set! p (inc p))
(set! p (inc p)))
Repeatedly executing this function in a REPL results in
3
6
9
etc.
Is it possible to create a mutable variable which is local to the function, such that the output would have been
3
3
3
etc.
in the case of repeated exection of (tmp)?
let lets you assign variables limited to it's scope:
(defn tmp[]
(let [p 0]
...))
Now, clojurescript makes use of immutable data. That means everything is basically a constant, and once you set a value of p, there is no changing it. There are two ways you can get around this:
Use more local variables
(defn tmp[]
(let [a 0
b (inc a)
c (inc b)
d (inc c)]
...))
Use an atom
Atoms are somewhat different from other data structures in clojurescript and allow control of their state. Basically, you can see them as a reference to your value.
When creating an atom, you pass the initial value as its argument. You can access an atoms value by adding # in front of the variable, which is actually a macro for (deref my-var).
You can change the value of an atom using swap! and reset! functions. Find out more about them in the cljs cheatsheet.
(defn tmp[]
(let [p (atom 0)]
(reset! p (inc #p))
(reset! p (inc #p))
(reset! p (inc #p))))
Hope this helps.

How to build a vector via a call to reduce

I'm trying to figure why this particular function isn't working as expected. I suspect from the error message that it has something to do with the way I'm creating the empty vector for the accumulator.
I have a simple function that returns a sequence of 2-element vectors:
(defn zip-with-index
"Returns a sequence in which each element is of the
form [i c] where i is the index of the element and c
is the element at that index."
[coll]
(map-indexed (fn [i c] [i c]) coll))
That works fine. The problem comes when I try to use it in another function
(defn indexes-satisfying
"Returns a vector containing all indexes of coll that satisfy
the predicate p."
[p coll]
(defn accum-if-satisfies [acc zipped]
(let [idx (first zipped)
elem (second zipped)]
(if (p elem)
(conj acc idx)
(acc))))
(reduce accum-if-satisfies (vector) (zip-with-index coll)))
It compiles, but when I attempt to use it I get an error:
user=> (indexes-satisfying (partial > 3) [1 3 5 7])
ArityException Wrong number of args (0) passed to: PersistentVector
clojure.lang.AFn.throwArity (AFn.java:437)
I can't figure out what's going wrong here. Also if there is a more 'Clojure-like' way of doing what I'm trying to do, I'm interested in hearing about that also.
The problem is probably on the else clause of accum-if-satisfies, should be just acc not (acc).
You could use filter and then map instead of reduce. Like that:
(map #(first %)
(filter #(p (second %))
(zip-with-index coll)))
You could also call map-indexed with vector instead of (fn [i c] [i c]).
The whole code would look like that:
(defn indexes-satisfying
[p coll]
(map #(first %)
(filter #(p (second %))
(map-indexed vector coll))))
As for a more Clojure-like way, you could use
(defn indexes-satisfying [pred coll]
(filterv #(pred (nth coll %))
(range (count coll))))
Use filter instead of filterv to return a lazy seq rather than a vector.
Also, you should not use defn to define inner functions; it will instead define a global function in the namespace where the inner function is defined and have subtle side effects besides that. Use letfn instead:
(defn outer [& args]
(letfn [(inner [& inner-args] ...)]
(inner ...)))
One more way to do it would be:
(defn indexes-satisfying [p coll]
(keep-indexed #(if (p %2) % nil) coll))

Retrieve Clojure function metadata dynamically

Environment: Clojure 1.4
I'm trying to pull function metadata dynamically from a vector of functions.
(defn #^{:tau-or-pi: :pi} funca "doc for func a" {:ans 42} [x] (* x x))
(defn #^{:tau-or-pi: :tau} funcb "doc for func b" {:ans 43} [x] (* x x x))
(def funcs [funca funcb])
Now, retrieving the metadata in the REPL is (somewhat) straight-forward:
user=>(:tau-or-pi (meta #'funca))
:pi
user=>(:ans (meta #'funca))
42
user=>(:tau-or-pi (meta #'funcb))
:tau
user=>(:ans (meta #'funcb))
43
However, when I try to do a map to get the :ans, :tau-or-pi, or basic :name from the metadata, I get the exception:
user=>(map #(meta #'%) funcs)
CompilerException java.lang.RuntimeException: Unable to resolve var: p1__1637# in this context, compiling:(NO_SOURCE_PATH:1)
After doing some more searching, I got the following idea from a posting in 2009 (https://groups.google.com/forum/?fromgroups=#!topic/clojure/VyDM0YAzF4o):
user=>(map #(meta (resolve %)) funcs)
ClassCastException user$funca cannot be cast to clojure.lang.Symbol clojure.core/ns-resolve (core.clj:3883)
I know that the defn macro (in Clojure 1.4) is putting the metadata on the Var in the def portion of the defn macro so that's why the simple (meta #'funca) is working, but is there a way to get the function metadata dynamically (like in the map example above)?
Maybe I'm missing something syntactically but if anyone could point me in the right direction or the right approach, that'd would be great.
Thanks.
the expression #(meta #'%) is a macro that expands to a call to defn (actually def) which has a parameter named p1__1637# which was produced with gensym and the call to meta on that is attempting to use this local parameter as a var, since no var exists with that name you get this error.
If you start with a vector of vars instead of a vector of functions then you can just map meta onto them. You can use a var (very nearly) anywhere you would use a function with a very very minor runtime cost of looking up the contents of the var each time it is called.
user> (def vector-of-functions [+ - *])
#'user/vector-of-functions
user> (def vector-of-symbols [#'+ #'- #'*])
#'user/vector-of-symbols
user> (map #(% 1 2) vector-of-functions)
(3 -1 2)
user> (map #(% 1 2) vector-of-symbols)
(3 -1 2)
user> (map #(:name (meta %)) vector-of-symbols)
(+ - *)
user>
so adding a couple #'s to your original code and removing an extra trailing : should do the trick:
user> (defn #^{:tau-or-pi :pi} funca "doc for func a" {:ans 42} [x] (* x x))
#'user/funca
user> (defn #^{:tau-or-pi :tau} funcb "doc for func b" {:ans 43} [x] (* x x x))
#'user/funcb
user> (def funcs [#'funca #'funcb])
#'user/funcs
user> (map #(meta %) funcs)
({:arglists ([x]), :ns #<Namespace user>, :name funca, :ans 42, :tau-or-pi :pi, :doc "doc for func a", :line 1, :file "NO_SOURCE_PATH"} {:arglists ([x]), :ns #<Namespace user>, :name funcb, :ans 43, :tau-or-pi :tau, :doc "doc for func b", :line 1, :file "NO_SOURCE_PATH"})
user> (map #(:tau-or-pi (meta %)) funcs)
(:pi :tau)
user>
Recently, I found it useful to attach metadata to the functions themselves rather than the vars as defn does.
You can do this with good ol' def:
(def funca ^{:tau-or-pi :pi} (fn [x] (* x x)))
(def funcb ^{:tau-or-pi :tau} (fn [x] (* x x x)))
Here, the metadata has been attached to the functions and then those metadata-laden functions are bound to the vars.
The nice thing about this is that you no longer need to worry about vars when considering the metadata. Since the functions contain metadata instead, you can pull it from them directly.
(def funcs [funca funcb])
(map (comp :tau-or-pi meta) funcs) ; [:pi :tau]
Obviously the syntax of def isn't quite as refined as defn for functions, so depending on your usage, you might be interested in re-implementing defn to attach metadata to the functions.
I'd like to elaborate on Beyamor's answer. For some code I'm writing, I am using this:
(def ^{:doc "put the-func docstring here" :arglists '([x])}
the-func
^{:some-key :some-value}
(fn [x] (* x x)))
Yes, it is a bit unwieldy to have two metadata maps. Here is why I do it:
The first metadata attaches to the the-func var. So you can use (doc the-func) which returns:
my-ns.core/the-func
([x])
put the-func docstring here
The second metadata attaches to the function itself. This lets you use (meta the-func) to return:
{:some-key :some-value}
In summary, this approach comes in handy when you want both docstrings in the REPL as well as dynamic access to the function's metadata.

Generate javascript method call code with ClojureScript macro?

I am using ClojureScript to detect which browser-specific version of 'requestAnimationFrame' method is defined. I use the following code:
(defn animationFrameMethod []
(let [window (dom/getWindow)
options (list #(.-requestAnimationFrame window)
#(.-webkitRequestAnimationFrame window)
#(.-mozRequestAnimationFrame window)
#(.-oRequestAnimationFrame window)
#(.-msRequestAnimationFrame window))]
((fn [[current & remaining]]
(cond
(nil? current) #((.-setTimeout window) % (/ 1000 30))
(fn? (current)) (current)
:else (recur remaining)))
options)))
This works fine, and it's not terrible, but I would really like to be able to put the method names in a list, i.e.
'(requestAnimationFrame webkitRequestAnimationFrame ...)
And then call a macro for each symbol in the list to generate the anonymous function code.
I would like something to work like so:
user> (def name webkitRequestAnimationFrame)
user> (macroexpand '(macros/anim-method name window))
#(.-webkitRequestAnimationFrame window)
But I played around with macros for a while, and was unable to achieve this effect. Part of the problem is that method names and the dot notation work strangely, and I'm not even sure if this is possible.
Any tips to get this working? Thanks!
Remember that javascript objects are also associative hashes, so something like this should work without resorting to macros (untested)....
(def method-names ["requestAnimationFrame"
"webkitRequestAnimationFrame"
"mozRequestAnimationFrame"
"oRequestAnimationFrame"
"msRequestAnimationFrame"])
(defn animationFrameMethod []
(let [window (dom/getWindow)
options (map (fn [n] #(aget window n)) method-names)]
((fn [[current & remaining]]
(cond
(nil? current) #((.-setTimeout window) % (/ 1000 30))
(fn? (current)) (current)
:else (recur remaining)))
options)))