Nokogiri returns "no method error" - html

I keep getting the same error in my program. I've written a method that takes some messy HTML and turns it into neater strings. This works fine on its own, however when I run the whole program I get the following error:
kamer.rb:9:in `normalise_instrumentation': undefined method `split' for #<Nokogiri::XML::NodeSet:0x007f92cb93bfb0> (NoMethodError)
I'd be really grateful for any info or advice on why this happens and how to stop it.
The code is here:
require 'nokogiri'
require 'open-uri'
def normalise_instrumentation(instrumentation)
messy_array = instrumentation.split('.')
normal_array = []
messy_array.each do |section|
if section =~ /\A\d+\z/
normal_array << section
end
end
return normal_array
end
doc = Nokogiri::HTML(open('http://www.cs.vu.nl/~rutger/vuko/nl/lijst_van_ooit/complete-solo.html'))
table = doc.css('table[summary=works] tr')
work_value = []
work_hash = {}
table.each do |row|
piece = [row.css('td[1]'), row.css('td[2]'), row.css('td[3]')].map { |r|
r.text.strip!
}
work_value = work_value.push(piece)
work_key = normalise_instrumentation(row.css('td[3]'))
work_hash[work_key] = work_value
end
puts work_hash

The problem is here:
row.css('td[3]')
Here's why:
row.css('td[3]').class
# => Nokogiri::XML::NodeSet < Object
You're creating your piece array which then becomes an array of NodeSets, which is probably not what you want, because text against a NodeSet often returns a weird String of concatenated text from multiple nodes. You're not seeing that happen here because you're searching inside a row (<tr>) but if you were to look one level up, in the <table>, you'd have a cocked gun pointed at your foot.
Passing a NodeSet to your normalise_instrumentation method is a problem because NodeSet doesn't have a split method, which is the error you're seeing.
But, it gets worse before it gets better. css, like search and xpath returns a NodeSet, which is akin to an Array. Passing an array-like critter to the method will still result in confusion, because you really want just the Node found, not a set of Nodes. So I'd probably use:
row.at('td[3]')
which will return only the node.
At this point you probably want the text of that node, something like
row.at('td[3]').text
would make more sense because then the method would receive a String, which does have a split method.
However, it appears there are additional problems, because some of the cells you want don't exist, so you'll get nil values also.
This isn't one of my better answers, because I'm still trying to grok what you're doing. Providing us with a minimal example of the HTML you need to parse, and the output you want to capture, will help us fine-tune your code to get what you want.

I had a similar error (undefined method) for a different reason, in my case it was due to an extra dot (put by mistake) like this:
status = data.css.("status font-large").text
where it was fixed by removing the extra dot after the css as shown below
status = data.css("status font-large").text
I hope this helps someone else

Related

How to add to / amend / consolidate JRuby Profiler data?

Say I have inside my JRuby program the following loop:
loop do
x=foo()
break if x
bar()
end
and I want to collect profiling information just for the invocations of bar. How to do this? I got so far:
pd = []
loop do
x=foo()
break if x
pd << JRuby::Profiler.profile { bar() }
end
This leaves me with an array pd of profile data objects, one for each invocation of bar. Is there a way to create a "summary" data object, by combining all the pd elements? Or even better, have a single object, where profile would just add to the existing profiling information?
I googled for a documentation of the JRuby::Profiler API, but couldn't find anything except a few simple examples, none of them covering my case.
UPDATE : Here is another attempt I tried, which does not work either.
Since the profile method initially clears the profile data inside the Profiler, I tried to separate the profiling steps from the data initializing steps, like this:
JRuby::Profiler.clear
loop do
x=foo()
break if x
JRuby::Profiler.send(:current_thread_context).start_profiling
bar()
JRuby::Profiler.send(:current_thread_context).stop_profiling
end
profile_data = JRuby::Profiler.send(:profile_data)
This seems to work at first, but after investigation, I found that profile_data then contains the profiling information from the last (most recent) execution of bar, not of all executions collected together.
I figured out a solution, though I have the feeling that I'm using a ton of undocumented features to get it working. I also must add that I am using (1.7.27), so later JRuby versions might or might not need a different approach.
The problem with profiling is that start_profiling (corresponding to the Java method startProfiling in the class Java::OrgJrubyRuntime::ThreadContext) not only turns on the profiling flag, but also allocates a fresh ProfileData object. What we want to do, is to reuse the old object. stop_profiling OTOH only toggles the profiling switch and is uncritical.
Unfortunately, ThreadContext does not provide a method to manipulate the isProfiling toggle, so as a first step, we have to add one:
class Java::OrgJrubyRuntime::ThreadContext
field_writer :isProfiling
end
With this, we can set/reset the internal isProfiling switch. Now my loop becomes:
context = JRuby::Profiler.send(:current_thread_context)
JRuby::Profiler.clear
profile_data_is_allocated = nil
loop do
x=foo()
break if x
# The first time, we allocate the profile data
profile_data_is_allocated ||= context.start_profiling
context.isProfiling = true
bar()
context.isProfiling = false
end
profile_data = JRuby::Profiler.send(:profile_data)
In this solution, I tried to keep as close as possible to the capabilities of the JRuby::Profiler class, but we see, that the only public method still used is the clear method. Basically, I have reimplemented profiling in terms of the ThreadContext class; so if someone comes up with a better way to solve it, I will highly appreciate it.

How do I debug lua functions called from conky?

I'm trying to add some lua functionality to my existing conky setup so that repetitive "code" in my conky text can be cleaned up. For example, I have information for each mounted FS, each core, etc. where each row displayed in my panel differs ONLY by one parameter.
My first skeletal, attempt at using lua functions for this seems to run but displays nothing in my panel. I've only found very simple examples to base this on, so I may have made a simple error, but I don't even know how to diagnose it. My code here is modeled after what I HAVE been able to find regarding writing functions, such as this How to implement a basic Lua function in Conky? , but that's about all the depth I've found on the topic except for drawing and cairo examples.
Here's the code added to my conky config, as well as the contents of my functions.lua file
conky.config = {
...
lua_load = '/home/conky-manager/MyConky/functions.lua',
};
conky.text = [[
...
${voffset 5}${lua conky_test 'test'}
...
]]
file - functions.lua
function conky_test(parm1)
return 'result text'
end
What I would expect is to see is "result text" displayed in my panel at the location where that function call appears, but nothing shows.
Is there a log created by conky as it runs, or a way to provide some debug output? Even if I'd made a simple error here, I'd still like to have the ability to diagnose things as my code gets more complex.
Success!
After cobbling info from several articles together, I figured out my basic flaws -
1. Missing a 'conky_main' function,
2. Missing a 'lua_draw_hook_post' to invoke it, and
3. Realizing that if I invoke conky from a terminal, print statements in lua would appear there.
So, for anyone who sees this question and has the same issues, here's the corrected code.
conky.config = {
...
lua_load = '/home/conky-manager/MyConky/functions.lua',
lua_draw_hook_post = "main",
};
conky.text = [[
...
${lua conky_test 'test'}
...
]]
and the proper basics in my functions.lua file
function conky_test(parm1)
return 'result text'
end
function conky_main()
if conky_window == nil then
return
end
end
A few notes:
I still haven't determined if using 'lua_draw_hook_pre' instead of 'lua_draw_hook_post' makes any difference, but it doesn't seem to in this example.
Also, some examples showed actually calling this 'test' function instead of writing a 'main', but the 'main' seemed to have value in checking to see if conky_window existed.
Some examples seemed to state that naming functions with the prefix 'conky_' was required, but then showed examples of calling those functions without the prefix, so I assume the prefix is inferred during the call.
a major note: you should run conky from the directory containing the lua scripts.

Maquette cannot read property "class" of undefined

Chrome debug console snapshot
I basically am unsure as to what is causing this error ^^.
I've done a little digginng, and it seemse the previousProperties is passed in as previous.properties by updateDom(). previous, in turn, is passed in by update where it is labeled as just vnode. This VNOde is a valid VNode, but just lacks the properties.
I'm pretty sure I've made everything distinguishable (by setting unique key properties) that would need to be distinguishable, so I don't think that's the problem, although I could be mistaken.
So I had this question, wrote it, did more looking and found my answer before even posting it. I'm still posting this question, and answering it myself in hopes that it might help save someone else some heartache in the future.
In this case, this error is being caused by a projector rendering and receiving an invalid value in return from the renderMaquette function. In my component based framework, I've been using ternary operators to work like if-else statements inside renderMaquetteFunction return blocks. I.E.
function renderMaquette(){
return h('div',
showTitle ?
h('h1', 'My Title')
: []
)
}
Leaving an empty array is perfectly acceptable parameter inside of a hyperscript function, as it will return nothing. However, returning an empty array is not. I.E.
function renderMaquette(){
return showTitle ?
h('h1', 'My Title')
: []
}
This generates an error.

Returning values in Elixir?

I've recently decided to learn Elixir. Coming from a C++/Java/JavaScript background I've been having a lot of trouble grasping the basics. This might sound stupid but how would return statements work in Elixir? I've looked around and it seems as though it's just the last line of a function i.e.
def Hello do
"Hello World!"
end
Would this function return "Hello World!", is there another way to do a return? Also, how would you return early? in JavaScript we could write something like this to find if an array has a certain value in it:
function foo(a){
for(var i = 0;i<a.length;i++){
if(a[i] == "22"){
return true;
}
}
return false;
}
How would that work in Elixir?
Elixir has no 'break out' keyword that would be equivalent to the 'return' keyword in other languages.
Typically what you would do is re-structure your code so the last statement executed is the return value.
Below is an idiomatic way in elixir to perform the equivalent of your code test, assuming you meant 'a' as something that behaves kind of array like in your initial example:
def is_there_a_22(a) do
Enum.any?(a, fn(item) -> item == "22" end.)
end
What's actually going on here is we're restructuring our thinking a little bit. Instead of the procedural approach, where we'd search through the array and return early if we find what we were looking for, we're going to ask what you are really after in the code snippet:
"Does this array have a 22 anywhere?"
We are then going to use the elixir Enum library to run through the array for us, and provide the any? function with a test which will tell us if anything matches the criteria we cared about.
While this is a case that is easily solved with enumeration, I think it's possible the heart of your question applies more to things such as the 'bouncer pattern' used in procedural methods. For example, if I meet certain criteria in a method, return right away. A good example of this would be a method that returns false if the thing you're going to do work on is null:
function is_my_property_true(obj) {
if (obj == null) {
return false;
}
// .....lots of code....
return some_result_from_obj;
}
The only way to really accomplish this in elixir is to put the guard up front and factor out a method:
def test_property_val_from_obj(obj) do
# ...a bunch of code about to get the property
# want and see if it is true
end
def is_my_property_true(obj) do
case obj do
nil -> false
_ -> test_property_value_from_obj(obj)
end
end
tl;dr - No there isn't an equivalent - you need to structure your code accordingly. On the flip side, this tends to keep your methods small - and your intent clear.
You are correct in that last expression is always returned. Even more - there are no statements in the language - just expressions. Every construct has a value - if, case, etc. There is also no early return. This may seem limiting, but you quickly get used to it.
A normal way to solve your example with Elixir would be to use a higher order function:
def foo(list) do
Enum.any?(list, fn x -> x == "22" end)
end

Python equivalent of PHP's #

Is there a Python equivalent of PHP's #?
#function_which_is_doomed_to_fail();
I've always used this block:
try:
foo()
except:
pass
But I know there has to be a better way.
Does anyone know how I can Pythonicify that code?
I think adding some context to that code would be appropriate:
for line in blkid:
line = line.strip()
partition = Partition()
try:
partition.identifier = re.search(r'^(/dev/[a-zA-Z0-9]+)', line).group(0)
except:
pass
try:
partition.label = re.search(r'LABEL="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
try:
partition.uuid = re.search(r'UUID="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
try:
partition.type = re.search(r'TYPE="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
partitions.add(partition)
What you are looking for is anti-pythonic, because:
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In your case, I would use something like this:
match = re.search(r'^(/dev/[a-zA-Z0-9]+)', line)
if match:
partition.identifier = match.group(0)
And you have 3 lines instead of 4.
There is no better way. Silently ignoring error is bad practice in any language, so it's naturally not Pythonic.
Building upon Gabi Purcanu's answer and your desire to condense to one-liners, you could encapsulate his solution into a function and reduce your example:
def cond_match(regexp, line, grp):
match = re.search(regexp, line)
if match:
return match.group(grp)
else:
return None
for line in blkid:
line = line.strip()
partition = Partition()
partition.identifier = cond_match(r'^(/dev/[a-zA-Z0-9]+)', line, 0)
partition.label = cond_match(r'LABEL="((?:[^"\\]|\\.)*)"', line, 1)
partition.uuid = cond_match(r'UUID="((?:[^"\\]|\\.)*)"', line, 1)
partition.type = cond_match(r'TYPE="((?:[^"\\]|\\.)*)"', line, 1)
partitions.add(partition)
Please don't ask for Python to be like PHP. You should always explicitly trap the most specific error you can. Catching and ignoring all errors like that is not good best practice. This is because it can hide other problems and make bugs harder to find. But in the case of REs, you should really check for the None value that it returns. For example, your code:
label = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line).group(1)
Raises an AttributeError if there is not match, because the re.search returns None if there is no match. But what if there was a match but you had a typo in your code:
label = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line).roup(1)
This also raises an AttributeError, even if there was a match. But using the catchall exception and ignoring it would mask that error from you. You will never match a label in that case, and you would never know it until you found it some other way, such as by eventually noticing that your code never matches a label (but hopefully you have unit tests for that case...)
For REs, the usual pattern is this:
matchobj = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line)
if matchobj:
label = matchobj.group(1)
No need to try and catch an exception here since there would not be one. Except... when there was an exception caused by a similar typo.
Use data-driven design instead of repeating yourself. Naming the relevant group also makes it easier to avoid group indexing bugs:
_components = dict(
identifier = re.compile(r'^(?P<value>/dev/[a-zA-Z0-9]+)'),
label = re.compile(r'LABEL="(?P<value>(?:[^"\\]|\\.)*)"'),
uuid = re.compile(r'UUID="(?P<value>(?:[^"\\]|\\.)*)"'),
type = re.compile(r'TYPE="(?P<value>(?:[^"\\]|\\.)*)"'),
)
for line in blkid:
line = line.strip()
partition = Partition()
for name, pattern in _components:
match = pattern.search(line)
value = match.group('value') if match else None
setattr(partition, name, value)
partitions.add(partition)
There is warnings control in Python - http://docs.python.org/library/warnings.html
After edit:
You probably want to check if it is not None before trying to get the groups.
Also use len() on the groups to see how many groups you have got. "Pass"ing the error is definitely not the way to go.