IMacos - Extracting element with multiple classes - extract

I am using IMacros for extracting txt from a span. Its something like this
<span class="class1 class2 class3" >Data to Extract</span>
Now, I am having confusion, how to select the data as it has multiple classes.
Any help will be appreciated.

TAG POS=1 TYPE=SPAN ATTR=CLASS:"class1 class2 class3" EXTRACT=TXT

Related

Is there a way, using Imacros for Firefox, to extract an image from within a DIV that has a certain class?

I'm not sure of the syntax within iMacros (10.0.2.1450, which has been paid for, and is being run on Firefox Quantum 66.0.3) to extract an image from within a DIV with a particular class
I have tried to look up info on the iMacros wiki and have Googled extensively, but am not sure how to extract the image, when I think the type needs to be DIV with a particular class as the Attribute.
TAG POS=1 TYPE=IMG ATTR=ALT:*&&SRC:*&&CLASS:particularclass EXTRACT=HREF
This is the code on the site I'm trying to extract the image from
<div class="particularclass">
<img alt="sample" src="https://sample.com/sample.jpg">
</div>
I'm trying to save this image. I don't want to find it on the page based on any details of the image, I want to find it on the page by finding the div with a particular class, and saving the image within that div
When I tried the following, it's not finding an image with this class:
TAG POS=1 TYPE=IMG ATTR=ALT:*&&SRC:*&&CLASS:particularclass EXTRACT=HREF
If anyone can help with the syntax, I'll be eternally grateful!
Thanks,
Paul
TAG POS=1 TYPE=DIV ATTR=CLASS:*replaceWithATTRclass*EXTRACT=*replaceWithATTRname*
Try this code, replace replaceWithATTRname with the name of the ATTR you want to extract, and replace replaceWithATTRclass with your class. You can also change the Type accordingly along with the position.

How to click link with changing attributes using imacros?

I am creating a macro in which I want to click the anchor element.
The problem with this anchor element is that all the attributes changes randomly after every click.
Below is the anchor link
<a id="bLMa" class="**btn valign-wrapper** pulse **animated** lime accent-2 black-text">**StŠ°rt**</a>
Items marked between star always remain same. All other items changes continuously.
The location of the anchor changes randomly after every click.
I tried below steps
TAG XPATH="//*[#class="animated"]"
TAG POS=1 TYPE=A ATTR=Class:animated
TAG POS=1 TYPE=A ATTR=TXT:Start
TAG POS=1 TYPE=A ATTR=TXT:*Start*
I always get the same error as Element Not Found.
Please suggest.
You can try combining Attributes to identify the anchor you want to find as per http://wiki.imacros.net/TAG_parameters_explained#Multiple_ATTR_parameters
For your specific example something like
TAG POS=1 TYPE=A ATTR=class:*animated*&&TXT:*Start*
could work, looking for the class animated AND the text Start together in an anchor.

How to extract specific text with imacros xpath

I have this code in a website:
<div id="1234">
<li>
text I want to extract
<span>
text I don't want to extract
</span>
</li>
</div>
I'm using this IMACROS code, but it extracts both texts:
TAG XPATH="id('1234')/li[1]" EXTRACT=TXT
I was trying to use text() at the end but get an error.
For your Specific case Shugar's code with some tweaking will work. Split \n and extract [1] :
TAG XPATH="id('1234')/li[1]" EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.split('\\n')[1];")
PROMPT {{!EXTRACT}}
If you want a more General approach you can get li[1] and split by content of span at [0]:
TAG XPATH="id('1234')/li[1]" EXTRACT=TXT
SET !VAR1 {{!EXTRACT}}
SET !EXTRACT NULL
TAG XPATH="id('1234')/li[1]/span" EXTRACT=TXT
SET !EXTRACT EVAL("'{{!VAR1}}'.split('{{!EXTRACT}}')[0];")
PROMPT {{!EXTRACT}}
I suggest just adding one more line in your code:
TAG XPATH="id('1234')/li[1]" EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.split('<span>')[0];")

Replace html tags having specific attributes with desired tags

I have around 100 HTML files containing code written in mostly using div and span tags.
example:
<span class="Bold">Sometext <span class="Italic">Some more text</span>Even more text</span>
I want to replace these span elements with proper <b> and <i> tags in all these 100 html files. I could have used regex in notepad++ but nesting of tags is making it difficult for me to handle closing tags.
Kindly suggest how to go about it.
Edit:
How about using a regex? This would also cover the closing tags.
Replace < span class="Bold">...< /span> with < b>...< /b>:
Find: <span class="Bold">(.*)<\/span>
Replace: <b>\1<\/b>
Replace < span class="Italic">...< /span> with ...< /b>:
Find: <span class="Italic">(.*)<\/span>
Replace: <i>\1<\/i>
You could write a script file (JavaScript and run it once for all your pages). The only thing required would be:
$('span').replaceWith('b');
For any further clarifications this is the jquery reference.
http://api.jquery.com/replacewith/
Or you could create a separate js file with this code and attach it to all your files.

Scrapy writing XPath expression for unknown depth

I have an html file which is like:
<div id='author'>
<div>
<div>
...
<a> John Doe </a>
I do not know how many div's would be under the author div. It may have different depth for different pages.
So what would be the XPath expression for this kind of xml?
By the way, I tried:
//div[#id = "author"]/*/a/text()
but this only seems to work for grandchildren of the author div.
Use double slash to find an a element anywhere inside the div element with id="author":
//div[#id = "author"]//a/text()