I know Regular Expression is not right track to do this parsing job but it is recommended from my side.
If i have a HTML this below. I want to parse all the select info from html table. For this i have used
<table id='options_table'>\s*?(.+)?\s*?</table>
But this above giving me null result.
and then to parse all select returned from above regex i will use
<SELECT.*?>(.*?)<\/SELECT>
But above both getting null result.
What should be the regex for Table and Select (from parsed table html) ?
HTML Part
<table id='options_table'>
<tr><td colspan=3><font size="3" class="colors_productname">
<i><b>Color</b></i>
</font>
<br /><table cellpadding="0" cellspacing="0" border="0"><tr><td><img class="vCSS_img_line_group_features" src="/v/vspfiles/templates/192/images/Line_Group_Features.gif" /></td></tr></table>
</font></td></tr>
<tr>
<td align="right" vAlign="top">
<img src="/v/vspfiles/templates/192/images/clear1x1.gif" width="1" height="4" border="0"><br />
</td><td></td><td>
<SELECT name="SELECT___S15FTAN01___29" onChange="change_option('SELECT___S15FTAN01___29',this.options[this.selectedIndex].value)">
<OPTION value="176" >Ivory/Grey</OPTION>
</SELECT>
</td></tr>
<tr>
<td align="right" vAlign="top">
<img src="/v/vspfiles/templates/192/images/clear1x1.gif" width="1" height="4" border="0"><br />
</td><td></td><td>
<SELECT name="SELECT___S15FTAN01___31" onChange="change_option('SELECT___S15FTAN01___31',this.options[this.selectedIndex].value)">
<OPTION value="167" >0/3 months</OPTION>
<OPTION value="169" >3/6 months</OPTION>
<OPTION value="175" >6/9 months</OPTION>
</SELECT>
</td></tr>
</table>
I don't know, GoLang, but I can tell you in perl, and I think you will be able to relate to GoLang.
Firstly, regex to store table tag content (https://regex101.com/r/tL7dA0/1):
$table = $1 if ($html =~ m/<table.*?>(.*)<\/table>/igs);
Regex for printing all the things between select tag (https://regex101.com/r/xJ0xU1/1):
while ($table =~ m/<select.*?>(.*?)<\/select>/isg){
print $1."\n";
}
As in your case, if html table contains inner table, then all the content of outer table would be selected.
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
s modifier: single line. Dot matches newline characters
g modifier: global. All matches (don't return on first match)
Related
I have this piece of code, but the option value is not concatenating {item.id}-${driver.id} and instead I got "-2"
<tr th:each="item: ${devices}" >
<td class="col_id" th:text="${item.id}" ></td><!-- ID -->
<td class="col_name" th:text="${item.description}"></td><!-- NAME -->
<td class="col_name" th:if="${#authorization.expression('hasRole(''ROLE_ADMIN'')')}" th:text="${item.application.name}"></td><!-- NAME -->
<td class="col_name" >
<select id="selectAuthorizedDriverId" >
<!-- option value="0">Please select the driver</option-->
<option th:each="driver : ${drivers}"
th:value="${item.id}-${driver.id}"
th:text="${driver.firstName}"
th:selected="${driver.id==item.driverDevices[0].driver.id}">
</option>
</select>
Thymeleaf is able to do mathematical operations, in your case:
th:value="${item.id}-${driver.id}"
will generate a single result from two integers.
Try
th:value="|${item.id}-${driver.id}|"
instead as this should make sure the given values are concatenated.
Here is the html code:
<table>
<tr class="WhiteRow">
<td align="center">
<input id="SelectedDelivery1" type="checkbox" onclick="HandleClick(this.name,this.checked,"")" value="Y" name="SelectedDelivery1">
</td>
<td valign="top">
<span></span>
<span class="bold">Instrument Search</span>
<br>
abc (TRANSFER)
</td>
<td align="center">5 minutes</td>
<td class="noborder" align="right">
<td class="noborder" align="right">
<td class="noborder" align="right">
<td class="noborder" align="right">
</tr>
<tr>
<td align="center">
<input id="SelectedDelivery2" type="checkbox" onclick="HandleClick(this.name,this.checked,"")" value="Y" name="SelectedDelivery1">
</td>
<td valign="top">
<span></span>
<span class="bold">Instrument Search</span>
<br>
abc (CAVEAT)
</td>
...
</tr>
</table>
I would like to target the <tr> containing <span class="bold">Instrument Search</span> and abc (TRANSFER). That tr may not be the first element in the table.
So far I tried
//td/span[text()="Instrument Search"]/ancestor::tr
which only satisfy one of the condition, and there are a few tr that satisfy the selector.
Could you please advise me how to target both of them
Use the following XPath expression:
//tr[contains(., 'abc (TRANSFER)') and contains(td/span[#class = 'bold'], 'Instrument Search')]
If possible, you should always use expressions that are unidirectional, because a "backwards" axis like ancestor:: could be a costly move. That's the advantage over the solution you have found already.
If the span[#class = 'bold'] cannot contain anything else than "Instrument Search", you should modifiy the expression above to:
//tr[contains(., 'abc (TRANSFER)') and td/span[#class = 'bold'] = 'Instrument Search']
The location of "abc (TRANSFER)" is still not very precise, if it is required in a certain place (e.g. always inside a td element) you'd have to further restrict the above.
EDIT Respondin to your comment:
abc (TRANSFER) is inside td tag, it's just a text field
Then use
//tr[contains(td, 'abc (TRANSFER)') and td/span[#class = 'bold'] = 'Instrument Search']
I found myself an answer after crawling through the syntax.
Please let me know if there is any other better ways
//td/span[text()="Instrument Search"]/ancestor::td/text()[contains(., "TRANSFER")]/ancestor::tr
I'm doing some bug-fix on a legacy project which was developed using Struts2 and JSTL. I have an issue with the multiple select below:
<tr class="itemTr">
<td class="formLabel"><span class="spamFormLabel">Tags </span>
</td>
<td class="formField"><html:select property="tags"
styleId="tags"
styleClass="baseField" size="1" multiple="true"
style="height:170">
<html:options property="tagsList"
labelProperty="tagsLabelList" styleClass="baseOptions" />
</html:select>
</td>
</tr>
When i request the values on the action class
request.getParameter("tags");
is just returning the first value I selected. My objective is to return all of them, of course...lol
String parreco[] = request.getParameterValues("tags");
Using this method I can use all the selected values
I am testing a page using selenium web driver. I have rows of data that represent 'requests', and in the last column of each of those rows the user can click a drop down list (with the option to either approve or reject) element that allows them to 'approve' or 'reject' the request.
I need to be able to select the approve option on the drop down list of a row whose 'Name' column is equal to a variable (in this instance say the variable is 'John').
In this test the user will be approving 'John's' request by selecting approve. How do I use xpath to ensure I am selecting the correct drop down element for the right person (right row)? Will I need to include a select element within an xpath somehow?
An example of the select element method to select a drop down element:
new SelectElement(this.Driver.FindElement(By.Name("orm")).FindElement(By.Name("Tutors"))).SelectByText(tutorName);
<form name="RequestsForm" action="SubmitRequest.aspx" method="POST">
<h2 class="blacktext" align="center">Course approvals</h2>
<table class="cooltable" width="90%" border="0" cellspacing="1" cellpadding="1">
<tbody>
<tr>
<td class="heading">
<b>Name</b>
</td>
<td class="heading">
<b>Request Date</b>
</td>
<td class="heading">
<b>Approved</b>
</td>
</tr>
<tr>
<td>
John
<input id="T1" type="text" value="888" name="T1">
</td>
<td>1/3/2015</td>
<td>
<select id="D1" class="selecttext" size="1" name="D1">
<option>?</option>
<option value="Approved">Approved</option>
<option>Rejected</option>
</select>
</td>
</tr>
</tbody>
</table>
Using XPath, this gets the position where the Name column is in your table:
count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1
You can use that position to get the corresponding table cell in the other columns. This selects the corresponding td in the second row (where the ... represent the expression above):
//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]
Appending /text() will extract the text (with spaces). Using normalize-space() will trim the text so you can compare it with John:
normalize-space(//table[#class='cooltable']/tbody/tr[2]/td[count( ... )+1]/text()) = 'John'
To select only the tr which contains John in the Name column, you leave only the td in the predicate. Now it returns a node-set of all tr which match the predicate text = John:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count( ... )+1]/text()) = 'John']
Finally, if you append //select/option[#value='Approved'] to that expression, you will select the option with the Approved attribute in the context of that tr. Here is the full XPath expression:
//table[#class='cooltable']/tbody/tr[normalize-space(td[count(//table[#class='cooltable']/tbody/tr[1]/td[b = 'Name']/preceding-sibling::td)+1]/text()) = 'John']//select/option[#value='Approved']
I have table without any class or id (there are more tables on the page) with this structure:
<table cellpadding="2" cellspacing="2" width="100%">
...
<tr>
<td class="cell_c">...</td>
<td class="cell_c">...</td>
<td class="cell_c">...</td>
<td class="cell">SOME_ID</td>
<td class="cell_c">...</td>
</tr>
...
</table>
I want to get only one row, which contains <td class="cell">SOME_ID</td> and SOME_ID is an argument.
UPD.
Currently i am doing iy in this way:
doc = Jsoup.connect("http://www.bank.gov.ua/control/uk/curmetal/detail/currency?period=daily").get();
Elements rows = doc.select("table tr");
Pattern p = Pattern.compile("^.*(USD|EUR|RUB).*$");
for (Element trow : rows) {
Matcher m = p.matcher(trow.text());
if(m.find()){
System.out.println(m.group());
}
}
But why i need Jsoup if most of work is done by regexp ? To download HTML ?
If you have a generic HTML structure that always is the same, and you want a specific element which has no unique ID or identifier attribute that you can use, you can use the css selector syntax in Jsoup to specify where in the DOM-tree the element you are after is located.
Consider this HTML source:
<html>
<head></head>
<body>
<table cellpadding="2" cellspacing="2" width="100%">
<tbody>
<tr>
<td class="cell">I don't want this one...</td>
<td class="cell">Neither do I want this one...</td>
<td class="cell">Still not the right one..</td>
<td class="cell">BINGO!</td>
<td class="cell">Nothing further...</td>
</tr> ...
</tbody>
</table>
</body>
</html>
We want to select and parse the text from the fourth <td> element.
We specify that we want to select the <td> element that has the index 3 in the DOM-tree, by using td:eq(3). In the same way, we can select all <td> elements before index 3 by using td:lt(3). As you've probably figured out, this is equal and less than.
Without using first() you will get an Elements object, but we only want the first one so we specify that. We could use get(0) instead too.
So, the following code
Element e = doc.select("td:eq(3)").first();
System.out.println("Did I find it? " + e.text());
will output
Did I find it? BINGO!
Some good reading in the Jsoup cookbook!