Unexpected end tag (font) HTML - html

I am running an R code to which several HTML files are connected. As R Returns an error message, I have opened the source file from which the message results:
In the HTML file, there is the following error message:
"End tag (font) violates step 1, paragraph 1 of the Adoption agency algorithm. Unexpected end tag (font). Ignored.
As I am completely new to HTML, I would appreciate if someone told me from which the message results. Here is the code line:
<TH ALIGN="left" COLSPAN="2">Methods in org.apache.poi.hssf.usermodel with parameters of type CellType</FONT></TH>
Thanks in advance.

You have a </FONT> tag ... but you have not a beginner tag <FONT>
<TH ALIGN="left" COLSPAN="2">
Methods in
org.apache.poi.hssf.usermodel
with parameters of type
CellType
</FONT>
</TH>
..so or you add the begin
<TH ALIGN="left" COLSPAN="2">
<FONT
Methods in
org.apache.poi.hssf.usermodel
with parameters of type
CellType
</FONT>
</TH>
or you should remove
<TH ALIGN="left" COLSPAN="2">
Methods in
org.apache.poi.hssf.usermodel
with parameters of type
CellType
</TH>

Related

THEAD Not repeating after certain size

I am using this to create a quotation template for our online project management software (which prints it to a PDF):
<thead>
<tr>
<th colspan="2" align="right" style="width:146px" valign="top">
<img class="body_table img_logo" name="user:logo_url" src="/images/logo.jpg" />
</th>
</tr>
<tr class="address_details" style="font-weight:500;">
<th align="left"><span name="job:company">{{job.company}}</span></th>
<th align="right"><span name="user:company">{{user.company}}</span></th>
</tr>
<tr class="address_details" style="font-weight:400">
<th align="left"><span name="job:address">{{job.address}}</span></th>
<th align="right">
<span name="user:address">{{user.address}}</span>
<br/>
<span name="user:depot_email">{{user.depot_email}}</span>
<br/>
<span name="depot:telephone">{{depot.telephone}}</span>
</th>
</tr>
</thead>
So it works as intended, giving repeated branding on each printed page, but as soon as the last <span name="depot:telephone">{{depot.telephone}}</span> is added, it stops repeating on subsequent pages. It doesn't seem to be that one line though, if I comment out another random bit, it starts working. So, I assume it's a length thing.
I am too much of a newb to HTML to know what I broke, any ideas? Is there a max length that the THEAD can be?
The stuff in the "{{xxx}}}" is what draws data from the online software.
The PDF generator options are "Webkit" which doesn't seem to do the repititions right or "Chromium", which does.

How to find Xpath for nodes with text including linebreak or html fomatters

I am trying to locate a specific node content from an html response. I am trying to find a bit difficult to locate a very specific node as the node element contains line breaks. I am trying out in xpathtester site and my test xml is a provided below.
<html>
<table >
<tr >
<th colspan="3">
<table >
<tr valign="bottom">
<th scope="col" align="left">Test
<br/> Item1</th>
<th scope="col">:</th>
<th scope="col" align="left">ABC123</th>
<th rowspan="7">
<img width="100" height="140" src="xyzcontenturl.jpg"/>
</th>
</tr>
<tr valign="bottom">
<th scope="col" align="left">Test
<br/> Item2</th>
<th scope="col" >:</th>
<th scope="col" align="left" colspan="2" >DEF789</th>
</tr>
</table>
</th>
</tr>
</table>
<p>
<strong/>
</p>
</html>
The idea is to pick up the third column header text and i can place a condition //th[contains(text(),"Test")]/following-sibling::th[2]/text() to locate that(value returned is ABC123).
The challenge is when i try to locate the value based on a specific node ie. "Test Item1" .
Since the Line break is sitting between The text "Test" and "Item1" I could not use functions "contains or starts-with.
How do I write the XPATH so that i can pick up the TH element with value `"Test <br/> Item1"?
Note: The xml provided is a sample illustrating the problem hence first table header ( th element) or second Table Header (th) element etc won't help.
Compare against normalize-space() which replace newlines (not HTML <br/> to be clear) with single space :
//th[normalize-space()='Test Item1']/following-sibling::th[2]/text()
demo
The function receives concatenation of all text nodes within th as argument, do whitespaces normalization on the argument and return the result. Quoted from the linked specification :
The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.
If you're using XPath in code, then get the element and use the "InnerText" property.
If from XSL use the text() function.
What are you calling your XPath from?

R-Advanced Web Scraping-bypassing aspNetHidden using xmlTreeParse()

This question takes a bit of time to introduce, bear with me. It will be fun to solve if you can get there. This scrape would be replicated over thousands of pages on this website using a loop.
I'm trying to scrape the website http://www.digikey.com/product-detail/en/207314-1/A25077-ND/ looking to capture the data in the table with Digi-Key Part Number, Quantity Available etc.. including the right hand side with Price Break, Unit Price, Extended Price.
Using the R function readHTMLTable() doesn't work and only returns NULL values. The reason for this (I believe) is because the website has hidden it's content using the tag "aspNetHidden" in the html code.
For this reason I also found difficulty using htmlTreeParse() and xmlTreeParse() with the whole section parented by not appearing in the results.
Using the R function scrape() from the scrapeR package
require(scrapeR)
URL<-scrape("http://www.digikey.com/product-detail/en/207314-1/A25077-ND/")
does return the full html code including the lines of interest:
<th align="right">Digi-Key Part Number</th>
<td id="reportpartnumber">
<meta itemprop="productID" content="sku:A25077-ND">A25077-ND</td>
<th>Price Break</th>
<th>Unit Price</th>
<th>Extended Price
</th>
</tr>
<tr>
<td align="center">1</td>
<td align="right">2.75000</td>
<td align="right">2.75</td>
However, I haven't been able to select the nodes out of this block of code with the error being returned:
no applicable method for 'xpathApply' applied to an object of class "list"
I've received that error using different functions such as:
xpathSApply(URL,'//*[#id="pricing"]/tbody/tr[2]')
getNodeSet(URL,"//html[#class='rd-product-details-page']")
I'm not the most familiar with xpath but have been identifying the xpath using inspect element on the webpage and copy xpath.
Any help you can give on this would be much appreciated!
You've not read the help for scrape have you? It returns a list, you need to get parts of that list (if parse=TRUE) and so on.
Also I think that web page is doing some heavy heavy browser detection. If I try and wget the page from the command line I get an error page, the scrape function gets something usable (but seems different to you) and Chrome gets the full junk with all the encoded stuff. Yuck. Here's what works for me:
> URL<-scrape("http://www.digikey.com/product-detail/en/207314-1/A25077-ND/")
> tables = xpathSApply(URL[[1]],'//table')
> tables[[2]]
<table class="product-details" border="1" cellspacing="1" cellpadding="2">
<tr class="product-details-top"/>
<tr class="product-details-bottom">
<td class="pricing-description" colspan="3" align="right">All prices are in US dollars.</td>
</tr>
<tr>
<th align="right">Digi-Key Part Number</th>
<td id="reportpartnumber"><meta itemprop="productID" content="sku:A25077-ND"/>A25077-ND</td>
<td class="catalog-pricing" rowspan="6" align="center" valign="top">
<table id="pricing" frame="void" rules="all" border="1" cellspacing="0" cellpadding="1">
<tr>
<th>Price Break</th>
<th>Unit Price</th>
<th>Extended Price
</th>
</tr>
<tr>
<td align="center">1</td>
<td align="right">2.75000</td>
<td align="right">2.75</td>
Adjust to your use-case, here I'm getting all the tables and showing the second one, which has the info you want, some of it in the pricing table which you can get directly with:
pricing = xpathSApply(URL[[1]],'//table[#id="pricing"]')[[1]]
> pricing
<table id="pricing" frame="void" rules="all" border="1" cellspacing="0" cellpadding="1">
<tr>
<th>Price Break</th>
<th>Unit Price</th>
<th>Extended Price
</th>
</tr>
<tr>
<td align="center">1</td>
<td align="right">2.75000</td>
<td align="right">2.75</td>
</tr>
and so on.

pass sql result to mako template and show in table

I just decided to learn Pyramid and choose Mako as template language. I got a list from view which is a raw sql result, as follow:
[{'sample': 'R1_Y200.fq', 'study': 'GaoQiang1'},{'sample': 'R1_Y300.fq', 'study': 'GaoQiang2'},...]
view.py:
#view_config(route_name='example', renderer='example.html')
def templet_test(request):
...
return {'result':search_result}
html:
<html>
...
<table border="1" width="500" align="center" cellpadding="0">
<tr>
<th >study</th>
<th >sample</th>
</tr>
% for item in result:
<tr>
<td valign=top align=center> ${item['study']} </td>
<td valign=top align=center> ${item['sample']} </td>
</tr>
% endfor
</table>
...
</html>
But it didn't work out. I got this message:
"Internal Server Error
The server encountered an unexpected internal server error
(generated by waitress)"
How to fix it?
The real exception is shown in console - it will contain the cause of your problem.
You may also want to enable debug toolbar in pyramid to have a more friendly way of error handling in development.
Also by looking at your template - and i assume you use sqlalchemy for database interaction, your probably want to access result rows like item.study , not like item['study'].
That's as much as I can guess without a proper traceback.

Different font size and color using CSS for data retrieved via JSTL

I am working on a web application where I get some string data using Java Servlets and then subsequently display that data on a JSP view using JSTL. Below is the code snippet:
<table border="1" width=100%>
<th style="width: 10%">S. NUMBER</th>
<th style="width:35%"> First Name </th>
<th style="width: 35%"> Second Name </th>
<th style="width: 10%">Student ID</th>
<th style="width: 10%">Student Class</th>
<c:forEach items="${sortedResults}" var="result">
<tr>
<td>${result.counter}</td>
<td>${result.sequenceIntEx}</td>
<td>${result.lName}</td>
<td>${result.sID}</td>
<td>${result.sClass}</td>
</tr>
</c:forEach>
For example, the lName field, i want to display the 3rd and 4th characters in a bigger font size with different color. Any thoughts?
Perhaps assign systematic element names ex. id="td1" id="td2" etc. and then use jsp to generate a css file that matches the automatic element names and filling in any parameters needed depending on your criteria.
So, i got this working using substring functions in JSTL. First, the following needs to be imported:
<%# taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="fn" %>
Then, i used the substrings to do the needed formatting.
<td>${fn:substring(result.sequenceIntEx, 0, 3)}<font size="5" color="red">${fn:substring(result.sequenceIntEx, 3, 5)}</font>${fn:substring(result.sequenceIntEx, 5, 13)}</td>
Thanks