Customize table pagination for Html to pdf conversion - html

I'm currently using openhtmltopdf to convert an html to pdf. The html has a table1(no header) with 3 rows, then some text and then table2(with a header) which can have many rows. Assuming table2 is getting paginated across multiple pages, the pdf should look like this -
table1(on every page having atleast 1 row from table2)
some text(only on the page having 1st row of table 2)
table2 paginating with header repeating on every page
For repeating the header of table2 on all the pages, I am using -fs-table-paginate:paginate which works. But how do I repeat table 1 and the text as per the requirement? Really appreciate the help in advance.

You can either put the table 1 in the header of the page (it will appear on every page, not only on page containing table 2).
An other option is to include table 1 in the thead of table 2, with some CSS to make it appear as a separate table.
For example:
<head>
<style>
td{border:1px solid red}
#table1, #table1 td{border: 0}
table{-fs-table-paginate:paginate;border-collapse: collapse}
</style>
</head>
<body>
<table id="table2">
<thead>
<!-- Table 1 -->
<tr>
<td id="table1">
<table>
<tr><td>Table1</td></tr>
<tr><td>Value</td></tr>
</table>
</td>
</tr>
<!-- Header of table 2 -->
<tr><td>Col name</td></tr>
</thead>
<tbody>
(...)
</tbody>
</table>
</body>

Related

add a tag to many td elements

I have a table with hundreds of rows.
The table is done after converting a csv file to html table using https://www.convertcsv.com/csv-to-html.htm
I want a specific column of the table to contains a link, but I don't know how to add to hundreds of at a time.
<tr>
<td>title 1</td>
<td align="right">5.18</td>
<td align="right">17.27</td>
<td align="right">70</td>
<td>www.google.com/</td>
<td align="right">32958865536</td>
</tr>
the 5th td is always a link, but td don't contain so I need a way to add the a tag to all 5th td of the table
I use vscode
This is the selector you need:
http://api.jquery.com/nth-child-selector/
You can select every 5th td of each table row with following script, considering tableContent is the id of table
$('#tablecontent td:nth-child(5n)').addClass('someClass');

Hide table header if page has no rows with pagination (Flying Saucer)

Currently I have a dynamic table that is printed on multiple pages, and have a repeated header using Flying Saucer pagination.
The problem I am facing is if the table is too low at the bottom of the page, it just prints the table header and does not have any rows under it.
I am wondering how I can hide the header if there are no rows on the same page. The entire page is dynamic, so I cannot hardcode the table to go to the next page as I do not know how low the table will be.
table {
width: 100%;
-fs-table-paginate: paginate;
border-spacing: 0;
}
<!-- lots of dynamic data, spans across multiple pages -->
<table>
<thead>
<th>
myheader
</th>
</thead>
<tbody>
<tr>
<td>
~lots of data, rows epeated multiple times dynamically
</td>
</tr>
</tbody>
</table>

Incorporating a table inside a table in HTML

I am trying to create an HTML table where there are four columns and any number of rows. Inside this table, the first two columns are just normal cells. The latter two columns can have multiple rows WITHIN a row in the top-level table. My issue is how I can properly align the column separators, even if the length of the content in each cell is variable.
My attempt tries to make use of:
<td colspan=2>
Example of what I am trying to do: https://jsfiddle.net/hurnzhmq/
The things I am missing in the JSFiddle are:
There is no divider between the two rows separating Content3A/Content4A from Content3B/Content4B - I tried using the "bottom-border:none" for the last child, but that did not seem to work.
The column separators between Content3A/Content3B and Content4A/Content4B are not lined up with the header's column separator, and do not touch the ends of the table (there are gaps).
Any advice on how I might go about fixing this would be greatly appreciated!
I think you should use rowspan instead colspan
you can use code below
<html>
<table border=1 >
<tr>
<td>Header1</td>
<td>Header2</td>
<td>Header3</td>
<td>Header4</td>
</tr>
<!-- Content -->
<tr>
<td rowspan="2">Content1</td>
<td rowspan="2" >Content2</td>
<td > Content3A</td>
<td > Content2</td>
</tr>
<tr>
<td > Content3B</td>
<td > Content2</td>
</tr>
</table>
</html>

Only parsing outer element

I am writing a scraper with Nokogiri, and I want to scrape a large HTML file.
Currently, I am scraping a large table; here is a small fragment:
<table id="rptBidTypes__ctl0_dgResults">
<tr>
<td align="left">S24327</td>
<td>
Airfield Lighting
<div>
<div>
<table cellpadding="5px" border="2" cellspacing="1px" width="100%" bgcolor=
"black">
<tr>
<td bgcolor="white">Abstract:<br />
This project is for the purchase and delivery, of various airfield
lighting, for a period of 36 months, with two optional 1 year renewals,
in accordance with the specifications, terms and conditions specified in
the solicitation.</td>
</tr>
</table>
</div>
</div>
</td>
</tr>
</table>
And here is the Ruby code I am using to scrape:
document = doc.search("table#rptBidTypes__ctl0_dgResults tr")
document[1..-1].each do |v|
cells = v.search 'td'
if cells.inner_html.length > 0
data = {
number: cells[0].text,
}
end
ScraperWiki::save_sqlite(['number'], data)
end
Unfortunately this isn't working for me. I only want to extract S24327, but I am getting the content of every table cell. How do I only extract the content of the first td?
Keep in mind that under this table, there are many table rows following the same format.
In CSS, table tr means tr anywhere underneath the table, including nested tables. But table > tr means the tr must be a direct child of the table.
Also, it appears you only want the cell values, so you don't need to iterate. This will give you all such cells (the first in each row):
doc.search("table#rptBidTypes__ctl0_dgResults > tr > td[1]").map(&:text)
The content of the first td would be:
doc.at("table#rptBidTypes__ctl0_dgResults td").text
The problem is that your search is matching two different things: the <tr> tag nested directly within the table with id rptBidTypes__ctl0_dgResults, and the <tr> tag within the table nested inside that parent table. When you loop through document[1..-1] you're actually selecting the second <tr> tag rather than the first one.
To select just the direct child <tr> tag, use:
document = doc.search("table#rptBidTypes__ctl0_dgResults > tr")
Then you can get the text for the <td> tag with:
document.css('td')[0].text #=> "S24327"

Removing table line

I want to add two table or more consecutively and they must be seemed like one table.
<html>
<head>
<style type="text/css">
.cls
{
border:1px solid #000000;
}
.cls td {
border:1px solid #000000;
}
</style>
</head>
<body>
<table class="cls">
<tr>
<td>aaa</td><td>bbb</td><td>ccc</td>
</tr>
<tr>
<td>ddd</td><td>eee</td><td>fff</td>
</tr>
</table>
<table class="cls">
<tr>
<td>aaa</td><td>bbb</td><td>ccc</td>
</tr>
<tr>
<td>ddd</td><td>eee</td><td>fff</td>
</tr>
</table>
</body>
</html>
My problem is the line that tables combined has a doble line normally. How can i show it like a single line.
.cls-last
{
border-top: 0px;
}
On your 2nd table:
<table class="cls cls-last">
<tr>
<td>aaa</td><td>bbb</td><td>ccc</td>
</tr>
<tr>
<td>ddd</td><td>eee</td><td>fff</td>
</tr>
</table>
You could change the top (or bottom) border of the table via CSS.
However, alignment could be a challenge here. In the example you gave, not a problem--each contains 3 (relatively) similar characters each. So, it'd be nearly identical. However if one column in one table has 10 characters for instance, HTML is going to stretch that column and you're going to be left with two obviously different entities.
So, to make this work 100% of the time, you're going to need to set widths and (possibly) overflow properties as well.
I'm having a tough time understanding why you'd have to do it this way. I'm sure you've got a reason, but two similar entities with similar widths and columns should be able to be commingled. If the tables were to only sometimes appear, or you wanted to remove rows, you could do so via Javascript and/or CSS or at the server level when rendering.