PDFlib - using ArtBox to store width and height of placed elements - pdflib

Using PDFlib I'm adding elements to a page. My objective is to somehow retrieve combined height and width of the elements after page is closed with $p->end_page_ext("");.
I know combined height of elements added to that page.
Question. Is it possible to use PDF's ArtBox to somehow store these dimensions and later retrieve them?
I'm not interested in page's height or width - only the space that added elements occupy.

with PDFlib you can also add the ArtBox to a page. (use option "ArtBox {llx lly urx ury}" within the begin_page_ext/end_page_ext() option list.
Afterwards you can retrieve this values from the PDF. For example when using the pCOS interface (it's part of PDFlib+PDI, PLOP, or TET). You find a code sample for the MediaBox in the pCOS Cookbook: https://www.pdflib.com/pcos-cookbook/pages/page_size/
To retrieve the ArtBox you just need to use the pCOS paths:
// $pageno is the 0 based index of the pages. First page is 0
if ($p->pcos_get_number($doc, "type:pages[" . $pageno . "]/ArtBox") == 5)
{
$llx = sprintf("%.2f", $p->pcos_get_number($doc, "pages[" . $pageno . "]/ArtBox[0]"));
$lly = sprintf("%.2f", $p->pcos_get_number($doc, "pages[" . $pageno . "]/ArtBox[1]"));
$urx = sprintf("%.2f", $p->pcos_get_number($doc, "pages[" . $pageno . "]/ArtBox[2]"));
$ury = sprintf("%.2f", $p->pcos_get_number($doc, "pages[" . $pageno . "]/ArtBox[3]"));
}

Related

How to keep metadata fields on one line

I am trying to customize the metadata for my posts. Specifically, I want to change the separator from the default forward slash (/) to a vertical line (|). I also want to add the word "Updated" before the date displayed. And, I want to keep the option to display reading time. (*FYI: I know basically nothing about coding, just trying to override the metadata output from Astra)
I used this code to change the separator (found in a post about how to customize post meta in Astra theme):
add_filter('astra_single_post_meta', 'custom_post_meta');
function custom_post_meta($old_meta)
{
$post_meta = astra_get_option('blog-single-meta');
if (!$post_meta) return $old_meta;
$new_output = astra_get_post_meta($post_meta, "|");
if (!$new_output) return $old_meta;
return "<div class='entry-meta'>$new_output</div>";
}
See the output in image 1 - looks great!
output of new code for separator
This is close, but still needed to add "Updated" before the date. So, I used this code (from same post mentioned above):
function astra_post_date()
{
$format = apply_filters('astra_post_date_format', '');
$published = esc_html(get_the_date($format));
$modified = esc_html(get_the_modified_date($format));
$output = '<p class="posted-on">';
$output .= 'Updated: <span class="published" ';
$output .= 'itemprop="datePublished">' . $published;
$output .= '</span>';
$output .= '</p>';
return apply_filters('astra_post_date', $output);
}
See output in image 2 - content is perfect, but formatting is wrong. Can't figure out how to edit the code to get all 3 fields on the same line.
output of new code for adding "Update"
What do I need to change in the 2nd block of code to keep everything on one line like the 1st block of code?
Thanks for any help!

pdflib - is there's a way to rotate the page of pdf

before rotate
this is what i generate, with A4 Page.
before
after rotate
after generate, i wanna rotate the page like this.
please help me.
after
Totally possible!
Have a look at the PDFLib Cookbook (great resource for learning!), more specifically the sample for rotate_pages.
Basically you import a page into a document and place it with a new orientation. That will "rotate" the page as you like.
PHP sample code from the cookbook:
/* Loop over all pages of the input document */
for ($pageno = 1; $pageno <= $endpage; $pageno++)
{
$page = $p->open_pdi_page($indoc, $pageno, "");
if ($page == 0)
throw new Exception("Error: " . $p->get_errmsg());
/* Page size may be adjusted by fit_pdi_page() */
$p->begin_page_ext(0, 0, "width=a4.width height=a4.height");
/* Place the imported page on the output page. Adjust the page size
* automatically to the size of the imported page. Orientate the
* page to the west; similarly you can orientate it to the east or
* south, if required.
*/
$p->fit_pdi_page($page, 0, 0, "adjustpage orientate=west");
$p->close_pdi_page($page);
$p->end_page_ext("");
}

How to multiply the font size in html? ActionScript 3 implemention

Please read the examples compare input and output. Defrent is in size=[Values]. How to replace it?
input:
"<font size='30'> Head </font><br></br> <font color='#b5fe01' size='50'>Progress:</font>"
and I want multiply all font sizes by 2 and replace it in original input.
output:
"<font size='60'> Head </font><br></br> <font color='#b5fe01' size='100'>Progress:</font>"
Thanks
AS3 regexp as requested:
var multiply:Function = function(matched:String, start:String, size:String, index:int, str:String):String
{
return start + (2 * int(size)).toString() + "'";
}
var match:RegExp = /(<font[^>]*size=')(\d+)'/gi;
var src:String = "<font size='30'> Head </font><br/> <font color='#b5fe01' size='50'>Progress:</font>";
var replaced:String = src.replace(match, multiply);
Explanation:
multiply - Takes "start" and "size" params. "start" is the previously matched part of font tag. This is required as we need to know we are in font tag, yet we only want to replace the size value. "size" is the actual size value.
RegExp - Captures as first group "<font" followed by any number of non-'>' characters, followed by "size='". Second group is the value of size. match is finished with "'" after size value, which is not captured. g stands for "global" and makes multiple-times matching on single string, i makes matching case-insensitive.
This is not a foolproof solution but I think it follows the basic idea and is easy to extend for more universal usage.
We should do this with a HTML/XML processor...
Using just pure perl:
#!/usr/bin/perl -i
while(<>){
s/(<font\s)(.*?)(>)/$1 . repsize($2) . $3 /ge;
print
}
sub repsize{my $atribs=shift;
return $atribs =~ s/(size=.)(\d+)/ $1 . $2*2/er;
}

IE table first column whitespace

Im making a page (for IE) where is shows some values in a table format. But somehow there's some kind of whitespace before the first column making it uneven with the following columns.
i just want the columns to be aligned
http://i49.tinypic.com/14b3kuh.jpg
out.println("<div id='tablePos'>");
out.println("<ul class='ulStyle'>");
out.println("<li>");
out.println("<table border='1'>");
out.println("<div id='divTableForms'>");
out.println("<tr id='process'><td>PROCESS</td></tr>");
while(rs.next()){
String process = rs.getString(2);
String processtype = rs.getString(1);
out.println("<tr id='process'><td id='process2'>"+processtype + "-" +process+"</td></tr>");
}
out.println("<tr id='process'><td id='process2'></td></tr>");
out.println("</div>");
out.println("</table>");
out.println("</li>");
First of all,I think its because of ur divTableForms div... And second, u have used out.println("< tr id='process'>< td id='process2'>< /td>< /tr>");
But i dont get for wat u have included it.... Abd finally never use the same id for multiple tag... Particularly inside loop never use it..
empty Table-Data (TD element) always was problem. view this http://bignosebird.com/docs/h34.shtml OR CSS to make an empty cell's border appear?.
out.println("<tr id='process'><td id='process2'> </td></tr>");
also id attribute value must be unique in html document . your code duplicate 'process' identifier in html document use class instead of Id Attribute

Basic information extraction from html?

I have a project where users submit many links to external sites and I need to parse the HTML of these submitted links and extract basic information from the page in the same way that Digg and Facebook do when a link is submitted.
I want to retrieve:
main title or heading (could be in title, h1, h2, p etc...)
intro or description text (could be in div, p etc...)
main image
My main problem is that there seem to be too many options to explore here and im getting a little confused to sat the least. Many solutions I have looked so far seem to be inadequate or huge overkill.
You would pick a server side language to do this.
For example, with PHP, you could use get_meta_tags() for the meta tags...
$meta = get_meta_tags('http://google.com');
And you could use DOMDocument to get the title element (some may argue if needing the title element, you may as well use DOMDocument to get the meta tags as well).
$dom = new DOMDocument;
$dom->loadHTML('http://google.com');
$title = $dom
->getElementsByTagName('head')
->item(0)
->getElementsByTagName('title')
->item(0)
->nodeValue;
As for getting main image, that would require some sort of extraction of what may be considered the main image. You could get all img elements and look for the largest one on the page.
$dom = new DOMDocument;
$dom->loadHTML('http://google.com');
$imgs = $dom
->getElementsByTagName('body')
->item(0)
->getElementsByTagName('img');
$imageSizes = array();
foreach($imgs as $img) {
if ( ! $img->hasAttribute('src')) {
continue;
}
$src = $img->getAttribute('src');
// May need to prepend relative path
// Assuming Apache, http and port 80
$relativePath = rtrim($_SERVER['SERVER_NAME'] . $_SERVER['REQUEST_URI'], '/') . '/';
if (substr($src, 0, strlen($relativePath) !== $relativePath) {
$src = $relativePath . $src;
}
$imageInfo = getimageinfo($src);
if ( ! $imageInfo) {
continue;
}
list($width, $height) = $imageInfo;
$imageSizes[$width * $height] = $img;
}
$mainImage = end($imageSizes);