Set Page Size in Units with NReco (WkHtmlToPdf) - html

Is there a way to set print page size in millimeters, like you can with borders, in NReco - the html to pdf wrapper for wkhtmltopdf? Found a way to specify one of four page sizes in the docs, which isn't precise enough.
From reading the wkhtmltopdf docs themselves, it seems like it's also limited to predefined page sizes, rather than setting them manually in units of length.
Making this question in case I am, hopefully, wrong. Need to set page to something 15x10cm for example.
Edit: I am having trouble forcing the library to use the html page settings (as an alternative to not setting anything in the html, and setting height/width/margins in NReco, as shown in my partial answer below). This:
#page {
size: 4in 3in;
margin: 0mm 0mm 0mm 0mm;
}
actually works on print, but when I try to force NReco to use it with:
pdfConverter.CustomWkHtmlArgs = "--print-media-type";
it does nothing. Example is from front page of NReco's site too, which makes it funnier.

Looking at this snippet from the wkhtmltopdf documentation:
--page-height <unitreal> Page height
-s, --page-size <Size> Set paper size to: A4, Letter, etc.
(default A4)
--page-width <unitreal> Page width
I would say that using --page-height and --page-width would do the trick. Logic would say that these will set the page height and width in points, but in fact it is mm. If you need to convert, there are 25.4 mm in an inch, and 72 points in an inch.

Note: I will be editing this answer if I get somewhere, but for now, I guess something is better than nothing for whoever might read this.
Posting an answer that others might find helpful but that isn't the full solution to my problem:
HtmlToPdfConverter pdfConverter = new HtmlToPdfConverter();
//The page width and page height values are in mm
pdfConverter.PageWidth = 102;
pdfConverter.PageHeight = 77;
This should NOT be the accepted answer - for some reason it does NOT fully match simple html sizing. For example, if I create an html document with size ratios of 4x3 and then set these props appropriately, the resulting image on the page still does not take up the entire page (ends up smaller).
If I run the following html and set page sizes to 102 mm x 72mm I get the screenshot below, which is way off despite having the ratios right:
<html>
<head>
<style>
.reportBody {
padding: 0px;
border: 0px;
margin: 0px;
}
.reportTable{
padding: 0px;
border: 0px;
margin: 0px;
width: 102mm;
height: 77mm;
}
</style>
</head>
<body class = "reportBody">
<table class = "reportTable">
<tr>
<td style = "background-color:red">
Row 1 Column 1
</td>
<td style = "background-color:blue">
Row 1 Column 2
</td>
</tr>
<tr>
<td style = "background-color:green">
Row 2 Column 1
</td>
<td style = "background-color:yellow">
Row 2 Column 2
</td>
</tr>
</table>
</body>
</html>

Quick answer to your question is yes, the settings are in mm when you use the height and width properties, so what you are doing is correct with regard to the C# code, your sizing issue is the converter employing a 'smart' resizing technique, which is on by default. The ratio being correct but the dimensions reduced is exactly the same issue I had, which was resolved by the --disable-smart-shrinking option being included.
For a fuller picture:
I've just finished step one of getting prescription printing direct from a new Razor app done, the PDFs being generated via the NReco wrapper. I have to print to a fixed prescription sheet (215mm x 176mm) with a small margin around it. This is the code that I've got in the Controller, which returns the pdf as the fileResult variable.
public async Task<ActionResult> OutputScript(int? id)
{
if (id == null)
{
return new HttpStatusCodeResult(HttpStatusCode.BadRequest);
}
var model = await GetViewModelForScript(id.Value);
if (model == null)
{
return HttpNotFound();
}
// create a string writer to receive the HTML code
StringWriter stringWriter = new StringWriter();
// get the view to render
ViewEngineResult viewResult = ViewEngines.Engines.FindView(ControllerContext, "Script", null);
// create a context to render a view based on a model
ViewContext viewContext = new ViewContext(
ControllerContext,
viewResult.View,
new ViewDataDictionary(model),
new TempDataDictionary(),
stringWriter
);
// render the view to a HTML code
viewResult.View.Render(viewContext, stringWriter);
// return the HTML code
string htmlToConvert = stringWriter.ToString();
// instantiate the HTML to PDF converter
HtmlToPdfConverter htmlToPdfConverter = new HtmlToPdfConverter();
htmlToPdfConverter.CustomWkHtmlArgs = " --print-media-type --title \"SMS Script " + model.ScriptID + "\" --dpi 300 --disable-smart-shrinking";
htmlToPdfConverter.PageHeight = 215;
htmlToPdfConverter.PageWidth = 176;
var margins = new PageMargins();
margins.Bottom = 4;
margins.Top = 4;
margins.Left = 5;
margins.Right = 5;
htmlToPdfConverter.Margins = margins;
htmlToPdfConverter.Orientation = PageOrientation.Landscape;
// render the HTML code as PDF in memory
byte[] pdfBuffer = htmlToPdfConverter.GeneratePdf(htmlToConvert);
// send the PDF file to browser
FileResult fileResult = new FileContentResult(pdfBuffer, "application/pdf");
fileResult.FileDownloadName = "Script.pdf";
return fileResult;
}
The way it is now all the#media print directives are obeyed, hiding the labels that show on the screen "preview" when the PDF is generated for printing.

Related

Retina Devices in web developing: Do I still need to have 2x images?

A lot of the information about Retina devices comes from ~2013 but not much recently.
It seems like, for example in retina.js, it includes anything with a device pixel ratio of > 1.5 to be "retina", but don't all smartphones have well over 1.5 these days? My desktop computer does as well.
My question then, why not just always serve the highest possible resolution images you have access to instead of creating the half-sized versions for "non-retina" devices, which as far as I know don't really exist much and won't suffer much from being served a higher resolution image.
Thanks!!!
Using 2x images is a huge pain.
Who can know what is "best," but I'm currently working with images in a combo like this:
I use a parent element so that the image will fill it - and that parent/or it's ancestors will determine any limits. You can use the picture element. (usually, the src are supplied by a CMS or something {{image.small.url}} etc.
The official answer to your questions would be, that people don't serve the higher res file to everything - because the file is bigger and they want the site to load as fast as possible. / but if you double the images size (twice as big as it will ever be presented, and compress to ~40 or so) then use the parent element to size it - it's actually a smaller file size. There are very specific studies on this. I don't know how that works for painting and browser rendering, though.
MARKUP
<figure class='poster'>
<img src='' alt=''
data-small='http://placehold.it/600'
data-medium='http://placehold.it/1000'
data-large='http://placehold.it/2000'
/>
</figure>
STYLES (stylus)
figure // just imagine the brackets if you want
margin: 0
img
display: block
width: 100%
height: auto
.poster
max-width: 400px
SCRIPT
$(document).on('ready', function() {
// $global
var $window = $(window);
var windowWidth;
var windowHeight;
function getWindowDimentions() {
windowWidth = $window.width();
windowHeight = $window.height();
}
function setResponsibleImageSrc(imageAncestorElement, container) {
var large = false; // innocent until proven guilty
var medium = false; // "
var context;
if ( !container ) {
context = windowWidth;
} else {
context = $(container).outerWidth();
}
var large = context > 900;
var medium = context > 550;
$(imageAncestorElement).each( function() {
var $this = $(this).find('img');
var src = {};
src.small = $this.data('small');
src.medium = $this.data('medium');
src.large = $this.data('large');
if ( large ) {
$this.attr('src', src.large);
} else if ( medium ) {
$this.attr('src', src.medium);
} else {
$this.attr('src', src.small);
}
});
};
$window.on('resize', function() { // this should jog a bit
getWindowDimentions();
setResponsibleImageSrc('.poster', 'body');
}).trigger('resize');
});
It all depends on what you are doing - and there is no silver bullet yet. The context for each image is so unique. My goal is to get in the ballpark for each size - keep the images compressed to 40 in Photoshop on export... double the size they should be, and then the parent squishes them for retina. The size is actually smaller in most cases.
CodePen example: http://codepen.io/sheriffderek/pen/bqpPra

Typoscript: render HTML table with database values

Im completely new to typoscript therefore I have quite a hard time with the syntax but I think I am getting there.
My task is to render an HTML table and fill it with values from a database table (doesn't matter which one). In my case I took the tt_content table and tried to fill my HTML table with the "header" field and the "bodytext" field.
So I made a completely empty template and wrote the following code in the "setup" field of the template. I added some headers and texts to the sites I have to test my code but I get a completely empty page not even the "table" HTML tags are there.
After 4 days of research I still don't know what my problem is here so I am quite desperate.
Here is what I have so far:
page = PAGE
page.typeNum = 0
lib.object = COA_INT
lib.object {
10 = TEXT
10.value = <table>
20 = CONTENT
20.wrap = <tr>|</tr>
20 {
table = tt_content
select {
orderBy = sorting
}
renderObj = COA
renderObj {
10 = COA
10 {
10 = TEXT
10 {
field = header
wrap = <td>|</td>
}
20 = TEXT
20 {
field = bodytext
wrap = <td>|</td>
}
}
}
}
20 = TEXT
20.value = </table>
}
If someone could help me out here it would be much appreciated.
Thanks in advance.
Check if you have any 'template parser' running.
go to template -> choose 'Info/modify' and click on 'edit the whole ...'
There choose the includes tab and include css_styled_content' (Yes, there is another way of parsing your content, with fluid_styled_content'. you can choose that instead if you are on TYPO3 7.6.* or higher)
These 'parsers' will give you all the needed typoscript included to parse and render your content. Without these, nothing will be rendered when you want to render content from the backend.
second: your typoscript is wrong
You have made a content array (lib.content is a Content Object Array) and filled it with content. But you overwrite the content with key 20.
change
20 = TEXT
20.value = </table>
to
30 = TEXT
30.value = </table>
third: you have created a Page object but you did not add your COA into that page object.
Try this:
page = PAGE
page.10 < lib.object
What this does is include your lib.content in the Page Object at 'level' 10
you can also do
page.20 = TEXT
page.20.value = hello world
This will be rendered after your lib.content.
As you could notice. It is a bit as writing a big Array (because typoscript is a big Array ;)
Beware that you place your lib.content ABOVE the page object declaration. else it will not be able to include it.
There is also a slack channel for TYPO3 you can join if you have other questions. People over there are more then willing to help you.
https://forger.typo3.org/slack

Reduce the size of text in angularjs when line breaks?

I have a responsive app for desktop and mobile.
In the app i have a div which randomly shows texts of all kinds of lengths.
I want to do the following:
If the line breaks because the length of the text is too wide for the width of that div, i want the font-size to reduce itself (I am using em's in my app).
Is it something i need to build directive for it? is it something that was built and used wildly?
Writing a robust solution for this problem is going to be non-trivial. As far as I know, there's no way to tell whether a line of text breaks. However, we do know the criteria for line breaking is the width of the text being wider than the element, accounting for padding.
The Canvas API has a method called measureText which can be used to measure a string, using a given context with a font and size set. If you spoof the settings of the element with a canvas, then you can measure the text with the canvas and adjust the size until it fits without overflowing.
I've written up a rough implementation of the way I would tackle this.
function TextScaler(element) {
var canvas = document.createElement('canvas'),
context = canvas.getContext('2d');
var scaler = {};
scaler.copyProps = function() {
var style = element.style.fontStyle,
family = element.style.fontFamily,
size = element.style.fontSize,
weight = element.style.fontWeight,
variant = element.style.fontVariant;
context.font = [style, variant, weight, size, family].join(' ');
};
scaler.measure = function(text) {
text = text || element.innerText;
return context.measureText(text);
};
scaler.overflows = function() {
var style = window.getComputedStyle(element),
paddingLeft = style['padding-left'],
paddingRight = style['padding-right'],
width = style.width - paddingLeft - paddingRight;
return scaler.measure() > width;
};
scaler.decrease = function() {
// decrease font size by however much
};
scaler.auto = function(retries) {
retries = retries || 10;
if(retries <= 0) {
scaler.apply();
console.log('used all retries');
}
if(scaler.overflows()) {
scaler.decrease();
scaler.auto(retries - 1);
} else {
console.log('text fits');
scaler.apply();
}
};
scaler.apply = function() {
// copy the properties from the context
// back to the element
};
return scaler;
}
After you've sorted out some of the blank details there, you'd be able to use the function something like this:
var element = document.getElementById('');
var scaler = TextScaler(element);
scaler.auto();
If it doesn't manage to decrease it within 10 retries, it will stop there. You could also do this manually.
while(scaler.overflows()) {
scaler.decrease();
}
scaler.apply();
You'd probably want some fairly fine tuned logic for handling the decrease function. It might be easiest to convert the ems to pixels, then work purely with integers.
This API could quite trivially be wrapped up as a directive, if you want to use this with Angular. I'd probably tackle this with two attribute directives.
<div text-scale retries="10">Hello world</div>
Of course, if it's not important that all the text is there onscreen, then you can just use the text-overflow: ellipsis CSS property.

Google Apps Script: weird page layout in a script formatted document

I'm working on a script that applies custom headings to a plain text document imported in Google Docs. The scripts works pretty much as it should. However the resulting document has a weird layout, as if random page breaks were inserted here and there. But there are no page breaks and I can't understand the reason of this layout. Checking the paragraph attributes give me no hints on what is wrong.
Here is the text BEFORE the script is applied:
https://docs.google.com/document/d/1MzFvlkG13i3rrUcz5jmmSppG4sBH6zTXr7RViwdqaIo/edit?usp=sharing
You can make a copy of the document and execute the script (from the Scripts menu, choose Apply Headings). The script applies the appropriate heading to the scene heading, name of the character, dialogue, etc.
As you can see, at the bottom of page 2 and 3 of the resulting document there is a big gap and I can't figure out why. The paragraph attributes seem ok to me...
Here is a copy of the script:
// Apply headings to sceneheadings, actions, characters, dialogues, parentheticals
// to an imported plain text film script;
function ApplyHeadings() {
var pars = DocumentApp.getActiveDocument().getBody().getParagraphs();
for(var i=0; i<pars.length; i++) {
var par = pars[i];
var partext = par.getText();
var indt = par.getIndentStart();
Logger.log(indt);
if (indt > 100 && indt < 120) {
var INT = par.findText("INT.");
var EXT = par.findText("EXT.");
if (INT != null || EXT != null) {
par.setHeading(DocumentApp.ParagraphHeading.HEADING1);
par.setAttributes(ResetAttributes());
}
else {
par.setHeading(DocumentApp.ParagraphHeading.NORMAL);
par.setAttributes(ResetAttributes());
}
}
else if (indt > 245 && indt < 260) {
par.setHeading(DocumentApp.ParagraphHeading.HEADING2);
par.setAttributes(ResetAttributes());
}
else if (indt > 170 && indt < 190) {
par.setHeading(DocumentApp.ParagraphHeading.HEADING3);
par.setAttributes(ResetAttributes());
}
else if (indt > 200 && indt < 240) {
par.setHeading(DocumentApp.ParagraphHeading.HEADING4);
par.setAttributes(ResetAttributes());
}
}
}
// Reset all the attributes to "null" apart from HEADING;
function ResetAttributes() {
var style = {};
style[DocumentApp.Attribute.STRIKETHROUGH] = null;
style[DocumentApp.Attribute.HORIZONTAL_ALIGNMENT] = null;
style[DocumentApp.Attribute.INDENT_START] = null;
style[DocumentApp.Attribute.INDENT_END] = null;
style[DocumentApp.Attribute.INDENT_FIRST_LINE] = null;
style[DocumentApp.Attribute.LINE_SPACING] = null;
style[DocumentApp.Attribute.ITALIC] = null;
style[DocumentApp.Attribute.FONT_SIZE] = null;
style[DocumentApp.Attribute.FONT_FAMILY] = null;
style[DocumentApp.Attribute.BOLD] = null;
style[DocumentApp.Attribute.SPACING_BEFORE] = null;
style[DocumentApp.Attribute.SPACING_AFTER] = null;
return style;
}
A couple of screenshots to make the problem more clear.
This is page 2 of the document BEFORE the script is applied.
This is page two AFTER the script is applied. Headings are applied correctly but... Why the white space at the bottom?
Note: if you manually re-apply HEADING2 to the first paragraph of page 3 (AUDIO TV), the paragraph will jump back to fill the space at the bottom of page 2. This action, however, doesn't change any attribute in the paragraph. So why the magic happens?
Thanks a lot for your patience.
That was an interesting problem ;-)
I copied your doc, ran the script and had a surprise : nothing happened !
It took me a few minutes to realize that the copy I just made had no style defined for headings, everything was for some reason in courrier new 12pt, including the headings.
I examined the log and saw the indent values, played with that a lot to finally see that the headings were there but not changing the style.
So I went in the doc menu and set 'Use my default style and... everything looks fine, see screen capture below.
So now your question : it appears that there must be something wrong in your style definition, by "wrong" I mean something that changes more than just the font Style and size but honestly I can't see any way to guess what since I'm unable to reproduce it... Please try resetting your heading styles and re-define your default.... and tell us what happens then.
PS : here are my default heading styles : (and the url of my copy in view only :https://docs.google.com/document/d/1yP0RRCrRSsQc9zCk-sdfu5olNGDkoIrabXanII4qUG0/edit?usp=sharing )

Methods for deleting blank (or nearly blank) pages from TIFF files

I have something like 40 million TIFF documents, all 1-bit single page duplex. In about 40% of cases, the back image of these TIFFs is 'blank' and I'd like to remove them before I do a load to a CMS to reduce space requirements.
Is there a simple method to look at the data content of each page and delete it if it falls under a preset threshold, say 2% 'black'?
I'm technology agnostic on this one, but a C# solution would probably be the easiest to support. Problem is, I've no image manipulation experience so don't really know where to start.
Edit to add: The images are old scans and so are 'dirty', so this is not expected to be an exact science. The threshold would need to be set to avoid the chance of false positives.
You probably should:
open each image
iterate through its pages (using Bitmap.GetFrameCount / Bitmap.SelectActiveFrame methods)
access bits of each page (using Bitmap.LockBits method)
analyze contents of each page (simple loop)
if contents is worthwhile then copy data to another image (Bitmap.LockBits and a loop)
This task isn't particularly complex but will require some code to be written. This site contains some samples that you may search for using method names as keywords).
P.S. I assume that all of images can be successfully loaded into a System.Drawing.Bitmap.
You can do something like that with DotImage (disclaimer, I work for Atalasoft and have written most of the underlying classes that you'd be using). The code to do it will look something like this:
public void RemoveBlankPages(Stream source stm)
{
List<int> blanks = new List<int>();
if (GetBlankPages(stm, blanks)) {
// all pages blank - delete file? Skip? Your choice.
}
else {
// memory stream is convenient - maybe a temp file instead?
using (MemoryStream ostm = new MemoryStream()) {
// pulls out all the blanks and writes to the temp stream
stm.Seek(0, SeekOrigin.Begin);
RemoveBlanks(blanks, stm, ostm);
CopyStream(ostm, stm); // copies first stm to second, truncating at end
}
}
}
private bool GetBlankPages(Stream stm, List<int> blanks)
{
TiffDecoder decoder = new TiffDecoder();
ImageInfo info = decoder.GetImageInfo(stm);
for (int i=0; i < info.FrameCount; i++) {
try {
stm.Seek(0, SeekOrigin.Begin);
using (AtalaImage image = decoder.Read(stm, i, null)) {
if (IsBlankPage(image)) blanks.Add(i);
}
}
catch {
// bad file - skip? could also try to remove the bad page:
blanks.Add(i);
}
}
return blanks.Count == info.FrameCount;
}
private bool IsBlankPage(AtalaImage image)
{
// you might want to configure the command to do noise removal and black border
// removal (or not) first.
BlankPageDetectionCommand command = new BlankPageDetectionCommand();
BlankPageDetectionResults results = command.Apply(image) as BlankPageDetectionResults;
return results.IsImageBlank;
}
private void RemoveBlanks(List<int> blanks, Stream source, Stream dest)
{
// blanks needs to be sorted low to high, which it will be if generated from
// above
TiffDocument doc = new TiffDocument(source);
int totalRemoved = 0;
foreach (int page in blanks) {
doc.Pages.RemoveAt(page - totalRemoved);
totalRemoved++;
}
doc.Save(dest);
}
You should note that blank page detection is not as simple as "are all the pixels white(-ish)?" since scanning introduces all kinds of interesting artifacts. To get the BlankPageDetectionCommand, you would need the Document Imaging package.
Are you interested in shrinking the files or just want to avoid people wasting their time viewing blank pages? You can do a quick and dirty edit of the files to rid yourself of known blank pages by just patching the second IFD to be 0x00000000. Here's what I mean - TIFF files have a simple layout if you're just navigating through the pages:
TIFF Header (4 bytes)
First IFD offset (4 bytes - typically points to 0x00000008)
IFD:
Number of tags (2-bytes)
{individual TIFF tags} (12-bytes each)
Next IFD offset (4 bytes)
Just patch the "next IFD offset" to a value of 0x00000000 to "unlink" pages beyond the current one.