How to avoid java.net.UnknownHostException while parsing HTML content to generate Pdf file using iText - html

I want to convert some HTML content into a PDF file. The problem I'm facing is that the HTML content has some <img> tags with absolute image urls. Hence the
HTMLWorker.parse()
method throws following exception in case there is no network connectivity.
ExceptionConverter: java.net.UnknownHostException: xyz.com
Is there a way to avoid this exception in such case and generate a pdf without any image?
I'm using iText-5.0.5 library.

You should implement your ImageProvider and when there is a problem retrieving the image just return null, like
public static class MyImageProvider implements ImageProvider {
public Image getImage(String src, Map<String, String> h, ChainedProperties cprops, DocListener doc) {
try {
return Image.getInstance(IMAGE_URL); //create IMAGE_URL from src parameter
} catch (IOException e) {
return null;
}
}
}
Then you should use the HTMLWorker with this provider
HashMap<String,Object> map = new HashMap<String, Object>();
map.put(HTMLWorker.IMG_PROVIDER, new MyImageProvider());
HTMLWorker.parseToList(new FileReader(HTML), null, map);

Related

Tess4j Image reading

I am using tess4j api for reading an image for numerics.
code as below:
public static void main(String[] args) {
// TODO Auto-generated method stub
final File imageFile = new File("C:\\Users\\goku\\Desktop\\myimage.png");
System.out.println("Image found");
final ITesseract instance = new Tesseract();
instance.setTessVariable("tessedit_char_whitelist", "0123456789");
instance.setDatapath("C:\\Users\\goku\\Downloads\\Tess4J");
instance.setLanguage("eng");
String result;
try {
result = instance.doOCR(imageFile);
System.out.println(result);
} catch (TesseractException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Image attached.
The program is reading the numerics as wrong. Not able to find the issue.
output:
1 1 3 251
regards,
Vasu
Rescaling the image to 300 DPI would get the correct result.
This is how to properly edit image with im4java (imagemagick) so it can be read with tess4j (tesseract):
private static File processImage(File img) throws IOException {
File newImg = File.createTempFile("asdf", ".png");
ImageMagickCmd cmd = new ImageMagickCmd("convert");
IMOperation op = new IMOperation();
op.addImage(img.getAbsolutePath());
op.strip().resample(300).colorspace("gray").autoLevel().threshold(35000).type("bilevel").depth(8).trim();
op.addImage(newImg.getAbsolutePath());
cmd.run(op);
return newImg;
}
It might be the trained data. I have used the trained data from the tesseract-ocr-w64-setup-v4.1.0.20190314.exe Windows binary, found at https://digi.bib.uni-mannheim.de/tesseract/, with the datapath set as below
instance.setDatapath("C:\\Program Files\\Tesseract-OCR\\tessdata");
I do get a warning about the resolution, but the result is correct:
471871882819

Vaadin upload image and store it to database

I am using a Vaadin upload component and so far I have managed to upload an image to a directory, and display it in a panel component after it is successfull uploaded. What I want to do after this, is to insert it in the database aswell. What I have is a table called Show which has a name, date and an image. In the Show class I have tried to have my image as a byte array or as a Blob.
Column(name="image")
private byte[] image;
#Lob
#Column(name="image")
private Blob image;
In the upload succeded method I want to convert the file to a byte array, and so far I have tried this:
File file = new File("C:\\Users\\Cristina_PC\\Desktop\\" + event.getFilename());
byte[] bFile = new byte[(int) file.length()];
try {
FileInputStream fileInputStream = new FileInputStream(file);
fileInputStream.read(bFile);
uIP.uploadImage(bFile);
fileInputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
I tried also this:
byte[] data = Files.readAllBytes(new File("C:\\Users\\Cristina_PC\\Desktop\\" + event.getFilename()).toPath());
uIP.uploadImage(data);
uIP it is actually my uploadImagePresenter, where I tried to transform the byte array to Blob, or simply pass it to the repository as byte array
public void uploadImage(byte[] data) throws SerialException, SQLException{
//Blob blob = new javax.sql.rowset.serial.SerialBlob(data);
showRepo.updateAfterImage(show, data); // or (show, blob)
}
In my repository, in my updateAfterImage method I have:
public void updateAfterImage(Show show, byte[] data) //or Blob data
{
em.getTransaction().begin(); //em - EntityManager
show.setImage(data);
em.getTransaction().commit();
}
Either with Blob or a byte array, I can't manage to update the existing show by setting its image and update it in the database (the cell remains NULL). Also I get no error to help me figure out what is going wrong. Any help/advice would be useful. Thanks!
I have found the solution. What made it work was:
em.getTransaction().begin();
em.find(Show.class, show.getId());
show.setImage(data);
em.merge(spectacol);
em.getTransaction().commit();
in my updateAfterImage method in the show repository.

How to retrieve blob image from mysql database in jsp

while (rsimg.next())
{
Blob photo = rsimg.getBlob("thumbnails");
}
after that what I have to do to show the image in browser.
Try this code in your servlet file , because it will easier to use and identify errors rather than jsp
import java.sql.*;
import java.io.*;
public class RetrieveImage {
public static void main(String[] args) {
try{
Class.forName("YOUR DRIVER NAME");
Connection con=DriverManager.getConnection(
"URL","USERNAME","PASSWORD");
PreparedStatement ps=con.prepareStatement("select * from TBL_NAME");
ResultSet rs=ps.executeQuery();
if(rs.next()){//now on 1st row
Blob b=rs.getBlob(2); //2 means 2nd column data
byte barr[]=b.getBytes(1,(int)b.length()); //1 means first image
FileOutputStream fout=new FileOutputStream("d:\\IMG_NAME.jpg");
fout.write(barr);
fout.close();
}//end of if
System.out.println("ok");
con.close();
}catch (Exception e) {e.printStackTrace(); }
}
}
Now you can load the image from path given in the above .
Hope this helps !!
In order to show image on web, you will have to use 'img' tag and populate it's 'src' attribute with relative path of your image.
Now the problem is, 'img' tag cannot take binary data as 'src' i.e your client cannot access files from database directly. So what you can do is, create a Servlet that loads the file from database and then streams the file via HttpServletResponse.
Your Servlet will looks something like this:
public class DispalyImage extends HttpServlet {
private static final int DEFAULT_BUFFER_SIZE = 10240; // 10KB.
protected void processRequest(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException
{
// Code to access database and get blob image.
// String id = HttpServletRequest.getParameter("id");
// select from table where id='id'
Blob photo = rsimg.getBlob("thumbnails");
response.reset();
response.setBufferSize(DEFAULT_BUFFER_SIZE);
response.setContentType("image/jpeg");
response.setHeader("Content-Length", String.valueOf(photo.length()));
// Prepare streams.
BufferedInputStream input = null;
BufferedOutputStream output = null;
try {
// Open streams.
input = new BufferedInputStream(new FileInputStream(file), DEFAULT_BUFFER_SIZE);
output = new BufferedOutputStream(response.getOutputStream(), DEFAULT_BUFFER_SIZE);
// Write file contents to response.
byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
int length;
while ((length = input.read(buffer)) > 0) {
output.write(buffer, 0, length);
}
} finally {
output.close();
input.close();
}
}
}
Now the problem is, how would your Servlet know which image to stream? Just provide your key as parameter to Servlet. The key would be used to load your image
Assuming you will provide key as 'id', you will display image as
<img src="DisplayImage?id=imageId"></img>
You can retrieve id using HttpServletRequest.getParameter("id") method in your DisplayImage Servlet and load image from database using the id.
Refer FileServlet by BalusC, which has nice example and explanation on how files can ve served from database.

how to add image in email velocity transformer templates from classpath

I am using Velocity Transformer email template with my Mule smtp. Is there any ways that I can add images in the email templates from my classpath ?
That is for example .. if I have an image say abc.png in my classpath, can I able to use it in my velocity email template like < image src= ......
You can add outbound attachments to the Mule Message, using classpath resources as their source. These Mule Message attachments will be turned into MIME parts by the SMTP outbound transformer.
From the discussion here Embedding images into html email with java mail it seems you need to declare the images like this:
<img src=\"cid:uniqueImageID\"/>
You have to use a unique ID after cid: that is consistent with the Content-ID part header. Mule allows you to specify custom part headers by adding an outbound message property java.util.Map named attachmentName+"Headers" (attachmentName is the name of the outbound attachment).
One potential difficulty is that the code in the ObjectToMimeMessage transformer that takes care of transforming a the javax.activation.DataHandler (coming from the Mule Message outbound attachment) in a javax.mail.BodyPart only calls setFileName but not setDisposition which I think is needed for the image to show properly. This said, I'm not an expert here, you probably know more about properly generating MIME emails with attached images.
1) Embed the image Base64 encoded in your HTML
e.g.
Use following site to convert image to base64:
http://www.dailycoding.com/Utils/Converter/ImageToBase64.aspx
I had followed your code to add image path in the velocity transformer in the following way, the String logo will get the value from spring beans
public final class MessageTransformer extends AbstractMessageTransformer
{
private VelocityEngine velocityEngine;
private String templateName;
private Template template;
//This part is for getting the value from property file by declaring setter and getter for fileName and subscriberName
private String logo;
public String getLogo() {
return logo;
}
public void setLogo(String logo) {
this.logo = logo;
}
//This part is for getting template for email from classpath configured in mule flow
public VelocityMessageTransformer()
{
registerSourceType(Object.class);
setReturnDataType(new SimpleDataType<String>(String.class));
}
public void setVelocityEngine(final VelocityEngine velocityEngine)
{
this.velocityEngine = velocityEngine;
}
public void setTemplateName(final String templateName)
{
this.templateName = templateName;
}
#Override
public void initialise() throws InitialisationException
{
try
{
template = velocityEngine.getTemplate(templateName);
}
catch (final Exception e)
{
throw new InitialisationException(e, this);
}
}
#Override
public Object transformMessage(final MuleMessage message, final String outputEncoding)throws TransformerException
{
try
{
final StringWriter result = new StringWriter();
FileDataSource myFile = new FileDataSource (new File (logo)); // It contains path of image file
message.setOutboundProperty("logo", myFile);
// -------------------------------------------------------
final Map<String, Object> context = new HashMap<String, Object>();
context.put("message", message);
context.put("payload", message.getPayload());
context.put("logo", message.getOutboundProperty("logo"));
template.merge(new VelocityContext(context), result); //Merging all the attributes
System.out.println("MAIL WITH TEMPLATE SEND SUCCESSFULLY !!!");
System.out.println( result.toString() );
return result.toString();
}
catch (final Exception e)
{
throw new TransformerException(
MessageFactory.createStaticMessage("Can not transform message with template: " + template)
, e);
}
}
}

FlyingSaucer: convert an HTML document to PDF ignoring external CSS?

I'm using the following to convert HTML to PDF:
InputStream convert(InputStream fileInputStream) {
PipedInputStream inputStream = new PipedInputStream()
PipedOutputStream outputStream = new PipedOutputStream(inputStream)
new Thread({
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(fileInputStream)
ITextRenderer renderer = new ITextRenderer()
renderer.setDocument(document, "")
renderer.layout()
renderer.createPDF(outputStream)
}).start()
return inputStream
}
From the documentation, apparently I should be able to set a "User Agent" resolver somewhere, but I'm not sure where, exactly. Anyone know how to ignore external CSS in a document?
Not the same question but my answer for that one will work here too: Resolving protected resources with Flying Saucer (ITextRenderer)
Override this method:
public CSSResource getCSSResource(String uri) {
return new CSSResource(resolveAndOpenStream(uri));
}
with
public CSSResource getCSSResource(String uri) {
return new CSSResource(new ByteArrayInputStream([] as byte[]));
}