Character by character reading the input from JEditorPane in Java - swing

I am trying to create an Html Editor. For this I am using JEditorPane, in which I want to read input from the JEditorPane character by character and want them to be stored in a string. For example: if user types <h so I want to read those two characters and according to those characters I will suggest users for the tags, in this case <html>,<header>,<head> etc (i.e. all tags starting with 'h'). So I am not getting how and which function to use to read character from JEditorPane as soon as user inputs into the JEditorPane.

So I am not getting how and which function to use to read character from JEditorPane as soon as user inputs into the JEditorPane.
You can use a DocumentListener Read the section from the Swing tutorial on How to Write a DocumentListener for more information and examples.
If you are creating an editor, which just displays the text, not the actual formatting, then you should use a JTextArea or a JTextPane. A JEditorPane is really only for displaying existing HTML files.

Keylistener worked for me. Using keylistener we can get input key strokes by the user.

Related

write_html() method in fpdf not using font/encoding specified

I'm creating a PDF with a large collection of quotes that I've imported into python with docx2python, using html=True so that they have some tags. I've done some processing to them so they only really have the bold, italics, underline, or break tags. I've sorted them and am trying to write them onto a PDF using the fpdf library, specifically the pdf.write_html(quote) method. The trouble comes with several special characters I have, so I am hoping to encode the PDF to UTF-8. To write with .write_html(), I had to create a new class as shown in their readthedocs under the .write_html() method at the very bottom of the left hand side:
from fpdf import FPDF, HTMLMixin
class htmlFPDF(FPDF, HTMLMixin):
pass
pdf = htmlFPDF()
pdf.add_page()
#set the overall PDF to utf-8 to preserve special characters
pdf.set_doc_option('core_fonts_encoding', 'utf-8')
pdf.write_html(quote) #[![a section of quote giving trouble with quotations][2]][2]
The list of quotes that I have going into the pdf all appear with their special characters and the html tags (<u> or <i>) in the debugger, but after the .write_html() step they then show up in the pdf file with mojibake, even before being saved, as seen through debugger. An example being "dayâ€ÂTMs demands", when it should be "day's demands" (the apostrophe is curled clockwise in the quote, but this textbox doesn't support).
I've tried updating the font I use by
pdf.add_font('NotoSans', '', 'NotoSans-Regular.ttf', uni=True)
pdf.set_font('NotoSans', '', size=12)
added after the .add_page() method, but this doesn't change the current font (or fix mojibake) on the PDF unless I use the more common .write(text_height, quote) method, which renders the underline/italicize tags into the PDF as text. The .write() method does preserve the special characters. I'm not trying to change the font really, but make sure that what's written onto the PDF preserves the special characters instead of mojibake them.
I've also attempted some .encode/.decode action before going into the .write_html(), as well as attempted some methods from the ftfy library. And tried adding '' to the start of each quote to no effect.
If anyone has ideas for a way to iterate through each line on the PDF that'd be terrific, since then I could use ftfy to fix the mojibake. But ideally, it would be some other html tag at the start of each quote or a way to change the font/encoding of the .write_html() method, maybe in the class declaration?
Or if I'm at a dead-end and should just split each quote on '<', use if statements to detect underlines, italicize, etc., and use the .write() method after all.
Extract docx to html works really bad with docx2python. I do this few month ago. I recommend PyDocX. docx2python are good for docx file content extracting, not converting it into a html.

Saving text as HTML from form

I have a form with a text field that users input text into. They can use multiple lines, put in bold text, underlined text, etc., but the text, when saved to SQL Server doesn't have any formatting saved, just the text is saved. What is the best way to save the text with the HTML so that when it gets viewed by another user and pulled up from Sql Server the HTML is saved and the formatting is saved?
Ex.
hello
Paul
This would be saved as
helloPaul
you can't see it but there are bold and carriage return html tags rapped around the text
When receiving data from the user, on the server side code, use HTML encode to safely store the data:
var inputData = Server.HtmlEncode("<strong>some data input from user</strong>"); //insert your user input data variable here
Then when displaying the data in your cshtml page, decode the data to display it as the user entered it:
HttpUtility.HtmlDecode(saveUserDataFromDatabaseVariable);
All this is assuming you have a rich text editor being plugged into the input field. CKEditor and TinyMCE are good ones.
You can use a text editor. Take a look at CKEditor. It's free and easy to use :)
Can you post some code and more details?
I have had good success with CKEditor. It is customizable, and its content can easily be saved via postback to a standard asp:TextBox.
It is possible that the editor you are using is not actually updating the input/textarea that you are using, it may be cloning the text and drawing the formatting in an overlay. You can use developer tools, or javascript, to verify this by checking the value property of the input or textarea element. If it is being saved via AJAX or javascript the code may be using the textContent or innerText properties instead of innerHTML.
I used the richtexteditor dll that's free online. it gave me a wiziwig box that the user can edit texxt in.

Common-Lisp printing the tab character in function format

I wish to print the tab character with the format function. I can achieve this with ~C and then placing #\tab as an argument to format, but this seems a bit verbose as for a newline one can simply place a ~% in the string.
What is the most commonly used practise for printing tabs with the format function?
Thanks for all the help!
There is no notation for the tab character in FORMAT.
There are several choices, but none is really really good.
use #\tab (or a variable set to the character) as the argument, as you mention, is okay for me
embed a literal tab character in the string. This may break with some editor settings, where the editor replaces tabs with spaces. It's also not directly visible.
use a function in a format string, which writes a tab character
use a reader macro to introduce extended string syntax. Probably not bad. Maybe there exists even one. There was a post on comp.lang.lisp with an example.

How can I remove or escape new line-carriage returns within an XML string in XSL?

I've got an ASP multiline textbox that saves user defined text to a database. When I retrieve the data, I serialize it into xml and then run it through an XSL transform to output my HTML.
Within my transform, I am passing the textbox defined data into a javascript function via an onclick event of an element.
The problem I'm running into...when a user enters a carriage return into the textbox and saves it to the database, a javascript error is generated on page load.
I'm using .NET's XslCompiledTransform to do the transform. There is a property on XmlDocument called PreserveWhiteSpace, default is false, that can be set to strip out white space in the XML. This solves the problem of not allowing a user to enter breaking text, however, the client wants to preserve the formatting of the text that they enter if at all possible.
From what I know, .NET XslCompiledTransform transforms carriage returns-new line into
. I believe these are the characters that are breaking the javascript.
My first thought was to strip out the carriage returns within the xsl prior to passing the string into the javascript function, but I've not been able to figure out what characters to "search" the string for.
I guess another question is what characters get stored in SQL for carriage returns from within an ASP.NET textbox control?
Looking directly at the data in the database, SQL seems to display the character as "non-displayable" characters or (2 empty boxes).
Has anyone had any experience with this type of thing?
I was able to do this in the code behind to get my desired results:
using (StringWriter sWriter = new StringWriter())
{
xTrans.Transform(xDoc, xslArgs, sWriter);
return sWriter.ToString().Replace("
", "\\r\\n");
}
One other thing that I've stumbled across...
Initially, I wanted to find a solution to this problem that did not require a "compiled" code change, ie. a way to do this within xsl aka "a short term fix".
I tried this first and was not successful...
<xsl:variable name="comment" select="normalize-space(.\Comment)" />
This essentially did nothing and I was still receiving the javascript error.
Eventually, I tried this...
<div onclick="Show('{normalize-space($comment)}'"></div>
The second actually worked in stripping out the white space, thus, the javascript error was avoided. This wasn't a complete solution for my requirements because it only solved the issue of the javascript error, however, it would effectively prevent the user from "breaking" the page.
For that reason, it could suffice as a short term solution.

Source text contains simple HTML. How can I simply format the text in MS Word?

I've inherited a project that stores basic HTML formatting (i.e. - <b>, <i> tags) in a database and writes it out to a Word document. This is my first Word automation assignment, so be gentle!
Currently, there is a complicated function that runs after the document is complete that searches and replaces these tags. However, as this is run after the document is complete, any logic that is determined at run time (i.e. - insert page break here) can lead to disastrous results. For example, if I have a large chunk of bolded text, this bold text takes up more space and pushes the line break down to the next page, resulting in a mostly blank page.
I believe the fix for this is to format the text as it comes from the database so the positioning logic will be correct. I don't want to call the complicated procedure multiple times as it is time consuming and our end users need this document as quickly as possible.
Is there an easy way to write HTML formatted text to a Word document without needing to find and replace every supported tag? I would think that there would be something within Word that could handle this automatically. Thanks in advance if you can point me in the right direction.
Try this:
First, save the HTML you are about to insert as an ordinary ".htm" file.
Then use the Range object and it's InsertFile method to insert the ".htm" file at any given position:
Dim r As Range
Set r = ActiveDocument.Range
r.InsertFile FileName:=TempFilePath, Link:=False, ConfirmConversions:=False
Word should be smart enough to handle the HTML and do all of the format conversion on it's own. Use CSS to control the finer parts of the formatting.
Delete the ".htm" file when done.
maybe you can invoke an embedded IE (IWebBrowser2) to layout the text, then copy to clipboard as richtext, and finally paste to Word as RichText (formatted).