How to copy paragraph_styles from one document to another in python-docx? - python-docx

I am working from a template file which is missing some of the base style types from the WD_STYLE_TYPE.PARAGRAPH.
If I list all styles included in the templates WD_STYLE_TYPE.PARAGRAPH I get
Normal
header
footer
Matrix Text
Body Text
toc1
Body
Balloon Text
Caption
annotation text
annotation subject
Body Text Indent
EY Document title
footnote text
EY Body text (with para space)
List Paragraph
In order to get around this I have created a blank Document object and built a paragraph_styles list object based on the default style types for WD_STYLE_TYPE.PARAGRAPH. This gives me the following style types
Normal
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Heading 7
Heading 8
Heading 9
No Spacing
Title
Subtitle
List Paragraph
Body Text
Body Text 2
Body Text 3
List
List 2
List 3
List Bullet
List Bullet 2
List Bullet 3
List Number
List Number 2
List Number 3
List Continue
List Continue 2
List Continue 3
macro
Quote
Caption
Intense Quote
TOC Heading
The style I want is 'List Bullet' but I cant seem to be able to add the styles from the temp Document to the main Document which is based on the template

Make sure you read and understand this page in the documentation:
http://python-docx.readthedocs.io/en/latest/user/styles-understanding.html
And this one that immediately follows it:
http://python-docx.readthedocs.io/en/latest/user/styles-using.html
Basically you add a paragraph with the desired style to the template, then delete that paragraph, and then save the template.
Setting a paragraph to a "latent" style (such as ListBullet) causes that style to be added to the document. When you delete the paragraph, the style remains. When you save the document, that style is available for later documents that use it as a template.

Related

pymupdf detect two paragraph which text blocks coordinates is closed as one

I face a problem that When I use fitz to detect pdf layout. The two paragraph will be detect as one textblock if the two block as a close line margin.
for example. I want detect the text and the isolated formula as to text blocks. but for now fitz detect them as one text block.How could i handdle this.
Shoud I detect words coordinates and sort it with normal reading order or some methods like this.
PyMuPDF also has ways to adjust the granularity of text extraction: there are more levels between and beyond block extraction and word extraction.
You can extract by line, by text span (both are a higher level than word) and by character (level below word). And all of them deliver wrapping rectangles of the respective text, plus a plethora of text font proprerties (font size, font weight, font style, font color), writing direction.
Here is an example that extracts lines of text:
details = page.get_text("dict", flags=fitz.TEXTFLAGS_TEXT) # skips images!
for block in details["blocks"]: # delivers the block level
for line in block["lines"]: # the lines in this block
bbox = fitz.Rect(line["bbox"]) # wraps this line
line_text = "".join([span["text"] for span in line["spans"]])
Please do have a look at this picture in the documentation - it shows an overview of the dictionary layout: https://pymupdf.readthedocs.io/en/latest/_images/img-textpage.png.

Wrapping html to other half of the page on page break and handling headings after the page break

I've got a table with very few columns but many rows on an html document. I'm trying to save paper during printing and I'm wondering if there is a way to wrap the table to the other half of the page.
It needs to keep the headings of the table on the other half as well like it does during a page break.
Another issue that comes up is that there are nested headers that need to repeat as well. For example:
Heading 1 Company A
Heading 1.1 Branch A
Some Data
Heading 1.2 Branch B
Some Data
Heading 2 Company B
Heading 2.1 Branch C
Some Data
Heading 2.2 Branch D
Some Data
That means that if there is still data for either of the sub heading they also need to be repeated after the page break.
I'm also trying to use base html so I can't use any libaries.
So far I have tried using the normal html table and using flex box form of a table

Is it possible to save the paragraph entered in the text area tag including the line breaks?

In the text area I'm typing the input as two paragraphs but it's returning the text as a single paragraph
This is the code and I want to return the input in two paragraphs exactly as I typed in there
but it's returning as a single paragraph

Duplicated Gutenberg Block content using Advance Custom Fields to create Gutenberg blocks

I'm using Advanced Custom Fields Gutenberg Blocks. I have a very simple block which just displays a heading, some text and a button. I want to be able to place this block on the page multiple times and so have set to allow duplicate blocks on the page with the SupportsMultiple: true
I am using get_field('heading') to retrieve the value. This works perfectly for 1 block on 1 page. But when I try two blocks the content is duplicated. Block 2 title is overwritten by Block 1 title. Block 2 is getting the data from BLock 1 instead of its own data.

Best practices: displaying text that was input via multi-line text box

I have a multi-line text box. When users simply type away, the text box wraps the text, and it's saved as a single line. It's also possible that users may enter line breaks, for example when entering a "bulleted" lists like:
Here are some suggestions:
- fix this
- remove that
- and another thing
Now, the problem occurs when I try to display the value of this field. In order to preserve the formatting, I currently wrap the presentation in <pre> - this works to preserve user-supplied breaks, but when there's a lot of text saved as a single line, it displays the whole text block as single line, resulting in horizontal scrolling being needed to see everything.
Is there a graceful way to handle both of these cases?
The easiest way of dealing with this is turning all line breaks \n into <br> line breaks. In PHP for example, this is done using the nl2br() function.
If you want something a bit more fancy - like the list you quote getting converted into an actual HTML <ul> for example - you could consider a simple "language" like Markdown that SO uses. It comes with natural, simple rules like
# Heading 1
## Heading 2
### Heading 3
* Unordered List item
* Unordered List item
1. Numbered List item
2. Numbered List item
etc....
You can use the php function nl2br() It transforms line breaks into elements
Convert newline characters to <br /> tags explicitly, and let the browser word-wrap the text normally. That preserves the breaks the visitor entered, without harming other paragraphs.
You could replace line breaks with HTML line breaks.
Replace "\r\n" or "\n" (depending on the browser and platform, check first for longer one) with <br/>.
I would normally replace all CR/LF with LF, and then replace all LF with <br />. You can then render this text inside any HTML container you want and let it flow naturally.