When I parse web sites in R, (system: R+debian) the html object output in the console make me uncomfortable.
The gap is wide between lines. How can I make it normal, to narrow the gap between the lines?
Maybe you can see tha same output with the following code.
options(encoding="gbk")
library(XML)
baseURL <- "http://www.jb51.net/article/27174.htm"
txt <- readLines(baseURL)
txt
Interesting, it seems that when print-ing a vector, the longest element decides how all elements will be spaced.
Your longest string is txt[374]: on my screen, it takes 19 lines; that means every element of txt will be printed using 19 lines, with possibly a lot of white space.
You don't have that problem when printing a list, so a solution is to do:
print(as.list(txt))
Try to use gsub() for replacing space by nothing.
Related
Having a strange issue when in jquery datatables where a column will resize and wrap contence when it contains a url like "http://foo.bar.co.uk?test=true&derp=reallyLongParam12345678901234578" it breaks it onto 2 lines at the ? however when it contains a link like "http%3A%2F%2Fdirectproducts.go2cloud.org%2Faff_c%3Foffer_id%3D149&aff_id=1041&aff_sub=white_bluepreland"
It doesn't know how to split this.
Is there anyway to specify more characters that it can split to the next line with as it is causing my table to be wider than the screen.
Note
The badly formated url has been fixed and is just an example
I have one label which is taking value from database.And the value of label is more than 20 lines.
While displaying that label in pdf ,if label is very big then it is rendering on second page and my first page is half blank.
so I want to break that label on two pages so it will start on first page only and then it will break and then it will render on second page.What can i do for that?
Unfortunately, determining where to put page breaks on the fly is a weak point of SSRS.
Perhaps you could break up the long text into multiple rows in the data source (splitting on spaces between words). This would result in funny looking breaks in the output as you won't know for sure where the break will appear in a line on the printed report.
If the text has reasonably sized paragraphs, you could parse it out that way instead using line breaks.
I am using sphinx to build latex and HTML documents with a lot of figures and enumerated lists. When I use figures in the middle of text outside of enumerated lists, the spacing is fine in both latex and HTML with and without captions. There is about a line of space above and below, which is acceptable. However When I try to use a figure within enumerated lists, such as the example below, the spacing is bad in HTML.
#. Here is an item in the list, above the figure
.. figure:: _images/myimage.png
:align: center
:width: 80 %
#. Here is another item below the figure.
The result of the above code is the bottom of the figure is right up against the next item in the list. There is no spacing between them, and this looks bad. This can be fixed in HTML by using the | character at the end of the figure to add a little space, but in the LaTeX output, this causes a DUlineblock environment that adds way too much space in the pdf.
Is there a way to simply add a single blank line after the figure in both HTML and Latex?
You can enter empty lines with:
text
|
text
I found that the replacement:
.. |br| raw:: html
<br />
Works well for adding a black line after a figure in enumerated lists. Since its a raw substitution it only affects html and the figure spacing in latex is fine without modification.
I have a report in SQL Server Reporting Services 2005. It makes use of a page header and footer and has no subreports. The body portion contains a few smaller elements and then a simple single column table. The table has a single header row and a single detail row. The header is just a label, basically. The detail row is a single textbox with a simple Fields!FieldName.Value as its output.
The problem is that FieldName, in this case, is a highly variable length string. It can be a sentence up to 8000 characters (usually no more than 2 pages worth). The text can contain line/paragraph breaks (returns) but no other special formatting. Everything is fine so long as the content fits on one page. Once the text exceeds a single page (8.5x11), the text is very nastily cut off abruptly. Since this is a pagination problem, it is only visible when exporting to PDF or when viewing the report in Print Layout.
It seems as though there is a maximum size the row can grow to on the first page and then it chops it off and starts it up on the second. But this cutoff is not carefully managed in relation to the text. It can occur right in the middle of a line, causing it to show the top halves of the letters on the first page and the bottom halves at the top of the second page.
Obviously, this is unacceptable, as it looks very unprofessional and can impair the readability of the line that was so messily split. I also can never be sure it'll split badly, as sometimes it more or less ends the page evenly, though usually I can still see the hanging tails of certain letters on the next page (g and p for instance).
The secondary problem is that I'd really like the table row header to repeat on each page. Setting the obvious property, "RepeatOnNewPage" has no effect. I suspect this is because it's still trying to show the single really vertically tall row. It seems like it's okay repeating headers and splitting pages nicely between detail rows. But because this is basically just a big block of text, and thus just one really tall row, it doesn't split it nicely.
What can I do or use to solve this problem? I can live without the repeating header so long as it just doesn't cut off text in the middle of a line.
Unfortunately, page break fine tuning is one of the biggest weak points of SSRS.
I can only suggest that you break up the long text into multiple rows before SSRS ever gets it. You'd want to parse the text to look for word breaks. The result will be odd looking breaks in the output since you won't know where the break will come on a line in the printed report. However, it'd be much more readable than cutting text in half.
If the text is comprised of reasonably sized paragraphs, you could parse it out that way instead.
You might even go so far as to measure the text using SQLCLR and the System.Drawing.Graphics.MeasureString method to fine tune the output but I wouldn't recommend that route for the feint of heart.
In SSRS 2008 R2 and Visual Studio 2008:
Click (not-right click) a textbox and go to the properties window (lower right side of VS) -> KeepTogether = false.
The text will cleanly cut between a line and continue on the next page.
Just thought to add here as searching for this doesn't return many results.
I have done what JC has suggested in the past where I've broken down the text into paragraphs and each paragraph would in effect be its own row. Works pretty well given the limitations of SSRS.
One thing to be careful about is that you would need to make sure that your paragraphs sort properly. In most cases it would display them in the correct order, but adding in a column with sortID to give some sorting hints to the table would probably be a good idea.
In the end, the cut-off-text problem was due to non-standard padding on the textbox in question.
For whatever reason, having padding any greater than the defaults (2pt all around) seemed to cause its pagination to go sour. I imagine it is due to the algorithm not taking padding into consideration when deciding where to break the paragraph. With default padding, the line always ends cleanly and nicely on each page.
As a workaround (since I liked the extra white space the padding gave to the layout), I used a rectangle to achieve the border and made the textbox inside it smaller than the rectangle by about an eighth of an inch. This gave the box some inner padding while still apparently allowing the pagination to correctly determine when to break up lines.
Still, a lot of unnecessary headache.
I have a multi-line text box. When users simply type away, the text box wraps the text, and it's saved as a single line. It's also possible that users may enter line breaks, for example when entering a "bulleted" lists like:
Here are some suggestions:
- fix this
- remove that
- and another thing
Now, the problem occurs when I try to display the value of this field. In order to preserve the formatting, I currently wrap the presentation in <pre> - this works to preserve user-supplied breaks, but when there's a lot of text saved as a single line, it displays the whole text block as single line, resulting in horizontal scrolling being needed to see everything.
Is there a graceful way to handle both of these cases?
The easiest way of dealing with this is turning all line breaks \n into <br> line breaks. In PHP for example, this is done using the nl2br() function.
If you want something a bit more fancy - like the list you quote getting converted into an actual HTML <ul> for example - you could consider a simple "language" like Markdown that SO uses. It comes with natural, simple rules like
# Heading 1
## Heading 2
### Heading 3
* Unordered List item
* Unordered List item
1. Numbered List item
2. Numbered List item
etc....
You can use the php function nl2br() It transforms line breaks into elements
Convert newline characters to <br /> tags explicitly, and let the browser word-wrap the text normally. That preserves the breaks the visitor entered, without harming other paragraphs.
You could replace line breaks with HTML line breaks.
Replace "\r\n" or "\n" (depending on the browser and platform, check first for longer one) with <br/>.
I would normally replace all CR/LF with LF, and then replace all LF with <br />. You can then render this text inside any HTML container you want and let it flow naturally.