Break word at specific character - html

I realise that similar questions have been asked, but none quite like this.
I have a situation where I am using BEM to display some classes in code tags. Below is an example:
Obviously the default behaviour is to break words at a hyphen, as we can see is happening in the example. Is there a way that I can control what characters the line-break occurs at? I would like to be able to have class name integrity maintained so that the line break occurs before each period . if necessary.

I have another solution using jquery,
$('.mypara').each(function () {
var str = $(this).html();
var htmlfoo = str.split('.').join('</br>');
$(this).html(htmlfoo);
});
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<code class="mypara">
This is-the HTML if its - characters exceed 50. characters it should go-to next line
</code>
<code class="mypara">
This is the HTM-if its. characters exceed 50 - characters it should. go-to next-line
</code>

Unfortunately I don't think there is a way to do everything you want with pure CSS.
UPDATE: removed spaces before periods in JS solution.
If you are able to use JavaScript you could process the code tag's contents to disable wrapping for words with hyphens and you could wrap each block starting with a period in an inline-block span.
The following code breaks the contents of each code tag into a list of blocks that start with either space or period. Each block is wrapped with a span that prevents wrapping, and blocks that begin with a period are additionally marked as display: inline-block;. This should give the behaviour you are looking for, and additionally preserve all content when copy-pasting text.
CSS:
.no-wrap-hyphen {
white-space: nowrap;
}
.wrap-period {
display: inline-block;
}
JavaScript (run this function on window load and resize):
function wrapPeriodsNotHyphens() { // run on window load or resize
var codes = document.getElementsByTagName( "code" );
for ( var i = 0; i < codes.length; i++ ) {
currentCode = codes[ i ].innerHTML.split(/(?=[ .])/); // split by spaces and periods
for ( var c = 0; c < currentCode.length; c++ ) {
// surround each item with nowrap span
currentCode[ c ] = '<span class="no-wrap-hyphen">' + currentCode[ c ] + '</span>';
if ( currentCode[ c ].indexOf( '.' ) > -1 ) {
// add a zero size space at the start for periods
currentCode[ c ] = '<span class="wrap-period">' + currentCode[ c ] + '</span>';
}
}
codes[ i ].innerHTML = currentCode.join('');
}
}

Related

How to set a certain number of spaces or indents before a Paragraph in Google Docs using Google Apps Script

I have a 20 line script, and I want to make sure that each paragraph is indented exactly once.
function myFunction() {
/*
This function turns the document's format into standard MLA.
*/
var body = DocumentApp.getActiveDocument().getBody();
body.setFontSize(12); // Set the font size of the contents of the documents to 9
body.setForegroundColor('#000000');
body.setFontFamily("Times New Roman");
// Loops through paragraphs in body and sets each to double spaced
var paragraphs = body.getParagraphs();
for (var i = 3; i < paragraphs.length; i++) { // Starts at 3 to exclude first 4 developer-made paragraphs
var paragraph = paragraphs[i];
paragraph.setLineSpacing(2);
// Left align the first cell.
paragraph.setAlignment(DocumentApp.HorizontalAlignment.LEFT);
// One indent
paragraph.editAsText().insertText(0, "\t"); // Adds one tab every time
}
var bodyText = body.editAsText();
bodyText.insertText(0, 'February 3, 1976\nMrs. Smith\nYour Name Here\nSocial Studies\n');
bodyText.setBold(false);
}
The code I have tried doesn't work. But my expected results are that for every paragraph in the for loop in myFunction(), there are exactly 4 spaces before the first word in each paragraph.
Here is a sample: https://docs.google.com/document/d/1sMztzhOehzheRdqumC6PLnvk4qJgUCSE0irjTZ0FjTQ/edit?usp=sharing
If the user uses Autoformat, but already has the paragraphs indented...
Update
I have investigated use of the Paragraph.setIndentFirstLine() method. When I set it to four, it sets it to 1 space. Now I realize this is because points and spaces are not the same thing. What number do I need to multiply by to get four spaces in points?
Let us consider a few basic identing operations: manual and by script.
The following image shows how to indent current paragraph (cursor stays inside this one).
Please note, the units are centimetres. Also note, that the paragraph does not include leading spaces or tabs, we have no need of them.
Suppose we would like to get the indent values in the script and apply them to the next paragraph. Look at the code below:
function myFunction() {
var ps = DocumentApp.getActiveDocument().getBody().getParagraphs();
// We work with the 5-th and 6-th paragraphs indeed
var iFirst = ps[5].getIndentFirstLine();
var iStart = ps[5].getIndentStart();
var iEnd = ps[5].getIndentEnd();
Logger.log([iFirst, iStart, iEnd]);
ps[6].setIndentFirstLine(iFirst);
ps[6].setIndentStart(iStart);
ps[6].setIndentEnd(iEnd);
}
If you run and look at the log, you will see something like this: [92.69291338582678, 64.34645669291339, 14.173228346456694]. No surprise, we have typographic points instead of centimetres. (1cm=28.3465pt) So we can measure and modify any paragraph indent values precisely.
Addition
For some reasons you might want to control spaces number at the beginning of the paragraph. It is also possible by scripting, but it has no effect on the paragraph's "left" or "right" indents.
Sample code below is for similar task: count leading spaces number of the 5-th paragraph and make the same number of spaces at the beginning of the next one.
function mySpaces() {
var ps = DocumentApp.getActiveDocument().getBody().getParagraphs();
// We work with the 5-th and 6-th paragraphs indeed
var spacesCount = getLeadingSpacesCount(ps[5]);
Logger.log(spacesCount);
var diff = getLeadingSpacesCount(ps[6]) - spacesCount;
if (diff > 0) {
ps[6].editAsText().deleteText(0, diff - 1);
} else if (diff < 0) {
var s = Array(1 - diff).join(' ');
ps[6].editAsText().insertText(0, s);
}
}
function getLeadingSpacesCount(p) {
var found = p.findText("^ +");
return found ? found.getEndOffsetInclusive() + 1 : 0;
}
We have used methods deleteText() and insertText() of the class Text for proper corrections and findText() to locate the spaces if any. Note, the last method argument is a string, representing a regular expression. It matches "all leading spaces", if they exist. See more details about regular expression syntax.

Retain newline when getting contenteditable div text

I wanted to save the text inside a contenteditable div being pre formatted. How would i get the pre form of the text and not the text where \n and \r are ommitted?
$('#save').click(function(e) {
var id = "board_code";
var ce = $("<pre />").html($("#" + id).html());
if ($.browser.webkit)
ce.find("div").replaceWith(function() { return "\n" + this.innerHTML; });
if ($.browser.msie)
ce.find("p").replaceWith(function() { return this.innerHTML + "<br>"; });
if ($.browser.mozilla || $.browser.opera || $.browser.msie)
ce.find("br").replaceWith("\n");
alert( ce.text() );
});
http://jsfiddle.net/AD5q7/10/ this doesnt work
try string for contenteditable div
UPDATE: try the string by typing it. There maybe no problem when the string is pasted.
1
abc def
gh i
jkl
2
#include<iostream.h>
#include<conio.h>
int main(){
int grade, passingMark=75;
cout<<"Hi there, please enter your mark: ";
cin>>grade;
if( ((grade >= passingMark)||(grade==35)) && (grade<101)){
cout<<"\nPass!";
}
return 0;//15lines
}
The save file must be also formatted like this and not without \n\r removed. Im expecting that the alert should include \n
When contenteaditable div looses focus, the entire text gets converted to html for eg
<div contenteditable="true">Your text is here
and has new line </div>
upon loosing focus it converts the virtual textarea to html i.e.
<div>Your text is here</div><br><div>and has new line </div>
and when you'll attempt .text(), you'll loose the desired alignment as actually the don't exist anymore in that div.
Solution
1. You can use textarea, with border properties set to 0 which would make it look like a contenteditable div or
2. You can grab the entire html of the contenteditable div and replace the html with the corresponding text representations using javascript (for that refer javascript regex replace html chars)
Try this
alert( ce.text().replace(/\n/g, "\\n" ).replace(/\r/g, "\\r"));

highlight words in html using regex & javascript - almost there

I am writing a jquery plugin that will do a browser-style find-on-page search. I need to improve the search, but don't want to get into parsing the html quite yet.
At the moment my approach is to take an entire DOM element and all nested elements and simply run a regex find/replace for a given term. In the replace I will simply wrap a span around the matched term and use that span as my anchor to do highlighting, scrolling, etc. It is vital that no characters inside any html tags are matched.
This is as close as I have gotten:
(?<=^|>)([^><].*?)(?=<|$)
It does a very good job of capturing all characters that are not in an html tag, but I'm having trouble figuring out how to insert my search term.
Input: Any html element (this could be quite large, eg <body>)
Search Term: 1 or more characters
Replace Txt: <span class='highlight'>$1</span>
UPDATE
The following regex does what I want when I'm testing with http://gskinner.com/RegExr/...
Regex: (?<=^|>)(.*?)(SEARCH_STRING)(?=.*?<|$)
Replacement: $1<span class='highlight'>$2</span>
However I am having some trouble using it in my javascript. With the following code chrome is giving me the error "Invalid regular expression: /(?<=^|>)(.?)(Mary)(?=.?<|$)/: Invalid group".
var origText = $('#'+opt.targetElements).data('origText');
var regx = new RegExp("(?<=^|>)(.*?)(" + $this.val() + ")(?=.*?<|$)", 'gi');
$('#'+opt.targetElements).each(function() {
var text = origText.replace(regx, '$1<span class="' + opt.resultClass + '">$2</span>');
$(this).html(text);
});
It's breaking on the group (?<=^|>) - is this something clumsy or a difference in the Regex engines?
UPDATE
The reason this regex is breaking on that group is because Javascript does not support regex lookbehinds. For reference & possible solutions: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript.
Just use jQuerys built-in text() method. It will return all the characters in a selected DOM element.
For the DOM approach (docs for the Node interface): Run over all child nodes of an element. If the child is an element node, run recursively. If it's a text node, search in the text (node.data) and if you want to highlight/change something, shorten the text of the node until the found position, and insert a highligth-span with the matched text and another text node for the rest of the text.
Example code (adjusted, origin is here):
(function iterate_node(node) {
if (node.nodeType === 3) { // Node.TEXT_NODE
var text = node.data,
pos = text.search(/any regular expression/g), //indexOf also applicable
length = 5; // or whatever you found
if (pos > -1) {
node.data = text.substr(0, pos); // split into a part before...
var rest = document.createTextNode(text.substr(pos+length)); // a part after
var highlight = document.createElement("span"); // and a part between
highlight.className = "highlight";
highlight.appendChild(document.createTextNode(text.substr(pos, length)));
node.parentNode.insertBefore(rest, node.nextSibling); // insert after
node.parentNode.insertBefore(highlight, node.nextSibling);
iterate_node(rest); // maybe there are more matches
}
} else if (node.nodeType === 1) { // Node.ELEMENT_NODE
for (var i = 0; i < node.childNodes.length; i++) {
iterate_node(node.childNodes[i]); // run recursive on DOM
}
}
})(content); // any dom node
There's also highlight.js, which might be exactly what you want.

How to preserve whitespace indentation of text enclosed in HTML <pre> tags excluding the current indentation level of the <pre> tag in the document?

I'm trying to display my code on a website but I'm having problems preserving the whitespace indentation correctly.
For instance given the following snippet:
<html>
<body>
Here is my code:
<pre>
def some_funtion
return 'Hello, World!'
end
</pre>
<body>
</html>
This is displayed in the browser as:
Here is my code:
def some_funtion
return 'Hello, World!'
end
When I would like it displayed as:
Here is my code:
def some_funtion
return 'Hello, World!'
end
The difference is that that current indentation level of the HTML pre tag is being added to the indentation of the code. I'm using nanoc as a static website generator and I'm using google prettify to also add syntax highlighting.
Can anyone offer any suggestions?
PRE is intended to preserve whitespace exactly as it appears (unless altered by white-space in CSS, which doesn't have enough flexibility to support formatting code).
Before
Formatting is preserved, but so is all the indentation outside of the PRE tag. It would be nice to have whitespace preservation that used the location of the tag as a starting point.
After
Contents are still formatted as declared, but the extraneous leading whitespace caused by the position of the PRE tag within the document is removed.
I have come up with the following plugin to solve the issue of wanting to remove superfluous whitespace caused by the indentation of the document outline. This code uses the first line inside the PRE tag to determine how much it has been indented purely due to the indentation of the document.
This code works in IE7, IE8, IE9, Firefox, and Chrome. I have tested it briefly with the Prettify library to combine the preserved formatting with pretty printing. Make sure that the first line inside the PRE actually represents the baseline level of indenting that you want to ignore (or, you can modify the plugin to be more intelligent).
This is rough code. If you find a mistake or it does not work the way you want, please fix/comment; don't just downvote. I wrote this code to fix a problem that I was having and I am actively using it so I would like it to be as solid as possible!
/*!
*** prettyPre ***/
(function( $ ) {
$.fn.prettyPre = function( method ) {
var defaults = {
ignoreExpression: /\s/ // what should be ignored?
};
var methods = {
init: function( options ) {
this.each( function() {
var context = $.extend( {}, defaults, options );
var $obj = $( this );
var usingInnerText = true;
var text = $obj.get( 0 ).innerText;
// some browsers support innerText...some don't...some ONLY work with innerText.
if ( typeof text == "undefined" ) {
text = $obj.html();
usingInnerText = false;
}
// use the first line as a baseline for how many unwanted leading whitespace characters are present
var superfluousSpaceCount = 0;
var currentChar = text.substring( 0, 1 );
while ( context.ignoreExpression.test( currentChar ) ) {
currentChar = text.substring( ++superfluousSpaceCount, superfluousSpaceCount + 1 );
}
// split
var parts = text.split( "\n" );
var reformattedText = "";
// reconstruct
var length = parts.length;
for ( var i = 0; i < length; i++ ) {
// cleanup, and don't append a trailing newline if we are on the last line
reformattedText += parts[i].substring( superfluousSpaceCount ) + ( i == length - 1 ? "" : "\n" );
}
// modify original
if ( usingInnerText ) {
$obj.get( 0 ).innerText = reformattedText;
}
else {
// This does not appear to execute code in any browser but the onus is on the developer to not
// put raw input from a user anywhere on a page, even if it doesn't execute!
$obj.html( reformattedText );
}
} );
}
}
if ( methods[method] ) {
return methods[method].apply( this, Array.prototype.slice.call( arguments, 1 ) );
}
else if ( typeof method === "object" || !method ) {
return methods.init.apply( this, arguments );
}
else {
$.error( "Method " + method + " does not exist on jQuery.prettyPre." );
}
}
} )( jQuery );
This plugin can then be applied using a standard jQuery selector:
<script>
$( function() { $("PRE").prettyPre(); } );
</script>
Indenting With Comments
Since browsers ignore comments, you can use them to indent your pre tag contents.
Solution
<html>
<body>
<main>
Here is my code with hack:
<pre>
<!-- -->def some_function
<!-- --> return 'Hello, World!'
<!-- -->end
</pre>
Here is my code without hack:
<pre>
def some_function
return 'Hello, World!'
end
</pre>
</main>
<body>
</html>
NOTE: a main wrapper was added to provide enough space for the comments.
Advantages
No JavaScript required
Can be added statically
Minification won't affect the indentation and reduces file size
Disadvantages
Requires a minimum amount of space for the comments
Not very elegant unless build tools are used
Removing Indentation With Node
A better solution is to remove the leading white-space using either your build process or back-end rendering process. If you are using node.js, then you can use a stream I wrote called predentation. You can use any language you want to build a similar tool.
Before
<html>
<body>
Here is my code:
<pre>
def some_function
return 'Hello, World!'
end
</pre>
</body>
</html>
After
<html>
<body>
Here is my code:
<pre>
def some_function
return 'Hello, World!'
end
</pre>
</body>
</html>
Advantages
Seamless way to write pre tags
Smaller output file size
Disadvantages
Requires a build step in your workflow
Does not handle non pre elements with white-space: pre added by CSS
Removing Indentation With JavaScript
See this answer to remove indentation with JavaScript
Advantages
Possible to target elements with white-space: pre
Disadvantages
JavaScript can be disabled
White-space adds to the file size
Managed to do this with JavaScript. It works in Internet Explorer 9 and Chrome 15, I haven't tested older versions. It should work in Firefox 11 when support for outerHTML is added (see here), meanwhile there are some custom implementations available on the web. An excercise for the reader is to get rid of trailing indentation (until I make time to finish it and update this answer).
I'll also mark this as community wiki for easy editing.
Please note that you'll have to reformat the example to use tabs as indentation, or change the regex to work with spaces.
<!DOCTYPE html>
<html>
<head>
<title>Hello, World!</title>
</head>
<body>
<pre>
<html>
<head>
<title>Hello World Example</title>
</head>
<body>
Hello, World!
</body>
</html>
</pre>
<pre>
class HelloWorld
{
public static int Main(String[] args)
{
Console.WriteLine(&quot;Hello, World!&quot;);
return 0;
}
}
</pre>
<script language="javascript">
var pre_elements = document.getElementsByTagName('pre');
for (var i = 0; i < pre_elements.length; i++)
{
var content = pre_elements[i].innerHTML;
var tabs_to_remove = '';
while (content.indexOf('\t') == '0')
{
tabs_to_remove += '\t';
content = content.substring(1);
}
var re = new RegExp('\n' + tabs_to_remove, 'g');
content = content.replace(re, '\n');
pre_elements[i].outerHTML = '<pre>' + content + '</pre>';
}
</script>
</body>
</html>
This can be done in four lines of JavaScript:
var pre= document.querySelector('pre');
//insert a span in front of the first letter. (the span will automatically close.)
pre.innerHTML= pre.textContent.replace(/(\w)/, '<span>$1');
//get the new span's left offset:
var left= pre.querySelector('span').getClientRects()[0].left;
//move the code to the left, taking into account the body's margin:
pre.style.marginLeft= (-left + pre.getClientRects()[0].left)+'px';
<body>
Here is my code:
<pre>
def some_funtion
return 'Hello, World!'
end
</pre>
<body>
If you're okay with changing the innerHTML of the element:
Given:
<pre>
<code id="the-code">
def some_funtion
return 'Hello, World!'
end
</code
</pre>
Which renders as:
def some_funtion
return 'Hello, World!'
end
The following vanilla JS:
// get block however you want.
var block = document.getElementById("the-code");
// remove leading and trailing white space.
var code = block.innerHTML
.split('\n')
.filter(l => l.trim().length > 0)
.join('\n');
// find the first non-empty line and use its
// leading whitespace as the amount that needs to be removed
var firstNonEmptyLine = block.textContent
.split('\n')
.filter(l => l.trim().length > 0)[0];
// using regex get the first capture group
var leadingWhiteSpace = firstNonEmptyLine.match(/^([ ]*)/);
// if the capture group exists, then use that to
// replace all subsequent lines.
if(leadingWhiteSpace && leadingWhiteSpace[0]) {
var whiteSpace = leadingWhiteSpace[0];
code = code.split('\n')
.map(l => l.replace(new RegExp('^' + whiteSpace + ''), ''))
.join('\n');
}
// update the inner HTML with the edited code
block.innerHTML = code;
Will result in:
<pre>
<code id="the-code">def some_funtion
return 'Hello, World!'
end</code>
</pre>
And will render as:
def some_funtion
return 'Hello, World!'
end
I also found that if you're using haml you can use the preserve method. For example:
preserve yield
This will preserve the whitespace in the produced yield which is usually markdown containing the code blocks.
I decided to come up with something more concrete than changing the way pre or code work. So I made some regex to get the first newline character \n (preceded with possible whitespace - the \s* is used to cleanup extra whitespace at the end of a line of code and before the newline character (which I noticed yours had)) and find the tab or whitespace characters following it [\t\s]* (which means tab character, whitespace character (0 or more) and set that value to a variable. That variable is then used in the regex replace function to find all instances of it and replace it with \n (newline). Since the second line (where pattern gets set) doesn't have the global flag (a g after the regex), it will find the first instance of the \n newline character and set the pattern variable to that value. So in the case of a newline, followed by 2 tab characters, the value of pattern will technically be \n\t\t, which will be replaced where every \n character is found in that pre code element (since it's running through the each function) and replaced with \n
$("pre code").each(function(){
var html = $(this).html();
var pattern = html.match(/\s*\n[\t\s]*/);
$(this).html(html.replace(new RegExp(pattern, "g"),'\n'));
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<body>
Here is some code:
<pre><code>
Here is some fun code!
More code
One tab
One more tab
Two tabs and an extra newline character precede me
</code></pre>
</body>
<script>
$("pre[name='pre']").each(function () {
var html = $(this).html()
var blankLen = (html.split('\n')[0].match(/^\s+/)[0]).length
$(this).html($.trim(html.replace(eval("/^ {" + blankLen + "}/gm"), "")))
})
</script>
<div>
<pre name="pre">
1
2
3
</pre>
</div>
This is cumbersome, but it works if code folding is important to you:
<pre>def some_funtion</pre>
<pre> return 'Hello, World!'</pre>
<pre>end</pre>
In your css,
pre { margin:0 }
In vim, writing your code normally and then executing:
:s/\t\t\([^\n]\+\)/<pre>\1<\/pre>/
for each line would work.
If you are using this on a code block like:
<pre>
<code>
...
</code>
</pre>
You can just use css like this to offset that large amount of white space in the front.
pre code {
position: relative;
left: -95px; // or whatever you want
}
The pre tag preserves all the white spaces you have used while writing in the body. Where as normally if you do not use pre it will display the text normally...(HTML will make the browser to neglect those white spaces) Here try this I have used the paragraph tag.
Output:-
Here is my code:
def some_function
return 'Hello, World!'
end
<html>
<body>
Here is my code:
<p>
def some_function<br>
<pre> return 'Hello, World!'<br></pre>
end
</p>
</body>
</html>

Controlling tab space in a <pre> using CSS?

Is it possible to specify how many pixels, etc. the tab space occupies in a <pre> using CSS?
for example, say i have a piece of code appearing in a <pre> on a web page:
function Image()
{
this.Write = function()
{
document.write(this.ToString());
return this;
};
...
}
Image.prototype = new Properties();
...
is it possible to specify a different amount of space the tab indents the line using CSS?
If not, is there any workarounds?
While the above discussion provides some historical background, times have changed, and more relevant information and possible solutions can be found here: Specifying Tab-Width?
attn admin: possible duplicate of ref'ed question.
From CSS 2.1, § 16.6.1 The 'white-space' processing model:
All tabs (U+0009) are rendered as a horizontal shift that lines up the start edge of the next glyph with the next tab stop. Tab stops occur at points that are multiples of 8 times the width of a space (U+0020) rendered in the block's font from the block's starting content edge.
CSS3 Text says basically the same thing.
From HTML 4.01 § 9.3.4 Preformatted text: The PRE element
The horizontal tab character (decimal 9 in [ISO10646] and [ISO88591] ) is usually interpreted by visual user agents as the smallest non-zero number of spaces necessary to line characters up along tab stops that are every 8 characters. We strongly discourage using horizontal tabs in preformatted text since it is common practice, when editing, to set the tab-spacing to other values, leading to misaligned documents.
If you're concerned with leading tabs, it's a simple matter to replace them with spaces.
/* repeat implemented using Russian Peasant multiplication */
String.prototype.repeat = function (n) {
if (n<1) return '';
var accum = '', c=this;
for (; n; n >>=1) {
if (1&n) accum += c;
c += c;
}
return accum;
}
String.prototype.untabify = function(tabWidth) {
tabWidth = tabWidth || 4;
return this.replace(/^\t+/gm, function(tabs) { return ' '.repeat(tabWidth * tabs.length)} );
}