multiple selection every N lines Sublime Text - sublimetext2

I have a very large file (~100k lines) and I want to insert a comment every 100 lines, I can write a script to do it, but I wonder if something like this is possible in Sublime (In emacs is pretty straightforward).

How about this regex to find 100 lines at a time
((.*\n){1,100})
and then replace with
\1
// this is a comment
Is that close enough for whatever it is you are trying to achieve?
Edit: This version of the replacement text suggested as better by #Maluchi
\1\n
// this is a comment\n

Related

Simple macros for HTML

My html file contains in many places the code
It is too short and it doesn't really make sense to replace it with a code like
<span class="three-spaces"></span>
I would like to replace it with something like
##TS##
or
%%TS%%
and the file should start with something like:
SET TS = " "
Is there any way to write the HTML this way? I am not looking for compiling a source file into a HTML. I am looking for a solution that allows directly writing macros into HTML files.
Later edit: I'm coming with another example:
I also need to transform
lnk(http://www.example.com)
into
<a target="_blank" href="http://www.example.com">http://www.example.com</a>
Instead of telling him WHY he should not do something, how about telling him HOW he could do it? Maybe his example is not an appropriate need for it, but there's other situations where being able to create a macro would be nice.
For example... I have an HTML page that I'm working on that deals with unit conversions and quite often, I'm having to type things like "cm/in" as "cm/in" or for volumes "cu-cm/cu-in" as "cm3/in3". It would be really nice from a typing and readability standpoint if I could create macros that were just typed as "%%cm-per-in%%, %%cc-per-cu-in%% or something like that.
So, the line in the 'sed' file might look like this:
s/%%cc-per-cu-in%%/<sup>cm<sup>3<\/sup><\/sup>\/<sub>in<sup>3<\/sup><\/sub>/g
Since the "/" is a field separator for the substitute command, you need to explicitly quote it with the backslash character ("\") within the replacement portion of the substitute command.
The way that I have handled things like this in the past was to either write my own preprocessor to make the changes or if the "sed" utility was available, I would use it. So for this sort of thing, I would basically have a "pre-HTML" file that I edited and after running it through "sed" or the preprocessor, it would generate an HTML file that I could copy to the web server.
Now, you could create a javascript function that would do the text substitution for you, but in my opinion, it is not as nice looking as an actual preprocessor macro substitution. For example, to do what I was doing in the sed script, I would need to create a function that would take as a parameter the short form "nickname" for the longer HTML that would be generated. For example:
function S( x )
{
if (x == "cc-per-cu-in") {
document.write("<sup>cm<sup>3</sup></sup>/<sub>in<sup>3</sup></sub>");
} else if (x == "cm-per-in") {
document.write("<sup>cm</sup>/<sub>in</sub>");
} else {
document.write("<B>***MACRO-ERROR***</B>");
}
}
And then use it like this:
This is a test of cc-per-cu-in <SCRIPT>S("cc-per-cu-in");</SCRIPT> and
cm-per-in <SCRIPT>S("cm-per-in");</SCRIPT> as an alternative to sed.
This is a test of an error <SCRIPT>S("cc-per-in");</SCRIPT> for a
missing macro substitution.
This generates the following:
This is a test of cc-per-cu-in cm3/in3
and cm-per-in cm/in as an alternative to sed. This is a test of an error MACRO-ERROR for a missing macro substitution.
Yeah, it works, but it is not as readable as if you used a 'sed' substitution.
So, decide for yourself... Which is more readable...
This...
This is a test of cc-per-cu-in <SCRIPT>S("cc-per-cu-in");</SCRIPT> and
cm-per-in <SCRIPT>S("cm-per-in");</SCRIPT> as an alternative to sed.
Or this...
This is a test of cc-per-cu-in %%cc-per-cu-in%% and
cm-per-in %%cm-per-in% as an alternative to sed.
Personally, I think the second example is more readable and worth the extra trouble to have pre-HTML files that get run through sed to generate the actual HTML files... But, as the saying goes, "Your mileage may vary"...
EDITED: One more thing that I forgot about in the initial post that I find useful when using a pre-processor for the HTML files -- Timestamping the file... Often I'll have a small timestamp placed on a page that says the last time it was modified. Instead of manually editing the timestamp each time, I can have a macro (such as "%%DATE%%", "%%TIME%%", "%%DATETIME%%") that gets converted to my preferred date/time format and put in the file.
Since my background is in 'C' and UNIX, if I can't find a way to do something in HTML, I'll often just use one of the command line tools under UNIX or write a small 'C' program to do it. My HTML editing is always in 'vi' (or 'vim' on the PC) and I find that I am often creating tables for alignment of various portions of the HTML page. I got tired of typing all the TABLE, TR, and TD tags, so I created a simple 'C' program called 'table' that I can execute via the '!}' command in 'vi', similar to how you execute the 'fmt' command in 'vi'. It takes as parameters the number of rows & columns to create, whether the column cells are to be split across two lines, how many spaces to indent the tags, and the column widths and generates an appropriately indented TABLE tag structure. Just a simple utility, but saves on the typing.
Instead of typing this:
<TABLE>
<TR>
<TD width=200>
</TD>
<TD width=300>
</TD>
</TR>
<TR>
<TD>
</TD>
<TD>
</TD>
</TR>
<TR>
<TD>
</TD>
<TD>
</TD>
</TR>
</TABLE>
I can type this:
!}table -r 3 -c 2 -split -w 200 300
Now, with respect to the portion of the original question about being able to create a macro to do HTML links, that is also possible using 'sed' as a pre-processor for the HTML files. Let's say that you wanted to change:
%%lnk(www.stackoverflow.com)
to:
www.stackoverflow.com
you could create this line in the sed script file:
s/%%lnk(\(.*\))/<a href="\1">\1<\/a>/g
'sed' uses regular expressions and they are not what you might call 'pretty', but they are powerful if you know what you are doing.
One slight problem with this example is that it requires the macro to be on a single line (i.e. you cannot split the macro across lines) and if you call the macro multiple times in a single line, you get a result that you might not be expecting. Instead of doing the macro substitution multiple times, it assumes the argument to the macro starts with the first '(' of the first macro invocation and ends with the last ')' of the last macro invocation. I'm not a sed regular expression expert, so I haven't figured out how to fix this yet. For the multiple line portion though, a possible fix would be to replace all the LF characters in the file with some other special character that would not normally be used, run sed on that result, and then convert the special characters back to LF characters. Of course, the problem there is that the entire file would be a single line and if you are invoking the macro, it is going to have the results that I described above. I suspect awk would not have that problem, but I have never had a need to learn awk.
Upon further reflection, I think there might be an easier solution to both the multi-line and multiple invocation of a macro on a single line -- the 'm4' macro preprocessor that comes with the 'C' compiler (e.g. gcc). I haven't tested it much to see what the downside might be, but it seems to work well enough for the tests that I have performed. You would define a macro as such in your pre-HTML file:
define(`LNK', `$1')
And yeah, it does use the backwards single quote character to start the text string and the normal single quote character to end the text string.
The only problem that I've found so far is that is that for the macro names, it only allows the characters 'A'-'Z', 'a'-'z', '0'-'9', and '' (underscore). Since I prefer to type '-' instead of '', that is a definite disadvantage to me.
Technically inline JavaScript with a <script> tag could do what you are asking. You could even look into the many templating solutions available via JavaScript libraries.
That would not actually provide any benefit, though. JavaScript changes what is ultimately displayed, not the file itself. Since your use case does not change the display it wouldn't actually be useful.
It would be more efficient to consider why is appearing in the first place and fix that.
This …
My html file contains in many places the code
… is actually what is wrong in your file!
is not meant to use for layout purpose, you should fix that and use CSS instead to layout it correctly.
is meant to stop breaking words at the end of a line that are seperated by a space. For example numbers and their unit: 5 liters can end up with 5 at the end of the line and liters in the next line (Example).
To keep that together you would use 5 liters. That's what you use for and nothing else, especially not for layout purpose.
To still answer your question:
HTML is a markup language not a programming language. That means it is descriptive/static and not functional/dynamic. If you try to generate HTML dynamically you would need to use something like PHP or JavaScript.
Just an observation from a novice. If everyone did as purists suggest (i.e.-the right way), then the web would still be using the same coding conventions it was using 30 years ago. People do things, innovate, and create new ways, then new standards, and deprecate others all the time. Just because someone says "spaces are only for separating words...and nothing else" is silly. For many, many years, when people typed letters, they used one space between words, and two spaces between end punctuation and the next sentence. That changed...yeah, things change. There is absolutely nothing wrong with using spaces and non-breaking spaces in ways which assist layout. It is neither useful nor elegant for someone to use a long span with style over and over and over, rather than simple spaces. You can think it is, and your club of do it right folks might even agree. But...although "right", they are also being rather silly about it. Question: Will a page with 3 non-breaking spaces validate? Interesting.

Line ending charactor LFs are automatically changed to CRLFs in HTML textarea

I noticed that all LFs are automatically changed to CRLFs if I put them into a HTML textarea.
■ Questions:
where and what causes this behavior?
is this because of Windows Operation system, i.e. it will not happen if using a different Operating system such as MacOS? (I just experienced this on a windows machine, not yet tested on a Mac though...)
or is this something which depends on Browser? (I have seen this behavior on Chrome, IE, and Firefox. Not yet tested on Safari...)
or is this something only happens on my editor? (i.e I am using sakura editor)
If possible, how to preserve the LF so that it does not get changed into CRLF?
■ Steps to reproduce this:
find a textarea where you can input, for example the following w3school website.
https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_textarea
prepare a text that at least 2 lines with some LFs using an editor which can detect the line ending charactors (so that you can make sure you have some LFs).
※ I am using Sakura editor as an example.
copy and paste the text prepared in step 2 to the textarea.
once text is copied into the textarea, this time, copy the entire content of the textarea.
paste the content of the textarea back to your editor.
the line ending characters all become CRLFs.
■ P.S.
Please see the screenshots for details
left side is original text with 3 LFs
right side is the content copied back from the textarea and all LFs becomes CRLFs)
「↓」indicated LF
「⏎」indicated CRLF
Thanks
I think I find myself the answer at least some helpful information, i will just leave a record in case there are people seeking for the answer for similar questions.
where and what causes this behavior?
For historical reasons, the element’s value is normalized in three different ways for three different purposes. The raw value is the value as it was originally set. It is not normalized. The API value is the value used in the value IDL attribute. It is normalized so that line breaks use U+000A LINE FEED (LF) characters. Finally, there is the value, as used in form submission and other processing models in this specification. It is normalized so that line breaks use U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pairs, and in addition, if necessary given the element’s wrap attribute, additional line breaks are inserted to wrap the text at the given width.
for more information please read:
https://www.w3.org/TR/html5/forms.html#the-textarea-element
If possible, how to preserve the LF so that it does not get changed into CRLF?
I guess there are a lot of ways. Using javascript to replace all /r/n to /n before submit a form will likely be a client side solution. or if it doesn't have the necessity to be handled on client side which is exactly my case, I do the replacement process on the server side to force convert all line ending characters to LF.

How to use doc.replaceText() for multiline strings? Replacing line breaks via replaceText?

I'm using replaceText() for a project and I want to be able to match and replace strings that pan across multiple lines in Google Docs. Now in the 'replace' part of replaceText(), I'm able to use \n to insert line breaks in the new text, but for some reason I cannot use \n to match existing line breaks.
Say I have a string like this in Google Docs:
Line one
Line two
And I want to replace it with text "Just one line". I've tried doing the following and it hasn't worked.
doc.replaceText("Line one\n*Line two", "Just one line");
doc.replaceText("Line one\n\nLine two", "Just one line");
doc.replaceText("Line one\n+Line two", "Just one line");
Which leads me to believe that replaceText can't look for line breaks via \n. I was surprised, as it doesn't have any problems with using . or .* to look for random characters.
What would be a good solution for this? Another application for this that I need is deleting excess line breaks (so if there are three line breaks anywhere in a Doc, to delete them/turn them into one line break).
Keep in mind I'm very new to Javascript and only ever use it for Google Apps Script stuff on Google Docs. Please try to explain any solution in somewhat baby steps. Thanks in advance!
If your document contains only text (you will lose images etc),
var text = DocumentApp.getActiveDocument().getBody().getText();
var newText = text.replace(/\n/g,'');
doc.getBody().clear().appendParagraph(newText);

How can I insert a return at every HTML break "> <"

So I've never encountered this personally - I have a huge blob of HTML with inline CSS that I need to break up to be cohesive and editable (it's literally all back to back right now with no returns in any of the lines).
I use Sublime and I'm trying to insert a return at every break of code so I can easily go through and edit it all.
If you do not have a lot of code you can do it manually.
Otherwise just select all lines (Ctrl A) and then from the menu select Edit → Line → Reindent. This works with any files with the file extension .php or .html.

How to edit Firefox bookmarks with regex in Notepad++

I need to combine multiple bookmarks files and reduce the size, but I don't know how to use regular expression.
I want to:
Delete every line that starts with <DD>
Delete the following HTML tags and the (unknown) text between the
qoutes: ICON_URI="...", ICON="...", and LAST_CHARSET="..."
Replace the text between > and </A>
Delete duplicate lines
Sort lines alphabetically
Tested in Notepad++, for other tools it may not work. Also for some strange cases it could not work in Notepad++ as well
1. ^<DD>.*?$ - replace with empty string
2.ICON_URI="[^"]*" - replace with empty string
3.(<a[^>]+>).+?</a> - replace with \1 </a>
4. This is hard to do with regex, you can use grouping and repetition, but I'm not advanced in that
5. Use excel or other similar tool and order there, much easier