How to add html attributes and values for all lines quickly with vim and plugins? - html

My os:debian8.
uname -a
Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1+deb8u2 (2017-03-07) x86_64 GNU/Linux
Here is my base file.
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
I want to create a html file as bellow.
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
Every line was added href and id attributes,whose values are line content pasted .html and line content itself correspondingly.
How to add html attributes and values for all lines quickly with vim and plugins?
sed,awk,sublime text 3 are all welcomed to solve the problem.

$ sed 's:.*:&:' file
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc

if you want to do this in vi itself, no plug-in neccessary
Open the file, type : and insert this line as the command
%s:.*:&
it will make all the substitutions in the file.

sed is the best solution (simple and pretty fast here) if your are sure of the content, if not it need a bit of complexity that is better treated by awk:
awk '
{
# change special char for HTML constraint
Org = URL = HTML = $0
# sample of modification
gsub( / /, "%20", URL)
gsub( /</, "%3C", HTML)
printf( "%s\n", URL, Org, HTML)
}
' YourFile

To complete this easily in Sublime Text, without any plugins added:
Open the base file in Sublime Text
Type Ctrl+Shift+P and in the fuzzy search input type syn html to set the file syntax to HTML.
In the View menu, make sure Word Wrap is toggled off.
Ctrl+A to select all.
Ctrl+Shift+L to break selection into multi-line edit.
Ctrl+C to copy selection into clipboard as multiple lines.
Alt+Shift+W to wrap each line with a tag-- then tap a to convert the default <p> tag into an <a> tag (hit esc to quit out of any context menus that might pop up)
Type a space then href=" -- you should see this being added to every line as they all have cursors. Also you should note that Sublime has automatically closed your quotes for you, so you have href="" with the cursor between the quotes.
ctrl+v -- this is where the magic happens-- your clipboard contains every lines worth of contents, so it will paste each appropriate value into the quotes where the cursor is lying. Then you simply type .html to add the extension.
Use the right arrow to move the cursors outside of the quotes for the href attribute and follow the two previous steps to similarly add an id attribute with the intended ids pasted in.
Voila! You're done.
Multi-line editing is very powerful as you learn how to combine it with other keyboard shortcuts. It has been a huge improvement in my workflow. If you have any questions please feel free to comment and I'll adjust as needed.

With bash one-liner:
while read v; do printf '%s\n' "$v" "$v" "$v"; done < file
(OR)
while read v; do echo "$v"; done < file

Try this -
awk '{print a$1b$1c$1d}' a='' d='' file
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
Here I have created 4 variable a,b,c & d which you can edit as per your choice.
OR
while read -r i;do echo ""$i";done < f
home
help
variables
compatibility

To execute it directly in vim:
!sed 's:.*:&:' %

In awk, no regex, no nothing, just print strings around $1s, escaping "s:
$ awk '{print "" $1 ""}' file
home
help
If you happen to have empty lines in there just add /./ before the {:
/./{print ...

list=$(cat basefile.txt)
for val in $list
do
echo ""$val"" >> newfile.html
done
Using bash, you can always make a script or type this into the command line.

This vim replacement pattern handles your base file:
s#^\s*\(.\{-}\)\s*$#\1#
^\s* matches any leading spaces, then
.\{-} captures everything after that, non-greedily — allowing
\s$ to match any trailing spaces.
This avoids giving you stuff like home .
You can also process several base files with vim at once:
vim -c 'bufdo %s#^\s*\(.\{-}\)\s*$#\1# | saveas! %:p:r.html' some.txt more.txt`
bufdo %s#^\s*\(.\{-}\)\s*$#\1# runs the replacement on each buffer loaded into vim,
saveas! %:p:r.html saves each buffer with an html extension, overwriting if necessary,
vim will open and show you the saved more.html, which you can correct as needed, and
you can use :n and :prev to visit some.html.
Something like sed’s probably best for big jobs, but this lets you tweak the conversions in vim right after it’s made them, use :u to undo, etc. Enjoy!

Related

How can I change specific recurring text on a very large HTML file?

I have a very big HTML file (talking about 20MB) and I need to remove from the file a large amount of nodes of the form:
<tr><td>SPECIFIC-STRING</td><td>RANDOM-STRING</td><td>RANDOM-STRING</td></tr><tr><td style="padding-top:0" colspan="3">RANDOM-STRING</td></tr>
The file I need to work on is basically made of thousands of these strings, and I only need to remove those that have a specific first string, for instance, all those with the first string being "banana":
<tr><td>banana</td><td>RANDOM-STRING</td><td>RANDOM-STRING</td></tr><tr><td style="padding-top:0" colspan="3">RANDOM-STRING</td></tr>
I tried achieving this opening the file in Geany and using the replace feature with this regex:
<tr><td>banana<\/td><td>(.*)<\/td><td>(.*)<\/td><\/tr><tr><td(.*)<\/td><\/tr>
but the console output was that it removed X amount of occurrences, when I know there are way more occurrences than that in the file.
Firefox, Chrome and Brackets fail even to view the html code of the file due to it's size. I can't think of another way to do this due to my large unexperience with HTML.
You could be using a stream editor which as the name suggest streams the file content, thus never loads the whole file into the main memory.
A popular editor is sed. It does support RegEx.
Your command would have the following structure.
sed -i -E 's/SEARCH_REGEX/REPLACEMENT/g' INPUTFILE
-E for support of extended RegEx
-i for in-place editing mode
s denotes that you want to replace values
g is for global. By default sed would only replace the first occurrence so to replace all occrrences you must provide g
SEARCH_REGEX is the RegEx you need to find the substrings you want to replace
REPLACEMENT is the value you want to replace all matches with
INPUTFILE is the file sed is gonna read line-by line and do the replacement for you.
While regex may not be the best tool to do this kinda job, try this adjustment to your pattern:
<tr><td>banana<\/td><td>(.*?)<\/td><td>(.*?)<\/td><\/tr><tr><td(.*?)<\/td><\/tr>
That's making your .* matches lazy. I am wondering if those patterns are consuming too much.

Sublime Text - find all instances of an html class name project-wide

I want to find all instances of a class named "validation" in all of my html files project wide. It's a very large project and a search for the word "validation" gives me hundreds of irrelevant results (js functions, css, js/css minified, other classes, functions and html page content containing the word validation, etc). It can sometimes be the second, third, or fourth class declared so searching for "class='validation" doesn't work.
Is there a way to specify that I only want results where validation is a class declared on an html block?
Yes. In the sublime menu go to Find --> Find in Files...
Then match what is in the following image.
The first thing you will want to do is consider other possibilities with how you can solve this problem. Currently, it sounds like you are only using sublime text. Have you considered trying to use a command-line tool like grep?
Here is an example of how it could be used.
I have a project called enfold-child with a bunch of frontend assets for a wordpress project. Let's say, I want to find all of my scss files with the class "home" listed in them somewhere, but I do NOT want to pull in built css files, or anything in my node_modules folder. The way i would do that is as follows:
Folder structure:
..
|build
|scss_files
|node_modules
|css_files
|style.css
grep -rnw build --exclude=*{.css} --exclude-dir=node_modules -e home
grep = handy search utility.
-r = recursive search.
-n = provide line numbers for each match
-w = Select only those lines containing matches that form whole words.
-e = match against a regular expression.
home = the expression I want to search for.
In general, the command line has most anything one could want/need to do most of the nifty operations offered by most text-editors -- such as Sublime. Becoming familiar with the command line will save you a bunch of time and headaches in the future.
In SublimeText, right-click on the folder you want to start the search from and click on Find in Folder. Make sure regex search is enabled (the .* button in the search panel) and use this regex as the search string:
class="([^"]+ )?validation[ "]
That regex will handle cases where "validation" is the only classname as well as cases where its one of several classnames (in which case it can be anywhere in the list).
If you didn't stick to double quotes, this version will work with single or double quotes:
class=['"]([^'"]+ )?validation[ '"]
If you want to use these regexes from the command line with grep, you'll need to include a -E argument for "extended regular expressions".

How to remove all content before and after two different HTML tags from the Linux command line

I have a file, links.html, and want to remove everything before:
<div id="data_div"
and everything after AND including
<div style="\"overflow:auto\"">
and save to a new file; links1.html.
Is it possible to complete this in a single operation, either with BASH string manipulation or sed?
If so, how?
Yes it is entirely possible. Below are two examples, although you will have to substitute "pattern" with what you referenced above as the pattern.
To remove everything after a "pattern" to include the pattern itself, you can use the following command
sed -i '/pattern/,$d' filename
To remove everything before a pattern to include the pattern itself, you can use the following command
sed -i '1,/pattern/d' filename
If you would like to not edit the file in place, you could remove the "-i" portion and redirect the output to a separate file.

How can I disable Vim's HTML error highlighting?

I use Vim to edit HTML with embedded macros, where the macros are bracketed with double angle brackets, e.g., "<>". Vim's standard HTML highlighting sees the second "<" and ">" as errors, and highlights them as such. How can I prevent this? I'd be happy to either teach $VIMHOME/syntax/html.vim that double-angle-brackets are OK, or to simply disable the error highlighting, but I'm not sure how to do either one. ("highlight clear htmlTagError" has no effect. In fact, "highlight clear" has no effect in an HTML buffer.)
If you want to introduce full syntax highlighting in your macros, it'll be easiest to start with a syntax file like htmldjango ($VIMRUNTIME/syntax/htmldjango.vim, which then uses html.vim and django.vim from the same directory); in it, there is special meaning in {{ ... }}, among other things. You want it just the same, but with << and >> being your delimiters.
To just highlight << ... >> specially, you'd need a syntax line like this:
syntax region mylangMacro start="<<" end=">>" containedin=ALLBUT,mylangMacro
And then you could highlight it with:
highlight default link mylangMacro Macro
This could either go in ~/.vim/after/syntax/html.vim or could be done in the style of htmldjango as a new syntax highlighter (this would be my preferred approach; you can then make HTML files use this new syntax file with an autocmd).
(You can also remove the error highlighting with syntax clear htmlTagError which would go in the same sort of position. But hopefully you'll think getting separate highlighting is better than just removing the error.)
Here are instructions to edit existing syntax highlighting in Vim:
http://vimdoc.sourceforge.net/htmldoc/syntax.html#mysyntaxfile-add
vim runtime paths for Unix/Linux:
$HOME/.vim,
$VIM/vimfiles,
$VIMRUNTIME,
$VIM/vimfiles/after,
$HOME/.vim/after
Create a directory in your vim runtime path called "after/syntax".
Commands for Unix/Linux:
mkdir ~/.vim/after
mkdir ~/.vim/after/syntax
Write a Vim script that contains the commands you want to use. For example, to change the colors for the C syntax: highlight cComment
ctermfg=Green guifg=Green
Write that file in the "after/syntax" directory. Use the name of the syntax, with ".vim" added. For our C syntax: :w
~/.vim/after/syntax/c.vim
That's it. The next time you edit a C file the Comment color will be
different. You don't even have to restart Vim.
If you have multiple files, you can use the filetype as the directory
name. All the "*.vim" files in this directory will be used, for
example: ~/.vim/after/syntax/c/one.vim ~/.vim/after/syntax/c/two.vim
Alternatively, you could take a much easier route and use syntax highlighting within the Nano command line editor, which you can define your own syntax very easily with regular expressions:
http://how-to.wikia.com/wiki/How_to_use_syntax_highlighting_with_the_GNU_nano_text_editor

Seeking a vim function to insert the text behind a hyperlink

I'm writing a book. I have a large notes file, with hyperlinks of interest. I'd like to be able to move the cursor over a hyperlink, execute a command, and have the text of the webpage inserted for reading, cutting, and modifying. Ideally, it would only be the readable text rather than the full HTML, but I'll take what I can get (or what I can wget, if that's the right shell function).
Surely such a thing has already been put together, but searches for "vim insert text hyperlink" and such are not very helpful. Have you seen this function?
With your mouse cursor on a hyperlink, execute
:rC-rC-aEnter
This will insert the raw contents on the following line.
To preprocess:
:r !links -dumpC-rC-aEnter
Of course, you can map this to a key:
:nnoremap <F6> :r !links -dump <C-r><C-a><CR>
Various notes:
If you have quoted urls, you might not be happy about C-rC-a because the quotes will get included. Consider doing
:se iskeyword+=/,:,.
And use C-rC-w instead. Alternatively,
:se isfname-="
allows you to use C-rC-f instead.
If you don't want to tinker with any of your settings, consider creating a function that saves and restores the value of &iskeyword or &isfname. To bring out the big gun, write a regex for URLs and use that:
func! ReadFromUrl()
let url = substitute(getline('.'), '\c\v^.*((https?|ftp|file)://[a-z0-9.:/%+()]+).*$', '\1', '')
exec 'r! links -dump "' . url . '"'
endf
command! ReadFromUrl call ReadFromUrl()
nnoremap! <F6> ReadFromUrl<CR>
Move the cursor to the hype-link, and type this command:
:exec 'r!wget -q -O- ' getline('.') ' | html2text'
It'll append the content of the hype-link after the hype-link.
To make this command work, you need install html2text.
Good luck!