How to remove tabs from blank lines using sed? - tabs

I'd like to use sed to remove tabs from otherwise blank lines. For example a line containing only \t\n should change to \n. What's the syntax for this?

sed does not know about escape sequences like \t. So you will have to literally type a tab on your console:
sed 's/^ *$//g' <filename>
If you are on bash, then you can't type tab on the console. You will have to do ^V and then press tab. (Ctrl-V and then tab) to print a literal tab.

The other posted solution will work when there is 1 (and only 1) tab in the line. Note that Raze2dust points out that sed requires you to type a literal tab. An alternative is:
sed '/[^ ]/!s/ //g' file-name.txt
Which substitues away tabs from lines that only have tabs. The inverted class matches lines that contain anything bug a tab - the following '!' causes it to not match those lines - meaning only lines that have only tabs. The substitution then only runs on those lines, removing all tabs.

To replace arbitrary whitespace lines with an empty line, use
sed -r 's/^\s+$//'
The -r flag says to use extended regular expressions, and the ^\s+$ pattern matches all lines with some whitespace but no other characters.

What worked for me was:
sed -r '/^\s+$/d' my_file.txt > output.txt

I've noticed \t is not recognized by UNIX. Being said, use the actual key. In the code below, TAB represents pressing the tab key.
$ sed 's/TAB//g' oldfile > newfile
Friendly tip: to ensure you have tabs in the file you are trying to remove tabs from use the following code to see if \t appears
$ od -c filename

grep -o ".*" file > a; mv a file;

Related

How to delete all characters up to and including a specified character in a text file?

Quick sed question
In a text file how do I remove all characters up to and including the first '[' found in the entire file and nothing else?
I tried
sed "s/^[^\[]*\[//" example.json
but it's stripping out all text on every line.
Alternately,
I have a set of files that are sets of JSON documents. I am trying to import them into elasticsearch, but the first document in the file is an informational document with a non-standard layout that messes up the importing of the rest of the documents. I'm trying to get rid of the first document so the subsequent documents can load properly.
Here is the document:
https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2014-01-01&endtime=2014-01-02
For anything other than a simple s/old/new on individual strings just use awk. This will work using any awk in any shell on every UNIX box:
awk 'sub(/[^[]*\[/,""){f=1} f' file
I used two sed commands to achieve the task to "remove all characters up to and including the first '[' found"
sed -n '/\[/,$p' example.json | sed -r '1 s/[^\[]*\[(.*)/\1/'

How to reformat file with sed/vim?

I have a .csv file that looks like this.
atomnum,atominfo,metric
238,A-30-CYS-SG,53.7723
889,A-115-CYS-SG,46.2914
724,A-94-CYS-SG,44.6405
48,A-6-CYS-SG,37.2108
630,A-80-CYS-SG,29.574
513,A-64-CYS-SG,23.1925
981,A-127-CYS-SG,19.8903
325,A-41-GLN-OE1,17.6205
601,A-76-CYS-SG,17.5079
I want to change it like this:
atomnum,atominfo,metric
238,C30-SG,53.7723
889,C115-SG,46.2914
724,C94-SG,44.6405
48,C6-SG,37.2108
630,C80-SG,29.574
513,C64-SG,23.1925
981,C127-SG,19.8903
325,Q41-OE1,17.6205
601,C76-SG,17.5079
The part between the commas is an atom identifier: where A-30-CYS-SG is the gamma sulfur of the residue 30, which is a cysteine, in chain A. Residues can be represented with three letters or just one (Table here https://www.iupac.org/publications/pac-2007/1972/pdf/3104x0639.pdf). Basically, I just want to a) change the three letter code to the one letter code, b) remove the chain id (A in this case) and c) put the residue number next to the one letter code.
I've tried matching the patterns between the commas within vim. Something like s%:\(-\d\+\-\)\(\u\+\):\2\1:g gives me c) i.e. (ACYS-30--SG). I do not know how to do a) with vim. I know how to do it with sed and an input file with all the substitute commands in it. But then maybe is better to do all the work with sed... I am asking if is it possible to do a) on vim?
Thanks
This might work for you (GNU sed):
sed -r '1b;s/$/\n:ALAA:ARGR:ASNN:ASPD:CYSC:GLUE:GLNQ:GLYG:HISH:ILEI:LEUL:LYSK:METM:PHEF:PROP:SERS:THRT:TRPW:TYRY:VALV/;s/,A-([0-9]+)-(...)(.*)\n.*:\2(.).*/,\4\1\3/' file
Append a lookup table to each line and use pattern matching to substitute a 3 letter code (and integer value) for a 1 letter code. The lookup key is a colon, followed by the 3 letter key, followed by the 1 letter code.
Using sed, paste, cut, & bash, given input atoms.csv:
paste -d, <(cut -d, -f1 atoms.csv) \
<(cut -d, -f2 atoms.csv | sed 's/.-//
s/\(.*\)-\([A-Z]\{3\}\)-/\2\1-/
s/^ALA/A/
s/^ARG/R/
s/^ASN/N/
s/^ASP/D/
s/^CYS/C/
s/^GLU/E/
s/^GLN/Q/
s/^GLY/G/
s/^HIS/H/
s/^ILE/I/
s/^LEU/L/
s/^LYS/K/
s/^MET/M/
s/^PHE/F/
s/^PRO/P/
s/^SER/S/
s/^THR/T/
s/^TRP/W/
s/^TYR/Y/
s/^VAL/V/') \
<(cut -d, -f3 atoms.csv)
Output:
atomnum,atominfo,metric
238,C30-SG,53.7723
889,C115-SG,46.2914
724,C94-SG,44.6405
48,C6-SG,37.2108
630,C80-SG,29.574
513,C64-SG,23.1925
981,C127-SG,19.8903
325,Q41-OE1,17.6205
601,C76-SG,17.5079
If you know how to do it in sed why not leverage that knowledge and simply call out from Vim?
:%!sed -e '<your sed script>'
Once you done that and it works you can pop it in a Vim function.
functioni Transform()
your sed command
endfunction
and then just use
:call Transform()
which you can map to a key.
Simples!

Find and replace text in JSON with sed [duplicate]

I am trying to change the values in a text file using sed in a Bash script with the line,
sed 's/draw($prev_number;n_)/draw($number;n_)/g' file.txt > tmp
This will be in a for loop. Why is it not working?
Variables inside ' don't get substituted in Bash. To get string substitution (or interpolation, if you're familiar with Perl) you would need to change it to use double quotes " instead of the single quotes:
# Enclose the entire expression in double quotes
$ sed "s/draw($prev_number;n_)/draw($number;n_)/g" file.txt > tmp
# Or, concatenate strings with only variables inside double quotes
# This would restrict expansion to the relevant portion
# and prevent accidental expansion for !, backticks, etc.
$ sed 's/draw('"$prev_number"';n_)/draw('"$number"';n_)/g' file.txt > tmp
# A variable cannot contain arbitrary characters
# See link in the further reading section for details
$ a='foo
bar'
$ echo 'baz' | sed 's/baz/'"$a"'/g'
sed: -e expression #1, char 9: unterminated `s' command
Further Reading:
Difference between single and double quotes in Bash
Is it possible to escape regex metacharacters reliably with sed
Using different delimiters for sed substitute command
Unless you need it in a different file you can use the -i flag to change the file in place
Variables within single quotes are not expanded, but within double quotes they are. Use double quotes in this case.
sed "s/draw($prev_number;n_)/draw($number;n_)/g" file.txt > tmp
You could also make it work with eval, but don’t do that!!
This may help:
sed "s/draw($prev_number;n_)/draw($number;n_)/g"
You can use variables like below. Like here, I wanted to replace hostname i.e., a system variable in the file. I am looking for string look.me and replacing that whole line with look.me=<system_name>
sed -i "s/.*look.me.*/look.me=`hostname`/"
You can also store your system value in another variable and can use that variable for substitution.
host_var=`hostname`
sed -i "s/.*look.me.*/look.me=$host_var/"
Input file:
look.me=demonic
Output of file (assuming my system name is prod-cfm-frontend-1-usa-central-1):
look.me=prod-cfm-frontend-1-usa-central-1
I needed to input github tags from my release within github actions. So that on release it will automatically package up and push code to artifactory.
Here is how I did it. :)
- name: Invoke build
run: |
# Gets the Tag number from the release
TAGNUMBER=$(echo $GITHUB_REF | cut -d / -f 3)
# Setups a string to be used by sed
FINDANDREPLACE='s/${GITHUBACTIONSTAG}/'$(echo $TAGNUMBER)/
# Updates the setup.cfg file within version number
sed -i $FINDANDREPLACE setup.cfg
# Installs prerequisites and pushes
pip install -r requirements-dev.txt
invoke build
Retrospectively I wish I did this in python with tests. However it was fun todo some bash.
Another variant, using printf:
SED_EXPR="$(printf -- 's/draw(%s;n_)/draw(%s;n_)/g' $prev_number $number)"
sed "${SED_EXPR}" file.txt
or in one line:
sed "$(printf -- 's/draw(%s;n_)/draw(%s;n_)/g' $prev_number $number)" file.txt
Using printf to build the replacement expression should be safe against all kinds of weird things, which is why I like this variant.

How to remove first character from first line using sed

I have three .csv files that are output from saving a query in MS SQLServer. I need to load these files into an Informix database, which requires that tacking on of a trailing delimiter. That's easy to do using sed
s/$/,/g
However, each of these files also contains (as displayed by vim, but not ed or sed) an at the first character position of the first line.
I need to get rid of this character. It deletes as one character using vim's x command. How can I describe this character using sed so I can delete it without removing the line.
I've tried 1s/^.//g, but that is not working.
Try this instead:
sed -e '1s/^.//' input_file > output_file
Or if you'd like to edit the files in-place:
sed -ie '1s/^.//' input_file
(Edited) Apparently s/^.// doesn't quote do it, updated.
Remove the first character on the first line inplace:
sed -i '1s/^.//' file
try:
sed -i '1s/^.\(.*\)/\1/' file
this should remove the first character from the first line. (try it without the -i argument first to make sure)
edit: i originally posted the following, which would delete the first character from every line. upon re-reading the question i realized that isn't quite what was wanted.
sed -i 's/^.\(.*\)/\1/' file

Grep for a pattern

I have an HTML file with the following code
<html>
<body>
Test #1 '<%aaa(x,y)%>'
Test #2 '<%bbb(p)%>'
Test #3 '<%pqr(z)%>'
</body>
</html>
Please help me with the regex for a command (grep or awk) which displays the output as follows:
'<%aaa(x,y)%>'
'<%bbb(p)%>'
'<%pqr(z)%>'
I think that sed is a better choice than awk, but it is not completely clear cut.
sed -n '/ *Test #[0-9]* */s///p' <<!
<html>
<body>
Test #1 '<%aaa(x,y)%>'
Test #2 '<%bbb(p)%>'
Test #3 '<%pqr(z)%>'
</body>
</html>
!
You can't use grep; it returns lines that match a pattern, but doesn't normally edit those lines.
You could use awk:
awk '/Test #[0-9]+/ { print $3 }'
The pattern matches the test lines and prints the third field. It works because there are no spaces after the test number third field. If there could be spaces there, then the sed script is easier; it already handles them, whereas the awk script would have to be modified to handle them properly.
Judging from the comments, the desired output is the material between '<%' and '%>'. So, we use sed, as before:
sed -n '/.*\(<%.*%>\).*/s//\1/p'
On lines which match 'anything-<%-anything-%>-anything', replace the whole line with the part between '<%' and '%>' (including the markers) and print the result. Note that if there are multiple patterns on the line which match, only the last will be printed. (The question and comments do not cover what to do in that case, so this is acceptable. The alternatives are tough and best handled in Perl or perhaps Python.)
If the single quotes on the lines must be preserved, then you can use either of these - I'd use the first with the double quotes surrounding the regex, but they both work and are equivalent. OTOH, if there were expressions involving $ signs or back-ticks in the regex, the single-quotes are better; there are no metacharacters within a single-quoted string at the shell level.
sed -n "/.*\('<%.*%>'\).*/s//\1/p"
sed -n '/.*\('\''<%.*%>'\''\).*/s//\1/p'
The sequence '\'' is how you embed a single quote into a single-quoted string in a shell script. The first quote terminates the current string; the backslash-quote generates a single quote, and the last quote starts a new single-quoted string.
the -o option for grep is what you want:
grep -o "'.*'" filename
grep -P "^Test" 1.htm |awk '{print $3}'