Why does appending text to the end of each line replaces the first characters instead? - carriage-return

I searched everywhere but I haven't seen anyone have the same issue, yet alone a solution to this. I am trying to add text at the end of each line like so:
"Name1";"2913"
"Name2";"2914"
into:
"Name1";"2913";""
"Name2";"2914";""
I have tried it with sed, awk(with gsub) and pearl commands but each time instead of adding the ;"" to the end of each line it just replaces the first 3 characters of each line with it:
"Name1";"2913"
becomes
;""me1";"2913"
It is not limited to just ;"" it happends with anything i try to add at the end of the line.
code i tried:
cat list | sed 's/$/;""/'
cat list | awk '{gsub(/$/,";\"\"")}1'
each with the same outcome of:
;""me1";"2913"
;""me2";"2914"
Why is this happening?

Looks like OP may have control M characters in OP's Input_file in that case could you please try following.
awk -v s1="\"" 'BEGIN{FS=OFS=";"} {gsub(/\r/,"");$(NF+1)=s1 s1} 1' Input_file
2nd solution: With sed:
sed 's/\r//g;s/$/;""/' Input_file
Suggestions for OP's code:
We need not to use cat with awk or sed they are capable of reading Input_file by themselves.
You could have control M characters in your file so you could remove them by doing tr -d '\r' < Input_file > temp && mv temp Input_file OR directly run commands mentioned above to get rid of the carriage returns and get your output too.

Related

prefix every header column with string using awk

I have a bunch of big csv I want to prefix every header column with fixed string. There is more than 500 columns in every file.
suppose my header is:
number;date;customer;key;amount
I tried this awk line:
awk -F';' 'NR==1{gsub(/[^a-z_]/,"input_file.")} { print }'
but I get (note fist column is missing prefix and separator is removed):
numberinput_file.dateinput_file.customerinput_file.keyinput_file.amount
expected output:
input_file.number;input_file.date;input_file.customer;input_file.key;input_file.amount
In any awk that'd be:
$ awk 'NR==1{gsub(/^|;/,"&input_file.")} 1' file
input_file.number;input_file.date;input_file.customer;input_file.key;input_file.amount
but sed exists to do simple substitutions like that, e.g. using a sed that has -E to enable EREs (e.g. GNU and BSD sed):
$ sed -E '1s/^|;/&input_file./g' file
input_file.number;input_file.date;input_file.customer;input_file.key;input_file.amount
If you're using GNU tools then you could use either of the above to change all of your CSV files at once with either of these:
awk -i inplace 'NR==1{gsub(/^|;/,"&input_file.")} 1' *.csv
sed -i -E '1s/^|;/&input_file./g' *.csv
Your gsub would brutally replace any nonalphabetic character anywhere in the input with the prefix - including your column separators.
The print can be abbreviated to the common idiom 1 at the very end of your script; this simply means "this condition is true; perform the default action for every line (i.e. print it all)" though this is just a stylistic change.
awk -F';' 'NR==1{
sub(/^/, "input_file."); gsub(/;/, ";input_file."); }
1' filename
If you want to perform this on multiple files, probably put a shell loop around it. If you only want to concatenate everything to standard output, you can give all the files to Awk in one go (in which case you probably don't want to print the header line for any file after the first; maybe change the 1 to NR==1 || FNR != 1).
I would use GNU AWK following way, let file.txt content be
number;date;customer;key;amount
1;2;3;4;5
6;7;8;9;10
then
awk 'BEGIN{FS=";";OFS=";input_file."}NR==1{$1="input_file." $1}{print}' file.txt
output
input_file.number;input_file.date;input_file.customer;input_file.key;input_file.amount
1;2;3;4;5
6;7;8;9;10
Explanation: I set OFS to ; followed by prefix. Then in first line I add prefix to first column, which trigger string rebuilding. No modification is done in any other line, thus they are printed as is.
(tested in GNU Awk 5.0.1)
Also with awk using for loop and printf:
awk 'BEGIN{FS=OFS=";"} NR==1{for (i=1; i<=NF; i++) printf "%s%s", "input_file." $i, (i<NF ? OFS : ORS)}' file
input_file.number;input_file.date;input_file.customer;input_file.key;input_file.amount

Change column if some regex expression is true with awk or sed

I have a file (lets call it data.csv) similar to this
"123","456","ud,h-match","moredata"
with many rows in the same format and embedded commas. What I need to do is look at the third column and see if it has an expression. In this case I want to know if the third column has "match" anywhere (which it does). If there is any, then I to replace the whole column to something else like "replaced". So to relate it to the example data.csv file, I would want it to look this.
"123","456","replaced","moredata"
Ideally, I want the file data.csv itself to be changed (time is of the essence since I have a big file) but it's also fine if you write it to another file.
Edit:
I have tried using awk:
awk -F'","' -OFS="," '{if(tolower($3) ~ "stringI'mSearchingFor"){$3="replacement"; print}else print}' file
but it dosen't change anything. If I remove the OFS portion then it works but it gets separated by spaces and the columns don't get enclosed by double quotes.
Depending on the answer to my question about what you mean by column, this may be what you want (uses GNU awk for FPAT):
$ awk -v FPAT='[^,]+|"[^"]+"' -v OFS=',' '$3~/match/{$3="\"replaced\""} 1' file
"123","456","replaced","moredata"
Use awk -i inplace ... if you want to do "in place" editing.
With any awk (but slightly more fragile than the above since it leaves the leading/trailing " on the first and last fields, and has no -i inplace):
$ awk 'BEGIN{FS=OFS="\",\""} $3~/match/{$3="replaced"} 1' file
"123","456","replaced","moredata"

Find a string between 2 other strings in document

I have found a ton of solutions do do what I want with only one exception.
I need to search a .html document and pull a string.
The line containing the string will look like this (1 line, no newlines)
<script type="text/javascript">g_initHeader(0);LiveSearch.attach(ge('oh2345v5ks'));var _ = g_items;_[60]={icon:'INV_Chest_Leather_09',name_enus:'Layered Tunic'};_[6076]={icon:'INV_Pants_11',name_enus:'Tapered Pants'};_[3070]={icon:'INV_Misc_Cape_01',name_enus:'Ensign Cloak'};</script>
The text I need to get is
INV_CHEST_LEATHER_09
When I use awk, grep, and sed, I extract the data between icon:' and ',name_
The problem is, all three of these scripts scan the entire line and use the last occurring ',name_ thus I end up with
INV_Chest_Leather_09',name_enus:'Layered
Tunic'};_[6076]={icon:'INV_Pants_11',name_enus:'Tapered
Pants'};_[3070]={icon:'INV_Misc_Cape_01
Here's the last one I tried
grep -Po -m 1 "(?<=]={icon:').*(?=',name_)"
I've tried awk and sed too, and I don't really have a preference of which one to use.
So basically, I need to search the entire html file, find the first occurrence of icon:', extract the text right after it until the first occurrence after icon:' of ',name_.
With GNU awk for the 3rd arg to match():
$ awk 'match($0,/icon:\047([^\047]+)/,a){print a[1]}' file
INV_Chest_Leather_09
Simple perl approach:
perl -ne 'print "$1\n" if /\bicon:\047([^\047]+)/' file
The output:
INV_Chest_Leather_09
The .* in your regular expression is a greedy matcher, so the pattern will match till the end of the string and then backtrack to match the ,name_ portion. You could try replacing the .* with something like [^,]* (i.e. match anything except comma):
grep -Po -m 1 "(?<=]={icon:')[^,]*(?=',name_)"

How to convert linefeed into literal "\n"

I'm having some trouble converting my file to a properly formatted json string.
Have been fiddling with sed for ages now, but it seems to mock me.
Am working on RHEL 6, if that matters.
I'm trying to convert this file (content):
Hi there...
foo=bar
tomàto=tomáto
url=http://www.stackoverflow.com
Into this json string:
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com"}
How would I replace the actual line feeds in the literal '\n' character?? This is where I'm utterly stuck!
I've been trying to convert line feeds into ";" first and then back to a literal "\n". Tried loops for each row in the file. Can't make it work...
Some help is much appreciated!
Thanks!
sed is for simple substitutions on individual lines, that is all. Since sed works line by line your sed script doesn't see the line endings and so you can't get it change the line endings without jumping through hoops using arcane language constructs and convoluted logic that hasn't been useful since the mid-1970s when awk was invented.
This will change all newlines in your input file to the string \n:
$ awk -v ORS='\\n' '1' file
Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com\n
and this will do the rest:
$ awk -v ORS='\\n' 'BEGIN{printf "{\"text\":\""} 1; END{printf "\"}\n"}' file
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com\n"}
or this if you have a newline at the end of your input file but don't want it to become a \n string in the output:
$ awk -v ORS='\\n' '{rec = (NR>1 ? rec ORS : "") $0} END{printf "{\"text\":\"%s\"}\n", rec}' file
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com"}
With GNU sed:
sed ':a;N;s/\n/\\n/;ta' file | sed 's/.*/{"text":"&"}/'
Output:
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com"}
Use awk for this :
awk -v RS=^$ '{gsub(/\n/,"\\n");sub(/^/,"{\"text\":\"");sub(/\\n$/,"\"}")}1' file
Output
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com"}
awk to the rescue!
$ awk -vRS='\0' '{gsub("\n","\\n");
print "{\"text\":\"" $0 "\"}"}' file
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com\n"}
This might work for you (GNU sed):
sed '1h;1!H;$!d;x;s/.*/"text":"&"/;s/\n/\\n/g' file
Slurp the file into memory and use pattern matching to manipulate the file to the desired format.
The most simple (and elegant ?) solution :) :
#!/bin/bash
in=$(perl -pe 's/\n/\\n/' $1)
cat<<EOF
{"text":"$in"}
EOF
Usage:
./script.sh file.txt
Output :
{"text":"Hi there...\n\nfoo=bar\ntomàto=tomáto\nurl=http://www.stackoverflow.com\n"}

awk start edits from line 2?

i have a testfile.csv of which i want to replace all values in the third column with a new value without touching the header:
testfile.csv
col1,col2,col3
a,a,a
b,b,b
i tried this code below where i specified NR>1:
cat test_file.csv| awk -F"," 'NR>1{OFS=",";{$3="10/1/2015"} print}' >xx
my output gives me the below but it also edited the header of col3 which is not what i want:
xx
col1,col2,10/1/2015
a,a,10/1/2015
b,b,10/1/2015
i want this:
col1,col2,col3
a,a,10/1/2015
b,b,10/1/2015
Your script should skip the first line. So you should not even see col1,col2,col3. Please make sure that the first line of testfile.csv start with col1,col2,col3.
If you want to see also the first line you need to:
cat testfile.csv| awk -F"," '{ if (NR>1) {OFS=",";{$3="10/1/2015"} print} else print}'
I didn't get the same output as you
a,a,10/1/2015
b,b,10/1/2015
which is what I would have expected.
So the NR>1 worked for me.
Try
echo "col1,col2,col3
a,a,a
b,b,b" \
| awk -F"," -vOFS="," 'NR==1{print};NR>1{$3="10/1/2015"; print}'
output
col1,col2,col3
a a 10/1/2015
b b 10/1/2015
Also if you data has been created in MSWindows environment, and your are now processing in Linux, be sure to remove \r chars, with dos2unix myDataFile.txt
IHTH