regsub 1 space between words to "_" - tcl

Can anyone help to regsub 1 space between words to "_" (TCL)?
The original line:
CTS__94331/I (DCCKBD8BWP240H11P57PDULVT) 0.025 0.002 & 0.352 r
The require line:
CTS__94331/I_(DCCKBD8BWP240H11P57PDULVT) 0.025 0.002_& 0.352_r
I tried the below but each space replaced to "" and I want only 1 space to replace to ""
regsub -all {\s{1}} $a _

You need to be sneakier in your regular expression and use a different replacement.
regsub -all {(\S)\s(?=\S)} $a {\1_}
The regular expression matches (and captures) a non-whitespace followed by a whitespace and then requires (with a lookahead constraint) that the next character is a non-whitespace without matching it. This is replaced with the first character you matched (replacing it with itself) and the underscore.
Normally for this sort of thing you'd use \y\s\y → _, but that doesn't work in your use case because it doesn't handle 0.002 & correctly (& is not a word character).
If Tcl's RE engine supported lookbehind constraints (it doesn't) this would be much simpler to solve as you wouldn't need the trickery of replacing the character before the space with itself.

regsub 1 space between words to "_"
For this case, you may consider string map:
% string map {" " "_"} {CTS__94331/I (DCCKBD8BWP240H11P57PDULVT)}
CTS__94331/I_(DCCKBD8BWP240H11P57PDULVT)
However, your example data is inconsistent and I do not see an example of more than one whitespace to be come replaced?

Related

How can I exclude the last regex match check on this regex?

How can I match the comma between each key:value pair EXCEPT to exclude the last comma match? And please me know if you have a cleaner regex as mine seems a little messy. I am new to writing regex.
I need to pattern match this format and they need to be map string:string
{\"key1\":\"val1\",....N}
or
{"key1":"val1",....N}
Example
{\"key1\":\"val1\",\"key2\":\"val2\",\"k3\":\"v3\",\"k4\":\"v4\"}
What I have for my regex:
^[{]((["]|[\\]["])[a-zA-Z0-9]+(["]|[\\]["])[:](["]|[\\]["])[a-zA-Z0-9]+(["]|[\\]["])[,])+[}]$
What my match is - I do not want the last comma:
{"key1":"val1","key2":"val2","k3":"v3","k4":"v4",}
Its usually done by requiring the first Key/Val pair, then making all the others
optional with a prepended separator ,
^{\\?"[a-zA-Z0-9]+\\?":\\?"[a-zA-Z0-9]+\\?"(?:,\\?"[a-zA-Z0-9]+\\?":\\?"[a-zA-Z0-9]+\\?")*}$
https://regex101.com/r/cqJk8q/1
^
{
\\? " [a-zA-Z0-9]+ \\? ": \\? " [a-zA-Z0-9]+ \\? "
(?:
, \\? " [a-zA-Z0-9]+ \\? ": \\? " [a-zA-Z0-9]+ \\? "
)*
}
$
Instead of ending with
[,])+[}]$
end with
(,(?!$)|}$))+$
See live demo.
Also, some simplification you can do:
[:] is identical to just :, etc for all like this
(["]|[\\]["]) is identical to \\?"
[a-zA-Z0-9] is almost equivalent to \w (\w also allows the underscore - if that's a problem, just leave [a-zA-Z0-9])
So, your whole regex could be refactored to:
^\{(\\?"\w+\\?":\\?"\w+\\?"(,(?!$)|}$))+$
Your regex however allow mismatched escaping, eg
`{\"key1": "value1"}`
`{"key1": "value1\"}`
To fix that, capture the optional backslash and use a back reference to it on the other end so they must be balanced:
^\{(?:(\\?)"\w+\1":(\\?)"\w+\2?"(,(?!$)|}$))+$
See live demo (with all input varieties).
To also restrict input to either all quotes escaped or no quotes escaped, add negative look ahead anchored to start (?!.*\\".*[^\\]"|.*[^\\]".*\\"), which is one type then the other or visa versa:
^(?!.*\\".*[^\\]"|.*[^\\]".*\\")\{(?:\\?"\w+\\?":\\?"\w+\\?"(,(?!$)|}$))+$
See live demo.
The previous back reference based check for balanced escaping for a key/value pair has been removed because the look ahead now enforces balanced quoting.

TCL regsub multiple special characters in one shot

Is there a way to add escape '\' into a string with multiple special characters?
Example input : a/b[1]/c/d{3}
Desired outcome : a\/b\[1\]\/c\/d\{3\}
I've done it in multiple regsubs one special character at a time. But is there a way to do it in one shot?
I would simply escape all non-word characters:
set input {a/b[1]/c/d{3}}
set output [regsub -all {\W} $input {\\&}]
puts $output
a\/b\[1\]\/c\/d\{3\}
ref: https://tcl.tk/man/tcl8.6/TclCmd/regsub.htm and https://tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm
The general approach to use is to build a RE character set ([…]) and use that. You have to be a bit careful with those in some cases (some characters are special in them, especially ^, ], - and \), but it's not too difficult.
regsub -all {[][/{}]} $input {\\&}
However, if you can use character classes (such as \W or [^\w]) then it's a lot simpler and easier to read. Most common cases of needing to apply backslashes work with those.

escaping "\" in the end of a string in tcl

I want to create a string that ends with "\". For example:
set str {",$"23"^##$\'"\}
This won't work because tcl thinks that I'm escaping the "}" here.
So I tried to escape the "\"
set str {",$"23"^##$\'"\\}
but now the value of str is ",$"23"^##$\'"\\.
I want the value of str to be with one "\" in the end: ",$"23"^##$\'"\
How can I do that while creating the string inside {}
The easiest way I could think is to use format:
puts [format {",$"23"^##$\'"%s} \\]
",$"23"^##$\'"\
I think you could even try with the %c and the ascii code of the \.
You can't; this is one of the small number of cases that can't be quoted that way. Here's the proof from an interactive session:
% gets stdin s
",$"23"^##$\'"\
15
% list $s
\",\$\"23\"^##\$\\'\"\\
Other such cases are thing like where there's unbalanced braces, and so on. They really don't come up very often. That backslash form above generated by list is the alternative (and you can put it in double quotes if you wish).

How to append two string in TCL with a space between them?

I'm trying to append two string in tcl. I'm reading from csv and set values to the variables and then i will use that for assigning it my application. I tried the below one.
set vMyvalue [lindex $lsLine 17]
append vMyvalue " [lindex $lsLine 18]"
it is giving me the expected result. Like for e.g if i have values 250 and km in 17th and 18th position in csv. i'm getting
250 km
But the problem is when there are no values in the 17th and 18th i mean when it is empty, that time also it is adding space. But my application won't allow me to assign space for that value. How can i resolve this? I just started working in TCL. I'm not aware of many functions.
I think the most intuitive way to handle cases similar to this one if you don't know a function to do this (including for example if you are joining two strings with some character but if any one of them are empty strings, then you want something different to be done), would be to use if. In this case:
if {$vMyvalue eq " "} {set vMyvalue ""}
If you want to make your code somewhat shorter, you can make use of the functions lrange (list range), join and string:
set vMyvalue [string trim [join [lrange $lsLine 17 18] " "]]
lrange returns a list of elements from the list $lsLine between indices 17 to 18 inclusive, then join literally joins those elements with a space, and last, string trim cleans up any leading and trailing spaces (removing the space completely if it is the only character in the string).
There are several ways to do this. The minimum modification from the code you already have is probably to trim the result. Trim removes leading and trailing whitespace but if it's only whitespace it would trim it to an empty string. So:
set myValue [string trim $myValue]

How to trim two words from right of a string

I want to remove two words from right of a string.
For example:
set str "sachin is the pride of india"
I need to remove india and of from right and there should be no space after that.
I have tried using string trimright.
The string trimright command is exactly the wrong tool for this; it treats its trim argument as a set of characters to remove, not a literal. The simplest way of doing this is with lreplace, provided the string doesn't contain list metacharacters and you don't care about the number of spaces.
set shortened [lreplace $str end-1 end]
If you need to do it reliably, regular expressions are the tool of choice.
set shortened [regsub {\s*\S+\s+\S+\s*$} $str ""]
Use regsub for this. Please.