changing csh regexp match code to tcl - tcl

I need to change the following piece of code in shell to tcl. Please help.
if (` expr $_f : proj_lp_ ` == 8) then
I need the tcl equivalent of the condition inside the if condition.
Thanks!

See the expr manual page where is states:
STRING : REGEXP
anchored pattern match of REGEXP in STRING
So your _f variable holds a string and you are comparing it with the litteral proj_lp_. The result is the length of the match. In tcl code that could be if {[regexp {^proj_lp_} $_f]} { ...} as you only care if it matches. You could also just use if {[string match "proj_lp_*" $_f]} {...}. The expr(1) page says this is an anchored regexp -- hence adding the caret. Both the examples I have given will only match at the start of the input string (ie: they are anchored).

Related

How do I use the ternary operator to add an optional part to an anonymous string in tcl?

I am new to tcl (sorry if the answer is obvious, but reading tutorials and documentation did not help). I have a statement in tcl that says:
startupitem.start "foo
\tbar"
What I would like to do is have the "foo" part become optional, depending on the outcome of
[variant_isset "alice"] using the ternary operator and without using variables.
I've tried several things along the lines
startupitem.start "[variant_isset """alice"""?"""foo\n\t""":""""""] bar"
(of course with all kinds of escapes and combo's or the use of the double quotes inside the double quotes) but I haven't succeeded.
The outcome if the variant_isset expression returns true is that it is equivalent to
startupitem.start "bar"
You might prefer to use the if command (which is very much the command version of the ternary operator), whose result is the result of the body script it evaluates. If there isn't an else clause, the result is the empty string if nothing else is chosen to do:
startupitem.start "[if {[variant_isset alice]} {string cat "foo\n\t"}] bar"
Or you can build a list and then join it:
set items {}
if {[variant_isset alice]} {
lappend items "foo"
}
lappend items bar
startupitem.start [join $items "\n\t"]
This second approach tends to work particularly well when things get complicated.
You want to check out Tcl's expr command, which introduces Tcl's expression sub-language incl. what you call the "ternary" operator ?:
startupitem.start "[expr {[variant_isset "alice"] ? "foo\n\t" : ""}]bar"
If you happen to use a Tcl recent enough, you may want to prefer an outplace string assembly using string cat, rather than inplace:
string cat [expr {[variant_isset "alice"] ? "foo\n\t" : ""}] "bar"

String.IndexOf() returns unexpected value - cannot extract substring between two search strings

Script to manipulate some proper names in a web story to help my reading tool pronounce them correctly.
I get the content of a webpage via
$webpage = (Invoke-WebRequest -URI 'https://wanderinginn.com/2018/03/20/4-20-e/').Content
This $webpage should be of type String.
Now
$webpage.IndexOf('<div class="entry-content">')
returns correct value, yet
$webpage.IndexOf("Previous Chapter")
returns unexpected value and I need some explanation why or how I can find the error myself.
In theory it should cut the "body" of the page run it through a list of proper nouns I want to Replace and push this into a htm-file.
It all works, but the value of IndexOf("Prev...") does not.
Edit:
After invoke-webrequest I can
Set-Clipboard $webrequest
and post this in notepad++, there I can find both 'div class="entry-content"' and 'Previous Chapter'.
If I do something like
Set-Clipboard $webpage.substring(
$webpage.IndexOf('<div class="entry-content">'),
$webpage.IndexOf('PreviousChapter')
)
I would expect Powershell to correctly determine both first instances of those strings and cut between. Therefore my clipboard should now have my desired content, yet the string goes further than the first occurrence.
tl;dr
You had a misconception about how String.Substring() method works: the second argument must be the length of the substring to extract, not the end index (character position) - see below.
As an alternative, you can use a more concise (albeit more complex) regex operation with -replace to extract the substring of interest in a single operation - see below.
Overall, it's better to use an HTML parser to extract the desired information, because string processing is brittle (HTML allows variations in whitespace, quoting style, ...).
As Lee_Dailey points out, you had a misconception about how the String.Substring() method works: its arguments are:
a starting index (0-based character position),
from which a substring of a given length should be returned.
Instead, you tried to pass another index as the length argument.
To fix this, you must subtract the lower index from the higher one, so as to obtain the length of the substring you want to extract:
A simplified example:
# Sample input from which to extract the substring
# '>>this up to here'
# or, better,
# 'this up to here'.
$webpage = 'Return from >>this up to here<<'
# WRONG (your attempt):
# *index* of 2nd substring is mistakenly used as the *length* of the
# substring to extract, which in this even *breaks*, because a length
# that exceeds the bounds of the string is specified.
$webpage.Substring(
$webpage.IndexOf('>>'),
$webpage.IndexOf('<<')
)
# OK, extracts '>>this up to here'
# The difference between the two indices is the correct length
# of the substring to extract.
$webpage.Substring(
($firstIndex = $webpage.IndexOf('>>')),
$webpage.IndexOf('<<') - $firstIndex
)
# BETTER, extracts 'this up to here'
$startDelimiter = '>>'
$endDelimiter = '<<'
$webpage.Substring(
($firstIndex = $webpage.IndexOf($startDelimiter) + $startDelimiter.Length),
$webpage.IndexOf($endDelimiter) - $firstIndex
)
General caveats re .Substring():
In the following cases this .NET method throws an exception, which PowerShell surfaces as a statement-terminating error; that is, by default the statement itself is terminated, but execution continues:
If you specify an index that is outside the bounds of the string (a 0-based character position less than 0 or one greater than the length of the string):
'abc'.Substring(4) # ERROR "startIndex cannot be larger than length of string"
If you specify a length whose endpoint would fall outside the bounds of the string (if the index plus the length yields an index that is greater than the length of the string).
'abc'.Substring(1, 3) # ERROR "Index and length must refer to a location within the string"
That said, you could use a single regex (regular expression) to extract the substring of interest, via the -replace operator:
$webpage = 'Return from >>this up to here<<'
# Outputs 'this up to here'
$webpage -replace '^.*?>>(.*?)<<.*', '$1'
The key is to have the regex match the entire string and extract the substring of interest via a capture group ((...)) whose value ($1) can then be used as the replacement string, effectively returning just that.
For more information about -replace, see this answer.
Note: In your specific case an additional tweak is needed, because you're dealing with a multiline string:
$webpage -replace '(?s).*?<div class="entry-content">(.*?)Previous Chapter.*', '$1'
Inline option ((?...)) s ensures that metacharacter . also matches newline characters (so that .* matches across lines), which it doesn't by default.
Note that you may have to apply escaping to the search strings to embed in the regex, if they happen to contain regex metacharacters (characters with special meaning in the context of a regex):
With embedded literal strings, \-escape characters as needed; e.g., escape .txt as \.txt
If a string to embed comes from a variable, apply [regex]::Escape() to its value first; e.g.:
$var = '.txt'
# [regex]::Escape() yields '\.txt', which ensures
# that '.txt' doesn't also match '_txt"
'a_txt a.txt' -replace ('a' + [regex]::Escape($var)), 'a.csv'

Mysql RegExp select all string with matching sequence if the string is 5 character long [duplicate]

I need a regex that will only find matches where the entire string matches my query.
For instance if I do a search for movies with the name "Red October" I only want to match on that exact title (case insensitive) but not match titles like "The Hunt For Red October". Not quite sure I know how to do this. Anyone know?
Thanks!
Try the following regular expression:
^Red October$
By default, regular expressions are case sensitive. The ^ marks the start of the matching text and $ the end.
Generally, and with default settings, ^ and $ anchors are a good way of ensuring that a regex matches an entire string.
A few caveats, though:
If you have alternation in your regex, be sure to enclose your regex in a non-capturing group before surrounding it with ^ and $:
^foo|bar$
is of course different from
^(?:foo|bar)$
Also, ^ and $ can take on a different meaning (start/end of line instead of start/end of string) if certain options are set. In text editors that support regular expressions, this is usually the default behaviour. In some languages, especially Ruby, this behaviour cannot even be switched off.
Therefore there is another set of anchors that are guaranteed to only match at the start/end of the entire string:
\A matches at the start of the string.
\Z matches at the end of the string or before a final line break.
\z matches at the very end of the string.
But not all languages support these anchors, most notably JavaScript.
I know that this may be a little late to answer this, but maybe it will come handy for someone else.
Simplest way:
var someString = "...";
var someRegex = "...";
var match = Regex.Match(someString , someRegex );
if(match.Success && match.Value.Length == someString.Length){
//pass
} else {
//fail
}
Use the ^ and $ modifiers to denote where the regex pattern sits relative to the start and end of the string:
Regex.Match("Red October", "^Red October$"); // pass
Regex.Match("The Hunt for Red October", "^Red October$"); // fail
You need to enclose your regex in ^ (start of string) and $ (end of string):
^Red October$
If the string may contain regex metasymbols (. { } ( ) $ etc), I propose to use
^\QYourString\E$
\Q starts quoting all the characters until \E.
Otherwise the regex can be unappropriate or even invalid.
If the language uses regex as string parameter (as I see in the example), double slash should be used:
^\\QYourString\\E$
Hope this tip helps somebody.
Sorry, but that's a little unclear.
From what i read, you want to do simple string compare. You don't need regex for that.
string myTest = "Red October";
bool isMatch = (myTest.ToLower() == "Red October".ToLower());
Console.WriteLine(isMatch);
isMatch = (myTest.ToLower() == "The Hunt for Red October".ToLower());
You can do it like this Exemple if i only want to catch one time the letter minus a in a string and it can be check with myRegex.IsMatch()
^[^e][e]{1}[^e]$

not able to understand output of regexp in tcl

Please explain the output of this tcl command , i am
not getting the result .
on tclsh
set line = "Clock Domain: clk"
regexp {Clock Domain:\s*(.+)} $line tmp1 Pnr_clk
$tmp1 = "Clock Domain: clk"
$Pnr_clk = clk
How this value is assigned
The Tcl regexp command is documented to assign the submatches to the variables whose names you provide. The first such variable you give is tmp1, which gets the whole string that the overall RE matched (which might be a substring of the overall input string; Tcl's RE engine does not anchor matches by default). The second such variable is Pnr_clk, which gets what the first parenthesized sub-RE matches, which in this case is clk because the \s* before the parenthesis greedily consumed the whitespace after Clock Domain:.

regex with tcl and $ in variable value

I need help with this regex for tcl. I want to detect the $ character but it isn't flagging. Any ideas?
set cell {ABC_ONE_123_$12345$wc_PIE_IN_SKY}
string match $ $cell
string match and glob patterns
string match does a match against a glob pattern, not a regular expression. Plus, it will try to match the whole string. The glob pattern $ doesn't match since the string has much more than just a dollar sign. However *$* does, since it says "zero or more characters, a dollar sign, and zero or more characters". Because $ is treated specially by the tcl shell, you must quote it properly.
For example:
% string match {*$*} $cell
1
% string match *\$* $cell
1
regular expressions
If you really want to do a regular expression search rather than a glob pattern match, use the regexp command. In this case, you must a) protect the $ from normal tcl interpretation just like with string match, and b) because it is special to regular expressions, you must protect the dollar sign from regex interpretation.
Here's an example:
% regexp {\$} $cell
1
% regexp \\\$ $cell
1