Regular expression to match CSS rules

Regular expression to match CSS rules - html

Example original CSS:
.sk-hybrid{}
.sk-hybrid header > .row,
.sk-hybrid > #content.row,
.sk-hybrid footer > .row{}
.sk-border-radius-sm{}
.sk-gradient-gray-sm{}
I want to preg_replace all instances of .sk-whatever{ to .sk-whatever-i{ and all instances of the same terms followed by a space, like **.sk-whatever ** to **.sk-whatever-i **
Basically what I'm trying to achieve is, write some PHP code that will parse my CSS file to add the "-i" to all instances of my .sk-someword class. So I can then append the !important declaration to the ruleset but that's easy enough.
I need the regX only to add the "-i". Please note that .sk-whatever might have special characters between.
.sk-some-class-term(space)
or
.sk-some-class-term{
I'm such a slob when it comes to regX. I'm pretty sure others can write this easily. I can't. Help please? :(
The result of my example CSS should be:
.sk-hybrid-i{}
.sk-hybrid-i header > .row,
.sk-hybrid-i > #content.row,
.sk-hybrid-i footer > .row{}
.sk-border-radius-sm-i{}
.sk-gradient-gray-sm-i{}

This regex should work for your purposes:
^(\.sk-\w+(?:[^\{\s,>]+))
RegEx Demo
Explanation
^ Matches beginning of string
( Begin capture group
\.sk-\w+ Matches .sk- followed by letters or numbers
(?: Begin non-capturing group
[^\{\s,>]+ Matches any non-whitespace, non-{, non-,, or non-> character
) End non-capturing group
) End capture group

Match
(\.sk(-\w+)+)
and replace with $1-i: http://regex101.com/r/nC8kU1/1
However, much better would be to use a dedicated tool, that, unlike regexes, "understands" the underlying language. For php, there's https://github.com/sabberworm/PHP-CSS-Parser, pay attention to the Prepend id to selectors example - it's almost what you're looking for.

PHP to do what you are asking where $cssString is set to your css.
$cssString = preg_replace ( '(\.sk-\w+(?:[^\{\s,>]+))', '$1-i', $cssString);
Though if you are running on linux, just use sed
sed 's/^(\.sk-\w+(?:[^\{\s,>]+))/$0-i/g' example.css

Related

How to edit this html lexer rule?

I want to edit this HTML lexer rule and I need help with the Regular Expression
the TAG_NAME refers to any HTML attribute for ex: (required, class, id, etc...).
I want to edit it to make it does not accept this exact syntax: 'az-'.
I think this needs regular expression modification, I looked it up but I couldn't integrate what I found online with the way these rules are written.
I tried to remove the '-' in the Tag_NameChar as a first try but that made the HTML doesnt recognize attributes like 'data-target'.
This snippet is for the rule:
and this one shows how the attributes are recognized.

ANTLR does not support lookahead syntax like some regex engines do, so there's no easy way to exclude certain matches from within the regex. It's always possible to rewrite a regular expression to exclude a given string (regular expressions are closed under negation and intersection), but it usually ends up quite painful. In your case, you'd end up with something following the logic of "a tag name can either have less than 3 characters, more than 3 characters, or it could have three characters where the first isn't an 'a', the second isn't a 'z' or the last isn't a '-'".
The less painful, but also less cross-language solution is to use a predicate that returns false if the text of the tag name equals az-. So something like {getText().equals("az-")}? depending on the language.
If you're okay with introducing an additional lexer rule, you may also introduce a rule INVALID_TAG_NAME (or whatever you want to call it) that matches exactly az- and that's defined before TAG_NAME. That way any tag that's named exactly az- will produce an INVALID_TAG_NAME token instead of a TAG_NAME token.
Depending on your requirements, you could also leave the grammar unchanged altogether and simply produce an error when you see a tag named az- when you traverse the tree in a listener or visitor.

Regular expression remove some links

i need a regular expression to strip html tags for some links
example
link
fasafiso
should be converted to
link
fasafiso

Depending on your programming language, you could come up with sth. like:
~<a href="sample\.com" [^>]*>(.*?)</a>~
# delimiter ~
# look for <a, everything that is not > and >
# capture everything lazily in a group
# look for a closing tag
# delimiter ~
In your example, group 1 would hold fasafiso and could be replaced/insert via the group $1.
See a demo for this approach on regex101.com.
Hint:
This is just a quick-and-dirty solution (e.g. for text editors). If this is getting more complicated, consider using a parser instead.

I'll assume you want to replace all links whose target is sample.com by their content :
match <a[^>]*href="sample.com"[^>]*>([^<]*)</a>
replace by \1
For example with sed :
sed 's/<a[^>]*href="sample.com"[^>]*>([^<]*)</a>/\1/'
Please also keep in mind that if your requirements are complex enough you should instead be using an HTML parser.

Regular Expression for HTML attributes

I need to write a regular expression to catch the following things in bold
class="something_A211"
style="width:380px;margin-top: 20px;"
I have no idea how to write it, can someone help me?
I need this because, in html file i have to replace (whit notepad++) with empty, so i want to have a clear < tr > or < td > or anything else.
Thank you

You can use a regex like this to capture the content:
((?:class|style)=".*?")
Working demo
However, if you just want to match and delete that you can get rid of capturing groups:
(?:class|style)=".*?"

For all constructions like something="data", you can use this.
[^\s]*?\=\".*?\"
https://regex101.com/r/oQ5dR0/1
The link shows you what everything does.
To explain it briefly, a non space character can come before the "=" any mumber of times, then comes the quotes and info inside of them.
The question mark in .*? (and character any number of times) is needed so only the minimum amount of characters will be used (instead of looking for the next possible quotes somewhere further along)

Regex selects first to last instead of just first

I'm trying to use String.sub! in ruby and it substitutes way too much.
The regex i'm using. You can see it's matching too much: http://rubular.com/r/IUav4KEFWH
<rb>.+<\/rb>
it selects from the first to the last and I want it just to select the first pair.
is there another version of sub I'm not aware of, or a better way to sub
it would be easy to turn of multi-line and put them on separate lines but I don't want to sacrifice multi-lining

Your regex is too greedy:
<rb>.+<\/rb>
Make it non-greedy using:
<rb>.+?<\/rb>
Rubular Demo

It matches from the first <rb> tag up until the very last </rb> tag because + is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match.
You want to use +? for a non-greedy match meaning "one or more — preferably as few as possible".
<rb>.+?</rb>
Note: A parser to extract from HTML is recommended rather than using regular expression.

You can try this variant:
<rb>(?>(?!<\/rb>).)*+<\/rb>
Demo
Or if you want:
<rb>[^<]+<\/rb>
Demo
See the difference between .*? And [^<]+ in this DEMO

Regex to extract text from inside an HTML tag

I know this has been asked at least a thousand times but I can't find a proper regex that will match a name in this string here:
<td><div id="topbarUserName">Donald</div></td>
I want to get the name 'Donald' and the regex that's the closest is >[a-zA-Z0-9]+ but the result is >Donald.
I'm coding in PureBasic (It's syntax is similar to that of Basic) and it uses the PCRE library for regular expressions.
Can anyone help?

Josh's pattern will work if you only make use of the numbered group, not the whole match. If you have to use the whole match, use something like (?<=>)(\w+?)(?=<)
Either way, regex is widely known to not be good for parsing HTML.
Explanation:
(?<=) is used to check if something appears before the current item.
\w+? will match any "word"-character, one or more times, but stop whenever the rest of the pattern matches something, for this situation the ? could have been left out.
(?=) is used to check if something appears after the current item.

Try this
It should capture anything that is a letter / number
>([\w]+)<
Also I'm not exactly sure what your project limitations are, but it would be much easier to do something like this
$('#topbarUserName').text();
in jQuery instead of using a regex.

>([a-zA-Z]+) should do the Trick. Remember to get the grouping right.

Why not doing it with plain old basic string-functions?
a.w = FindString(HTMLstring.s, "topbarUserName") + 16 ; 2 for "> and topbar...
If a > 0
b.w = FindString(HTMLstring, "<", a)
If b > 0
c.w = b - a
Donald.s = Mid(HTMLstring,a, c)
EndIf
EndIf
Debug Donald

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Regular expression to match CSS rules - html

PHP to do what you are asking where $cssString is set to your css. $cssString = preg_replace ( '(\.sk-\w+(?:[^\{\s,>]+))', '$1-i', $cssString); Though if you are running on linux, just use sed sed 's/^(\.sk-\w+(?:[^\{\s,>]+))/$0-i/g' example.css

Related

How to edit this html lexer rule?

Regular expression remove some links

Regular Expression for HTML attributes

Regex selects first to last instead of just first

Regex to extract text from inside an HTML tag

Categories

Resources