Making sure a url parameter is not present, using regex - mysql

I need 2 regular expressions that I will use in MySQL
OK if one of the url parameters equals something (e.g page_id=5)
I came up with this: ^https?:.*[?&]page_id=5([#&].*)?$
OK if a certain parameter is not present in the url (e.g do not match [?&]page_id=)
This is the one I need help with.
This functionality is part of a bigger problem that does need to be implemented with regular expressions and they have to be compatible with MySQLs RLIKE

your regexp looks fine - just use NOT RLIKE

AFAIK, MySQL's regex library does not support look-aheads, which is necessary for this kind of thing. As already stated, NOT RLIKE seems to be the only option.

Related

grep : Count the number of elements in json response [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have this gigantic ugly string:
J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully
I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).
Here's the regex I've been playing with:
Project name:\s+(.*)\s+J[0-9]{7}:
The problem is that it doesn't stop until it hits the J0000020: at the end.
How do I make the regex stop at the first occurrence of J[0-9]{7}?
Make .* non-greedy by adding '?' after it:
Project name:\s+(.*?)\s+J[0-9]{7}:
Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.
However, consider using a negative character class instead:
Project name:\s+(\S*)\s+J[0-9]{7}:
\S means “everything except a whitespace and this is exactly what you want.
Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.
Here's what I used. s contains your original string. This code is .NET specific, but most flavors of regex will have something similar.
string m = Regex.Match(s, #"Project name: (?<name>.*?) J\d+").Groups["name"].Value;
I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.
One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.
For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.
Available for download at their site:
http://www.ultrapico.com/Expresso.htm
Express download:
http://www.ultrapico.com/ExpressoDownload.htm
(Project name:\s+[A-Z]:(?:\\w+)+.[a-zA-Z]+\s+J[0-9]{7})(?=:)
This will work for you.
Adding (?:\\w+)+.[a-zA-Z]+ will be more restrictive instead of .*

How to extract in Splunk at indexed time json field with same child-key from different father-key using regex? [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have this gigantic ugly string:
J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully
I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).
Here's the regex I've been playing with:
Project name:\s+(.*)\s+J[0-9]{7}:
The problem is that it doesn't stop until it hits the J0000020: at the end.
How do I make the regex stop at the first occurrence of J[0-9]{7}?
Make .* non-greedy by adding '?' after it:
Project name:\s+(.*?)\s+J[0-9]{7}:
Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.
However, consider using a negative character class instead:
Project name:\s+(\S*)\s+J[0-9]{7}:
\S means “everything except a whitespace and this is exactly what you want.
Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.
Here's what I used. s contains your original string. This code is .NET specific, but most flavors of regex will have something similar.
string m = Regex.Match(s, #"Project name: (?<name>.*?) J\d+").Groups["name"].Value;
I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.
One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.
For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.
Available for download at their site:
http://www.ultrapico.com/Expresso.htm
Express download:
http://www.ultrapico.com/ExpressoDownload.htm
(Project name:\s+[A-Z]:(?:\\w+)+.[a-zA-Z]+\s+J[0-9]{7})(?=:)
This will work for you.
Adding (?:\\w+)+.[a-zA-Z]+ will be more restrictive instead of .*

mySQL replace string + additional string with a static value

Unlike PHP, I don't believe mySQL has any preg_replace() feature, only matching via REGEXP. Here are the strings I have in the code:
http://ourcompany.com/theapplestore/...
http://ourcompany.com/anotherstore/...
http://ourcompany.com/yetanotherstore/...
As you can see, there is a constant in there, http://ourcompany.com/, but there is also a variable string namely theapplestore, anotherstore, etc. etc.
I want to replace the constant string, plus the variable string(s), and then the trailing slash (/) after the variable string(s), with a single shortcode value, namely {{store url=''}}
EDIT
If it helps, the store codes are always the same length, they are going to be
sch131785
sch185399
sch634019
etc.
i.e., they are all 9 characters long
How would I do this? Thanks.
I thought this might be useful: there is currently NO WAY to do this in mysql. Find using REGEXP, yes; replace, no. That said, there is another post with an extension library mentioned, sagi:
Is there a MySQL equivalent of PHP's preg_replace?
MariaDB-10.0.5 has REGEXP_REPLACE(), REGEXP_INSTR() and REGEXP_SUBSTR()
You can use following regex,
(ourcompany.com\/\w+\/)
Demo
Uses the concept of Group Capture

Erlang binary pattern matching fails

Why does this issue a badmatch error? I can't figure out why this would fail:
<<IpAddr, ":*:*">> = <<"2a01:e34:ee8b:c080:a542:ffaf:*:*">>.
You need to specify the size of IpAddr so that it can be pattern-matched:
1> <<IpAddr:28/binary, ":*:*">> = <<"2a01:e34:ee8b:c080:a542:ffaf:*:*">>.
<<"2a01:e34:ee8b:c080:a542:ffaf:*:*">>
2> IpAddr.
<<"2a01:e34:ee8b:c080:a542:ffaf">>
Pattern matching of a binary proceeds left-to-right so it will match IpAddr first before it tries the following segment. There is no back-tracking until there is a match. A default typed variable like IpAddr matches one byte. See Bit Syntax Expressions and Bit Syntax for a proper description and more examples.
As alternative to using pattern matching here you might consider using the binary module. There are two functions which could be useful to you: binary:match/2/3 and binary:split/2/3. These search which may better fit your problem.
As a last alternative you could try using regular expressions and the re module.

Recognize Missing Space

How can I recognize when a user has missed a space when entering a search term? For example, if the user enters "usbcable", I want to search for "usb cable". I'm doing a REGEX search in MySQL to match full words.
I have a table with every term used in a search, so I know that "usb" and "cable" are valid terms. Is there a way to construct a WHERE clause that will give me all the rows where the term matches part of the string?
Something like this:
SELECT st.term
FROM SearchTerms st
WHERE 'usbcable' LIKE '%' + st.term + '%'
Or any other ideas?
Text Segmentation is a part of Natural Language Processing, and is what you're looking for in this specific example. It's used in search engines and spell checkers, so you might have some luck with example source code looking at open source spell checkers and search engines.
Spell checking might be the correct paradigm to consider anyway, as you first need to know whether it's a legitimate word or not before trying to pry it apart.
-Adam
Posted in the comments, but I thought it important to bring up as an answer:
Does that query not work? – Simon Buchan
Followed by:
Well, I should've tested it before I
posted. That query does not work, but
with a CONCAT it does, like so: WHERE
'usbcable' LIKE Concat('%', st.term,
'%'). I think this is the simplest,
and most relevant (specific to my
site), way to go. – arnie0674
Certainly far easier than text segmentation for this application...
-Adam