How to get # word from database field with mysql? - mysql

I have a database field name called "vCaption" in which i have sentences.
In those sentences, somewhere, there are words with # symbol at the starting of that word. i need that particular word form that sentence. And if there is no # symbol word exist in the record then it should return null.
for example,
"my #childhood image from #1992 with my #Dad"
i have above record in my table.
What i need is only these three below words.
chilhood, 1992, Dad.
i tried REGEX and other mysql function but it doesnt get me what exactly i need.
Please help me here.
SELECT vCaption FROM tbl_post WHERE vCaption REGEXP '(?<= #|^#)\S*'
i have written above query. it return error
"#1139 - Got error 'repetition-operator operand invalid' from regexp"

To select the rows that have words starting with a #, you can use this:
SELECT mycolumn FROM mytable WHERE mycolumn REGEXP "#[[:alnum:]]+";
But this only selects the rows you want—not all the words.
You could:
try to transform mycolumn in the SELECT using MySQL string functions to remove unwanted words... frankly, not sure if that's possible.
post-process the selected rows in your language to extract the words you want. For instance, in PHP, preg_match_all("~#[[:alnum:]]+~", $yourstring, $m) would return all the #words into $m[0]

Related

SQL replace function with MATCH() AGAINST()

I would like to use the replace function inside a match function, to remove \n characters before it searches matching rows. Otherwise, for example, if the text is FULLTEXT\nsearch, and the search is search, it will not match.
Here is my query (simplified) :
SELECT * FROM messages WHERE MATCH(REPLACE(body,'\\n',' ')) AGAINST ('mysearch' IN BOOLEAN MODE)
But it throws an error...
[EDIT]
After #Shadow 's answer, I tried this :
SELECT * FROM (SELECT REPLACE(body,'\\n',' ') AS rb FROM messages) AS rbody WHERE MATCH(rb) AGAINST ('mysearch');
I think the idea is correct, but I get an error ERROR 1210 (HY000): Incorrect arguments to MATCH. I think this is because I didn't index the column rb (FULLTEXT INDEX (rb)), so the MATCH () AGAINST () operation won't work.
So I update my question : How can one index a column of a subquery
The answer is that you cannot dynamically remove \n character sequence within a match() call. As MySQL manual on match() says:
MATCH() takes a comma-separated list that names the columns to be searched.
You either have to store \n differently, not as a character sequence or you need to have a separate field in which these characters are already filtered out and this additional field is used for fulltext searches.
Actually, waiting for a better solution, I will just add a column raw_body to my table, where I will store the exact body (I won't escape it with real_sacpe_string, I will just manually replace " and ' by \" and \'), and I will prepare the query and bind the params. However, I don't know if it is secure enough against sqlinjection.
[UPDATE]
Actually I found out that I didn't even needed to manually escape quotes, since the prepared statement is enough to prevent sqli. So I think I will just keep this solution for the moment

SQL conditional: using Regex formatter for "Like"

I have a record in a database like this: 1K-05, in a column called "DocXmtlNum"
The SQL statement to try to get it is like this:
"SELECT DISTINCT DocXmtlNum FROM table1 WHERE DocXmtlNum Like '#?[A-Z]*' ORDER BY DocXmtlNum Desc"
However, it does not grab any records. I am assuming that the "#?[A-Z]*" part is saying that it wants to get records that start with a number, is followed by a letters, and is followed by any other characters. What's wrong with this? How would I write the regular expression to get a record that is a number followed by a letter, and followed by any character?
Note: The SQL statement was auto translated from VB6 to vb.net4, so there were errors introduced.
Is this what you want?
WHERE DocXmtlNum REGEXP '^[0-9]?[A-Z]-.+$'
This checks for:
An optional digit
A letter
A hyphen
At least one more character

Get all records between to alpha variables in alpha order mysql

I have a database of words for dictionary lookup purposes. What I need to be able to do with mysql is allow a user to input to variables (alpha) and my script will return every word that starts with both of those variables and everything in between.
Let's say the two variables are:
$letters1 = abor
$letters2 = accr
I want to get every word that starts with abor through accr. I need to return every word that would fit between those two starting points. So an example SQL statement that I know does not work but might help you understand what I am asking:
SELECT word from table1 WHERE word LIKE '%abor%' THROUGH '%accr%' ORDER BY word ASC
I know that THROUGH is not an operator but that's the general idea of what I need to accomplish.
If you merely want words that start with letters between the two variables, you can use MySQL's BETWEEN ... AND ... operator:
SELECT word FROM table1 WHERE word BETWEEN 'abor' AND 'accr' ORDER BY word

Using REGEX to alter field data in a mysql query

I have two databases, both containing phone numbers. I need to find all instances of duplicate phone numbers, but the formats of database 1 vary wildly from the format of database 2.
I'd like to strip out all non-digit characters and just compare the two 10-digit strings to determine if it's a duplicate, something like:
SELECT b.phone as barPhone, sp.phone as SPPhone FROM bars b JOIN single_platform_bars sp ON sp.phone.REGEX = b.phone.REGEX
Is such a thing even possible in a mysql query? If so, how do I go about accomplishing this?
EDIT: Looks like it is, in fact, a thing you can do! Hooray! The following query returned exactly what I needed:
SELECT b.phone, b.id, sp.phone, sp.id
FROM bars b JOIN single_platform_bars sp ON REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(b.phone,' ',''),'-',''),'(',''),')',''),'.','') = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')',''),'.','')
MySQL doesn't support returning the "match" of a regular expression. The MySQL REGEXP function returns a 1 or 0, depending on whether an expression matched a regular expression test or not.
You can use the REPLACE function to replace a specific character, and you can nest those. But it would be unwieldy for all "non-digit" characters. If you want to remove spaces, dashes, open and close parens e.g.
REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')','')
One approach is to create user defined function to return just the digits from a string. But if you don't want to create a user defined function...
This can be done in native MySQL. This approach is a bit unwieldy, but it is workable for strings of "reasonable" length.
SELECT CONCAT(IF(SUBSTR(sp.phone,1,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,1,1),'')
,IF(SUBSTR(sp.phone,2,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,2,1),'')
,IF(SUBSTR(sp.phone,3,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,3,1),'')
,IF(SUBSTR(sp.phone,4,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,4,1),'')
,IF(SUBSTR(sp.phone,5,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,5,1),'')
) AS phone_digits
FROM sp
To unpack that a bit... we extract a single character from the first position in the string, check if it's a digit, if it is a digit, we return the character, otherwise we return an empty string. We repeat this for the second, third, etc. characters in the string. We concatenate all of the returned characters and empty strings back into a single string.
Obviously, the expression above is checking only the first five characters of the string, you would need to extend this, basically adding a line for each position you want to check...
And unwieldy expressions like this can be included in a predicate (in a WHERE clause). (I've just shown it in the SELECT list for convenience.)
MySQL doesn't support such string operations natively. You will either need to use a UDF like this, or else create a stored function that iterates over a string parameter concatenating to its return value every digit that it encounters.

MySQL REGEXP not matching string

I have a table of messages. I am trying to find messages in the table that have an ID code which complies with a specific format. The regexp that I have below was written for matching these values in PHP, but I want to move it to a MySQL query.
It is looking for a specific format of an identifier code that looks like this:
[692370613-3CUWU]
The code has a consistent format:
starts and ends with hard brackets [ ]
two components inside,
first is an account number, min 9 digits, but could be higher
second component is a alphanumeric code, 5 characters, can include 1-9, and capital letters excluding "O"
the complete code can occur anywhere in the message
I have a query that reads:
SELECT * FROM messages
WHERE
msgBody REGEXP '\\[(\d){9,}-([A-NP-Z1-9]){5}\\]'
OR
msgSubject REGEXP '\\[(\d){9,}-([A-NP-Z1-9]){5}\\]'
I created a test row in the table which has only the sample value above in the msgBody field for testing - but it does not return any results.
I am guessing that I am missing something in the conversion of PHP style regex vs. MySQL.
Help is greatly appreciated.
Thank you!
Instead of \d try using [[:digit:]]
SELECT * FROM messages
WHERE
msgBody REGEXP '\\[([0-9]){9,}-([A-NP-Z1-9]){5}\\]'
OR
msgSubject REGEXP '\\[([0-9]){9,}-([A-NP-Z1-9]){5}\\]'