I have a table colum with general text values ex:
"This is Gerald's Sample Text: With some special chars"
I need to convert this text to:
"this-is-geralds-sample-text-with-some-special-chars"
with MySQL InnoDB and save the value in a separate unique column in the same table. Is there a simpler way of achieving this with a query without using procedures?
The short answer is "No". You're looking for something that behaves exactly like a regular expression, and MySQL does not support regex replace natively.
The longer answer is "No, but there are workarounds." You have a couple of options, and I don't terribly like either. The first is to create a function like in this question. The second is to come up with a list of bad characters and then use a set of REPLACE calls. It's ugly, but it will work.
On a side note: you might consider creating this value with your application and then just store along with the original. That would be cleaner in some ways than using a custom MySQL function.
Related
Acronyms are a pain in my database, especially when doing a search. I haven't decided if I should accept periods during search queries. These are the problems I face when searching:
'IRQ' will not find 'I.R.Q.'
'I.R.Q' will not find 'IRQ'
'IRQ.' or 'IR.Q' will not find 'IRQ' or 'I.R.Q.'
etc...
The same problem goes for ellipses (...) or three series of periods.
I just need to know what directions should I take with this issue:
Is it better to remove all periods when inserting the string to the database?
If so what regex can I use to identify periods (instead of ellipses or three series of periods) to identify what needs to be removed?
If it is possible to keep the periods in acronyms, how can it be scripted in a query to find 'I.R.Q' if I input 'IRQ' in the search field, through MySQL using regex or maybe a MySQL function I don't know about?
My responses for each question:
Is it better to remove all periods when inserting the string to the database?
Yes and no. You want the database to have the original text. If you want, create a separate field that is "cleaned up" to search against. Here, you can remove periods, make everything lowercase, etc.
If so what regex can I use to identify periods (instead of ellipses or three series of periods) to identify what needs to be removed?
/\.+/
That finds one or more periods in a given spot. But you'll want to integrate it with your search formula.
Note: regex on a database isn't known to have high performance. Be cautious with this.
Other note: you may want to use FullText search in MySQL. This also, isn't known to have high performance with data sets over 1000+ entries. If you have big data and need fulltext search, use Sphinx (available as a MySQL plug-in and RAM-based indexing system).
If it is possible to keep the periods in acronyms, how can it be scripted in a query to find 'I.R.Q' if I input 'IRQ' in the search field, through MySQL using regex or maybe a MySQL function I don't know about?
Yes, by having the 2 fields I described in the first bullet's answer.
You need to consider the sanctity of your input. If it is not yours to alter then don't alter it. Instead you should have a separate system to allow for text searching, and that can alter the text as it sees fit to be able to handle these types of issues.
Have a read up on Lucene, and specifically Lucene's standard analyzer, to see the types of changes that are commonly carried out to allow successful searching of complex text.
I think you can use the REGEXP function of MySQL to send an acronym :
SELECT col1, col2...coln FROM yourTable WHERE colWithAcronym REGEXP "#I\.?R\.?Q\.?#"
If you use PHP you can build your regexp by this simple loop :
$result = "#";
foreach($yourAcronym as $char){
$result.=$char."\\.?";
}
$result.="#";
The functionality you are searching for is a fulltext search. Mysql supports this for myisam-tables, but not for innodb. (http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html)
Alternatively you could go for an external framework that provides that funcionality. Lucene is a popular open-source one. (lucene.apache.org)
There would be 2 methods,
1. save data -removing symbols from text and match accordingly,
2. you can make a regex ,like this for eg.
select * from table where acronym regexp '^[A-Z]+[.]?[A-Z]+[.]?[A-Z]+[.]?$';
Please note, however, that this requires the acronym to be stored in uppercase. If you don't want the case to matter, just change [A-Z] to [A-Za-z].
Because of the way our databases are collated they ignore case for usernames and passwords, which we're currently unable to fix at the database level. It appears from this thread that WHERE BINARY 'something' = 'Something' should fix the problem, but I haven't been able to figure out to get Django to insert BINARY. Any tips?
I don't think there's an easy way to force Django to add something into a query content.
You may want to simply write a raw SQL query within the Django to get objects filtered with case-sensitive comparison and then use them in normal queries.
The other approach is to use Django case-sensitive filters in order to achieve the same result. E.g. contains/startswith both use BINARY LIKE and may be used while comparing two strings with the same length (like a password hash)). Finally, you can use regexp to do the case-sensitive comparison. But these are rather ugly methods in that situation. They have an uncessary overhead and you should avoid them as long as it's possible.
Is it possible to make some characters stored in mysql invisible for search queries?
Of course, I can do this in application, but is there maybe some setting option in mysql for this?
I am still not sure I am following what you want. It sounds like a query like
SELECT * FROM `table` WHERE REPLACE(string_field, "#", "") = "user query"
might be what you are looking for.
See REPLACE. For more complicated matching, there's also regular expressions, although that would probably be rather messy for what you are describing.
EDIT: Just saw your comment. It sounds like you want to blacklist certain characters from the user's query as they are special to your system. No, there's no way to do that. Somewhere you are going to want a string replace operation to remove those characters; either in your application or in a stored procedure/function if you want to put it in the database.
I'm about to do a search/replace in a MySql database. I want to replace one word in a field containing html text with another word. The problem is that the word that I'm looking for in some cases is part of a path. Eg.:
<... src="../SEARCHTERM/img.jpg" />
Obviously I do not want to replace this instance. So my question is: What is the best way to do this? How do I replace only if the word is not part of a path?
Select all records that contain the word to replace.
Process the records using the programming language of your choice.
Update the records in the database with your processed records.
The problematic part of this is item 2. Depending on your processing needs, a regular expression replacement will do it (like preg_replace in PHP) or you need a fully fledged HTML parser.
MySQL can match strings using regular expressions, but no built-in way exists to do replacement based on regular expressions. You can, however, use User Defined Functions to do that, e. g. MySQL Regular Expression UDFs. Then again, question is whether regular expression are sufficient for your replacement needs. In a lot of cases involving HTML, it is not.
Not sure if it's possible but it's worth a shot.. I am trying to insert into a MySQL 'TEXT' field some text.. Some of the words within the text I want to change depending on other fields from some other tables in the MySQL database.. Something similar to a php email template where the 'Dear ${first_name}' can be changed depending on who the email is going to...
Can something like this be done within a field in a MySQL table?
I aware this can be done using a PHP file, but I was wondering if this can be done using MySQL..
Yeah, I guess you could do it, using a stored procedure.
It would have to have a REPEAT loop that uses LOCATE to find the string index of the next '${' token, takes the name from there up to the next LOCATEd '}', and replaces it by CONCAT and SUBSTRING with the value. If that value comes from a simple name-to-value lookup table that's not too bad, but if you want ${first_name} to actually use the column called first_name you would have to create some dynamic SQL in a string and run it using PREPARE...EXECUTE, which is ugly and dangerous.
It would be complex, fragile and DBMS-dependent. SQL is not really designed to be convenient for string fiddling. Any general-purpose programming language with reasonable string-manipulation facilities should be able to do it in a much more straightforward way. If you have PHP available, use it.