Extract specific words from text field in mysql - mysql

I have a table that contains a text field, there is around 3 to 4 sentences in the field depending on the row.
Now, I am making an auto-complete html object, and I would like to start typing the beginning of a word and that the database return words that start with those letters from the database text field.
Example of a text field: I like fishsticks, fishhat are great too
in my auto-complete if I would type "fish" it would propose "fishsticks" and "fishhat"
Everything works but the query.
I can easily find the rows that contains a specific word but I can't extract only the word, not the full text.
select data_txt from mytable match(data_txt) against('fish', IN BOOLEAN MODE) limit 10
I know it is dirty, but I cannot rearrange the database.
Thank you for your help!
EDIT:
Here's what I got, thanks to Brent Worden, it is not clean but it works:
SELECT DISTINCT
SUBSTRING(data_txt,
LOCATE('great', data_txt),
LOCATE(' ' , data_txt, LOCATE('great', data_txt)) - LOCATE('great', data_txt)
)
FROM mytable WHERE data_txt LIKE '% great%'
LIMIT 10
any idea on how to avoid using the same LOCATE expression over and over?

Use LOCATE to find the occurrence of the word.
Use LOCATE and the previous LOCATE return value to find the occurrence of the first space after the word.
USE SUBSTR and the previous 2 LOCATE return values to extract the whole word.

$tagsql ="SELECT * from mytable";
$tagquery = mysql_query($tagsql);
$tags = array(); //Creates an empty array
while ($tagrow = mysql_fetch_array($tagquery)) {
$tags[] = tagrow['tags']; //Fills the empty array
}
If the rows contain commas you could use -
$comma_separated = implode(",", $tags);
you can replace the comma for spaces if they are separated as spaces in your table.
$exp = explode(",", $comma_separated);
If you require your data to be unique you may include the following:
$uniquetags = array_unique($exp, SORT_REGULAR);
you can use print_r to see the results of the array resulting
Here array_merge is used because $rt will not get displayed if you are using a 'jquery' autocomplete else $rt may work and array_merge can be ignored. However, you may use array_merge to include multiple tables by repeating the previous process.
$autocompletetags = array_merge((array)$uniquetags);
This sorts the values in the alphabetic order
sort($autocompletetags, SORT_REGULAR);

Related

MySQL remove everything after n words in column

I have a table with over 5000 entries, and basically I want to replace the texts with excerpts. So the column 'text' has between 1000 and 2000 words, most of the time. I want every cell to be cut after 80 words. Additionally it would be nice to add something like 'Read more...' after the 80 words. Is it possible with a MySQL Query?
This is bad idea to do that from MySQL level. This is about the VIEW Layer, so should be programmed in the place which read the data from the database and presents them.
In PHP, it can be done with
function cutStringAfterWords($phrase,$max_words){
$phrase_array = explode(' ',$phrase);
if(count($phrase_array) > $max_words && $max_words > 0)
$phrase = implode(' ',array_slice($phrase_array, 0, $max_words)).'...';
return $phrase;
}
echo cutStringAfterWords($largeText,80).' Read more...';
, but I believe most of the languages have their equivalents.
Edit: made an example to cut after 80 words. you can simply replace 3 dots within function to place Read more always, or remove 3 dots in function, and manually add 'Read More...' string after every truncated text
You can try with regex.
select REGEXP_SUBSTR(your_column_name,'(.+?\\s+|.+?){1,80}')
from your_table_name
There's nice MySQL string function SUBSTRING_INDEX which you can use like this to get the result you want:
UPDATE your_table SET your_text_field = (SELECT SUBSTRING_INDEX(your_text_field, ' ', 80) FROM your_table WHERE text_id = some_id) WHERE text_id = some_id;
To add 'Read More..' you can use MySQL CONCAT.
Also, I would suggest backing up your table before you start experimenting.
execute below query to update all records with first 80 characters
update your_table_name set column_name=SUBSTRING_INDEX(column_name, ' ', 80)
Updated with SUBSTRING_INDEX since you need words

Removing single quotes from comparison in select statement

I have a table where a field can have single quotes, but I need to be able to search by that field without single quotes. For example, if the search query is "Johns favorite", I need to be able to find a row where that field contains "John's favorite". I was looking into regex for it, but that seems to return a 0 or 1 when used in a select statement, if I'm understanding it correctly.
Take a look at:
http://www.artfulsoftware.com/infotree/queries.php#552
This will give you the distance between two strings. I.e. you can check whether levensthein distance is less than 3, which means, less than 3 operations are required to be equal.
Try using REPLACE:
SELECT
IF(
REPLACE("John's favorite","'","") = "Johns favorite" ,
"found",
"not found"
)
It's not optimal but it should do the job.

Doing partial match on name fields in SQL Server

I need to enable partial matching on name search. Currently it works with Like '%#name%' but it's not good enough.
We need to enable typing in both first name and last name and both need to be partial, so I'm assuming full text is the way to go.
The problem is that I can't get it do a partial match on a name. For example searching for my name (Fedor Hajdu) will work if I type in either parts in full but not partial (It should match a search for 'fe ha' for example.
How can I achieve this? Can fulltext index be set to do something like syllable matching?
humm three options that mya help you:
the INFLECTIONAL form (Link)
CONTAINS with NEAR (Link)
CONTAINS with Weighted Values (Link)
If that doesn't help, get the string 'fe ha' and add a '%' on every blank space and do a Like:
Like '%fe%ha%'
Using CONTAINS() or CONTAINSTABLE() all you need to do is add * at the end of your matching string:
CONTAINS (Description, '"top*"' );
If you have your string as a parameter you may concatenate like this:
SET #SearchTerm = '"' + #NameParameter + '*"'
CONTAINS (Description, SearchTerm );
https://technet.microsoft.com/en-us/library/ms142492(v=sql.105).aspx

MySQL list IN list

I currently store user's inputs in comma separated lists like so:
Userid | Options
1 | 1,2,5
A user ticks a set of options in a form which is an array, which is then joined with a comma to make
1,2,5
Then MySQL is trying to find other users who some or all of the same options ticked on a different field name (although same table).
Currently I do this:
WHERE `choices` IN ('.$row['myoptions'].')
So this could be something like:
WHERE 1,2,5,8 IN (1,4,6)
This would return true because theres at least one value match right? Or have i got this confused..
May be you are going the wrong way to do this.
The function FIND_IN_SET might be helpful if the options column type is SET.
Example:
SELECT * FROM yourtabe WHERE FIND_IN_SET('2', Options);
But, it will only let you compare one string at a time, in the above example, it compares if 2 is present in the rows. If you have to compare multiple values you cannot accomplish that by using FIND_IN_SET.
However, in your case, LIKE clause may be of use to.
May be something like
SELECT * FROM yourtable WHERE Options LIKE '%2,3%';
Now this will search for 2,3 value anywhere in the column, and give the result. But this also comes with another complication, it gives the result only if 2,3 is present side by side of each other, if the column has 2,1,3 or 2,4,5,3 or even 3,2 it will not list these records.
Now coming to your question
`WHERE `choices` IN (1,4,6)`,
will translate to
WHERE `choices` = '1' OR `choices` = '4' OR `choices` = '6'
so it will return false
Why?
because your column contains not only 1 or 4 or 6 but 1,2,5 as one string. So all the comparisons above to return false
I do not think this will return true.
WHERE CHOICES IN ()
when you do this, it will compare complete choices value to individual item inside IN
You might wanna have a look at find_in_Set function of MySQL
WHERE find_in_set(optionNumber1, choices) > 0
OR find_in_set(optionNumber2, choices) > 0
OR find_in_set(optionNumber3, choices) > 0
You will have to make query in a loop in programming language you are using
I think you are not getting Confused. You are absolutely right this will return something (a tuple or more then one tuple) and that of-course is a True value. So Carry on....
I don't know where is choice column in which table, but have u tried this way
SELECT * FROM t1 WHERE `choices` > ANY (SELECT options FROM t2);
Reference
just had the same problem. solved it using RLIKE:
$options_in_row = array_filter(explode(',',$row['myoptions'])); // convert the csv to array of numbers. use array_filter because empty values will generate a regex that always find something.
$options_rx = implode('|', array_map(function ($x)
{
return "\b$x\b"; // adding \b to avoid partial number hits, such as '2' inside '123,234'
}, $options_in_row));
// $options_rx is something like '\b123\b|\b234\b'
$sql = '.... WHERE `choices` RLIKE "'.$options_rx.'"';
take into account that this code assumes csv of numbers. if your case is different you'll have to add escaping.

MySQL sort by name

Is ir possible to sort a column alphabetically but ignoring certain words like e.g 'The'
e.g.
A normal query would return
string 1
string 3
string 4
the string 2
I would like to return
string 1
the string 2
string 3
string 4
Is this possible?
EDIT
Please note I am looking to replace multiple words like The, A, etc... Can this be done?
You can try
SELECT id, text FROM table ORDER BY TRIM(REPLACE(LOWER(text), 'the ', ''))
but note that it will be very slow for large datasets as it has to recompute the new string for every row.
IMO you're better off with a separate column with an index on it.
For multiple stopwords just keep nesting REPLACE calls. :)
This will replace all leading "The " as an example
SELECT *
FROM YourTable
ORDER BY REPLACE(Val,'The ', '')
Yes, it should be possible to use expressions with the ORDER-part:
SELECT * FROM yourTable ORDER BY REPLACE(yourField, "the ", "")
I have a music listing that is well over 75,000 records and I had encountered a similar situation. I wrote a PHP script that checked for all string that began with 'A ', 'An ' or 'The ' and truncated that part off the string. I also converted all uppercase letters to lowercase and stored that string in a new column. After setting an index on that column, I was done.
Obviously you display the initial column but sort by the newly-created indexed column. I get results in a second or so now.