sphinx search return match and surrounding characters - mysql

In Sphinx (using mysql connection), how can I match for a term and also get, let's say 5 characters before and after the match?
For example: a row contains this is a word in a sentence.
When I query: SELECT term FROM table WHERE MATCH('"word*"')
I would want to see is a word in a s returned. Is this possible?
Edit Using #barryhunter's helpful answer below, now trying to figure out how to fit this into my query:
SELECT field1,field2,SPHINX_SNIPPETS(field3,indexName, "word") as snippet FROM tableName

Thats what CALL SNIPPETS is designed for. Although its counted in words, not charactors.
http://sphinxsearch.com/docs/current/sphinxql-call-snippets.html
CALL SNIPPETS('this is a word in a sentance','index1','word', 2 AS around, 5 as limit_words);
would get back
... is a word in a ...
add '' as chunk_separator is dont want the ellipsis
Edit To add: then if want to build the snippet during the search query (not as a seperate CALL query), can use SNIPPET() function in the intial select
http://sphinxsearch.com/docs/current.html#sphinxql-select

Related

Display output of MySQL query in one line/ horizontally

I am validating the results of a MySQL database that I created and, for that, I need some screenshots.
For example, the following query:
select distinct run_ID
from ngsRunStats_FK.failedRuns
where reason_fail regexp 'cannot populate readsInfo'
will return (output from the terminal)
But as we can see, the screenshot is quite too long.
Is there a way to, instead of display the output as a (vertical) column, to display only its values horizontally (e.g. like in a python list)?
Try using GROUP_CONCAT:
SELECT GROUP_CONCAT(run_ID ORDER BY run_ID) AS run_ID_values
FROM ngsRunStats_FK.failedRuns
WHERE reason_fail REGEXP 'cannot populate readsInfo';
Side note:
If you really want to match the three keywords cannot populate readsInfo anywhere inside a larger text within the reason_fail column, then consider using word boundaries with REGEXP:
WHERE reason_fail REGEXP '[[:<:]]cannot populate readsInfo[[:>:]]';

Isolate an email address from a string using MySQL

I am trying to isolate an email address from a block of free field text (column name is TEXT).
There are many different variations of preceding and succeeding characters in the free text field, i.e.:
email me! john#smith.com
e:john#smith.com m:555-555-5555
john#smith.com--personal email
I've tried variations of INSTR() and SUBSTRING_INDEX() to first isolate the "#" (probably the one reliable constant in finding an email...) and extracting the characters to the left (up until a space or non-qualifying character like "-" or ":") and doing the same thing with the text following the #.
However - everything I've tried so far hasn't filtered out the noise to the level I need.
Obviously 100% accuracy isn't possible but would someone mind taking a crack at how I can structure my select statement?
There is no easy solution to do this within MySQL. However you can do this easily after you have retrieved it using regular expressions.
Here would be a an example of how to use it in your case: Regex example
If you want it to select all e-mail addresses from one string: Regex Example
You can use regex to extract the ones where it does contain an e-mail in MySQL but it still doesn't extract the group from the string. This has to be done outside MySQL
SELECT * FROM table
WHERE column RLIKE '\w*#\w*.\w*'
RLIKE is only for matching it, you can use REGEXP in the SELECT but it only returns 1 or 0 on whether it has found a match or not :s
If you do want to extract it in MySQL maybe this other stackoverflow post helps you out. But it seems like a lot of work instead of doing it outside MySQL
Now in MySQL 5 and 8 you can use REGEXP_SUBSTR to isolate just the email from a block of free text.
SELECT *, REGEXP_SUBSTR(`TEXT`, '([a-zA-Z0-9._%+\-]+)#([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})') AS Emails FROM `mytable`;
If you want to get just the records with emails and remove duplicates ...
SELECT DISTINCT REGEXP_SUBSTR(`TEXT`, '([a-zA-Z0-9._%+\-]+)#([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})') AS Emails FROM `mytable` WHERE `TEXT` REGEXP '([a-zA-Z0-9._%+\-]+)#([a-zA-Z0-9.-]+)\.([a-zA-Z]{2,4})';

Mysql search, match words if similar

I have a db with a table called sections. In that is a field called head that has a full text index with 3 entries each a string. 2 entries have the word motorcycle and one has motorcycles. I can't seem to find a way to return all 3 if the term "motorcycles" is search.
I have tried
SELECT * FROM sections
WHERE MATCH (head) AGAINST ('Motorcycles')
but it only returns the plural entry. I have also tried.
SELECT * FROM sections
WHERE head like '%motorcycles%'
but that also only returns the plural entry. Is there a way to return all three rows based on "motorcycles"?
Have you tried boolean mode?
where match (head) against ('+ Motorcycle*' in Boolean mode)
More information is here.
Your where clause has an extra "s":
SELECT * FROM sections WHERE head like '%motorcycle%'
Assuming your question is more general than the specific motorcylce example you've given...I'm not aware of a way that you can relax the contraints directly in the SQL (without a stored proc to pre process the input). I'd suggest pre processing your input with a regex to remove/replace the chars that make the word plural. Then use like in the way that you have shown on the singular version of the word.
If i have got your Questions correctly I think you want something like this:
if (SELECT count(1) FROM sections WHERE head like '%motorcycles%')>1
begin
select * FROM selections
WHERE head like '%motorcycle%'
end

Get all records between to alpha variables in alpha order mysql

I have a database of words for dictionary lookup purposes. What I need to be able to do with mysql is allow a user to input to variables (alpha) and my script will return every word that starts with both of those variables and everything in between.
Let's say the two variables are:
$letters1 = abor
$letters2 = accr
I want to get every word that starts with abor through accr. I need to return every word that would fit between those two starting points. So an example SQL statement that I know does not work but might help you understand what I am asking:
SELECT word from table1 WHERE word LIKE '%abor%' THROUGH '%accr%' ORDER BY word ASC
I know that THROUGH is not an operator but that's the general idea of what I need to accomplish.
If you merely want words that start with letters between the two variables, you can use MySQL's BETWEEN ... AND ... operator:
SELECT word FROM table1 WHERE word BETWEEN 'abor' AND 'accr' ORDER BY word

Determine how many characters in text are within another text entry in MySQL

I have a table where there is a column of type 'text'. I know I can compare two entries to see is they are the same using a simple select statement. Is there a way to compare two entries and return how similar they are? More specifically, can it say how many characters are different between the two?
For example, suppose one entry is:
This is a line.
And another that is:
This is a line. And another.
I believe I can write a select statement that says the first in contained in the second. But is there a way to alert me that the first is in the second AND there are 15 extra characters in the 2nd?
Try to use Levinshtein distance http://www.artfulsoftware.com/infotree/queries.php#552
You can use LENGTH along with LIKE to do this. E.g.:
INSERT INTO test VALUES("HELLO WORLD");
select LENGTH(name)-length("HELLO") from test where name like "%HELLO%";
So you'd need to programmatically replace HELLO with whatever the string was you wanted to search for.
Is that what you were looking for?
You could simply measure the length of both as strings with char_length() and subtract the difference?
MySQL: char_length()
(Note that length() and char_length() return different values!)