MySQL Fulltext search present me inaccurate result

MySQL Fulltext search present me inaccurate result - mysql

Let's say that I have a database that looks like this (MyISAM):
+------------+-------------------+------------------+
| student_id | student_firstname | student_lastname |
+------------+-------------------+------------------+
| 30 | Patrik | Andersson |
| 79 | Patrik | Svensson |
+------------+-------------------+------------------+
And I perform this query:
SELECT s.student_firstname, s.student_lastname FROM students s
WHERE MATCH (student_firstname, student_lastname)
AGAINST
('+Patrik Svensson*' IN BOOLEAN mode)
This generates both of the above rows. Why do I not get 1 row in my result? Is it because the last three letters in the student_lastname are the same? Is there any way to make FULLTEXT more precise?

Have you tried reading the MySQL documentation?
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
And I quote:
By default (when neither + nor - is specified) the word is optional,
but the rows that contain it are rated higher.
And:
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they
also contain “macintosh”.

I have tested it, this query is giving right result
SELECT s.student_firstname, s.student_lastname FROM students s
WHERE MATCH (student_firstname, student_lastname)
AGAINST
('+Patrik +Svensson*' IN BOOLEAN mode)

Related

Select numberplate from a table with a missing letter or number when the original numberplate can not be found

I have the following problem:
A camera system is scanning numberplates to check if they can enter a parking.
Only numberplates in the table can enter.
Sometimes, the camera is missing a letter or number or is changing (misreading) one character.
For example I receive 'BE.TUVV129' instead of 'BE.1TUVV129' -> car cannot enter.
I created this https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=395c12fe55d1d4870381b788fac08234 to count the number of letters/numbers found in the numberplate by selecting the highest count.
The problem here is that I sometimes end up with a wrong solution because the same characters can appear in another numberplate in a different order. Also numberplates can have from 2 to 10 characters.
Is there any solution to search for numberplates that matches the original characters in the same order with a maximum of 1 error in the string?
What I need to do is:
First select the numberplate from the camera, and second, only when not found, search for a numberplate that is like the original one with a 'mistake'... This must be in one select statement.
+--------------+
| Numberplate |
+--------------+
|BE.1GGTT531 |
|BE.364BDRM |
|BE.1TUVV129 |
|BE.2BCMV569 |
|BE.1FIP479 |
|BE.XDB6501 |
|BE.1XJJ6E34 |
|BE.80T8XZH |
|BE.2AQHL897 |
|BE.2BFL4G22 |
|BE.2BFL4H11 |
|BE.1UTEB280 |
|BE.1RJHH864 |
+--------------+

MySQL search query ordered by match relevance

I know basic MySQL querying, but I have no idea how to achieve an accurate and relevant search query.
My table look like this:
id | kanji
-------------
1 | 一子
2 | 一人子
3 | 一私人
4 | 一時
5 | 一時逃れ
I already have this query:
SELECT * FROM `definition` WHERE `kanji` LIKE '%一%'
The problem is that I want to order the results from the learnt characters, 一 being a required character for the results of this query.
Say, a user knows those characters: 人,子,時
Then, I want the results to be ordered that way:
id | kanji
-------------
2 | 一人子
1 | 一子
4 | 一時
3 | 一私人
5 | 一時逃れ
The result which matches the most learnt characters should be first. If possible, I'd like to show results that contain only learnt characters first, then a mix of learnt and unknown characters.
How do I do that?

Per your preference, ordering by number of unmatched characters (increasing), and then number of matched character (decreasing).
SELECT *,
(kanji LIKE '%人%')
+ (kanji LIKE '%子%')
+ (kanji LIKE '%時%') score
FROM kanji
ORDER BY CHAR_LENGTH(kanji) - score, score DESC
Or, the relational way to do it is to normalize. Create the table like this:
kanji_characters
kanji_id | index | character
----------------------------
1 | 0 | 一
1 | 1 | 子
2 | 0 | 一
2 | 1 | 人
2 | 2 | 子
...
Then
SELECT kanji_id,
COUNT(*) length,
SUM(CASE WHEN character IN ('人','子','時') THEN 1 END) score
FROM kanji_characters
WHERE index <> 0
AND kanji_id IN (SELECT kanji_id FROM kanji_characters WHERE index = 0 AND character = '一')
GROUP BY kanji_id
ORDER BY length - score, score DESC
Though you didn't specify what should be done in the case of duplicate characters. The two solutions above handle that differently.

Just a thought, but a text index may help, you can get a score back like this:
SELECT match(kanji) against ('your search' in natural language mode) as rank
FROM `definition` WHERE match(`kanji`) against ('your search' in natural language mode)
order by rank, length(kanji)
The trick is to index these terms (or words?) the right way. I think the general trick is to encapsulate each word with double quotes and make a space between each. This way the tokenizer will populate the index the way you want. Of course you would need to add/remove the quotes on the way in/out respectively.
Hope this doesn't bog you down.

Selecting rows based upon a search string or any of its synonyms

I need some help please...
I have 2 tables, one contains a description field which is entered freehand by the user, the second table is made up of 2 columns, the first is a group name and the second is a list of synonyms. So, for example, I might have three rows in the synonyms table in a group called A that contains the synonyms 'Leaflet', 'Brochure', 'Hand Bill'.
What I need to do is return all rows from the first table where the ItemDescription column contains any of the synonyms of the query variable which might be 'Leaflet'.
So this should give me all of the rows that contain anywhere in the long description field the words 'Leaflet', 'Brochure' or 'Hand Bill'.
I have been able to do this only where the ItemDescription field contains only actual words being looked for, in reality this os a long wordy column that may contain 50 or 60 words any one of which may be one of the search word or any of its synonyms.
All help gratefully received as always.
Thanks.

You should probably try to use LIKE or RLIKE to match the description column. In this case, you want to match a number of alternatives, so I'll just show an example.
Let us assume that we have this table containing synonyms. Note that we have added the word itself as a synonym:
+---------+-----------+
| word | synonym |
+---------+-----------+
| leaflet | leaflet |
| leaflet | brochure |
| leaflet | hand bill |
| skin | skin |
| skin | leather |
| skin | hide |
+---------+-----------+
You don't give an example table, so I invented one called items:
+---------+-------------------+-----------------------------------+
| item_id | brief | description |
+---------+-------------------+-----------------------------------+
| 1 | Diamond | This brochure is glossy and shiny |
| 2 | Halloween Special | A leaflet for the Halloween |
| 3 | Pumpkin | This is just a Halloween pumpkin |
+---------+-------------------+-----------------------------------+
Now, we assume that you want to look for all rows containing one of the synonyms of 'leaflet' in the description. The following query does the job:
SELECT * FROM items
WHERE description RLIKE (
SELECT
CONCAT('.*(', GROUP_CONCAT(synonym SEPARATOR '|'), ').*')
FROM synonyms
WHERE word = 'leaflet'
GROUP BY word
);
The inner select create a regular expression matching one of the synonyms, and the outer select applies this regular expression to the description column of our items table.

Thanks for the feedback. I have found an answer to my SQL needs:
SELECT *
FROM MainTable a
WHERE EXISTS
(
SELECT 1
FROM (
Select concat('%',Synonym,'%') As cond
From synonyms
Where Synonym Like '%SearchString%'
OR ListRef = ( Select ListRef
From synonyms
Where Synonym Like '%SearchString%')
) с
WHERE a.Description LIKE cond
)
OR ItemDescription Like '%SearchString%'
Without the final OR I was only returning rows where something existed in the synonyms table for my search string, with the OR it also returns all straight matches not found through synonyms.

MySQL Natural Sort (like OSX Finder)

I've searched for this for a long time, but the solutions I've found aren't working as I need.
Let me explain: I have a table containing a couple of thousands of products, each one with an alphanumeric SKU, used also for sorting.
This SKU consists of:
Category Code (variable number of alphabetic characters),
Product Number (integer),
Product Model Variation (optional, variable number of alphabetic characters)
For example: MANT 12 CL (without spaces)
Now, I need to get them ordered like this (and if these were filenames, OSX Finder would order them perfectly):
MANT1
MANT2
MANT2C
MANT2D
MANT2W
MANT3
MANT4C
MANT9
MANT12
MANT12C
MANT12CL
MANT12P
MANT13
MANT21
MANT24
MANT24D
MANT29
Of course ORDER BY sku is plainly wrong:
MANT1
MANT12
MANT12C
MANT12CL
MANT12P
MANT13
MANT2
MANT21
MANT24
MANT24D
MANT29
MANT2C
MANT2D
MANT2W
MANT3
MANT4C
MANT9
And ORDER BY LENGTH(sku), sku has problems sorting the model variations:
MANT1
MANT2
MANT3
MANT9
MANT12
MANT13
MANT21
MANT24
MANT29
MANT2C
MANT2D
MANT2W
MANT4C
MANT12C
MANT12P
MANT24D
MANT12CL
So, is there a way to sort this stuff like Finder would?
(Also, once sorted, is there a way to get the next and previous product? I don't mind using several queries: at this point elegance is the last of my problems...)
Thanks everybody in advance.
One last thing: during my searches I've found this answer to a similar question
but I have no idea how to use it in PHP, so I don't know if it works and is actually an answer to my question.

Are you using PHP when fetching data?
If so, try using natural sort function for in memory sort after data is already loaded?

The order is not 'plain wrong', it simply depends what collation you use. In your case, you might try the binary collation, for example, 'latin1_bin'.
Following example the ORDER BY using COLLATE for UTF8 data:
mysql> SELECT c1 FROM t1 ORDER BY c1;
+------+
| c1 |
+------+
| a1 |
| a12 |
| a13c |
| a2 |
| a21 |
+------+
mysql> SELECT c1 FROM t1 ORDER BY c1 COLLATE 'utf8_bin';
+------+
| c1 |
+------+
| a1 |
| a12 |
| a2 |
| a21 |
| a13c |
+------+

Full-text MySQL search - return snippets

I have a MySQL table that contains chapters of books.
Table: book_chapter
--------------------------
| id | book_id | content |
--------------------------
I am currently able to search the content using full-text search like this:
SELECT * FROM book_chapter WHERE book_chapter.book_id="2" AND
MATCH (book_chapter.content) AGAINST ("unicorn hair" IN BOOLEAN MODE)
However, I would like to know if it's possible to search the content and have the results returned in 30 character snippets, just so the user can feel the gist. So for example, if I search for "unicorn hair", I would have a result like this:
-------------------------------------------------
| id | book_id | content |
-------------------------------------------------
| 15 | 2 | it isn't unicorn hair. You kno |
-------------------------------------------------
| 15 | 2 | chup and unicorn hair in soup |
-------------------------------------------------
| 27 | 2 | , should unicorn hair be used |
-------------------------------------------------
| 31 | 2 | eware of unicorn hair for bal |
Notice that there are two results from the same record. Is that possible as well?

An improvement to the query by Mike Bryant
If the match is at the beginning of the field, then the SUBSTRING will start from the end.
I just added an IF statement to fix it
SELECT
id,
book_id,
SUBSTRING(
content,
IF(LOCATE("unicorn hair", content) > 10, LOCATE("unicorn hair", content) - 10, 1),
10 + LENGTH("unicorn hair") + 10
)
FROM
book_chapter
WHERE book_chapter.book_id="2"
AND MATCH (book_chapter.content) AGAINST ("unicorn hair" IN BOOLEAN MODE)

Try something like this for creating a snippet of the first match of the search phrase plus 10 characters before it and 10 characters after it (this is not 30 characters in length, but may be a better solution depending on the length of the search phrase, i.e. what if your search phrase > 30 characters). This doesn't address your wish to possibly show multiple results for the same record in the result set. For something like that I would almost think you would be best server creating a stored procedure to do the work you want for you.
SELECT id, book_id, SUBSTRING(content, LOCATE("unicorn hair", content) - 10, 10 + LENGTH("unicorn hair") + 10) FROM book_chapter WHERE book_chapter.book_id="2" AND
MATCH (book_chapter.content) AGAINST ("unicorn hair" IN BOOLEAN MODE)
Obviously you would replace "unicorn hair" with whatever your search phrase is in all it's locations.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL Fulltext search present me inaccurate result - mysql

I have tested it, this query is giving right result SELECT s.student_firstname, s.student_lastname FROM students s WHERE MATCH (student_firstname, student_lastname) AGAINST ('+Patrik +Svensson*' IN BOOLEAN mode)

Related

Select numberplate from a table with a missing letter or number when the original numberplate can not be found

MySQL search query ordered by match relevance

Selecting rows based upon a search string or any of its synonyms

MySQL Natural Sort (like OSX Finder)

Full-text MySQL search - return snippets

Categories

Resources