MySQL full text search - no partial recognition - mysql

I'm trying to build a keyword search tool based on mysql and I can only get results for full words while I would like to get results for partial matches too.
My db structure looks like this:
My db content looks like this:
This query works:
select * from chromext_keyword where matches (keyword) against ('Redmi')
But this one doesn't work (no result):
select * from chromext_keyword where matches (keyword) against ('red')
I tried with % but it did not solve the problem. I tried the natural language option as well as boolean but it didn't help.
Update with create table query:
CREATE TABLE chromext_keywords (
id int(10) NOT NULL,
keyword text NOT NULL,
blacklist text NOT NULL,
category text NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
and insert:
INSERT INTO chromext_keywords (id, keyword, blacklist, category) VALUES
(1, 'Redmi Note 10', '9,8,pro', '2'),
(2, 'Realme GT', '6,7,8,narzo', '2');
and I added full text:
ALTER TABLE chromext_keywords
ADD UNIQUE KEY id (id);
ALTER TABLE chromext_keywords ADD FULLTEXT KEY keyword (keyword);
I have also tried innoDb and Myisam
Am I missing something?
Thanks

You should check for Minimum word lenght setting ..
in mysql the minimum length for full text search in limited by the param
ft_min_word_len
and the defualt value is for words > 3
take a look at the related docs
https://dev.mysql.com/doc/refman/8.0/en/fulltext-fine-tuning.html

I have finally found the answer.
The following query works:
SELECT * FROM chromext_keywords WHERE match(keyword) against('(re*)') IN BOOLEAN MODE)
With multiple keywords:
SELECT * FROM chromext_keywords WHERE match (keyword) against ('(+red*+not*)') IN BOOLEAN MODE)
I still need to figure out how to cover spelling mistakes. If anyone has an idea, let me know.

Related

MySQL WHERE Condition on integer field returning incorrect values

I'm having a problem with MySQL returning the incorrect result when applying a WHERE condition to an integer field with a string value.
CREATE TABLE `people` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1;
INSERT INTO `people` (`id`, `name`)
VALUES
(1, 'Bob'),
(2, 'Sally'),
(3, 'Jim');
Now when I run the query:
SELECT *
FROM people
WHERE id = '1-abcd';
My result set is:
id name
1 Bob
MySQL appears to be truncating the string value '1-abcd' to '1' behind the scenes as soon as it hits a non-integral character (in the conversion from a string to INT).
You're probably wondering why this matters. I'm trying to fix a site for a PCI compliance scan. The scan thinks the URI '/some/page?id=102-1' is allowing some form of sequel injection, but in reality it's showing the same content at '/some/page?id=102'.
This is not an issue in one place. It is an issue all over the place, and it's a fairly large system. Is there some way to rectify this on the MySQL end of things, so it no longer mistakenly judges the two values to be equivalent? I looked at the documentation for SQL modes, but didn't see anything regarding this circumstance.
UPDATE: I filed a dispute with the company that produced the scan, which they accepted, so I'm no longer in the woods. But it is disappointing that there's apparently no way to configure the casting behavior of MySQL from a string to INT in this case. (You can, but only for INSERTs and UPDATEs.)
What happens that MySQL type-casts the string literal value to an integer, and when it does that it starts from the left of the string and as soon as it reaches a character that cannot be considered part of a number, it strips out everything from that point on. So 1-0 gives output matching to 1. To do this you can use cast. I am not 100% sure about the syntax but it is like this:
select * from people
where id =
(
case when ISNUMERIC( '1-0' )
then cast ('1-0' as int)
else null
end )
What this will do is that if it is an numeric value then it will return the correct matching row or else not.
Edit:
The above query seems to be of MSSQL/Oracle and would not work with MySQL. For MySQL you can use RegExp. I have never use one but you can find more details here:
http://mysqlhints.blogspot.in/2012/01/how-to-find-out-if-entire-string-is.html
http://www.ash.burton.fm/blogs/2010/12/quick-tip-mysql-equivalent-of-isnumeric
http://www.justskins.com/forums/how-to-use-isnumeric-137604.html

Fulltext search in mysql doesn't retrieve all rows

i've a problem with a query in mysql.
This is what i done:
CREATE TABLE `dar`.`MyTable` (
`MyCol` VARCHAR(100) NOT NULL,
FULLTEXT INDEX `Index_1`(`MyCol`)
)
ENGINE = MyISAM;
INSERT INTO MyTable (MyCol)
VALUES ('6002.C3'),
('6002'),
('6002R1'),
('6003.C4'),
('AA6002.X'),
('BB 6002.X');
This is not necessary, but i've done anyway:
REPAIR TABLE MyTable QUICK;
Now, i execute the next query:
SELECT MyCol FROM MyTable
WHERE MATCH(MyCol) AGAINST ('6002*');
And, it doesn't return any row!!
The parameter ft_min_word_len i've changed to 2, but nothing is changed.
When deleting the row with 'BB 6002.X' the query returns 2 rows!!
6002
6002.C3
That is creepy.
Any idea what is happening here?
I need the query return:
6002.C3
6002
6002R1
Plus if include:
AA6002.X
BB 6002.X
Thanks in advance!!
You are past the 50% threshold in your dataset. Try
SELECT MyCol FROM MyTable
WHERE MATCH(MyCol) AGAINST ('6003');
And see what the result is.
The 50% threshold has a significant implication when you first try full-text searching to see how it works: If you create a table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results. Be sure to insert at least three rows, and preferably many more. Users who need to bypass the 50% limitation can use the boolean search mode; see Section 12.9.2, “Boolean Full-Text Searches”.
http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html

Mysql match against numeric keyword

I use mysql full-text index.
I found it can not matches a key numeric word as '1' in '1,2,3' or '1 2 3'.
I use this query "SELECT * FROM users u where match(u.leader_uids) against('1' IN BOOLEAN MODE);"
How to solve this issue?
Thanks a lot!
I am Providing the example hope it will works for you i think
MATCH (field) AGAINST ('+856049' IN BOOLEAN MODE)
It will work only with words of 4 or more digits. So you must concat some prefix in the leader_uid before saving it. Example:
CREATE TABLE mytable(
id INT NOT NULL KEY AUTO_INCREMENT,
myfield TEXT,
FULLTEXT KEY ix_mytable (myfield)
);
INSERT INTO mytable (myfield) VALUES
('id_1 id_2 id_3'),
('id_8'),
('id_4 id_1');
SELECT * FROM mytable
WHERE MATCH(myfield) AGAINST ('+id_1' IN BOOLEAN MODE);
-- will select rows 1 and 3
You can change the minimum amount of chars required for the words, in mysql config:
https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_ft_min_token_size

Query gives #1305 - FUNCTION database-name.LEN does not exist; WHY?

EDIT3 MySQL Fiddle Here. I have made the example MySQL so you can see the actual problems. While I am expecting to have Jamie Foxx, Christoph Waltz in the 2 names result it gives much more. Even though it is written the exact same way as it was in the SQL example where it return the names correctly. :/
EDIT2 SQL Fiddle here. This is a much simpler version, but the logic is there. I need to have this working in MySQL as the fiddle is in SQL. When I just replace the SQL functions with LENGTH and LOCATE and test it with PhpMyAdmin it returns the entire content of the actors column, not just the first two names. I am even more confused now as the LOCATE is supposed to be equivalent to the CHARINDEX.
EDIT1 *Oh, I just found it that neither LEN or CHARINDEX exist in MySQL. I think I can replace LEN with LENGTH, but I don't know what to do with the CHARINDEX I tried using LOCATE but the result is incorrect it gives the full content of the actors field. Any insight on this?
Another follow up on my previous questions. This should be the end of it though. I had part of the query that uses the len function working on SQL Fiddle, but as soon as I implemented it into the final query in my actual database I getting the function does not exist error. Listing all I consider related below:
MySQL query
SELECT
title,
director,
thumb,
LEFT(actors, LEN(actors) - CHARINDEX(', ', actors))AS '2 names'
FROM test
WHERE MATCH (title, director, actors) AGAINST ('arc*' IN BOOLEAN MODE)
The error
#1305 - FUNCTION database-name.LEN does not exist
Setup
OS: MAC OSX
SERVER: MAMP
DB ACCESS: PhpMyAdmin
SHOW CREATE TABLE test
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`director` varchar(255) NOT NULL,
`actors` varchar(10000) NOT NULL DEFAULT 'Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington, Samuel L. Jackson',
`summary` text NOT NULL,
`cover` varchar(255) NOT NULL DEFAULT 'http://localhost:8888/assets/OBS/img/uploads/covers-thumb/django_thumb.jpg',
`thumb` varchar(255) NOT NULL DEFAULT 'http://localhost:8888/assets/OBS/img/uploads/covers-bug/django1_cover.jpg',
`trailer` varchar(255) NOT NULL DEFAULT 'fnaojlfdUbs',
PRIMARY KEY (`id`),
FULLTEXT KEY `myindex` (`title`,`director`,`actors`)
) ENGINE=MyISAM AUTO_INCREMENT=101 DEFAULT CHARSET=utf8
I really don't understand why I am getting this error, as the statement worked fine on SQL Fiddle. If you'd need any additional information just ask for it. Thank you all for reading and in advance for your replies.
BTW: Any chance it is caused by actors varchar(10000)?
MySQL doesn't have a built-in CHARINDEX() function.Instead you can use LOCATE equivalent to charindex ,and instead of LEN you can use LENGTH
SELECT
title,
director,
thumb,
LEFT(actors, LENGTH(actors) - LOCATE(', ', actors))AS '2 names'
FROM test
WHERE MATCH (title, director, actors) AGAINST ('arc*' IN BOOLEAN MODE)
See fiddle demo
If you just want to show two actors name you can use SUBSTRING_INDEX
SELECT
title,
director,
thumb,
SUBSTRING_INDEX(actors, ',', 2) AS '2 names'
FROM test
WHERE MATCH (title, director, actors) AGAINST ('test*' IN BOOLEAN MODE)
See second fiddle demo

Strange behavior when query for varchar filed

I came across this strange behavior when I was hunting for a bug in a system. Consider following.
We have a mysql table which have varchar(100) column. See the following sql script.
create table user(`id` bigint(20) NOT NULL AUTO_INCREMENT,`user_id` varchar(100) NOT NULL,`username` varchar(255) DEFAULT NULL,PRIMARY KEY (`id`),UNIQUE KEY `user_id` (`user_id`)) ENGINE=InnoDB AUTO_INCREMENT=129 DEFAULT CHARSET=latin1;
insert into user(user_id, username) values('20120723145614834', 'user1');
insert into user(user_id, username) values('20120723151128642', 'user1');
When I execute following query I received 0 results.
select * from user where user_id=20120723145614834;
But When I execute following I get the result(note the single quote).
select * from user where user_id='20120723145614834';
This is expected since user_id field is varchar. Strange thing is that both following queries yield result.
select * from user where user_id=20120723151128642;
select * from user where user_id='20120723151128642';
Can anybody explain me the reason for this strange behavior. My MySql version is 5.1.63-0ubuntu0.11.10.1
Check mysql document 12.2. Type Conversion in Expression Evaluation
Comparisons that use floating-point numbers (or values that are
converted to floating-point numbers) are approximate because such
numbers are inexact. This might lead to results that appear
inconsistent:
mysql> SELECT '18015376320243458' = 18015376320243458;
-> 1
mysql> SELECT '18015376320243459' = 18015376320243459;
-> 0
So we better use always right data type for SQL.