Quick way to find a word using a SQL query

Quick way to find a word using a SQL query - mysql

My current code is to try to find 2 words "Red Table" in Title:
SELECT `id`,`title`,`colors`, `child_value`, `vendor`,`price`, `image1`,`shipping`
FROM `databasename`.`the_table` WHERE
`display` = '1' and (`title` REGEXP '([[:blank:][:punct:]]|^)RED([[:blank:][:punct:]]|$)')
and (`title` REGEXP '([[:blank:][:punct:]]|^)TABLE([[:blank:][:punct:]]|$)')
The problem is, this is so slow! I even put the status "Index" to the column Title.
I just want to search for multiple words in one (I would prefer in title AND description), but obviously I can't use LIKE because it has to be separated by space or dash or start or end with that word etc.
I tried chat or something like that but phpmyadmin said function doesn't exist.
Any suggestions?

You can not employ regular index for LIKE or REGEXP. Use Full Text Search for this. You can create FULLTEXT index on many columns and search them all in a single expression.
CREATE TABLE the_table(
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
description TEXT,
...
FULLTEXT(title,description)
) ENGINE=InnoDB;
And write query like this:
SELECT * FROM the_table WHERE MATCH(title , description )
AGAINST('+RED +TABLE' IN BOOLEAN MODE) -- + means that a word must be present in each row that is returned
Read more about usage and options: MySQL Full text search

Plan A: MySQL pre-8.0: [[:<:]]RED[[:>:]] is a simplification. Those codes mean "word boundary", which encompasses space, punctuation, and start/end.
Plan B: MySQL 8.0: \\bRED\\b is the new phrasing of the Plan A.
Plan C: FULLTEXT(title) with AND MATCH(title) AGAINST('+RED +TABLE' IN BOOLEAN MODE)
Plan D: If you need specifically "RED TABLE" but not "BLUE TABLE AND A RED CHAIR", then use this technique:
AND MATCH(title) AGAINST('+RED +TABLE' IN BOOLEAN MODE)`
AND LIKE '%RED TABLE%';
Note on Plan D: The fulltext search is very efficient, hences it is done first. Then other clauses are applied to the few resulting rows. That is the cost of LIKE (or a similar REGEXP) is mitigated by having to check many fewer rows.
Note: We have not discussed "red table" versus "red tables"
By having suitable collation on the column title, you can either limit to the same case as the argument of have "Red" = "RED" = "red" = ...
Plan E: (To further fill out the discussion): FULLTEXT(title) with AND MATCH(title) AGAINST('+"red table" IN BOOLEAN MODE) should match only "red" and "table" in that order and next to each other.
In general...
Use ENGINE=InnoDB, not MyISAM.
It is not really practical to set the min word len to 1 or 2. (3 is the default for InnoDB; and all settings have different names.)
If your hosting provider does not allow any my.cnf changes, change providers.

Related

What regex can I use to match a percentage of a string?

For example, let's say I have the strings "STARBUCKS #999", "STARBUCKS NYC", "STAR-BUCKS SEA 109"
I want to use regex to query a MySQL database and match the given strings with the column name eg. "starbucks" (or any other name). The problem is, the strings vary and cannot be predicted so I need to match something like 70% of the word so that I can be reasonably confident that I have a match. Is that something that can be done with simple regex or is it a more complicated problem?

I think you can use the fulltext search index to achieve your goal.
CREATE TEMPORARY TABLE input_table (
input TEXT
) ENGINE=MYISAM; -- InnoDB does not support fulltext search index on temporary tables
ALTER TABLE input_table ADD FULLTEXT SEARCH(input);
INSERT INTO input_table (input)
SELECT ('STARBUCKS NYC')
UNION
SELECT ('STAR-BUCKS SEA 109')
UNION
SELECT ('STARBUCKS #999')
UNION
SELECT ('SOMETHING ELSE');
SELECT input, match(input) against('>starbucks <star <bucks' IN BOOLEAN MODE) as `match` from input_table;
DROP TEMPORARY TABLE input_table;
More about fulltext search you can check here.

MySQL wildcard Like query with multiple words

I have a mysql query as follows.
$query="SELECT name,activity FROM appid
where result > 5 AND name LIKE :term ORDER BY name ASC LIMIT 0,40";
$result = $pdo->prepare($query);
$result->bindvalue(':term','%'.$_GET["q"].'%',PDO::PARAM_STR);
$result->execute();
What i want to do is this.
I have and entry like this that i want to find
'News & Weather'
However when i type
'news weather'
it of course will not find it. How can i be able to type that and retrieve that entry?

Regular expressions can do the trick:
select *
from appid
where name rlike 'news|weather' -- Matches 'news' or 'weather'
Another example:
select *
from appid
where name rlike 'news.*weather' -- Matches 'news' and 'wether'
-- with any characters in-between or none at all
-- (ordered)
Just one more:
select *
from appid
where name rlike '^news.{1,}weather$' -- Matches any text that starts with 'news'
-- and has at least one character before
-- ending with 'weather'
Regular espressions can be used to create very complicated filters on text fields. Read the link above for more information.
If you can add a full-text index to your table, Full-text search might be the better way to go with this. Specifically, a boolean Full-Text search:
select *
from appid
where match(name) against (+news +weather)

I believe the only way possible are through code:
Option A: Replace the spaces in your query parameter with '%' in code, but that of course will make the multiple words ordered
Option B: Split your parameter on spaces and dynamically construct your query with as many LIKEs as needed, adding additional ":termN" parameters for each one.

Big MySQL table, REPLACE -> very slow query

I have a table with 17.6 million rows in a MyISAM database.
I want to searh an article number in it, but the result can not depend on special chars as dot,comma and others.
I'm using a query like this:
SELECT * FROM `table`
WHERE
replace(replace(replace( replace( `haystack` , ' ', '' ),
'/', '' ), '-', '' ), '.', '' )
LIKE 'needle'
This method is very-very slow. table has an index on haystack, but EXPLAIN shows query can not use that, That means query must scan 17.6 million rows - in 3.8 sec.
Query runs in a page multiple times (10-15x), so the page loads extremly slow.
What should i do? Is it a bad idea to use replace inside the query?

As you do the replace on the actual data in the table, MySQL can't use the index, as it doesn't have any indexed data of the result of the replace which it needs to compare to the needle.
That said, if your replace settings are static, it might be a good idea to denormalize the data and to add a new column like haystack_search which contains the data with all the replaces applied. This column could be filled during an INSERT or UPDATE. An index on this column can then effectively be used.
Note that you probably want to use % in your LIKE query as else it is effectively the same as a normal equal comparison. Now, if you use a searchterm like %needle% (that is with a variable start), MySQL again can't use the index and falls back to a table scan as it only can use the index if it sees a fixed start of the search term, i.e. something like needle%.
So in the end, you might end up having to tune your database engine so that it can held the table in memory. Another alternative with MyISAM tables (or with MySQL 5.6 and up also with InnoDB tables) is to use a fulltext index on your data which again allows rather efficient searching.

It's "bad" to apply functions to the column as it will force a scan of the column.
Perhaps this is a better method:
SELECT list
, of
, relevant
, columns
, only
FROM your_table
WHERE haystack LIKE 'two[ /-.]needles'
In this scenario we are searching for "two needles", where the space between the words could be any of the character within the square brackets i.e. "two needles", "two/needles", "two-needles" or "two.needles".

You could try using LENGTH on the column, not sure if it gives a better affect. Also, when using LIKE you should use the %
SELECT * FROM `table`
WHERE
haystack LIKE 'needle%' AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'/','')) = 0 AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'-','')) = 0 AND
LENGTH(haystack) - LENGTH(REPLACE(haystack,'.','')) = 0;
If the haystack is exactly needle then do this
SELECT * FROM `table`
WHERE
haystack='needle';

ORDER BY in MySql based on LIKE condition

I am facing difficulty in sorting the result based on field in mysql. Say for example I am searching the word "Terms" then I should get the results which starts with 'Terms' first and then 'Terms and' as next and then 'Terms and conditions' and so on.
Any one please help out who to fetch the search result based on my requirements in efficient manner using mysql query.

SELECT * FROM your_table WHERE your_column LIKE "Terms%" ORDER BY your_column;

Based on the storage engine and mysql version you probably can use the full text search capabilities of MySQL. For example:
SELECT *, MATCH (your_column) AGAINST ('Terms' IN BOOLEAN MODE) AS relevance
FROM your_table
WHERE MATCH (your_column) AGAINST ('Terms' IN BOOLEAN MODE)
ORDER BY relevance
You can find more info here: http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
Or if you don't want FTS another possible solution where ordering is strictly based on the length (difference) of the strings.
SELECT * FROM your_table WHERE your_column LIKE "Terms%" ORDER BY ABS(LENGTH(your_column) - LENGTH('Terms'));

You are looking for fulltext search. Below a very simple example
SELECT id,name MATCH (name) AGAINST ('string' > 'string*' IN BOOLEAN MODE) AS score
FROM tablename WHERE MATCH (name) AGAINST ('string' > 'string*' IN BOOLEAN MODE)
ORDER BY score DESC
The advantage of this is that you can control the value of words. This is very basic, you can 'up' some matches or words (or 'down' them)
In my example an exact match ('string') would get a higher score than the string with something attached ('string*'). The following line is even one step broader:
'string' > 'string*' > '*string*'
This documentation about fulltextsearch explains allot. It's a long read, but worth it and complete.

Don't use fulltext index if you search for prefix string!
Using LIKE "Term%" the optimizer will be able to use a potential index on your_column:
SELECT * FROM your_table
WHERE your_column LIKE "Terms%"
ORDER BY CHAR_LENGTH(your_column),your_column
Note the ORDER BY clause: it first sorts by string length, and only use alphabetcal order to sort strings of equal length.
And please, use CHAR_LENGTH and not LENGTH as the first count the number of characters, whereas the later count number of bytes. Using a variable length encoding such as utf8, this would made a difference.

SQL SERVER FULL-TEXT INDEX, CONTAINS return empty

I got a issue about full index, any body can help me on this?
1) set up full text index
CREATE FULLTEXT INDEX ON dbo.Companies(my table name)
(
CompanyName(colum of my table)
Language 0X0
)
KEY INDEX IX_Companies_CompanyAlias ON QuestionsDB
WITH CHANGE_TRACKING AUTO
GO
2) Using CONTAINS to find the matched rows
SELECT CompanyId, CompanyName
FROM dbo.Companies
WHERE CONTAINS(CompanyName,'Micro')
3) All is going well. just just just return empty resultset. And I am sure there is company with CompanyName "Microsoft" in Table Company
Much appreciated if anybody does me a favor on this.

Your CONTAINS(CompanyName,'Micro') is looking for the word Micro, if you want a prefix match to pick up "Microsoft" use the syntax: CONTAINS(CompanyName,'"Micro*"').

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Quick way to find a word using a SQL query - mysql

Related

What regex can I use to match a percentage of a string?

MySQL wildcard Like query with multiple words

Big MySQL table, REPLACE -> very slow query

ORDER BY in MySql based on LIKE condition

SQL SERVER FULL-TEXT INDEX, CONTAINS return empty

Categories

Resources