MySQL: searching rows with TEXT field beginning with a query - mysql

I have a table and one of the column is TEXT type. I need to search the table for the rows with the text similar to the given string.
As the string can be pretty long (let's say 10000 bytes) I decided that it will be enough to compare only first 20 bytes of the string. To do this search faster I created a key:
KEY `description` (`description`(20))
So what I want to do now is one of the following query:
SELECT * FROM `table` WHERE STRCMP(SUBSTRING(`description`,0,20),'string_to_compare') = 0
or
SELECT * FROM `table` WHERE `description` LIKE 'string_to_compare%')
Note that I put only one percentage sign at the end of string_to_compare for saying to DB that I want to compare only first bytes.
I hope that MySQL brains will do the best to use key and not to do any extra moves.
Questions:
Is there any difference which query is better? I'm personally prefer the second as it looks clearer and hopefully will be better understand by the DB engine (MyISAM).
Is that correct the MySQL MyISAM will make an efficient code for these queries?
How do I put '%' in the PDO's prepare statement? SELECT * FROM table
WHERE description LIKE ":text%"?

Yes, there's a difference. When the WHERE condition calls a function on the column value, indexes can't be used. I don't think it will realize that your SUBSTRING() call happens to match the indexed part of the text and use that. On the other hand, LIKE is specifically coded to recognize the cases where it can use an index. Also, if you want to compare two strings for equality, you should use =, not STRCMP(), e.g.
WHERE SUBSTRING(`description`,1,20) = 'string_to_compare'
I believe it will make an efficient query for the LIKE version.
The placeholder can't be in quotes for it to work. Use CONCAT() to combine it: WHERE description LIKE CONCAT(:text, '%'). Or you can put the % at the end of the PHP variable that you bind to the placeholder, and use WHERE description LIKE :text.

Related

Extracting a value from an Array using mysql

I have a column that has brand names in an array format as below:
I want to extract information associated with Brand4 for example 'price'.
I tried using the below, but that's a psql query. How can I extract this information using MySQL in GCP.
SELECT Brand_name, price
FROM table_name
Where 'Brand4'=Any(Brand_name)
First, the explanation for your error message is that in MySQL, ANY() accepts a subquery, not just a single column or expression. See https://dev.mysql.com/doc/refman/8.0/en/any-in-some-subqueries.html
MySQL does not have an array type. Your Brand_name column is not an array, it's a string. It happens to contain commas and square brackets, but these are just characters in a string.
So your solutions are to use various string-search functions or expressions, as other folks have suggested.
The downside to all the string-search functions is that they cannot be optimized with a conventional index. So every search will be expensive, because it requires a table-scan.
Another solution I did not see yet is to use a fulltext index.
alter table brands add fulltext index (brand_name);
select * from brands
where match(brand_name) against ('Brand4' in boolean mode);
This may require some special handling if the brand names contain spaces or punctuation, but if they are plain words, it should work.
Read https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html to understand more about fulltext indexes.
The best solution would be to eliminate this fake "array" column by normalizing the schema to store one brand per row in another table. Then you can match strings exactly and optimize with a conventional index. But I understand you said that the table structure is not up to you.
This should work in MySQL (using a string function as mention here):
SELECT *
FROM brands
WHERE FIND_IN_SET('Brand4',brand_name);
see: DBFIDDLE
Provided SQL query will work in MySQL, if you will make a subquery within the parentheses, or use FIND_IN_SET instead of using ANY.
But, as stated in the MySQL documentation:
This function does not work properly if the first argument contains a
comma (,) character.
So, as an alternative, you could use LIKE (simple pattern matching).
Your SQL code then would be:
SELECT `brand_name`, `price`
FROM `test`
WHERE `brand_name` LIKE "%Brand4%"
See SQLFiddle for live example.
Also, you could use LOCATE.
Or any other alternative solution.
But, I must say that storing list data in the way you do, - it's not the best practice out there.
There are plenty of ways this can be done better.
For example, using M:M (many-to-many) relationship.
In case you made this design you really have to reconsider/redesign. Databases have there own data structures and sql is not an imparative language but a declaritve one.
If when you didnĀ“t desing you should consider create a table out of the one column. Perhaps this is what you try.
If it is just locating a specific string in the values of a field use like
SELECT Brand_name, price
FROM table_name
Where brand_anme like '%Brand4%'
But realize this is will not always yield accurate results.

full text search vs index to check exactly same text

i have such long text that over bytes limit of index.
The text doesn't have any space. it's combination of things like dbname and table_name.
I will query only check exactly same
select * from table where column='text'
or
if i use full text search it may like this.
select * from table where Match (column) Against(text)
i don't know which one is best between two options below.
i will use text_example = '{dbName}/{tableName}' for explain but in real, it's combination of at least 5types of string.
option1. use full text search
option2. use index
To use index, i have to split text_example to 5types. And then create index each types.
create table info (
dbName text,
tableName text,
index dbIdx (dbName),
index tableIdx (tableName)
);
insert query will often.
In this case, which one is best aspects of storage?
how about performance?
or is there any good way to improve select query with such long text?
FYI, i use mysql 8.0
Punctuation, such as /, separates "words" for the purpose of FULLTEXT.
"dbname/tablename" is not very long text -- that is limited to perhaps 129 characters. (So, I don't understand where "over bytes limit of index" is coming from.)
col = 'text' checks the whole column; MATCH..AGAINST checks "words" in the column. That is, they do different things.
Please clarify what the real text is like and what the real query is like. Then we can advise on FULLEXT versus BTREE.

mysql database field type for search query

I tried searching in different terms & got some answers too but they were not matching to my requirements. like This Link
I am using a sql statement something like below to fetch matching results from MySQL table.
SELECT statements... WHERE keyword_title_field REGEXP 'abc|axy|91store';
My questions is:
What data type (e.g. varchar, text etc) should i choose for keyword_title_field field in MySQL table to fetch results quickly without putting much load on table/server.
My current data type is Text due to unknown character length supply by user. Is this best suited or should i change?
Though it's not mandatory but any reference reading along with answer would be great for my understanding.
Here are some things to consider:
When you use any field in conditions (like REGEXP, LIKE or even '=') it is importand that you put an INDEX on the field. This will make MySQL not search every record 1 by 1, but find it via its INDEX instead. So make sure to look into that -> https://www.tutorialspoint.com/mysql/mysql-indexes.htm
The less characters allowed in your field, the smaller the INDEX is. You however have variable lengths to consider, so a TEXT is fine. If you know the maximum length and it's less than 256 characters, use a VARCHAR. Just make sure to index the field.
Note that REGEXP is relatively slow. LIKE '%term%' would be prefered, but that of course depends on your needs. If it's just 'abc' OR 'axy' OR '91store', you could consider this query: SELECT statements... WHERE keyword_title_field IN ('abc', 'axy', '91store');

MySQL: Transform "LIKE" search to fulltext?

I have a pretty simple LIKE search for MySQL that i'd like to transform into a fulltext. The problem is i need to be able to implement it so that it starts with X. Like the example below:
SELECT column FROM table WHERE column LIKE "startswith%"
as you can see that query returns all results that begins with "startswith". I need to do this with a fulltext.
Is this possible to do?
No, that isn't how fulltext works (it's actually just a list with loose words underneath, no information about location relative to the string) but there's no reason why you can't have that LIKE ... as an extra WHERE clause. FULLTEXT can still help to get a smaller subset of results if you haven't got another key on column. If you do have a key on column, using FULLTEXT for this is useless.
You can set a key on just the start of a column with ADD INDEX (column(123)); (which would only index the first 123 characters). THis also works for text/blob columns (in the latter case it's the binary length you give).
I am not sure about MySQL, but in SQL Server you can convert the column into a varchar and perform a LIKE on the result, such as below:
SELECT column FROM table WHERE CONVERT(varchar(255), column) LIKE 'startswidth%'
Since this is ANSI standard, I presume MySQL will handle this as well.

How to search for rows containing a substring?

If I store an HTML TEXTAREA in my ODBC database each time the user submits a form, what's the SELECT statement to retrieve 1) all rows which contain a given sub-string 2) all rows which don't (and is the search case sensitive?)
Edit: if LIKE "%SUBSTRING%" is going to be slow, would it be better to get everything & sort it out in PHP?
Well, you can always try WHERE textcolumn LIKE "%SUBSTRING%" - but this is guaranteed to be pretty slow, as your query can't do an index match because you are looking for characters on the left side.
It depends on the field type - a textarea usually won't be saved as VARCHAR, but rather as (a kind of) TEXT field, so you can use the MATCH AGAINST operator.
To get the columns that don't match, simply put a NOT in front of the like: WHERE textcolumn NOT LIKE "%SUBSTRING%".
Whether the search is case-sensitive or not depends on how you stock the data, especially what COLLATION you use. By default, the search will be case-insensitive.
Updated answer to reflect question update:
I say that doing a WHERE field LIKE "%value%" is slower than WHERE field LIKE "value%" if the column field has an index, but this is still considerably faster than getting all values and having your application filter. Both scenario's:
1/ If you do SELECT field FROM table WHERE field LIKE "%value%", MySQL will scan the entire table, and only send the fields containing "value".
2/ If you do SELECT field FROM table and then have your application (in your case PHP) filter only the rows with "value" in it, MySQL will also scan the entire table, but send all the fields to PHP, which then has to do additional work. This is much slower than case #1.
Solution: Please do use the WHERE clause, and use EXPLAIN to see the performance.
Info on MySQL's full text search. This is restricted to MyISAM tables, so may not be suitable if you wantto use a different table type.
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
Even if WHERE textcolumn LIKE "%SUBSTRING%" is going to be slow, I think it is probably better to let the Database handle it rather than have PHP handle it. If it is possible to restrict searches by some other criteria (date range, user, etc) then you may find the substring search is OK (ish).
If you are searching for whole words, you could pull out all the individual words into a separate table and use that to restrict the substring search. (So when searching for "my search string" you look for the the longest word "search" only do the substring search on records containing the word "search")
I simply use SELECT ColumnName1, ColumnName2,.....WHERE LOCATE(subtr, ColumnNameX)<>0
To get rows with ColumnNameX having the substring.
Replace <> with = to get rows NOT having the substring.