MySQL - FIND_IN_SET for comma separated field values - mysql

I have few fields in one table which are storing comma separated values. I can show query using like or FIND_IN_SET to find data from comma separated values.
MySQL query finding values in a comma separated string
However, I would like to know that how much both options FIND_IN_SET('red',colors) & like with comma giving performance impact? Are they used field indexing to provide result?
If not then how can we optimize query and fetch data fast from comma separated fields using index?

A basic rule in query performance: Have a suitable index tuned for the query. What is your query? Let's also see SHOW CREATE TABLE.
A basic rule in index usage: Don't hide an indexed column inside a function. Doing so leads to ignoring the index. That leads to scanning the entire table. That is slow. I am referring to color hidden in FIND_IN_SET().
A basic rule in building a schema is that "arrays" need to be represented as rows, usually in a separate table. Not in a commalist. Not splayed across columns.

Related

Extracting a value from an Array using mysql

I have a column that has brand names in an array format as below:
I want to extract information associated with Brand4 for example 'price'.
I tried using the below, but that's a psql query. How can I extract this information using MySQL in GCP.
SELECT Brand_name, price
FROM table_name
Where 'Brand4'=Any(Brand_name)
First, the explanation for your error message is that in MySQL, ANY() accepts a subquery, not just a single column or expression. See https://dev.mysql.com/doc/refman/8.0/en/any-in-some-subqueries.html
MySQL does not have an array type. Your Brand_name column is not an array, it's a string. It happens to contain commas and square brackets, but these are just characters in a string.
So your solutions are to use various string-search functions or expressions, as other folks have suggested.
The downside to all the string-search functions is that they cannot be optimized with a conventional index. So every search will be expensive, because it requires a table-scan.
Another solution I did not see yet is to use a fulltext index.
alter table brands add fulltext index (brand_name);
select * from brands
where match(brand_name) against ('Brand4' in boolean mode);
This may require some special handling if the brand names contain spaces or punctuation, but if they are plain words, it should work.
Read https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html to understand more about fulltext indexes.
The best solution would be to eliminate this fake "array" column by normalizing the schema to store one brand per row in another table. Then you can match strings exactly and optimize with a conventional index. But I understand you said that the table structure is not up to you.
This should work in MySQL (using a string function as mention here):
SELECT *
FROM brands
WHERE FIND_IN_SET('Brand4',brand_name);
see: DBFIDDLE
Provided SQL query will work in MySQL, if you will make a subquery within the parentheses, or use FIND_IN_SET instead of using ANY.
But, as stated in the MySQL documentation:
This function does not work properly if the first argument contains a
comma (,) character.
So, as an alternative, you could use LIKE (simple pattern matching).
Your SQL code then would be:
SELECT `brand_name`, `price`
FROM `test`
WHERE `brand_name` LIKE "%Brand4%"
See SQLFiddle for live example.
Also, you could use LOCATE.
Or any other alternative solution.
But, I must say that storing list data in the way you do, - it's not the best practice out there.
There are plenty of ways this can be done better.
For example, using M:M (many-to-many) relationship.
In case you made this design you really have to reconsider/redesign. Databases have there own data structures and sql is not an imparative language but a declaritve one.
If when you didnĀ“t desing you should consider create a table out of the one column. Perhaps this is what you try.
If it is just locating a specific string in the values of a field use like
SELECT Brand_name, price
FROM table_name
Where brand_anme like '%Brand4%'
But realize this is will not always yield accurate results.

Why mysql fulltext search not run as expected?

I have a table with 700, 000 rows. One column called 'data' is text type. I add fulltext index on this column to improve my query speed.
Here are two query, the second not return as expected.
You can see that the first query return one result with the keywords I specified.
It took 2 seconds
I thought the second query should run faster since I give more filter condition. But it tooks about one minute.
Giving more conditions should narrow down the data set to search, why it slower?
MYSQL version is 8.0.16 Engine is InnoDB. Sorry about the Mosaic
Giving more conditions should narrow down the data set to search, why it slower?
FULLTEXT will search for each required string, getting a list of row numbers (or something equivalent).
For multiple required strings, it will get multiple lists and "AND" them together.
Furthermore, when there are two strings ("26228" and "31500733" in quotes together, their adjacency needs to be verified. This may be the slow part.
Consider this instead
MATCH AGAINST(+uf8... +26228 +31500733 IN BOOLEAN MODE)
That won't test the adjacency, but that might not matter to the end results. (Note also that I skipped the too-short "i".)

MySQL - Select substring from a column, without catching similar substrings from same column

In a MySQL table I have a VARCHAR column called ShareID.
If the ShareID value for Row #1 contains a string in the form of 1
and the ShareID value for Row #2 contains a string in the form of 10, 1
and the ShareID value for Row #3 contains a string in the form of 111, 12.
I would like to grab all the rows where the ShareID is 1. i.e. ONLY the first and second rows here.
I have tried using the LIKE command, like so:
SELECT * FROM tablename WHERE ShareWithID LIKE '1%';
but this will catch ALL the rows that contain the number 1 in it, i.e. Row #3 which is not what I want.
I would like to run a command that would ONLY return rows #1 and #2 above because they have a ShareID of 1 contained within it.
I've tried a variety of commands, (including REGEXP, and IN) and managed a 'frig' solution where I'd place a comma after EVERY number in the ShareID column, including the last one (i.e. 10, 1,), and then execute this command:
SELECT * FROM tablename WHERE ShareWithID LIKE '%1,%';
But I would rather use a proper solution over a frigged solution.
Any guidance would be most welcome.
You should not be storing lists of numbers in a comma-delimited string. This is a really bad idea:
Number should be stored as numbers, not strings.
Your numbers appear to be ids. Ids should have explicit foreign keys defined.
SQL -- in general -- has lousy string handling functions.
SQL cannot optimize the queries with string operations.
SQL has a great way of storing lists. It is called a table.
Sometimes, though, we are stuck with other peoples really, really, really, really bad decisions on designing databases. MySQL has a convenient function for this situation:
where find_in_set(1, ShareWithID) > 0
If you have spaces in the string, you need to remove them:
where find_in_set(1, replace(ShareWithID, ' ', '')) > 0
...the built-in feature is there to be used
FIND_IN_SET() is actually not intended to be used for strings containing comma-separated lists. It's intended to be used with MySQL's SET data type. Hence the name FIND_IN_SET(), not FIND_IN_COMMA_SEPARATED_LIST().
It saves having to waste time building a 250,000 row 'table' (was it??) to look after a few columns of IDs, when one column in the original 'table' could do the job just as well.
250k rows is not a problem for MySQL. I manage databases with billions of rows in a given table. If you do basic query optimization with indexes, most queries on a table of 250k rows are just fine.
Whereas using a comma-separated list, you spoil any chance of optimizing queries. An index does not help searching for substrings that may not be the leftmost prefix of the string, and searching for a number in a comma-separated list is basically searching for a substring.
You're making your queries impossible to optimize by using a comma-separated list. Every query using FIND_IN_SET() will be a table-scan, which will get slower in a linear relationship to the number of rows in your table.
There are other disadvantages to using a comma-separated list besides indexing, which I wrote about in my answer to this old post: Is storing a delimited list in a database column really that bad?
I would rather use a proper solution over a frigged solution.
Then store one id per row. In a relational database, that's the proper solution.
The solution to this problem is to use Gordon Linoff's suggestion of the FIND_IN_SET command in conjunction with the correct configuration of the table column in question, like this:
SELECT * FROM tablename WHERE FIND_IN_SET('1', ShareWithID);
However, because the FIND_IN_SET command allows you to find the position of a string within a comma-separated list of strings, you MUST ensure that the contents of the column contains a comma after each item and DOES NOT contain spaces after the comma.
So this column content used in conjunction with the above command will return '0' rows: 111, 1
While this column content will return '1' row: 111,1
As will this one: 33,1
And this one: 44,1,415

Comma separated value & wildcards in mysql

I have a value in my database with comma separated data eg.
11,223,343,123
I want to get the data, if it match a certain number (in this example it's number 223).
WHERE wp_postmeta.meta_value IN
('223', '223,%', '%,223,%', '%,223')
I thought I could use wildcard for it, but with no luck. Any ideas of how to do this? Maybe it's better to do this using PHP?
Storing stuff in a comma separated list usually is a bad idea, but if you must, use the FIND_IN_SET(str,strlist) function.
WHERE FIND_IN_SET('223',wp_postmeta.meta_value)
If you can change your database and normalise it, you would get faster results. Create an extra table that links meta_values to your primary_id in your table.
The wp_post_meta table is designed to hold loads of values, and for that simple reason (and because of database normalization, you should not never comma seperated lists as values in databases.
If you absolutely must use it this way, there are some mySQL functions, one being FIND_IN_SET.

MySQL Best way to select using list of ID's

Through a webservice my application receives a list of identifiers.
With this list, I have to look up a field, one per identifier.
If the field does not exist, the value should be null (it has to be shown).
I am wondering what would be the best approach. First I thought it was best to create a temporary table holding the id's and then joining it to the table holding the data, but if I'm correct this takes at least 1 query per identifier to insert it in the temporary table.
In that case, it seems that I could just as well iterate through the list of identifiers in my application and query the database 1 by 1. Is this correct?
Which approach can you advise?
greetings,
coen
Use the SELECT WHERE IN() Syntax to get a result set with the data you need, then iterate over it in your code. This way you only query the DB once and only get the information you need.
Showing nulls is the trick, you need to join the table to itself, so there are two index lookups per record. Just doing a 1-to-1 query for each identifier will only require one index lookup.
In practice, it won't be twice as slow, since the identifier will be in the key cache by the time the second lookup is executed.
Another option is to render your output using the input identifiers, and use an "IN" like previously suggested. The null records won't be in the query output, but that would be ok since you know what was requested.