optimise like `%value%` on join in exponential-growing, mysql database - mysql

i tried lots of thing but not of them worked hope someone may help me with this query
let me show my query first then issue
select log.*,client.client_name
from ( select * from sessions
where ( `report_error_status` like CONCAT('%' ,'consec', '%')
or `ipaddress` like CONCAT('%' ,'consec', '%') or `last_updated` like CONCAT('%' ,'consec', '%') )
ORDER BY `id` DESC LIMIT 10 OFFSET 0 )
log
inner join
(select * from clients
where ( `client_name` like CONCAT('%' ,'consec', '%') ) )
client on log.client_id = client.id
in order to prevent exponential reducing query speed i'm applying limit in my table session above query working perfectly fine without "where", but my problem lies over here if user from front end try to search any thing in datatable , where clause is dynamically get attached in backend (above query with where) now my problem is that suppose table (session) does not contain user search value consec ,but table (client) contain then final query still return null value now is there any way to apply conditional where like below query
ifnull((select id from sessions where
(`report_error_status` like CONCAT('%' ,'consec', '%')
or `ipaddress` like CONCAT('%' ,'consec', '%')
or `last_updated` like CONCAT('%' ,'consec', '%'))
),
(select * from sessions ORDER BY `id` DESC LIMIT 10 OFFSET 0) ))
it will resolve all my problem is there any way to achieve in mysql.
if table session contain 100 000 data it will search with client table one by one against 100k records. suppose time taken to execute is 1 sec now what if my session table has 200k data again time will increase exponentially in inner join, to avoid this i'm using subquery in session with limit
Note report_error_status,ipaddress, client_name etc in index

There is no way to optimize a MySQL SELECT statement that use a regex opening with the wildcard. Your REGEX is %consec%, and you could add an index, but to quote the official MySQL documentation...
The index also can be used for LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character. For example, the following SELECT statements use indexes:
SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%';
SELECT * FROM tbl_name WHERE key_col LIKE 'Pat%_ck%';
Source: Dev.MySQL.com: Comparison of B-Tree and Hash Indexes; B-Tree Index Characteristics
Your query falls outside of this use case, so indices will not help. Here's another answer suggesting the same.
I am going to suggest Database Normalization...
You're selecting fields that are LIKE %consec%. Why? What is this value? Is it a special, internal code that means something special for your software and your software alone? After all, look the names of the fields -- report_error_status, ipaddress, last_updated. Except for maybe the error code one, there's no reason "consec" would appear in these, unless it had some internal significance.
For instance, table.field has value of "userconsec", sometimes you want to search for "user", other times "consec".
In that case, you'd want a new table; "tableType", with tableType.tableid pointing to the other table and tableType.Type being the Type value ("user", "consec", etc.), an index on both tableid and Type, and then you can drop from your query WHERE LIKE ... and add instead JOIN ON tableType.tableid = table.id AND tableType.Type = "consec";.
It will be faster because...
It is not looking through all the text of several text fields.
It is looking through an ordered list of integers to identify the record you need.

Related

How to use FIND_IN_SET using list of data

I have used FIND_IN_SET multiple times before but this case is a bit different.
Earlier I was searching a single value in the table like
SELECT * FROM tbl_name where find_in_set('1212121212', sku)
But now I have the list of SKUs which I want to search in the table. E.g
'3698520147','088586004490','868332000057','081308003405','088394000028','089541300893','0732511000148','009191711092','752830528161'
I have two columns in the table SKU LIKE 081308003405 and SKU Variation
In SKU column I am saving single value but in variation column I am saving the value in the comma-separated format LIKE 081308003405,088394000028,089541300893
SELECT * FROM tbl_name
WHERE 1
AND upc IN ('3698520147','088586004490','868332000057','081308003405','088394000028',
'089541300893','0732511000148','009191711092','752830528161')
I am using IN function to search UPC value now I want to search variation as well in the variation column. This is my concern is how to search using SKU list in variation column
For now, I have to check in the loop for UPC variation which is taking too much time. Below is the query
SELECT id FROM products
WHERE 1 AND upcVariation AND FIND_IN_SET('88076164444',upc_variation) > 0
First of all consider to store the data in a normalized way. Here is a good read: Is storing a delimited list in a database column really that bad?
Now - Assumng the following schema and data:
create table products (
id int auto_increment,
upc varchar(50),
upc_variation text,
primary key (id),
index (upc)
);
insert into products (upc, upc_variation) values
('01234', '01234,12345,23456'),
('56789', '45678,34567'),
('056789', '045678,034567');
We want to find products with variations '12345' and '34567'. The expected result is the 1st and the 2nd rows.
Normalized schema - many-to-many relation
Instead of storing the values in a comma separated list, create a new table, which maps product IDs with variations:
create table products_upc_variations (
product_id int,
upc_variation varchar(50),
primary key (product_id, upc_variation),
index (upc_variation, product_id)
);
insert into products_upc_variations (product_id, upc_variation) values
(1, '01234'),
(1, '12345'),
(1, '23456'),
(2, '45678'),
(2, '34567'),
(3, '045678'),
(3, '034567');
The select query would be:
select distinct p.*
from products p
join products_upc_variations v on v.product_id = p.id
where v.upc_variation in ('12345', '34567');
As you see - With a normalized schema the problem can be solved with a quite basic query. And we can effectively use indices.
"Exploiting" a FULLTEXT INDEX
With a FULLTEXT INDEX on (upc_variation) you can use:
select p.*
from products p
where match (upc_variation) against ('12345 34567');
This looks quite "pretty" and is probably efficient. But though it works for this example, I wouldn't feel comfortable with this solution, because I can't say exactly, when it doesn't work.
Using JSON_OVERLAPS()
Since MySQL 8.0.17 you can use JSON_OVERLAPS(). You should either store the values as a JSON array, or convert the list to JSON "on the fly":
select p.*
from products p
where json_overlaps(
'["12345","34567"]',
concat('["', replace(upc_variation, ',', '","'), '"]')
);
No index can be used for this. But neither can for FIND_IN_SET().
Using JSON_TABLE()
Since MySQL 8.0.4 you can use JSON_TABLE() to generate a normalized representation of the data "on the fly". Here again you would either store the data in a JSON array, or convert the list to JSON in the query:
select distinct p.*
from products p
join json_table(
concat('["', replace(p.upc_variation, ',', '","'), '"]'),
'$[*]' columns (upcv text path '$')
) v
where v.upcv in ('12345', '34567');
No index can be used here. And this is probably the slowest solution of all presented in this answer.
RLIKE / REGEXP
You can also use a regular expression:
select p.*
from products p
where p.upc_variation rlike '(^|,)(12345|34567)(,|$)'
See demo of all queries on dbfiddle.uk
You can try with below example:
SELECT * FROM TABLENAME
WHERE 1 AND ( FIND_IN_SET('3698520147', SKU)
OR UPC IN ('3698520147') )
I have a solution for you, you can consider this solution:
1: Create a temporary table example here: Sql Fiddle
select
tablename.id,
SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.name, ',', numbers.n), ',', -1) sku_variation
from
numbers inner join tablename
on CHAR_LENGTH(tablename.sku_split)
-CHAR_LENGTH(REPLACE(tablename.sku_split, ',', ''))>=numbers.n-1
order by id, n
2: Use the temporary table to filter. find in set with your data
Performance considerations. The main thing that matters for performance is whether some index can be used. The complexity of the expression has only a minuscule impact on overall performance.
Step 1 is to learn what can be optimized, and in what way:
Equal: WHERE x = 1 -- can use index
IN/1: WHERE x IN (1) -- Turned into the Equal case by Optimizer
IN/many: WHERE x IN (22,33,44) -- Usually worse than Equal and better than "range"
Easy OR: WHERE (x = 22 OR x = 33) -- Turned into IN if possible
General OR: WHERE (sku = 22 OR upc = 33) -- not sargable (cf UNION)
Easy LIKE: WHERE x LIKE 'abc' -- turned into Equal
Range LIKE: WHERE x LIKE 'abc%' -- equivalent to "range" test
Wild LIKE: WHERE x LIKE '%abc%' -- not sargable
REGEXP: WHERE x RLIKE 'aaa|bbb|ccc' -- not sargable
FIND_IN_SET: WHERE FIND_IN_SET(x, '22,33,44') -- not sargable, even for single item
JSON: -- not sargable
FULLTEXT: WHERE MATCH(x) AGAINST('aaa bbb ccc') -- fast, but not equivalent
NOT: WHERE NOT ((any of the above)) -- usually poor performance
"Sargable" -- able to use index. Phrased differently "Hiding the column in a function call" prevents using an index.
FULLTEXT: There are many restrictions: "word-oriented", min word size, stopwords, etc. But it is very fast when it applies. Note: When used with outer tests, MATCH comes first (if possible), then further filtering will be done without the benefit of indexes, but on a smaller set of rows.
Even when an expression "can" use an index, it "may not". Whether a WHERE clause makes good use of an index is a much longer discussion than can be put here.
Step 2 Learn how to build composite indexes when you have multiple tests (WHERE ... AND ...):
When constructing a composite (multi-column) index, include columns in this order:
'Equal' -- any number of such columns.
'IN/many' column(s)
One range test (BETWEEN, <, etc)
(A couple of side notes.) The Optimizer is smart enough to clean up WHERE 1 AND .... But there are not many things that the Optimizer will handle. In particular, this is not sargable: `AND DATE(x) = '2020-02-20', but this does optimize as a "range":
AND x >= '2020-02-20'
AND x < '2020-02-20' + INTERVAL 1 DAY
Reading
Building indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Sargable: https://en.wikipedia.org/wiki/Sargable
Tips on Many-to-many: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
This depends on how you use it. In MySQL I found that find_in_set is way faster than using JSON when tested on the following commands, so much faster it wasn't even a competition (to be clear, the speed test did not include the set command line):
Fastest
set #ids = (select group_concat(`ID`) from `table`);
select count(*) from `table` where find_in_set(`ID`, #ids);
10 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where `ID` member of( #ids );
34 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where JSON_CONTAINS(#ids, convert(`ID`, char));
34 x slower
set #ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where json_overlaps(#ids, json_array(`ID`));
SELECT * FROM tbl_name t1,(select
group_concat('3698520147',',','088586004490',',','868332000057',',',
'081308003405',',','088394000028',',','089541300893',',','0732511000148',',','009191711092',
',','752830528161') as skuid)t
WHERE FIND_IN_SET(t1.sku,t.skuid)>0

Mysql subquery or something better

I am somewhat new to mysql and I am having an issue on how I should best write the following query. Say I have a table that has a datetime column as well as a few others I want to search on. Since this is just one table, I don't think a join statement would be appropriate here (but I may be wrong since I have not done much in the way of join statements) and I think a subquery is what I need here. So my initial query is to search the table based on a search string the user entered and then I want to limit that on a datetime (start date and end date) also specified by the user in an HTML form.
Table Schema
id, datetime, host, level, message
I want to select any rows that contain $searchstring first so something like ...
SELECT * FROM $table WHERE (level LIKE '%$searchstring%') OR (message LIKE '%$searchstring%') LIMIT $offset,$limit
If I want to limit the above results also by the datetime column, the query would look something like this ...
SELECT * FROM $table WHERE (datetime >='$startdate') AND (datetime < '$enddate')
How can I best merge these queries into one so I can first get any rows that match the search query and then further limit the rows by the start and end datetime?
TIA
You can achieve that by using a single where condition.
In your case:
SELECT * FROM $table WHERE ((level LIKE '%$searchstring%') OR (message LIKE '%$searchstring%')) AND (datetime >='$startdate') AND (datetime < '$enddate') LIMIT $offset,$limit
You don't have to use a JOIN but only add a condition
SELECT *
FROM $table
WHERE (level LIKE '%$searchstring%' OR message LIKE '%$searchstring%')
AND
datetime >='$startdate'
AND datetime < '$enddate'
LIMIT $offset,$limit

How to optimize MySQL query to find substring among three fields?

I have the following MySQL table fields:
description1, description2, description3: Varchar(500)
value: int
and wish to find the records where at least one of the description includes the string searched by the user.
Right now I am using the following query. It works, but it takes about 1.5 second to return the results.
SELECT `table`.`value`,
`table`.`description1`,
`table`.`description2`,
`table`.`description3`
FROM `table`
WHERE ( `table`.`description1` LIKE '%string%'
OR `table`.`description2` LIKE '%string%'
OR `table`.`description3` LIKE '%string%' )
ORDER BY `table`.`value` DESC LIMIT 0 , 9
Is there any way to get the results faster?
(Note that the value field is already indexed).
add full text index and instead like use AGAINST

I need some help getting MySql to output some results using a subquery

I'm storing a list of numbers inside a table as a varchar(255) and want to use this list in another query's "IN() clause.
Here's what I mean:
Table Data:
CREATE TABLE IF NOT EXISTS `session_data` (
`visible_portf_ids` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `session_data` (`visible_portf_ids`) VALUES
('45,44,658,659,661,45,44,658,659,661')
I want to run a query like this to return a list of portfolio's "QUERY #1":
SELECT portfolio_hierarchy_id, account_id, name, leaf_node_portf_id
FROM portfolio_hierarchy
WHERE account_id = 1
AND leaf_node_portf_id IN
(
(SELECT visible_portf_ids
FROM session_data
WHERE username = 'ronedog')
)
ORDER BY name ASC
The result of the query above returns only 1 row, when there are a total of 3 that should have been returned.
If I run the subquery alone like this:
(SELECT visible_portf_ids
FROM session_data
WHERE username = 'ronedog')
it will return a list like this:
45,44,658,659,661,45,44,658,659,661
But, when I run Query #1 above, only one row of data, which is associated with the "visible_portf_ids" of "45" is returned.
If I replace the subquery with hard coded values like this:
SELECT portfolio_hierarchy_id, account_id, name, leaf_node_portf_id
FROM portfolio_hierarchy
WHERE account_id = 1
AND leaf_node_portf_id IN (45,44,658,659,661,45,44,658,659,661)
ORDER BY name ASC
then I get all 3 rows I'm expecting.
I'm guessing that MySql is returning the list as a string because its stored as a varchar() and so it stops processing after the first "visible_portf_ids" is found, which is "45", but I'm not really sure.
Anyone got any ideas how I can fix this?
Thanks in advance.
You should think about restructuring your tables storing each value in a new row, instead of concatenating them.
Until then, you can use the FIND_IN_SET() function:
AND FIND_IN_SET(leaf_node_portf_id,
(SELECT visible_portf_ids
FROM session_data
WHERE username = 'ronedog'
LIMIT 1)
) > 0
Unfortunately MySQL does not have a function to split a delimited string. Your IN argument is a single string with the result of your subquery. The reason it works when you hard-code it is that MySQL is parsing the values.
I suggest that you redesign your data base to store the visible ports list as separate rows in a separate table. Then you can retrieve them and use them in subqueries like you tried.

Count occurrences of a word in a row in MySQL

I'm making a search function for my website, which finds relevant results from a database. I'm looking for a way to count occurrences of a word, but I need to ensure that there are word boundaries on both sides of the word ( so I don't end up with "triple" when I want "rip").
Does anyone have any ideas?
People have misunderstood my question:
How can I count the number of such occurences within a single row?
This is not the sort of thing that relational databases are very good at, unless you can use fulltext indexing, and you have already stated that you cannot, since you're using InnoDB. I'd suggest selecting your relevant rows and doing the word count in your application code.
You can try this perverted way:
SELECT
(LENGTH(field) - LENGTH(REPLACE(field, 'word', ''))) / LENGTH('word') AS `count`
ORDER BY `count` DESC
This query can be very slow
It looks pretty ugly
REPLACE() is case-sensitive
You can overcome the issue of mysql's case-sensitive REPLACE() function by using LOWER().
Its sloppy, but on my end this query runs pretty fast.
To speed things along I retrieve the resultset in a select which I have declared as a derived table in my 'outer' query. Since mysql already has the results at this point, the replace method works pretty quickly.
I created a query similar to the one below to search for multiple terms in multiple tables and multiple columns. I obtain a 'relevance' number equivalent to the sum of the count of all occurrances of all found search terms in all columns searched
SELECT DISTINCT (
((length(x.ent_title) - length(replace(LOWER(x.ent_title),LOWER('there'),''))) / length('there'))
+ ((length(x.ent_content) - length(replace(LOWER(x.ent_content),LOWER('there'),''))) / length('there'))
+ ((length(x.ent_title) - length(replace(LOWER(x.ent_title),LOWER('another'),''))) / length('another'))
+ ((length(x.ent_content) - length(replace(LOWER(x.ent_content),LOWER('another'),''))) / length('another'))
) as relevance,
x.ent_type,
x.ent_id,
x.this_id as anchor,
page.page_name
FROM (
(SELECT
'Foo' as ent_type,
sp.sp_id as ent_id,
sp.page_id as this_id,
sp.title as ent_title,
sp.content as ent_content,
sp.page_id as page_id
FROM sp
WHERE (sp.title LIKE '%there%' OR sp.content LIKE '%there%' OR sp.title LIKE '%another%' OR sp.content LIKE '%another%' ) AND (sp_content.title NOT LIKE '%goes%' AND sp_content.content NOT LIKE '%goes%')
) UNION (
[search a different table here.....]
)
) as x
JOIN page ON page.page_id = x.page_id
WHERE page.rstatus = 'ACTIVE'
ORDER BY relevance DESC, ent_title;
Hope this helps someone
-- Seacrest out
create a user defined function like this and use it in your query
DELIMITER $$
CREATE FUNCTION `getCount`(myStr VARCHAR(1000), myword VARCHAR(100))
RETURNS INT
BEGIN
DECLARE cnt INT DEFAULT 0;
DECLARE result INT DEFAULT 1;
WHILE (result > 0) DO
SET result = INSTR(myStr, myword);
IF(result > 0) THEN
SET cnt = cnt + 1;
SET myStr = SUBSTRING(myStr, result + LENGTH(myword));
END IF;
END WHILE;
RETURN cnt;
END$$
DELIMITER ;
Hope it helps
Refer This
Something like this should work:
select count(*) from table where fieldname REGEXP '[[:<:]]word[[:>:]]';
The gory details are in the MySQL manual, section 11.4.2.
Something like LIKE or REGEXP will not scale (unless it's a leftmost prefix match).
Consider instead using a fulltext index for what you want to do.
select count(*) from yourtable where match(title, body) against ('some_word');
I have used the technique as described in the link below. The method uses length and replace functions of MySQL.
Keyword Relevance
If you want a search I would advise something like Sphinx or Lucene, I find Sphinx (as an independent full text indexer) to be a lot easier to set up and run. It runs fast, and generates the indexes very fast. Even if you were using MyISAM I would suggest using it, it has a lot more power than a full text index from MyISAM.
It can also integrate (somewhat) with MySQL.
It depends on what DBMS you are using, some allow writing UDFs that could do this.