There are a lot of topics on sortation (like: Order Results By Occurrence) but these are all for one value.
I have a search field that people use with keywords; simply said the queries generated look like:
1 word:
SELECT *
FROM `meta`
WHERE (`keywords` LIKE '%bike%')
2 words:
SELECT *
FROM `meta`
WHERE (
`keywords` LIKE '%bike%'
OR `keywords` LIKE '%yellow%'
)
etc...
What I would like to do is sort the result on the most found keywords. How would I do this for an unknown amount of keywords LIKE's
Here is the general way to sort by the number of keyword matches in MySQL (using like):
SELECT *
FROM `meta`
ORDER BY ((`keywords` LIKE '%bike%') +
(`keywords` LIKE '%yellow%') +
. . .
) desc;
If you want to handle a flexible number of keywords, then you should use an appropriate relational data structure. Storing keywords in a single field (probably comma-separated) is not the best approach. You should have a separate table with one row per keyword.
EDIT:
To add in the number of keywords found, the expression can be put in the select statement:
SELECT m.*,
((`keywords` LIKE '%bike%') +
(`keywords` LIKE '%yellow%') +
. . .
) as NumKeywordsFound
FROM `meta` m
ORDER BY NumKeywordsFound desc;
You can also add a having clause to specify that at least one is found:
SELECT m.*,
((`keywords` LIKE '%bike%') +
(`keywords` LIKE '%yellow%') +
. . .
) as NumKeywordsFound
FROM `meta` m
HAVING NumKeywordsFound > 1
ORDER BY NumKeywordsFound desc;
If you want to find the number of times a keyword is found in each expression:
select m.*,
length(replace(keywords, 'bike', 'bike1')) - length(keywords) as NumBikes,
length(replace(keywords, 'yellow', 'yellow1')) - length(keywords) as NumYellows
FROM `meta` m
Related
I am quite the novice at MYSQL and would appreciate any pointers - the goal here would be to automate a categorical field using GROUP_CONCAT in a certain way, and then summarize certain patterns in the GROUP_CONCAT field in a new_column. Furthermore, is it possible to add the new_column to the original table in one query? Below is what I've tried and errors to an unknown column "Codes" if this assists:
SELECT
`ID`,
`Code`,
GROUP_CONCAT(DISTINCT `Code` ORDER BY `Code` ASC SEPARATOR ", ") AS `Codes`,
IF(`Codes` LIKE '123%', 'Description1',
IF(`Codes` = '123, R321', 'Description2',
"Logic Needed"))
FROM Table1
GROUP BY `ID`
Instead of nested if statements, I would like to have a CASE statement as a substitute. Reason being is that I already have around 1000 lines of logical already written as "If [column] = "?" Then "?" else if" etc. I feel like using CASE would be an easier transition with the logic. Maybe something like:
SELECT
`ID`,
`Code`,
GROUP_CONCAT(DISTINCT `Code` ORDER BY `Code` ASC SEPARATOR ", ") AS `Codes`,
CASE
WHEN `Codes` LIKE '123%' THEN 'Description1'
WHEN `Codes` = '123, R321' THEN 'Description2'
ELSE "Logic Needed"
END
FROM Table1
GROUP BY `ID`
Table Example:
ID,Code
1,R321
1,123
2,1234
3,1231
4,123
4,R321
Completed Table:
ID,Codes,New_Column
1,"123, R321",Description2
2,1234,Description1
3,1231,Description1
4,"123, R321",Description2
How then can I add back the summarized data to the original table?
Final Table:
ID,Code,New_Column
1,R321,Description2
1,123,Description2
2,1234,Description1
3,1231,Description1
4,123,Description2
4,R321,Description2
Thanks.
You can't refer to a column alias in the same query. You need to do the GROUP_CONCAT() in a subquery, then the main query can refer to Codes to summarize it.
It also doesn't make sense to select Code, since there isn't a single Code value in the group.
SELECT ID, Codes,
CASE
WHEN `Codes` = '123, R321' THEN 'Description2'
WHEN `Codes` LIKE '123%' THEN 'Description1'
ELSE "Logic Needed"
END AS New_Column
FROM (
SELECT
`ID`,
GROUP_CONCAT(DISTINCT `Code` ORDER BY `Code` ASC SEPARATOR ", ") AS `Codes`
FROM Table1
GROUP BY ID
) AS x
As mentioned in a comment, the WHEN clauses are tested in order, so you need to put the more specific cases first. You might want to use FIND_IN_SET() rather than LIKE, since 123% will match 1234, not just 123, something
Working on an export from a Sparx EA database in MySQL.
The database contains objects that have notes
select o.Note from t_object o
The result could be
Note
Contains reference to term1 and term2
Another note that mentions term1 only
A note that doesn't mention any terms
There is also a glossary that I can query like this
select g.TERM
from t_glossary g
union
select o.Name
from t_diagram d
join t_diagramobjects dgo
on dgo.Diagram_ID = d.Diagram_ID
join t_object o
on o.Object_ID = dgo.Object_ID
where 1=1
and d.styleEx like '%MDGDgm=Glossary Item Lists::GlossaryItemList;%'
The result of this query
TERM
term1
term2
The requirement is that I underline each word in the notes of the first query that is an exact match to one of the terms in the second query. Underlining can be done by enclosing the word in <u> </u> tags
So the final query result should be
Note
Contains reference to <u>term1</u> and <u>term2</u>
Another note that mentions <u>term1</u>only
A note that doesn't mention any terms
Is there any way to do this in a select query? (so without variables, temp tables, loops, and all that stuff)
I think regular expressions might be a better approach. For your example, you want:
select regexp_replace(note, '(term1|term2)', '<u>$1</u>')
from t_object;
You can easily construct this in MySQL as:
select regexp_replace(note, pattern, '<u>$1</u>')
from t_object cross join
(select concat('(', group_concat(term separator '|'), ')') as pattern
from t_glossary
) g;
Here is a db<>fiddle.
Regular expressions have a key advantage that they give you more flexibility on the word boundaries. The above replaces any occurrence of the terms, regardless of surrounding characters. But you can adjust that using the power of regular expressions.
I might also suggests that such replacement could be done -- using regular expressions -- at the application layer.
Here I have replace all the TERM from t_glossary table in note column from t_object table with <ul>Term</ul>
Schema:
create table t_object(note varchar(500));
insert into t_object
select 'Contains reference to term1 and term2' as Note
union all
select 'Another note that mentions term1 only'
union all
select 'A note that doesn''t mention any terms';
create table t_glossary (TERM varchar(500));
insert into t_glossary
select 'term1 '
union all
select 'term2';
Query:
WITH recursive CTE (note, note2, level) AS
(
SELECT note, note , 0 level
FROM t_object
UNION ALL
SELECT CTE.note,
REPLACE(CTE.note2, g.TERM, concat(' <u>', g.term , '</u> ')), CTE.level + 1
FROM CTE
INNER JOIN t_glossary g ON CTE.note2 LIKE concat('%' , g.TERM , '%') and CTE.note2 not like concat('%<u>', g.term , '</u>%')
)
SELECT DISTINCT note2, note, level
FROM CTE
WHERE level =
(SELECT MAX(level) FROM CTE c WHERE CTE.note = c.note)
Output:
note2
note
level
A note that doesn't mention any terms
A note that doesn't mention any terms
0
Another note that mentions <u>term1 </u> only
Another note that mentions term1 only
1
Contains reference to <u>term1 </u> and <u>term2</u>
Contains reference to term1 and term2
2
db<>fiddle here
I have selected my data with;
SELECT * FROM item_temp WHERE name LIKE '%starter%' AND Inventory LIKE '600';
I want to duplicate my selected data (Not overwrite it) multiply "entry" of every item in the query by 10.
As an example, the "entry" of one item is: 51327.
I want to make a copy of the item with an entry of 513270.
I have tried a few different methods but they've all resulted in errors and I feel like I'm at a brick wall.
Thanks in advance.
Something like this:
select (it.entry * 10 + n) as entry, . . . -- the rest of the columns go here
from (select 0 as n union all select 1 union all . . . select 9) n cross join
item_temp it
where it.name LIKE '%starter%' AND it.Inventory LIKE '600' ;
Use the INSERT INTO syntax
INSERT INTO table_name
<your query with same column order as table_name>;
Another option is making the destination table ex-novo with select ... into statement
SELECT *
into new_table
FROM item_temp
WHERE name LIKE '%starter%'
AND Inventory LIKE '600';
Use INSERT INTO with a SELECT that does the multiplication you need. You will have to write all columns on for the inserting table.
INSERT INTO item_temp (
entry
-- , other columns
)
SELECT
T.entry * 10 AS entry
-- , other columns
FROM
item_temp T
WHERE
name LIKE '%starter%' AND
Inventory LIKE '600';
I have this tables :
business table :
bussId | nameEn | nameHe | nameAr | status | favor | cityId | categoryId
category table :
categoryId | keywords
favorite table :
userId | bussId
rating table :
userId | bussId | rating
I am running this query which filter businesses with cityId and search (business.nameEn , business.nameAr , business.nameHe , categories.keywords) then order by favor and status and nameEn.
SELECT DISTINCT bussID ,businessName, bussStatus,favor, ratingCount , ratingSum
FROM
(
SELECT DISTINCT business.bussID , business.nameEn as businessName , bussStatus,favor,
(SELECT COUNT(rating.bussId) FROM `rating` WHERE rating.bussId = business.bussID) as ratingCount ,
(SELECT SUM(rating.rating) FROM `rating` WHERE rating.bussId = business.bussID) as ratingSum
FROM business LEFT JOIN favourites ON (favourites.bussID = business.bussID AND favourites.userID = '30000')
INNER JOIN `categories` ON (`categories`.`categoryId` = `business`.`subCategoryId` )
WHERE (bussiness.cityID = 11)
AND (
( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameEn`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameHe`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameAr`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`categories2`.`keyWords`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
)
AND
(bussiness.bussStatus IN(1,3,5,7)
)
GROUP BY bussiness.bussID )results
ORDER BY
businessName LIKE '%test%' DESC,
FIELD(bussStatus,'1','5','3'),
FIELD(favor,'1','2','3'),
businessName LIMIT 0,10
I am using replace to search case insensitive for أ ا and ة ه letters (before adding the test word I also replace this letters) .
my question :
I want to know How should I declare the indexes properly !
should I declare multiple columns index :
ALTER TABLE `bussiness`
ADD INDEX `index9` (`nameHe` ASC, `nameEn` ASC, `nameAr` ASC, `favor` ASC, `bussStatus` ASC);
or one columns index for each col !
should I create another col allNamesLanguages which contain nameAr,nameEn,nameHe then I just search this col ?
You have two problems with this part of the query that make standard indexes unusable:
( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameEn`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameHe`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`bussiness`.`nameAr`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
OR( REPLACE( REPLACE(REPLACE(LOWER(`categories2`.`keyWords`),'أ','ا'),'أ','ا') ,'ة','ه') LIKE '%test%' )
The first is the use of functions on the columns. The second is the use of like with a pattern that starts with a wildcard ('%').
For the functionality that you seem to want, you are going to need to use full text indexes and triggers and additional columns.
Here is my recommendation:
Add (at least) four addition columns that will be used for searching names. Something like business.nameEn_search and so on.
Add insert -- and perhaps update and delete triggers that will do the replacement of the special characters when you insert new values. That is, the massive replace( . . . ) logic goes in the trigger.
Add a full text index for the four columns.
Use match . . . against for your queries.
More information about full text functionality is in the documentation.
Functions basically render indexes useless. Therefore, columns that are used in WHERE clauses like UPPER(name) and else, can be indexed by so-called "function based indexes". They are a feature of Oracle, but as far as I know not in MySQL.
How to use a function-based index on a column that contains NULLs in Oracle 10+?
http://www.mysqlab.net/knowledge/kb/detail/topic/oracle/id/5041
Function-based indexes have their preconditions, though. The function used must be deterministic. So if you would like to index a calculation like "age", it won't work because "age" defined as "now minus then" basically grows each time you select.
My advice is to create more columns and to store the information to be mined there, as prepared as possible.
If you use LIKE "%blabla%", any index will be useless because of the variable text start length. So try to organize the additional columns so that you can avoid LIKE "%... or avoid LIKE at all. According to my experience, adding more columns to indexes won't be a performance stopper for many columns. So just try what happens if you add 4 columns and one combined index for them.
As I understand, you win the game as soon as you can write:
... WHERE nameEn_idx = 'test' AND/OR nameEr_idx = 'test' ...
I have a field ('roles') with this values
roles
row_1: 1,2,3,5
row_2: 2,13
I do this:
SELECT * FROM bloques WHERE 2 IN (roles)
and only find row_2, because it starts by 2
The LIKE option doesn't work because if I find 1
SELECT * FROM bloques WHERE roles LIKE '%1%'
, it gives me both rows.
FIND_IN_SET function cal helps you:
SELECT FIND_IN_SET('b','a,b,c,d');
For your code:
SELECT * FROM bloques WHERE FIND_IN_SET(2,roles);
Also, I suggesto to you that move your schema to 1NF
Could you use REGEXP? http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Otherwise, you could add commas to the front and end of your rows.
select * from mytable where ',' + roles + ',' like '%,2,%'
SELECT
*
FROM
bloques
WHERE
roles LIKE '%,2,%'
OR roles LIKE '2,%'
OR roles LIKE '%,2'
The first case will give you all cases where 2 is in the middle of a set, the second case is when 2 starts the set, and the third case is when 2 ends the set. This is probably terribly inefficient, but it should work.
You can use this regex here:
SELECT * FROM bloques
WHERE roles REGEXP '[[:<:]]2[[:>:]]';
... or, in more generic case:
SELECT * FROM bloques
WHERE roles REGEXP CONCAT('[[:<:]]', :your_value, '[[:>:]]');
You need to use these weird-looking things, as they match word boundaries - preventing match for '2' to occur at '23' and '32'. )
The problem is that query is only the beginning of the troubles caused by using denormalized field. I'd suggest using either SET type (if the number of options is limited and low) or, way better, junction table to store the m-n relationships.
Use FIND_IN_SET() function
Try this:
SELECT * FROM bloques WHERE FIND_IN_SET(1, roles);
OR
SELECT * FROM bloques
WHERE CONCAT(',', roles, ',') LIKE '%,1,%';