Is MySQL FULLTEXT best solution for partial words? - mysql

I have a MySQL MyISAM table containing entries that describe airports. This table contains 3 varchar columns - code, name and tags.
code refers to the airport's code (like JFK and ORD), the name refers to the airport's name (John F Kennedy and O'Hare) and tags specify a semicolon separated list of tags that are associated with the airport (like N.Y.C;New York; and Chicago;).
I need to be able to lookup an airport (for an autocomplete) by either the code, name or tags, therefore I set a FULLTEXT index on (code, name, tags).
I have encountered two problems with FULLTEXT so far that prevent me from working with it:
1. There is no way to do partial matching - only postfix matching (is this true?)
2. When a period ('.') is specified in the term to match against, the matching works differently. I am assuming that the period is being parsed in a special way. For example, doing a FULLTEXT search on N.Y.C will not return JFK, although doing the same search on New York will
Is there anyway to overcome these barriers? Otherwise, should I be looking at like matching instead, or an entirely different storage engine? Thanks!

Best solution I came up with is using both FULLTEXT and like matching, and using UNION for the results.

Related

Search for specific keyword in MYSQL

I'm almost new to mysql.
I wanted to write a query to search for specific keywords in a column where keywords are separated by the comma. but as I use the following code, it only returns the rows where I only have that specific keyword, not in combination with any other keywords.
In Table q16, I'm looking for a way to select rows that have my keyword in the "Area_of_concern" column, no matter if it's combined with other keywords or not:
SELECT *
FROM `q16`
WHERE area_of_concern like '%more education is needed%'
Here's an input example:
q16_id area of concern
1 more education is needed
2 more enforcement, change in strategy
3 change in strategy
4 more education is needed, change in strategy
5 transportation issue, more enforcement, more education is needed
Where I'm looking to get the rows with the keyword "more education is needed". So I should see row 1, 4,5 in the output
I think you should create a table where you have one column for keywords and one column for where those keywords are used: a foreign key for the q16 table in your case.
It will work much faster that way.
As for your question it is a duplicate of this one here, I believe.
How to search for rows containing a substring?
A quick try: try using double quotes instead of single ones, as in some systems, single quotes don't allow for escapes (special characters) inside them.

Merging two tables based on a column (partial match or comma separated column)?

I have two tables that their shared columns do not exactly match (differences in capital character or existence of some characters like comma,space and ...). How can I merge these two tables based on their shared column (in R, Knime, Excel-power query or sql)?
In your example Result table it's not clear where the row
gene1 | go3 | 14
comes from, because there's no entry for go3 in Table2. I'm assuming that's a mistake and you meant Table2 to include the row
go3 | 14
If that's correct, here's how to do this in KNIME:
The two Table Creator nodes just create the two tables with column names as shown in your example - replace these with your actual data sources. Cell Splitter splits column Goes using a comma as the delimiter. The Unpivoting node is configured like this:
and the Joiner like this:
All other settings were left as default. Add nodes to reorder and filter the columns in the Joiner output if you need to. Note that you'll see different Goes_Arr[n] columns depending on how many different values of Goes there are - the Enforce exclusion and Enforce inclusion settings make sure that Unpivoting handles this correctly.
This workflow should cope with whitespace between the commas, but I think you also mention differences in capital letters - if you need to handle these, pass each table through a Case Converter node to make them consistent.
Pivoting and unpivoting are hard to understand (IMHO - especially given the cryptic descriptions of their KNIME nodes) but very powerful. I recommend taking time to play around with these nodes to figure out how they work.

Partial MySQL index from End of Field

A partial index helps to have smaller indexes, and makes INSERTs faster.
For instance
CREATE TABLE wine (
name VARCHAR(100),
...
INDEX (name(8)));
While names are something like
Chateau Mouton-Rothschild
Chateau Mouton-Cadet
Chateau Petrus
Chateau Lafite
Chateau Lafleur
...
In this (example) list, Chateau appears all the time, MySQL creates an index based on the 8 first characters... meaning there will be only one entry in the index (and the search of Chateau Petrus will be done sequentially for all Chateau).(In this very case, a split between the first word (Chateau) and the rest of the name in two fields would make sense, but this is not the point).
Is there a way to ask MySQL to create a partial index based on the end of a field?
Actually I found a way in the meantime - with a bit of programming:
Only for the name field
all name entries are stored reverse in the DB
all name searches are made reverse
name in row is reversed before being sent to client (user agent)
For instance in PHP
...query('INSERT ... name="' . strrev($name) . '"...
...query('SELECT * FROM wine WHERE name="' . strrev($name) . '"');
and for instance a search of %MOUTON% will actually search %NOTUOM%
There is a bit of reverse overhead, but is negligible compared to the possible database gain.
The question was specifically asking for a pure MySQL solution, but if there is none, this is a workable workaround in any language. I'll accept this answer in a few days if there is nothing better.
Aside from the solution you mentioned above, the only solution for this in MySQL is to explicitly store the value you want to index. (e.g. name without Chateau)

Select the records containing one or more words fully in UPPERCASE

I have a query in MYSql database. I have a table order_det, the table's column remarks_desc contains the entries as follows:
Table structure:
Table: order_det
Columns: rec_id, remarks_desc
Sample records in order_det table
rec_id remarks_desc
_________________________________________________________
1 a specific PROGRAMMING problem
2 A software Algorithm
3 software tools commonly USED by programmers
4 Practical, answerable problems that are unique to the programming profession
5 then you’re in the right place to ask your question
6 to see if your QUESTION has been asked BEFORE
My requirement I want to select only the records which that contains one more more words stored in all uppercase letters. From the above 6 records, I want to select only below 1,3,6 records:
rec_id remarks_desc
__________________________________________________
1 a specific PROGRAMMING problem (it contains one all uppercase word PROGRAMMING)
3 software tools commonly USED by programmers (it contains one all uppercase word USED)
6 to see if your QUESTION has been asked BEFORE (it contains two all uppercase words QUESTION and BEFORE)
I tried to archive this using LIKE, REGEXP but getting incorrect result.
Please help me to get the correct result.
Try:
SELECT rec_id, remarks_desc FROM order_det WHERE remarks_desc REGEXP '(^|[[:blank:]])[[:upper:]][[:upper:]]+([[:blank:]]|$)'
I have assumed that you want to exclude single-letter capitalised words. If you want to exclude capitalised words at the start of the string, you'll need to tweak the regex.
Make sure that your table collation is case sensitive (_cs not _ci)
I used information from http://dev.mysql.com/doc/refman/5.1/en/regexp.html#operator_regexp
However, if you're having to use regular expressions to extract data from a database, it's worth considering whether your database design could be improved.
This is particularly important if you need good performance from the database.
Here is the pretty straight forward stored function which returns amount of words in uppercase in row.
Cons:
it's stored function not pure SQL;
it uses collate
it uses regexp, but you can fill free to get rid of it using another inner loop for it;
it counts all words but you can add break if you reach 2.
Please find the function on the following link (gist.github.com). It doesn't display correctly here.

How to search a word in MySQL table? MATCH AGAINST or LIKE?

I have a table for some companies that may have many branches in different countries. These countries are inserted in countries field.
Now, I have to make a searching system that allows users to find companies that have any branch in a specific country.
My question is: Which one do I have to use ? MATCH AGAINST or LIKE ? The query must search all records to find complete matched items.
attention: Records may have different country name separated with a comma.
MATCH AGAINST clause is used in Full Text Search.
for this you need to create a full text index on search column countries.
full text index search is much faster than LIKE '%country%' serach.
I would change the implementation: having a field that contains multiple values is a bad idea, for example, it's difficult to maintain - how will you implement remove a country from a company ?.
I believe that a better approach would be to have a separate table companies_countries which will have two columns: company_id and country_id, and could have multiple lines per company.
You should use LIKE . Because as #Omesh mentioned MATCH AGAINST clause is used for Full Text Search.. And Full Text Search need entire column for search.