Case insensitive search without additional rules? - mysql

I'm trying to find a collation in MySQL (my version is 5.0) where strings that differ in case are considered the same but there're no other rules like:
á = a
and so on.
I tried to find the proper collation here: http://www.collation-charts.org/mysql60/by-charset.html but it seems that the collation I'm looking for doesn't exist.
I can't use in SQL query: SELECT ... WHERE lower(column1) = lower(column2) because indices on columns column1 and column2 are not used then and my query is terrible slow.
Thanks for any suggestion!

I was given an advice: simply have table like this: id, word, word_in_lowercase.. it's true that data are redundant but otherwise it fulfils all my needs.
Automatic update of word_in_lowercase may be done via trigger or some additional programming.

Which type of collation set in the tables that in question? I'm currently using a lot of tables with utf8_hungarian_ci because of this one is case-insensitive.

http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html indicates that nonbinary strings are case insensitive by default. Have you tested to see that it is not working properly without using lower()?

Why don't you use the full text search functions of MySQL for your search query?
For tasks like yours I am using the MATCH AGAINST function.
Read the Specifications at mysql.com to make it clear - Link
One example:
SELECT * FROM customer WHERE status = 1 AND MATCH (person, city, company, zipcode, tags) AGAINST ('".$searchstring."' IN BOOLEAN MODE)
And this will be executed case insensitive.

Related

Extracting a value from an Array using mysql

I have a column that has brand names in an array format as below:
I want to extract information associated with Brand4 for example 'price'.
I tried using the below, but that's a psql query. How can I extract this information using MySQL in GCP.
SELECT Brand_name, price
FROM table_name
Where 'Brand4'=Any(Brand_name)
First, the explanation for your error message is that in MySQL, ANY() accepts a subquery, not just a single column or expression. See https://dev.mysql.com/doc/refman/8.0/en/any-in-some-subqueries.html
MySQL does not have an array type. Your Brand_name column is not an array, it's a string. It happens to contain commas and square brackets, but these are just characters in a string.
So your solutions are to use various string-search functions or expressions, as other folks have suggested.
The downside to all the string-search functions is that they cannot be optimized with a conventional index. So every search will be expensive, because it requires a table-scan.
Another solution I did not see yet is to use a fulltext index.
alter table brands add fulltext index (brand_name);
select * from brands
where match(brand_name) against ('Brand4' in boolean mode);
This may require some special handling if the brand names contain spaces or punctuation, but if they are plain words, it should work.
Read https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html to understand more about fulltext indexes.
The best solution would be to eliminate this fake "array" column by normalizing the schema to store one brand per row in another table. Then you can match strings exactly and optimize with a conventional index. But I understand you said that the table structure is not up to you.
This should work in MySQL (using a string function as mention here):
SELECT *
FROM brands
WHERE FIND_IN_SET('Brand4',brand_name);
see: DBFIDDLE
Provided SQL query will work in MySQL, if you will make a subquery within the parentheses, or use FIND_IN_SET instead of using ANY.
But, as stated in the MySQL documentation:
This function does not work properly if the first argument contains a
comma (,) character.
So, as an alternative, you could use LIKE (simple pattern matching).
Your SQL code then would be:
SELECT `brand_name`, `price`
FROM `test`
WHERE `brand_name` LIKE "%Brand4%"
See SQLFiddle for live example.
Also, you could use LOCATE.
Or any other alternative solution.
But, I must say that storing list data in the way you do, - it's not the best practice out there.
There are plenty of ways this can be done better.
For example, using M:M (many-to-many) relationship.
In case you made this design you really have to reconsider/redesign. Databases have there own data structures and sql is not an imparative language but a declaritve one.
If when you didn´t desing you should consider create a table out of the one column. Perhaps this is what you try.
If it is just locating a specific string in the values of a field use like
SELECT Brand_name, price
FROM table_name
Where brand_anme like '%Brand4%'
But realize this is will not always yield accurate results.

Case-sensitive database search with Active Record

Does Active Record provide a way to generate SQL that forces a text search to be case-sensitive?
Ruby-on-Rails generators instructed to create a string-type column produce a simple VARCHAR(255) field, in a mysql database. It turns out that queries on such columns are case insensitive by default.
Thus, an Active Record search such as:
Secret.where(token: 'abcde')
will match records with tokens abcde, ABcdE, etc.
Without changing the underlying database column (e.g. specifying a utf8_bin collation) searches can be made case sensitive by explicitly tweaking the where clause:
Secret.where('binary token = ?', 'abcde')
However, this is database-specific, and I am wondering if Active Record has an idiom to accomplish the same for any database. Just as an example, something resembling the where.not construct:
Secret.where.binary(token: 'abcde')
Wouldn't this be a common enough need?
In short: there is NO ActiveRecord idiom for case-sensitive
search.
For case-insensitive search you can try to use
this.
It still works, but source code was changed a bit. So, use it on your
own risk.
In general, case sensitivity is subset of the Collation idiom.
And different DBMS use very different default collations for string(text) data types, including default case sensitivity.
Detailed description for MySQL.
There is a sql operator COLLATE which is very common across DBMS(but seems still is not in SQL Standard).
But ActiveRecord sources show it only in schema creation code.
Neither ActiveRecord, nor Arel gems do not use COLLATE in where search(sad).
Note: Please, don't miss the database tag(mysql etc) in a Question.
There are many Questions and Answers on SO without the tags(or with sql one), which are completely irrelevant for the most of DBMSs except author's one.

Having questions as field name in database design

I was wondering if it ok to use a question as a field name or column on a table such as IsTheWetherNiceOutsite? is there any pros or cons, or is there any other way to archive this?
If I were you I'd avoid column names with special characters like ? in the table itself. Writing queries can get troublesome.
But you can use column aliases if you want. For example,
SELECT NOW() 'What time is it?',
t.weatherNice 'Is the weather nice?'
FROM table t
It's not necessary. The is prefix already indicating that it's a boolean flag.
Prefer not to use weird characters, it might work (if use use backticks for the column name), but might not be portable, and will certainly confuse others.

In Scala, when making a slick sortBy, how can I have it do a case-sensitive sort

I found the Scala slick package's "sortBy" method is not case sensitive. Ex:
after implementing the following command: q.sortBy(columnMap("name").desc), I got:
TestingIsFun,
testing foo1,
Testing foo,
Is this expected behavior? How can I make it case sensitive? Thx.
I think as it currently stands, slick just depends on the RDBMS default handling of case in sorting. You did not mention the RDBMS type, but e.g. in mysql, case-insensitive is the default in sorting. However, you can define a column to-be-sorted in a way overiding that, in mysql, as per Altering Mysql Table column to be case sensitive. This will work without having to touch the query or slick parameters, as the solution is at the schema definition level. It should be possible to define the column as a binary string in the first place, with slick if needed:
O.DBType("binary") in the slick column definition should work for that.
When it comes to the database, the sorting of particular column will be done according to the collation for that column. By default, MySQL uses case-insensitive collation (unless you specify binary charset). You can override the default collation on any of the 4 levels (server, database, table or column) or even only in specific ORDER BY clause. Which way is the most efficient, depends on your particular use case. Using case-sensitive collation obviously affects performance, so most of the time it makes sense doing it either on table or on column level.

MySQL Match() not finding most optimal match

I have a database of occupation titles I'm trying to run some queries on. I'm using Match() to try and find the best match occupational title for a user-entered string using this SQL:
SELECT *, MATCH (occupation_title) AGAINST ('EGG PROCESSOR')
AS score FROM occupational_titles WHERE MATCH (occupation_title)
AGAINST ('EGG PROCESSOR') ORDER BY score DESC;
When I run this query against my database, the first three results are "Processor", "Egg Processor", and "COPRA Processor". The first two have the exact same match score of 6.04861688613892. Why on earth would MySQL not rank an exact match hit as the number one result? Is there anything I can do to refine the search algorithm?
You probably want to use one of the modifier modes in your searches. Check the fulltext documentation.
In particular, by default it uses "natural language" searching, while you probably want to consider "boolean mode" and prefixing each keyword with a plus sign to make it mandatory in results, or using double quotes to search for the exact phrase. Check the boolean mode documentation for more information on the syntax.
You can also consider performing multiple searches using a variety of modes and doing your own weighting.
I guess you should change the collation of your Column to case insensitive ones.
eg. latin1 to latin1_bin
Case sensitive Match is being done in your case.
Have a look here:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-natural-language.html