SQL 'LIKE BINARY' any slower than plain 'LIKE'? - mysql

I'm using a django application which does some 'startswith' ORM operations comparing longtext columns with a unicode string. This results in a LIKE BINARY comparison operation with a u'mystring' unicode string. Is a LIKE BINARY likely to be any slower than a plain LIKE?
I know the general answer is benchmarking, but I would like to get a general idea for databases in general rather than just my application as I'd never seen a LIKE BINARY query before.
I happen to be using MySQL but I'm interested in the answer for SQL databases in general.

If performance seems to become a problem, it might be a good idea to create a copy of the first eg. 255 characters of the longtext, add an index on that and use the startswith with that.
BTW, this page says: "if you need to do case-sensitive matching, declare your column as BINARY; don't use LIKE BINARY in your queries to cast a non-binary column. If you do, MySQL won't use any indexes on that column." It's an old tip but I think this is still valid.

For the next person who runs across this - in our relatively small database the query:
SELECT * FROM table_name WHERE field LIKE 'some-field-search-value';
... Result row
Returns 1 row in set (0.00 sec)
Compared to:
SELECT * FROM table_name WHERE field LIKE BINARY 'some-field-search-value';
... Result row
Returns 1 row in set (0.32 sec)
Long story short, at least for our database (MySQL 5.5 / InnoDB) there is a very significant difference in performance between the two lookups.
Apparently though this is a bug in MySQL 5.5: http://bugs.mysql.com/bug.php?id=63563 and in my testing against the same database in MySQL 5.1 the LIKE BINARY query still uses the index (while in 5.5 it does a full table scan.)

A trick: If you don't want to change the type of your column to binary, try to write your ‍WHERE statement like this:
WHERE field = 'yourstring' AND field LIKE BINARY 'yourstring'
instead of:
WHERE field LIKE BINARY 'yourstring'
Indeed, it will check the first condition very quickly, and try the second one only if the first one is true.
It worked well on my project for this test of equality, and I think you can adapt this to the "starts with" test.

Related

Non-binary LIKE in MySQL through Django ORM

This is a follow-up from this question. Although I can write a non-binary LIKE query such as - SELECT COUNT(*) FROM TABLE WHERE MID LIKE 'TEXT%' in raw SQL, I would like to know if it's possible through the Django ORM.
Both startswith and contains seem to be using a binary pattern search.
Try istartswith and icontains, which in MySQL resolve to LIKE rather than LIKE BINARY.
Note that with MySQL, the case-sensitivity of the comparison depends on the collation set in the database (meaning that i lookups may still be case-sensitive!).

Is better POW() or POWER()?

I must make an exponentiation of a number and I don't know which function to use between POW() and POWER(). Which of the two functions is better?
Looking at the MySQL documentation I saw that they are synonymous, but I wanted to understand if there was a reason for two functions that do the same thing.
POWER is the synonym of POW. So nothing is better, it is the same:
POWER(X,Y)
This is a synonym for POW().
Using two different names for the same function gives you the possibility to port an SQL query from one dialect to an other dialect without (big) changes.
An example:
You want to use the following TSQL query on MySQL too:
SELECT POWER(2,2) -- 4
Now you can write these query specific for the dialects:
SELECT POWER(2,2) -- 4 - TSQL (POW is not available on TSQL)
SELECT POW(2,2) -- 4 - MySQL
But you can also use the POWER function on MySQL since this is a synonym for POW:
SELECT POWER(2,2) -- 4 - TSQL and MySQL
pow and power are synonyms in MySQL.
I'd use power since it's part of the ANSI SQL standard and using it would make your code easier to port if you ever decide to use a different database.

Reverse string with leading wildcard scan in Postgres

SQL Server: Index columns used in like?
I've tried using the query method in the link above with Postgres (0.3ms improvement), it seems to only work with MySQL (10x faster).
MYSQL
User Load (0.4ms) SELECT * FROM users WHERE reverse_name LIKE REVERSE('%Anderson PhD')
User Load (5.8ms) SELECT * FROM users WHERE name LIKE ('%Anderson Phd')
POSTGRES
User Load (2.1ms) SELECT * FROM users WHERE reverse_name LIKE REVERSE('%Scot Monahan')
User Load (2.5ms) SELECT * FROM users WHERE name LIKE '%Scot Monahan'
Did some googling but couldn't quite understand as I'm quite new to DBs. Could anyone explain why this is happening?
To support a prefix match in Postgres for character type columns you either need an index with a suitable operator class: text_pattern_ops for text etc. (Unless you work with the "C" locale ...)
Or you use a trigram index to support any pattern - then you don't need the "reverse" trick any more.
See:
PostgreSQL LIKE query performance variations
You'll see massive performance improvement for tables of none-trivial size, as soon as an index can be used for the query. Have a look at the query plan with EXPLAIN.

MySQL Uncompress issue

I am facing issue with MySQL, UNCOMPRESS function.
I have table named as user and user_details stores COMPRESS values. In this case before search for values from the user_details i have to UNCOMPRESS it.
But issue is after I do UNCOMPRESS, search become case-sensetive.
Like.. e.g
If I tries below sql, it will only search for values which contain CAPITAL TESTING word and ignore small case testing word
SELECT * FORM user WHERE UNCOMPRESS(user_details) LIKE '%TESTING%'.
I want case-insensitive search.
But issue is after I do UNCOMPRESS, search become case-sensetive.
This is because COMPRESS() "Compresses a string and returns the result as a binary string." (emphasis mine)
When you perform a LIKE operation on a binary string, a binary comparison will be performed (which is case-sensitive).
You may be able to circumvent this by putting a CAST() around the COMPRESS() statement.
But you probably shouldn't be doing this in the first place. It's an extremely inefficient way to search through huge amounts of data. MySQL will have to uncompress every row for this operation, and has no chance of using any of its internal optimization methods like indexes.
Don't use COMPRESS() in the first place.
Try this:
SELECT * FROM `user` WHERE LOWER(UNCOMPRESS(user_details)) LIKE '%testing%'
But as Pekka well pointed or its very inefficient . If your using MyIsam Engine another alternative is myisampack which compresses the hole table and its still query-able.

How do you write a case insensitive query for both MySQL and Postgres?

I'm running a MySQL database locally for development, but deploying to Heroku which uses Postgres. Heroku handles almost everything, but my case-insensitive Like statements become case sensitive. I could use iLike statements, but my local MySQL database can't handle that.
What is the best way to write a case insensitive query that is compatible with both MySQL and Postgres? Or do I need to write separate Like and iLike statements depending on the DB my app is talking to?
The moral of this story is: Don't use a different software stack for development and production. Never.
You'll just end up with bugs which you can't reproduce in dev; your testing will be worthless. Just don't do it.
Using a different database engine is out of the question - there will be FAR more cases where it behaves differently than just LIKE (also, have you checked the collations in use by the databases? Are they identical in EVERY CASE? If not, you can forget ORDER BY on varchar columns working the same)
select * from foo where upper(bar) = upper(?);
If you set the parameter to upper case in the caller, you can avoid the second function call.
Use Arel:
Author.where(Author.arel_table[:name].matches("%foo%"))
matches will use the ILIKE operator for Postgres, and LIKE for everything else.
In postgres, you can do this:
SELECT whatever FROM mytable WHERE something ILIKE 'match this';
I'm not sure if there is an equivalent for MySQL but you can always do this which is a bit ugly but should work in both MySQL and postgres:
SELECT whatever FROM mytable WHERE UPPER(something) = UPPER('match this');
There are several answers, none of which are very satisfactory.
LOWER(bar) = LOWER(?) will work on MySQL and Postgres, but is likely to perform terribly on MySQL: MySQL won't use its indexes because of the LOWER function. On Postgres you can add a functional index (on LOWER(bar)) but MySQL doesn't support this.
MySQL will (unless you have set a case-sensitive collation) do case-insensitive matching automatically, and use its indexes. (bar = ?).
From your code outside the database, maintain bar and bar_lower fields, where bar_lower contains the result of lower(bar). (This may be possible using database triggers, also). (See a discussion of this solution on Drupal). This is clumsy but does at least run the same way on pretty much every database.
REGEXP is case insensitive (unless used with BINARY), and can be used, like so...
SELECT id FROM person WHERE name REGEXP 'john';
...to match 'John', 'JOHN', 'john', etc.
If you're using PostgreSQL 8.4 you can use the citext module to create case insensitive text fields.
use COLLATE.
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You might also consider checking out the searchlogic plugin, which does the LIKE/ILIKE switch for you.
You can also use ~* in postgres if you want to match a substring within a block. ~ matches case-sensitive substring, ~* case insensitive substring. Its a slow operation, but might I find it useful for searches.
Select * from table where column ~* 'UnEvEn TeXt';
Select * from table where column ~ 'Uneven text';
Both would hit on "Some Uneven text here"
Only the former would hit on "Some UNEVEN TEXT here"
Converting to upper is best as it covers compatible syntax for the 3 most-used Rails database backends. PostgreSQL, MySQL and SQLite all support this syntax. It has the (minor) drawback that you have to uppercase your search string in your application or in your conditions string, making it a bit uglier, but I think the compatibility you gain makes it worthwile.
Both MySQL and SQLite3 have a case-insensitive LIKE operator. Only PostgreSQL has a case-sensitive LIKE operator and a PostgreSQL-specific (per the manual) ILIKE operator for case-insensitive searches. You might specify ILIKE insead of LIKE in your conditions on the Rails application, but be aware that the application will cease to work under MySQL or SQLite.
A third option might be to check which database engine you're using and modify the search string accordingly. This might be better done by hacking into / monkeypatching ActiveRecord's connection adapters and have the PostgreSQL adapter modify the query string to substitute "LIKE" for "ILIKE" prior to query execution. This solution is however the most convoluted and in light of easier ways like uppercasing both terms, I think this is not worh the effort (although you'd get plenty of brownie points for doing it this way).