Non-binary LIKE in MySQL through Django ORM - mysql

This is a follow-up from this question. Although I can write a non-binary LIKE query such as - SELECT COUNT(*) FROM TABLE WHERE MID LIKE 'TEXT%' in raw SQL, I would like to know if it's possible through the Django ORM.
Both startswith and contains seem to be using a binary pattern search.

Try istartswith and icontains, which in MySQL resolve to LIKE rather than LIKE BINARY.
Note that with MySQL, the case-sensitivity of the comparison depends on the collation set in the database (meaning that i lookups may still be case-sensitive!).

Related

Alternative ways of searching a string in mysql column

I've a web app developed by java. Currently I'm in a part of my app that I need to use MySql like in order to search for a string in mysql table contain 100000+ rows. When I had my research I found that MySql like doesn't use indexes but if you have the wildcard at the end of your string example: hello% but I need %hello% which like doesn't use index in these kinds of wildcards. And I also read on the internet that there are other technologies such as postgresql which can give you the ability of using indexes for searching string.
My question is Just because of like do I need to change MySql DB with all it's other features to postgresql DB, Do we have any alternative way on MySql To search for a string that uses indexes?, Do I Install them both and use each for it's own use ( If there is no other way );
All replies are much appreciated.
Do we have any alternative way on MySql To search for a string
Have you looked into MySQL Full-Text Search which uses fulltext index; provided you are using either InnoDB or MyISAM engine

Is it good practice to use dots within table names in MySQL

Correct me if I'm wrong, but my understanding is that, in MSSQL, sub-structures of a database like Views, Schemas and Tables can be referenced using object notation such as:
Database.Schema.Table.Column
Each of these objects I believe has their own properties.
I need to replicate the structure of an MSSQL DB in MySQL and I am unsure what is the best practice.
I am thinking about creating tables in MySQL with the following naming convention:
Database
|---SubStructureX.Table
| |---Column_A
| |---Column_B
|---SubStructureY.Table
| |---Column_C
| |---Column_D
|
|
Therefore a MySQL query could look like this:
SELECT Column_A, Column_B FROM SubStructureX.Table
In short, "SubstructureX.Table" is just a table name that contains a dot. I would be doing this for ease of use during replication of the MSSQL structure. I don't care if the things before and after the dot are not objects in MySQL.
Is this good MySQL practice?
In MySQL? No, I would think that it's not good practice to use periods in table names at all. I would think that it's very bad practice. The dot is the reference operator in SQL. That means if you want to refer to a column using fully qualified notation, you do so like this:
SELECT Table.Column_A ...
Or, with backtick quoting:
SELECT `Table`.`Column_A` ...
Now, imagine if your table is named StructureX.Table. Just like with a space, you've got to quote that to escape it because you don't want MySQL to think the dot is an operator. That means your SQL has to look like this:
SELECT `StructureX.Table`.Column_A ...
Or, with backtick quoting:
SELECT `StructureX.Table`.`Column_A` ...
Doesn't that look like a syntax error to you? Like maybe it's supposed to be like this:
SELECT `StructureX`.`Table`.`Column_A` ...
This would be a nightmare to maintain and as a systems analyst I would hate any application or developer that inflicted this nomenclature on me. It makes me want to claw my eyes out.
Microsoft SQL Server is different because it supports multiple schemas within a single database, while MySQL treats schema as a synonym for database. In MS SQL Server, schemas are collections of objects, and you can use them to organize your tables, or apply security to tables as a group. The default schema is dbo, which is why you see that one listed so often. In MS SQL Server syntax, this:
SELECT [StructureX].[Table].[Column_A] ...
Means within the current database, the schema named StructureX, table named Table, and column name Column_A. MS SQL Server actually supports a four part name, with the fourth part being the database:
SELECT [MyDatabase].[StructureX].[Table].[Column_A] ...
Here, MyDatabase is the database name.
That same style works in MySQL, except you have to remember that schema and database are synonymous. So there, this:
SELECT `StructureX`.`Table`.`Column_A` ...
Would mean database StructureX, table Table, and column Column_A.
I Can say yes:
But instead of using table name, make a table some alias like this,
select a.column1 from yourTable as a
Using table alias is a good practice.

switching from MySQL to PostgreSQL for Ruby on Rails for the sake of Heroku

I'm trying to push a brand new Ruby on Rails app to Heroku. Currently, it sits on MySQL. It looks like Heroku doesn't really support MySQL and so we are considering using PostgreSQL, which they DO support.
How difficult should I expect this to be? What do I need to do to make this happen?
Again, please note that my DB as of right now (both development & production) are completely empty.
Common issues:
GROUP BY behavior. PostgreSQL has a rather strict GROUP BY. If you use a GROUP BY clause, then every column in your SELECT must either appear in your GROUP BY or be used in an aggregate function.
Data truncation. MySQL will quietly truncate a long string to fit inside a char(n) column unless your server is in strict mode, PostgreSQL will complain and make you truncate your string yourself.
Quoting is different, MySQL uses backticks for quoting identifiers whereas PostgreSQL uses double quotes.
LIKE is case insensitive in MySQL but not in PostgreSQL. This leads many MySQL users to use LIKE as a case insensitive string equality operator.
(1) will be an issue if you use AR's group method in any of your queries or GROUP BY in any raw SQL. Do some searching for column "X" must appear in the GROUP BY clause or be used in an aggregate function and you'll see some examples and common solutions.
(2) will be an issue if you use string columns anywhere in your application and your models aren't properly validating the length of all incoming string values. Note that creating a string column in Rails without specifying a limit actually creates a varchar(255) column so there actually is an implicit :limit => 255 even though you didn't specify one. An alternative is to use t.text for your strings instead of t.string; this will let you work with arbitrarily large strings without penalty (for PostgreSQL at least). As Erwin notes below (and every other chance he gets), varchar(n) is a bit of an anachronism in the PostgreSQL world.
(3) shouldn't be a problem unless you have raw SQL in your code.
(4) will be an issue if you're using LIKE anywhere in your application. You can fix this one by changing a like b to lower(a) like lower(b) (or upper(a) like upper(b) if you like to shout) or a ilike b but be aware that PostgreSQL's ILIKE is non-standard.
There are other differences that can cause trouble but those seem like the most common issues.
You'll have to review a few things to feel safe:
group calls.
Raw SQL (including any snippets in where calls).
String length validations in your models.
All uses of LIKE.
If you have no data to migrate, it should be as simple as telling your Gemfile to use the pg gem instead, running bundle install, and updating your database.yml file to point to your PostgreSQL databases. Then just run your migrations (rake db:migrate) and everything should work great.
Don't feel you have to migrate to Postgres - there are several MySQL Addon providers available on Heroku - http://addons.heroku.com/cleardb is the one I've had the most success with.
It should be simplicity itself: port the DDL from MySQL to PostgreSQL.
Does Heroku have any schema creation scripts? I'd depend on those if they were available.
MySQL and PostgreSQL are different (e.g. identity type for MySQL, sequences for PostgreSQL). But the port shouldn't be too hard. How many tables? Tens are doable.

SQL 'LIKE BINARY' any slower than plain 'LIKE'?

I'm using a django application which does some 'startswith' ORM operations comparing longtext columns with a unicode string. This results in a LIKE BINARY comparison operation with a u'mystring' unicode string. Is a LIKE BINARY likely to be any slower than a plain LIKE?
I know the general answer is benchmarking, but I would like to get a general idea for databases in general rather than just my application as I'd never seen a LIKE BINARY query before.
I happen to be using MySQL but I'm interested in the answer for SQL databases in general.
If performance seems to become a problem, it might be a good idea to create a copy of the first eg. 255 characters of the longtext, add an index on that and use the startswith with that.
BTW, this page says: "if you need to do case-sensitive matching, declare your column as BINARY; don't use LIKE BINARY in your queries to cast a non-binary column. If you do, MySQL won't use any indexes on that column." It's an old tip but I think this is still valid.
For the next person who runs across this - in our relatively small database the query:
SELECT * FROM table_name WHERE field LIKE 'some-field-search-value';
... Result row
Returns 1 row in set (0.00 sec)
Compared to:
SELECT * FROM table_name WHERE field LIKE BINARY 'some-field-search-value';
... Result row
Returns 1 row in set (0.32 sec)
Long story short, at least for our database (MySQL 5.5 / InnoDB) there is a very significant difference in performance between the two lookups.
Apparently though this is a bug in MySQL 5.5: http://bugs.mysql.com/bug.php?id=63563 and in my testing against the same database in MySQL 5.1 the LIKE BINARY query still uses the index (while in 5.5 it does a full table scan.)
A trick: If you don't want to change the type of your column to binary, try to write your ‍WHERE statement like this:
WHERE field = 'yourstring' AND field LIKE BINARY 'yourstring'
instead of:
WHERE field LIKE BINARY 'yourstring'
Indeed, it will check the first condition very quickly, and try the second one only if the first one is true.
It worked well on my project for this test of equality, and I think you can adapt this to the "starts with" test.

How do you write a case insensitive query for both MySQL and Postgres?

I'm running a MySQL database locally for development, but deploying to Heroku which uses Postgres. Heroku handles almost everything, but my case-insensitive Like statements become case sensitive. I could use iLike statements, but my local MySQL database can't handle that.
What is the best way to write a case insensitive query that is compatible with both MySQL and Postgres? Or do I need to write separate Like and iLike statements depending on the DB my app is talking to?
The moral of this story is: Don't use a different software stack for development and production. Never.
You'll just end up with bugs which you can't reproduce in dev; your testing will be worthless. Just don't do it.
Using a different database engine is out of the question - there will be FAR more cases where it behaves differently than just LIKE (also, have you checked the collations in use by the databases? Are they identical in EVERY CASE? If not, you can forget ORDER BY on varchar columns working the same)
select * from foo where upper(bar) = upper(?);
If you set the parameter to upper case in the caller, you can avoid the second function call.
Use Arel:
Author.where(Author.arel_table[:name].matches("%foo%"))
matches will use the ILIKE operator for Postgres, and LIKE for everything else.
In postgres, you can do this:
SELECT whatever FROM mytable WHERE something ILIKE 'match this';
I'm not sure if there is an equivalent for MySQL but you can always do this which is a bit ugly but should work in both MySQL and postgres:
SELECT whatever FROM mytable WHERE UPPER(something) = UPPER('match this');
There are several answers, none of which are very satisfactory.
LOWER(bar) = LOWER(?) will work on MySQL and Postgres, but is likely to perform terribly on MySQL: MySQL won't use its indexes because of the LOWER function. On Postgres you can add a functional index (on LOWER(bar)) but MySQL doesn't support this.
MySQL will (unless you have set a case-sensitive collation) do case-insensitive matching automatically, and use its indexes. (bar = ?).
From your code outside the database, maintain bar and bar_lower fields, where bar_lower contains the result of lower(bar). (This may be possible using database triggers, also). (See a discussion of this solution on Drupal). This is clumsy but does at least run the same way on pretty much every database.
REGEXP is case insensitive (unless used with BINARY), and can be used, like so...
SELECT id FROM person WHERE name REGEXP 'john';
...to match 'John', 'JOHN', 'john', etc.
If you're using PostgreSQL 8.4 you can use the citext module to create case insensitive text fields.
use COLLATE.
http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
You might also consider checking out the searchlogic plugin, which does the LIKE/ILIKE switch for you.
You can also use ~* in postgres if you want to match a substring within a block. ~ matches case-sensitive substring, ~* case insensitive substring. Its a slow operation, but might I find it useful for searches.
Select * from table where column ~* 'UnEvEn TeXt';
Select * from table where column ~ 'Uneven text';
Both would hit on "Some Uneven text here"
Only the former would hit on "Some UNEVEN TEXT here"
Converting to upper is best as it covers compatible syntax for the 3 most-used Rails database backends. PostgreSQL, MySQL and SQLite all support this syntax. It has the (minor) drawback that you have to uppercase your search string in your application or in your conditions string, making it a bit uglier, but I think the compatibility you gain makes it worthwile.
Both MySQL and SQLite3 have a case-insensitive LIKE operator. Only PostgreSQL has a case-sensitive LIKE operator and a PostgreSQL-specific (per the manual) ILIKE operator for case-insensitive searches. You might specify ILIKE insead of LIKE in your conditions on the Rails application, but be aware that the application will cease to work under MySQL or SQLite.
A third option might be to check which database engine you're using and modify the search string accordingly. This might be better done by hacking into / monkeypatching ActiveRecord's connection adapters and have the PostgreSQL adapter modify the query string to substitute "LIKE" for "ILIKE" prior to query execution. This solution is however the most convoluted and in light of easier ways like uppercasing both terms, I think this is not worh the effort (although you'd get plenty of brownie points for doing it this way).