Database SQL Compatibility - mysql

I am deploying a Ruby on Rails application that I developed with Sqlite3 to a server with either MySQL or PostgreSQL. I quickly discovered that the "group by" and "strftime" functions that I am using heavily to produce by-month rollup reports are working differently or not compatible between the various databases.
I can refactor my code to do the grouping, summing and averaging - but the database does such a nice job of it and reduces the processing required by the server! Advanced applications go beyond simple select and join. ActiveRecord gives us :group, but the DATABASES are not CONSISTENT.
So my question is a matter of architecture - does anyone expect to create truly "database portable" applications in Ruby on Rails? Should I modify my codebase to work with MySQL only and forget about the other databases? Should I modify my codebase to do the advanced grouping, summing, and averaging?
cheers - Don

Several comments:
Develop and test with the same RDBMS brand and version that you're going to deploy to.
Writing portable SQL code is hard because vendors have all these non-standard extra functions and features. For example, strftime() is not part of the ANSI SQL standard. The only way to resolve this is to RTM for each database you use, and learn what functions they have in common. Sometimes they have a function of a different name that you can use in a similar way. There's no short-cut around this -- you have to study the manuals.
All the databases support GROUP BY, but SQLite and MySQL are kind of more permissive about certain usage than standard ANSI SQL (and all other brands of database which do follow the standard). Specifically, in your GROUP BY clause you must name every column in your select-list that isn't part of a grouping function.
The following two examples are right:
SELECT A, B, COUNT(C) FROM MyTable GROUP BY A, B;
SELECT A, COUNT(C) FROM MyTable GROUP BY A;
But the next one is wrong, because B has multiple values per group, and it's ambiguous which value it should return in a given row:
SELECT A, B, COUNT(C) FROM MyTable GROUP BY A;
No framework writes truly portable SQL. Rails' ActiveRecord solves this only in very trivial cases. In fact, ActiveRecord helps solve neither of the examples you give, of functions that are brand-specific, and non-standard GROUP BY clauses.

The problem is that especially with GROUP BY MySQL does it wrong. If you leave out columns from the group by MySQL simply returns "something" accepting that the results may be indeterminate.
You can (should) use the ONLY_FULL_GROUP_BY parameter to make MySQL throw an error if the result of your GROUP BY would not be clearly defined.
Actually there are a lot more settings that should be changed in MySQL to make it behave more sanely
You might be interested in reading this:
http://www.slideshare.net/ronaldbradford/mysql-idiosyncrasies-that-bite-201007

Related

MySQL - No lock while selecting rows in table

I'm starting to study MySQL syntax and now I'm asking how to lock / unlock tables.
After a bit of research, it seems that mysql does not provide a single "nolock" key word.
But if I try to execute the following query:
select *from logs NOLOCK order by timestamp desc;
no errors occur. So, is there a standard way in order to achieve this?
NOLOCK is not an option supported by MySQL.
It's a feature specific to Microsoft SQL Server: https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table
You must understand that even though SQL is a common standard, each company who offers a SQL-compliant database product has implemented their own extensions to standard SQL. Therefore a product like Microsoft SQL Server has some syntax features that are not supported — and not needed — by other RDBMS products.
MySQL is not Microsoft SQL Server. They are two different implementations of RDBMS.
As Raymond commented above, you unintentionally used NOLOCK in a place where it would be interpreted by MySQL as a table alias.
... FROM logs [AS] NOLOCK ...
The SQL standard supports making the AS keyword optional when definining table aliases and column aliases. This can cause some weird surprises, even though it's technically legal syntax to omit the AS keyword.

Using SQL keywords with Yii

I work with Yii 1.1.13.
Is it good to call a table group? (In MySQL GROUP is a keyword, as used in "GROUP BY")
It's probably a bad idea, while some RDBMS support applying keywords to fields or tables (and accessing them using [] i.e select [group] from tbl ) it doesn't mean you should.
We recently have an issue where we had a field name group in one of our main tables, this was on a DB2 engine and we never had an issue, but then we moved our DataWarehouse to a PostgreSQL's fork named Greenplum and it didn't support a keyword as name for a field, so the DBA's were forced to change the field name in the migration and several services and reports failed until the code was changed. Production supported was impacted and everybody was mad/crazy about this.
It is my recommendation to avoid it, and remember anything that can go wrong, will go wrong

What is the standard SQL query equivalent to Oracle's 'start with...connect by', but not DBMS specific

I would like to know if there is a generic SQL equivalent to Oracle's hierarchical syntax start with...connect by. I need something which can be used on any database. Here is the sort of query I mean (using Oracle's example EMP table):
SELECT empno , ename , job , mgr ,hiredate ,level
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr.
Recursive Common Table Expressions work for many database implementations but not for MySQL.
There is no way of doing this in MySQL. There are some nasty hacks listed in an article by Mike Hillyer which could be used in other databases as well. But using something as inelegant as the Nested Set model in Oracle just so the same code will run on MySQL seems perverse.
The generic way would be CTE, as they are specified in SQL-99, and most flavours of RDBMS support it (even Oracle added recursiveness to its CTEs in 11gR2). The lack of support for CTE in MySQL was raised as a bug in 2006. Perhaps now Oracle owns MySQL they will get around to implementing it.
However, it really depends on your business reasons for wanting a generic solution and which database versions you really need to cover. It is a truism of writing database applications which can run on any RDBMS that they run well on none of them.

SQL Statement Syntax Differences

I was wondering what types of things usually vary between SQL implementations when looking at the query statements. One thing that I thought was the use of IS NULL in the WHERE clause. See bleow for example. I'm writing a query statement parser that handles the statement and queries in a custom language and need to account for most of the general differences between the more widely used SQL products.
Oracle Syntax:
SELECT * FROM TABLE WHERE COLUMN_A IS NULL
SELECT * FROM TABLE WHERE COLUMN_A IS NOT NULL
MySQL Syntax?
SQL Server Syntax?
I'm not sure you're going to find a definitive list of all differences. A few things I can think of off the top of my head:
MySQL uses LIMIT while SQL Server uses TOP.
SQL Server is much stricter on GROUP BY operations than MySQL, requiring that all non-aggregated columns from the SELECT appear in the GROUP BY clause.
SQL Server supports a proprietary UPDATE FROM and DELETE FROM syntax that goes beyond the ANSI standard.
Functions that exist in one system but not another. MySQL has FIND_IN_SET and GROUP_CONCAT that don't exist in SQL Server. Likewise, SQL Server has ROW_NUMBER() that doesn't exist in MySQL.
The IS NULL / IS NOT NULL syntax is ANSI standard SQL, and supported in all three of those RDBMS as you have listed it for Oracle.
IS NULL and IS NOT NULL is the same pretty much everywhere. The main differences for basic queries would relate to function calls, and those are vastly different so you'll have to be more specific there.
There are plenty of things that vary between different RDMBS implementation. Here's a simple example which doesn't use any specific function:
In Oracle, you can update table A from data in table B as follows:
UPDATE A
SET (COL1,COL2) = (SELECT B.COL3, B.COL4 FROM B WHERE B.COL5 = A.COL6)
WHERE A.COL7 = 3
AND EXISTS (SELECT 1 FROM B WHERE B.COL5 = A.COL6);
But in SQL Server the same task can be done as follows:
UPDATE A
SET COL1 = B.COL3, COL2 = B.COL4
FROM B
WHERE B.COL5 = A.COL6
AND A.COL7 = 3;
Additionally, the Oracle syntax is invalid in SQL Server and vice versa, so you can't settle for a common denominator. Writing a parser for this particular syntax is a challenge, so a general parser seems to be a highly non-trivial task.
You can apply both queries to all the rdbms. This is standard ansi.
About a decade ago I bookmarked a link, long since broken, to a document entitled, "Levels of Vendor Compliance with ANSI SQL". I've kept it so I can think, "Ah, how quaint." The Standard is now ISO (I = international) and not just ANSI (A = USA). Nobody tries to document this kind of thing for more than one SQL product anymore.
All vendors pay close attention to the SQL Standard and will declare level compliance on a feature-by-feature basis. Even when no such declaration is forthcoming you know they have read the Standard spec, even if it means a concious decision to extend or to do things completely differently. If you are interested in portability then get used to writing Standard SQL that is implemented by, or similar to syntax in, the SQL products you wish to target.
Taking mySQL and SQL Server as examples. I would guess that some mySQL features (e.g. ORDER BY LIMIT) are closer to Standards than SQL Server is (TOP) because mySQL have come to the party later and actually had a Standard spec to follow and no legacy version to be compatible with. I would guess that other features in mySQL (update on duplicate key) are further from Standards (SQL Server extends MERGE from Standards) because they wanted something easier to implement and simpler users. I would a guess some mySQL features are close to those in SQL Server to be able to poach users!

switching from MySQL to PostgreSQL for Ruby on Rails for the sake of Heroku

I'm trying to push a brand new Ruby on Rails app to Heroku. Currently, it sits on MySQL. It looks like Heroku doesn't really support MySQL and so we are considering using PostgreSQL, which they DO support.
How difficult should I expect this to be? What do I need to do to make this happen?
Again, please note that my DB as of right now (both development & production) are completely empty.
Common issues:
GROUP BY behavior. PostgreSQL has a rather strict GROUP BY. If you use a GROUP BY clause, then every column in your SELECT must either appear in your GROUP BY or be used in an aggregate function.
Data truncation. MySQL will quietly truncate a long string to fit inside a char(n) column unless your server is in strict mode, PostgreSQL will complain and make you truncate your string yourself.
Quoting is different, MySQL uses backticks for quoting identifiers whereas PostgreSQL uses double quotes.
LIKE is case insensitive in MySQL but not in PostgreSQL. This leads many MySQL users to use LIKE as a case insensitive string equality operator.
(1) will be an issue if you use AR's group method in any of your queries or GROUP BY in any raw SQL. Do some searching for column "X" must appear in the GROUP BY clause or be used in an aggregate function and you'll see some examples and common solutions.
(2) will be an issue if you use string columns anywhere in your application and your models aren't properly validating the length of all incoming string values. Note that creating a string column in Rails without specifying a limit actually creates a varchar(255) column so there actually is an implicit :limit => 255 even though you didn't specify one. An alternative is to use t.text for your strings instead of t.string; this will let you work with arbitrarily large strings without penalty (for PostgreSQL at least). As Erwin notes below (and every other chance he gets), varchar(n) is a bit of an anachronism in the PostgreSQL world.
(3) shouldn't be a problem unless you have raw SQL in your code.
(4) will be an issue if you're using LIKE anywhere in your application. You can fix this one by changing a like b to lower(a) like lower(b) (or upper(a) like upper(b) if you like to shout) or a ilike b but be aware that PostgreSQL's ILIKE is non-standard.
There are other differences that can cause trouble but those seem like the most common issues.
You'll have to review a few things to feel safe:
group calls.
Raw SQL (including any snippets in where calls).
String length validations in your models.
All uses of LIKE.
If you have no data to migrate, it should be as simple as telling your Gemfile to use the pg gem instead, running bundle install, and updating your database.yml file to point to your PostgreSQL databases. Then just run your migrations (rake db:migrate) and everything should work great.
Don't feel you have to migrate to Postgres - there are several MySQL Addon providers available on Heroku - http://addons.heroku.com/cleardb is the one I've had the most success with.
It should be simplicity itself: port the DDL from MySQL to PostgreSQL.
Does Heroku have any schema creation scripts? I'd depend on those if they were available.
MySQL and PostgreSQL are different (e.g. identity type for MySQL, sequences for PostgreSQL). But the port shouldn't be too hard. How many tables? Tens are doable.