I've developed an application at working using MySQL 5, that uses Views to access the major pieces of data. It turns out that our production server uses MySQL 4, which does not have Views included.
Does anybody have a quick and dirty way to deal with this that doesn't involve rewriting all my code?
This certainly points out the importance of using the same technology in your development and production environments!
Workarounds involving triggers or stored procedures won't work, because these are also not supported on MySQL 4.x.
Your options at this point:
Rewrite application code to duplicate data in denormalized tables, designed to match your views.
Upgrade your production database to MySQL 5.0. If you're talking about a hosting provider, then contact that provider and ask if they have an option for MySQL 5.0, otherwise you need to relocate to a provider who does.
I'd recommend the latter path, it'll be far less work than writing code to manage duplicate data.
Note that MySQL 4.1 was released as production software over four years ago. Active support for this release ended in 2006. Extended support for MySQL 4.1 ends 2009-12-31. See http://www.mysql.com/about/legal/lifecycle/
The quick and very dirty way that comes to mind is to subclass DBI and re-write the SQL there. Depends on what you're using views for, of course, and if you mean MySQL 4.0 (does not have subqueries) or MySQL 4.1 (does have subqueries).
If you're on 4.1, you can turn:
CREATE VIEW foo AS
SELECT a, b, c FROM real_table WHERE fooable = 1;
SELECT * FROM foo;
into
SELECT v1.* FROM (
SELECT a, b, c FROM real_table WHERE fooable = 1
) v1;
At least, the latter syntax works in 5.0.x, I think it should in 4.1.x as well.
If you're on 4.0... well, it won't be as easy.
Ouch. Aside from a DeLorean and a flux capacitor or upgrading the server I don't know of any easy way to get around this issue. A lot of change seems necessary.
Unfortunately, without upgrading to MySQL 5, probably not.
Related
I need to start off by pointing out that by no means am I a database expert in any way. I do know how to get around to programming applications in several languages that require database backends, and am relatively familiar with MySQL, Microsoft SQL Server and now MEMSQL - but again, not an expert at databases so your input is very much appreciated.
I have been working on developing an application that has to cross reference several different tables. One very simple example of an issue I recently had, is I have to:
On a daily basis, pull down 600K to 1M records into a temporary table.
Compare what has changed between this new data pull and the old one. Record that information on a separate table.
Repopulate the table with the new records.
Running #2 is a query similar to:
SELECT * FROM (NEW TABLE) LEFT JOIN (OLD TABLE) ON (JOINED FIELD) WHERE (OLD TABLE.FIELD) IS NULL
In this case, I'm comparing the two tables on a given field and then pulling the information of what has changed.
In MySQL (v5.6.26, x64), my query times out. I'm running 4 vCPUs and 8 GB of RAM but note that the rest of my configuration is default configuration (did not tweak any parameters).
In MEMSQL (v5.5.8, x64), my query runs in about 3 seconds on the first try. I'm running the exact same virtual server configuration with 4 vCPUs and 8 GB of RAM, also note that the rest of my configuration is default configuration (did not tweak any parameters).
Also, in MEMSQL, I am running a single node configuration. Same thing for MySQL.
I love the fact that using MEMSQL allowed me to continue developing my project, and I'm coming across even bigger cross-table calculation queries and views that I can run that are running fantastically on MEMSQL... but, in an ideal world, i'd use MySQL. I've already come across the fact that I need to use a different set of tools to manage my instance (i.e.: MySQL Workbench works relatively well with a MEMSQL server but I actually need to build views and tables using the open source SQL Workbench and the mysql java adapter. Same thing for using the Visual Studio MySQL connector, works, but can be painful at times, for some reason I can add queries but can't add table adapters)... sorry, I'll submit a separate question for that :)
Considering both virtual machines are exactly the same configuration, and SSD backed, can anyone give me any recommendations on how to tweak my MySQL instance to run big queries like the one above on MySQL? I understand I can also create an in-memory database but I've read there might be some persistence issues with doing that, not sure.
Thank you!
The most likely reason this happens is because you don't have index on your joined field in one or both tables. According to this article:
https://www.percona.com/blog/2012/04/04/join-optimizations-in-mysql-5-6-and-mariadb-5-5/
Vanilla MySQL only supports nested loop joins, that require the index to perform well (otherwise they take quadratic time).
Both MemSQL and MariaDB support so-called hash join, which does not require you to have indexes on the tables, but consumes more memory. Since your dataset is negligibly small for modern RAM sizes, that extra memory overhead is not noticed in your case.
So all you need to do to address the issue is to add indexes on joined field in both tables.
Also, please describe the issues you are facing with the open source tools when connect to MemSQL in a separate question, or at chat.memsql.com, so that we can fix it in the next version (I work for MemSQL, and compatibility with MySQL tools is one of the priorities for us).
I am writing a systems application that has to integrate into an existing older architecture. In order to do this, I have to access a bitmask field in a table; something like so:
SELECT * FROM directory WHERE (status & 64) | (status & 256);
Our existing system runs on MySQL -- and we have a statement similar to the above that works just fine.
However, in my new application which I have to integrate with the existing system, I am using embedded-HSQL in my unit-tests. And for the life of me I cannot figure out how to do bitwise operations in HSQL. Furthermore, even if I am able to figure it out, I am starting to worry that there is not a single statement compatible between both SQL engines.
Any tips on how to go about this? At the moment I'm thinking I'll have to just select everything where status != 0 (limiting the result-set, of course) and then use java to pick-out the specific ones I want that match the status's I'm targeting. Yikes.
These operations are done using functions in HSQLDB.
http://hsqldb.org/doc/2.0/guide/builtinfunctions-chapt.html#bfc_numeric_functions
See BITOR, BITXOR, BITAND, BITNOT, BITANDNOT functions.
Bitwise operators are not very common in SQL dialects. MySQL is an exception rather than the norm.
I never found a better solution to this than just making the effort to replace my embedded database from HSQL to MYSQL.
I also took the time to write better unit tests; ie: exercise my database logic against an actual database, but all other layers use mock database logic.
So that's my take away... run your DDL tests against the same database you are using in production and run all your other logic against DDL mocks.
For posterity, here's what I used to build my embedded MYSQL:
http://zhentao-li.blogspot.com/2013/06/using-embedded-mysql-database-for-unit.html?m=1
Simple question - what would better for a medium/big size database with requirement for compatibility with ACID in 2012.
I have read it all (well most) about mySQL vs pgSQL but most of those posts relate to version 4,5.1 and 7,8 respectively and are quite dated (2008,2009). Its almost 2012 now so I guess we could try and take a fresh look at the issue.
Basically I would like to know if there is anything in PostgreSQL that out-weights ease of use, availability and larger developer/knowledge base of MySQL.
Is MySQL's query optimizer still stupid? Is it still super slow on very complicated queries?
Hit me! :)
PS. And don't send me to goggle or wiki. I am looking for few specific points not an overview + I trust StackOverflow more than some random page with 'smart guy' shining his light.
Addendum
Size of the project: Say an ordering system with roughly 10-100 orders/day per account, couple of thousand accounts, eventually, each can have several hundred to several thousand users.
Better at: being future proof and flexible when it comes to growing and changing requirements. Performance is also important as to keep costs low in hardware department. Also availability of skilled workforce would be a factor.
OLTP or OLAP: OLTP
PostgreSQL is a lot more advanced when it comes to SQL features.
Things that MySQL still doesn't have (and PostgreSQL has):
deferrable constraints
check constraints (MySQL 8.0.16 added them, MariaDB 10.2 has them)
full outer join
MySQL silently uses an inner join with some syntax variations:
https://rextester.com/ADME43793
lateral joins
regular expressions don't work with UTF-8 (Fixed with MySQL 8.0)
regular expressions don't support replace or substring (Introduced with MySQL 8.0)
table functions ( select * from my_function() )
common table expressions (Introduced with MySQL 8.0)
recursive queries (Introduced with MySQL 8.0)
writeable CTEs
window functions (Introduced with MySQL 8.0)
function based index (supported since MySQL 8.0.15)
partial index
INCLUDE additional column in an indexes (e.g. for unique indexes)
multi column statistics
full text search on transactional tables (MySQL 5.6 supports this)
GIS features on transactional tables
EXCEPT or INTERSECT operator (MariaDB has them)
you cannot use a temporary table twice in the same select statement
you cannot use the table being changed (update/delete/insert) in a sub-select
you cannot create a view that uses a derived table (Possible since MySQL 8.0)
create view x as select * from (select * from y);
statement level read consistency. Needed for e.g.: update foo set x = y, y = x or update foo set a = b, a = a + 100
transactional DDL
DDL triggers
exclusion constraints
key/value store
Indexing complete JSON documents
SQL/JSON Path expressions (since Postgres 12)
range types
domains
arrays (including indexes on arrays)
roles (groups) to manage user privileges (MariaDB has them, Introduced with MySQL 8.0)
parallel queries (since Postgres 9.6)
parallel index creation (since Postgres 11)
user defined data types (including check constraints)
materialized views
custom aggregates
custom window functions
proper boolean data type
(treating any expression that can be converted to a non-zero number as "true" is not a proper boolean type)
When it comes to Spatial/GIS features Postgres with PostGIS is also much more capable. Here is a nice comparison.
Not sure what you call "ease of use" but there are several modern SQL features that I would not want to miss (CTEs, windowing functions) that would define "ease of use" for me.
Now, PostgreSQL is not perfect and probably the most obnoxious thing can be, to tune the dreaded VACUUM process for a heavy write database.
Is MySQL's query optimizer still stupid? Is it still super slow on
very complicated queries?
All query optimizers are stupid at times. PostgreSQL's is less stupid in most cases. Some of PostgreSQL's more recent SQL features (windowing functions, recursive WITH queries etc) are very powerful but if you have a dumb ORM they might not be usable.
Size of the project: Say an ordering system with roughly 10-100
orders/day per account, couple of thousand accounts, eventually, each
can have several hundred to several thousand users.
Doesn't sound that large - well within reach of a big box.
Better at: being future proof and flexible when it comes to growing
and changing requirements.
PostgreSQL has a strong developer team, with an extended community of contributors. Release policy is strict, with bugfixes-only in the point releases. Always track the latest release of 9.1.x for the bugfixes.
MySQL has had a somewhat more relaxed attitude to version numbers in the past. That may change with Oracle being in charge. I'm not familiar with the policies of the various forks.
Performance is also important as to keep costs low in hardware department.
I'd be surprised if hardware turned out to be a major component in a project this size.
Also availability of skilled workforce would be a factor.
That's your key decider. If you've got a team of experienced Perl + PostgreSQL hackers sat around idle, use that. If your people know Lisp and MySQL then use that.
OLTP or OLAP: OLTP
PostgreSQL has always been strong on OLTP.
My personal viewpoint is that the PostgreSQL mailing list are full of polite, helpful, knowledgeable people. You have direct contact with users with Terabyte databases and hackers who have built major parts of the code. The quality of the support is truly excellent.
As an addition to #a_horse_with_no_name answer, I want to name some features which I like so much in PostgreSQL:
arrays data type;
hstore extension - very useful for storing key->value data, possible to create index on columns of that type;
various language extensions - I find Python very useful when it comes to unstructured data handling;
distinct on syntax - I think this one should be ANSI SQL feature, it looks very natural to me (as opposed to MySQL group by syntax);
composite types;
record types;
inheritance;
Version 9.3 features:
lateral joins - one thing I miss from SQL Server (where it called outer/cross apply);
native JSON support;
DDL triggers;
recursive, materialized, updatable views;
PostgreSQL is a more mature database, it has a longer history, it is more ANSI SQL compliant, its query optimizer is significantly better. MySQL has different storage engines like MyISAM, InnoDB, in-memory, all of them are incompatible in a sense that an SQL query which runs on one engine may produce a syntax error when executed on another engine. Stored procedures are better in PostgreSQL.
I'm thinking about moving from MySQL to Postgres for Rails development and I just want to hear what other developers that made the move have to say about it.
I'm looking for personal experiences, not a Mysql v Postgres shootout, just the pros and cons that you yourself have arrived at. Stuff that folks might not necessarily think.
Feel free to explain why you moved in the first place as well.
I made the switch and frankly couldn't be happier. While Postgres lacks a few things of MySQL (Insert Ignore, Replace, Upsert stuff, and Load Data Infile for me mainly), the features it does have MORE than make up. Its stored procedures are so much more powerful and it's far easier to write complex functions and aggregates in Postgres.
Performance-wise, if you're comparing to InnoDB (which is only fair because of MVCC), then it feels at least as fast, possibly faster - we weren't able to do some real measurements here due to some constraints, but there certainly hasn't been a performance issue. The complex queries with several joins are certainly faster, MUCH faster.
I find you're more likely to get the correct answer to your issue from the Postgres community. Everybody and their grandmother has 50 different ways to do something in MySQL. With Postgres, hit up the mailing list and you're likely to get lots of very very good help.
Any of the syntax and the like differences are a bit trivial.
Overall, Postgres feels a lot more "grown-up" to me. I used MySQL for years and I now go out of my way to avoid it.
Oh dear, this could end in tears.
Speaking from personal experience only, we moved from MySQL solely because our production system (Heroku) is running PostgreSQL. We had custom-built-for-MySQL queries which were breaking on PostgreSQL. So I guess the morale of the story here is to run on the same DBMS over everything, otherwise you may run into problems.
We also sometimes needs to insert records Über-quick-like. For this, we use PostgreSQL's built-in COPY function, used similarly to this in our app:
query = "COPY users(email) FROM STDIN WITH CSV"
values = users.map! do |user|
# Be wary of the types of the objects here, they matter.
# For instance if you set the id to a string it will error.
%Q{#{user["email"]}}
end.join("\n")
raw_connection.exec(query)
raw_connection.put_copy_data(values)
raw_connection.put_copy_end
This inserts ~500,000 records into the database in just under two minutes. Around about the same time if we add more fields.
Another couple of nice things PostgreSQL has over MySQL:
Full text searching
Geographical querying (PostGIS)
LIKE syntax is like this email ~ 'hotmail|gmail', NOT LIKE is like email !~ 'hotmail|gmail'. The | indicates an or.
In summary: PostgreSQL is like bricks & mortar, where MySQL is Lego. Go with whatever "feels" right to you. This is only my personal opinion.
We switched to PostgreSQL for several reasons in early 2007 (or was it the year before?). The main reasons were:
SQL support - PostgreSQL is much better for complex SQL-queries, for example with lots of joins and aggregates
MySQL's stored procedures didn't feel very mature
MySQL license changes - dual licensed, open source and commercial, a split that made me wonder about the future. With PG's BSD license you can do whatever you want.
Faulty behaviour - when MySQL was counting rows, sometimes it just returned an approximated value, not the actual counted rows.
Constraints behaved a bit odd, inserting truncated/adapted values. See http://use.perl.org/~Smylers/journal/34246
The administrative interface PgAdminIII felt more stable and mature than the MySQL counterpart
PostgreSQL is very solid and crash safe in case of an outage
// John
Haven't made the switch myself, but got bitten a few times by MySQL's lack of transactional schema changes which apparently Postgre supports.
This would solve those nasty problems you get when you move from your dev environment with sqlite to your MySQL server and realise your migrations screwed up and were left half-done! (No I didn't do this on a production server but it did make a mess of our shared testing server!)
When using MySQL 5.1 Enterprise after years of using other database products like Sybase, Infomix, DB2; I run into things that MySQL just doesn't do. For example, it can only generate an EXPLAIN query plan for SELECT queries.
What other things I should watch out for?
You may take a look at long list here: MySQL Gotchas
Full outer joins. But you can still do a left outer join union right outer join.
One thing I ran into late in a project is that MySQL date types can't store milliseconds. Datetimes and timestamps only resolve to seconds! I can't remember the exact circumstances that this came up but I ended up needing to store an int that could be converted into a date (complete with milliseconds) in my code.
MySQL's JDBC drivers cache results by default, to the point that it will cause your program run out of memory (throw up OutOfMemory exceptions). You can turn it off, but you have to do it by passing some unusual parameters into the statement when you create it:
Statement sx = c.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,java.sql.ResultSet.CONCUR_READ_ONLY);
sx.setFetchSize(Integer.MIN_VALUE);
If you do that, the JDBC driver won't cache results in memory, and you can do huge queries BUT what if you're using an ORM system? You don't create the statements yourself, and therefore you can't turn of caching. Which basically means you're completely screwed if you're ORM system is trying to do something involving a lot of records.
If they had some sense, they would make that configurable by the JDBC URL. Oh well.
Allow for Roles or Groups
It doesn't cost a fortune. Try building a clustered website on multiple machines with multiple cores on an oracle database. ouch.
It still doesn't do CHECK constraints!