JPA Portability Between MYSQL and MongoDB - mysql

If I am using JPA with MongoDB and later if I would like to change database to MYSQL, how easy to do switch from MongoDB to MYSQL?
Reason why I am asking this because I understand that MongoDB is non relational and MYSQL is relational database. So at the time of changing database do I need to make lots of changes in Entity classes?

First thing that must be said is that JPA was designed around RDBMS only, and so some aspects (e.g query language, joins) are not suited to "other types" of datastore. Consequently it is all down to how a particular implementation handles things.
I know that with DataNucleus JPA the impact is very small, in terms of configuration needed. Typically if an RDBMS-only configuration is seen when using MongoDB it simply ignores the setting, hence it is largely transparent.

Related

AWS RDS MySql or Postgres - performance wise and cost wise?

I want to use aws for hosting a django application and use aws rds for database purpose. The application is kind of blog like system.
I am not able to decide which RDS I should choose over MySql or Postgres? Both price wise and performance wise according to aws pricing policy.
This can be very broad and may be opinionated , I would try to keep it short as i read it somewhere:
MySQL would be very good for any CMS Site as it works very well with it and MyISAM tables are quite nice here.
From What I read where PostgreSQL does better than MySQL:
Multi-application databases
Advanced data modelling
What Advance Data Modelling means is that PostgreSQL is far more mature at doing complex data modelling than MySQL is. It has a very mature extensible type system, a wide range of procedural languages, and a great deal of flexibility in how these languages can be plugged into existing queries.
If that wasn't enough, the fact is that you can essentially build your data model in PostgreSQL based not only on what information you are storing but what information is commonly derived from what you are storing. This makes things like not-first-normal-form actually sane to use where they are needed. Add collections and multiple inheritance in table structure and you have a very sophisticated data modelling platform, this blog would help you understand it better.
Besides the content management system market, MySQL's other major market is in applications where data is not expected to be exposed to more than one writing application at a time. This leads to a significant difference in handling data validation, etc.
In PostgreSQL validation is always equally strict. If the app expects special error treatment it had better call functions or casts to handle this explicitly.
MySQL however places the application in charge of defining the data validation rules.So while PostgreSQL allows the relational and object-relational interface to serve as a public API, it is essentially intended largely to be a private API for applications in MySQL. This is a huge difference and not readily understood by many people trying to make the choice. This leads to major differences in application design.
MySQL is a data storage and reporting solution for your application.
PostgreSQL is a data centralization, modelling, and reporting solution
for your organization. The two are remarkably different.
Now coming to Second Question based on pricing as you can see from MySQL Pricing Page and PostgreSQL Pricing Page MySQL is bit cheaper than PostgrSQL , reading on the answer you can make informed decision what would be best for you.
Hope this Helps!
I'm gonna offer you a 3rd option: Aurora - try it. It's cheaper than those 2 and is MySQL compatible.
This article may be of help to you when deciding.
For simple blog-like thingie I'd go with MySQL (or Aurora MySQL compatible version)
For data-critical and highly relational solutions I might also consider Postgres (Aurora)

Arel in SQLite and other databases

I am new in databases and Ruby on rails applications.
I have a question about generating queries from ORM.
When my database is SQLite, and I am using a code for creating queries for this database, if I change my database am I still able to use the same code?
In addition, when I am using Arel, because it provides more ready methods for more complex queries, before I am generating a query I call the method .to_sql
If I want to use the same code but for another database am I still able to execute the query? Using instead of to_sql something else?
In general, the Ruby on Rails code is portable between databases without doing anything more than adjusting your config/database.yml file (for connection details) and updating your Gemfile (to use the correct database adapter gem).
Database portability is mostly likely when you do not rely on specific, hardcoded uses of SQL as a way to invoke queries. Instead use the Rails' associations, relations, and query tools wherever possible. Specific SQL often creeps in on .where() clauses, so be thoughtful there and minimize/simply those as much as practical (for instance, multiple simple scopes that can be chained may give you better results than trying larger more complex single scopes). Also use Arel.matches when depending on "LIKE" type clauses instead of hardcoding LIKE details in .where() calls because different databases (such as PostgreSQL) handle case-sensitivity differently.
Your best defense against surprises upon switching databases is a robust set of automated unit (e.g., Rspec, minitest) and integration tests (e.g., capybera). These are especially important where you are unable to avoid use of specific SQL coding (say for optimization or odd/complex queries).
Since SQLite is simpler than most other robust engines like MySQL or Postgres, you're likely to be safer anyway in any operations you depend on. You're most vulnerable when using some advanced or specific feature of the database, but you're also generally more aware if you're doing that, so can write protective tests to help warn you upon switching database engines.

Performing a join across multiple heterogenous databases e.g. PostgreSQL and MySQL

There's a project I'm working on, kind of a distributed Database thing.
I started by creating the conceptual schema, and I've partitioned the tables such that I may require to perform joins between tables in MySQL and PostgreSQL.
I know I can write some sort of middleware that will break down the SQL queries and issue sub-queries targeting individual DBs, and them merge the results, but I'd like to do do this using SQL if possible.
My search so far has yielded this (Federated storage engine for MySQL) but it seems to work for MySQL databases.
If it's possible, I'd appreciate some pointer's on what to look at, preferably in Python.
Thanks.
It might take some time to set up, but PrestoDB is a valid OpenSource solution to consider.
see https://prestodb.io/
You connect connect to Presto with JDBC, send it the SQL, it interprets the different connections, dispatches to the different sources, then does the final work on the Presto node before returning the result.
From the postgres side, you can try using a foreign data wrapper such as mysql_ftw (example). Queries with joins can then be run through various Postgres clients, such as psql, pgAdmin, psycopg2 (for Python), etc.
This is not possible with SQL.
Your options are to write your own "middleware" as you hinted at. To do that in Python, you would use the standard DB-API drivers for both databases and write individual queries; then merge their results. An ORM like sqlalchemy will go a long way to help with that.
The other option is to use an integration layer. There are many options out there, however, none that I know that are written in Python. mule esb, apache servicemix, wso2 and jboss metamatrix are some of the more popular ones.
You can colocate the data on a single RDBMS node (either PostgreSQL or MySQL for example).
Two main approaches
Readonly - You might want to use read-replicas of both source systems, then use a process to copy the data to a new writeable converged node; OR
Primary - You might chose a primary database of 2. Move the data from one to the primary using a conversion process (eg. ETL or off the shelf table-level replication)
Then you can just run the query on the one RDBMS with JOINs as usual.
BONUS: You can also do log reading from RDBMS that can ship logs through Kafka. You can make it really complex as required.

Try MongoDB or stick to MySQL

I am coding a web portal which stores a lot user data and later on maybe documents. In the meantime I use MySQL with many relations. I have read much about NoSQL and find that it is an interesting topic.
Is MongoDB or CouchDB ready to fully replace MySQL? Would something change in the usage of Doctrine in my application?
Is MongoDB or CouchDB ready to fully replace MySQL?
Sure, lots of people are storing their entire data set in MongoDB instead of MySQL and they are doing fine.
But I do not think that is the correct question. The key questions are really the following:
Does implementing MongoDB improve your system? Less queries, more flexibility, better performance?
Are you capable of implementing MongoDB at the appropriate scale?
MongoDB is a tool like many others and it does not solve all problems. In my experience, most systems are best implemented with some mix of databases. That would means something like MongoDB for some data and SQL for other data.

Using both Mongodb and Mysql in one project

I have been working to learn Mongodb effectively for one week in order to use for my project. In my project, I will store a huge geolocation data and I think Mongodb is the most appropriate to store this information. In addition, speed very important for me and Mongodb responds faster than Mysql.
However, I will use some joins for some parts of the project, and I'm not sure whether I store user's information in Mongodb or not. I heard some issues can occur in mongodb during writing process. should I use only mongodb with collections (instead of join) or both of them?
In most situations I would recommend choosing one db for a project, if the project is not huge. On really big projects (or enterprises in general), I think long term organizations will use a combination of
RDBMS for highly transactional OLTP
NoSQL
a datawarehousing/BI project
But for things of more reasonable scope, just pick the one that does the core of the use case, and use it for everything.
IMO storing user data in mongodb is fine -- you can do atomic operations on single BSON documents so operations like "allocate me this username atomically" are doable. With redo logs (--journal) (v1.8+), replication, slavedelayed replication, it is possible to have a pretty high degree of data safety -- as high as other db products on paper. The main argument against safety would be the product is new and old software is always safer.
If you need to do very complex ACID transactions -- such as accounting -- use an RDBMS.
Also if you need to do a lot of reporting, mysql may be better at the moment, especially if the data set fits on one server. The SQL GROUP BY statement is quite powerful.
You won't be JOINing between MongoDB and MySQL.
I'm not sure I agree with all of your statements. Relative speed is something that's best benchmarked with your use case.
What you really need to understand is what the relative strengths and weaknesses of the two databases are:
MySQL supports the relational model, sets, and ACID; MongoDB does not.
MongoDB is better suited for document-based problems that can afford to forego ACID and transactions.
Those should be the basis for your choice.
MongoDB has some nice features in to support geo-location work. It is not however necessarily faster out of the box than MySQL. There have been numerous benchmarks run that indicate that MySQL in many instances outperforms MongoDB (e.g. http://mysqlha.blogspot.com/2010/09/mysql-versus-mongodb-yet-another-silly.html).
Having said that, I've yet to have a problem with MongoDB losing information during writing. I would suggest that if you want to use MongoDB, you use if for the users as well, which will avoid having to do cross database 'associations', and then only migrate the users to MySQL away if it becomes necessary.