Consider scaling on Rails, would you write hand coded SQL? Use Sequel gem? - mysql

If I wanted to scale a Rails application by distributing its database on a different machine based on its authorization rules (location and user roles). So any resource attributed to that location would be sitting in a database dedicated to that location.
Should I get down to basic SQL writing, use something like Sequel gem or keep the niceness and magic of ActiveRecord?

It is true that raw SQL execution speed is more than execution of the ActiveRecord's nice magical queries. However, if you talk about scaling then there comes this question of how well would the queries be manageable when the application really grows large.
By far, a lot of complicated database operations can be managed well by caching, and proper indexing and proper eager-loading. In some cases MySQL views also help to improve performance, and Rails treats MySQL views fairly. After this, if one is able to corner out the really slow queries, then it might be worth it to convert them to raw SQL and save some time. Also, Rails offer caching of the database queries. MySQL also has a caching mechanism. Before executing raw SQL directly I would make sure these options (actually many more like avoiding unnecessary joins, as a join operation is resource intensive) are not able to give me what I am looking for. Hope this helps.

It sounds like you are partitioning your database, which Sequel has built-in support for (http://sequel.rubyforge.org/rdoc/files/doc/sharding_rdoc.html). I recommend using Sequel, but as the lead developer, I'm biased.

Related

Arel in SQLite and other databases

I am new in databases and Ruby on rails applications.
I have a question about generating queries from ORM.
When my database is SQLite, and I am using a code for creating queries for this database, if I change my database am I still able to use the same code?
In addition, when I am using Arel, because it provides more ready methods for more complex queries, before I am generating a query I call the method .to_sql
If I want to use the same code but for another database am I still able to execute the query? Using instead of to_sql something else?
In general, the Ruby on Rails code is portable between databases without doing anything more than adjusting your config/database.yml file (for connection details) and updating your Gemfile (to use the correct database adapter gem).
Database portability is mostly likely when you do not rely on specific, hardcoded uses of SQL as a way to invoke queries. Instead use the Rails' associations, relations, and query tools wherever possible. Specific SQL often creeps in on .where() clauses, so be thoughtful there and minimize/simply those as much as practical (for instance, multiple simple scopes that can be chained may give you better results than trying larger more complex single scopes). Also use Arel.matches when depending on "LIKE" type clauses instead of hardcoding LIKE details in .where() calls because different databases (such as PostgreSQL) handle case-sensitivity differently.
Your best defense against surprises upon switching databases is a robust set of automated unit (e.g., Rspec, minitest) and integration tests (e.g., capybera). These are especially important where you are unable to avoid use of specific SQL coding (say for optimization or odd/complex queries).
Since SQLite is simpler than most other robust engines like MySQL or Postgres, you're likely to be safer anyway in any operations you depend on. You're most vulnerable when using some advanced or specific feature of the database, but you're also generally more aware if you're doing that, so can write protective tests to help warn you upon switching database engines.

Sphinx search or MySQL for a huge database search enginee

So I'm working on a huge eCommerce solution on top of Symfony2, Doctrine2 and MySQL (maybe a cluster since we will have a lot of people connected and working in our platform) so I'm trying to decide if will be better to use Sphinx search or MysQL for search solution since some data will need to be duplicated in MySQL tables and in Sphinx. Our main goal is performance so excellent response times is what we look for. I'm not an expert in none so I need some advice from people here based on theirs experience and so on, maybe some docs or whatever. What path did yours take on this side.
PS: The DB will grow up really fast take that into account and the platform will be for the entire worl
Sphinx is usually preferred when it comes to performance vs MySQL for high volume searches because it's easy to scale. You will have a delay on results, to allow it to sync data with mysql, but even so it's better.
You should also take a good look at the actual queries that will run and store in Sphinx
only the fields that are searchable along with their ids. Once you get the ids from sphinx,
to list them use a mysql slave to get their other, non-searchable data.
Depending on what queries you are using for search,
A better solution than sphinx is Amazon Cloudsearch. We had a hard time implementing it, but it was well worth it, both time and $$$, and it replaced our sphinx solution

Slow remote queries

I am working on a Rails application and was using SQLite during development and the speed had been very fast.
I have started to use a remote MySQL database hosted by Amazon and am getting very slow query times. Besides trying to optimize the remote database, is there anything on the Rails side of things I can do?
Local database access vs. remote will show a significant difference in speed. Since you've not provided any specifics I can't zero in on the issue but I can make a suggestion:
Try caching your queries and views as much as possible. This will reduce the amount of queries you need to do. This works well especially for static data like menus.
Optimization is the key. Make sure you eliminate as many unnecessary queries as you can, and those queries you make only request the fields you need using the select method.
Profile the various components involved. The database server itself is one of them. The network latency is another. While for the second one probably there is little you can do, probably you can tweak alot the first part. Starting from profiling the queries and going to tweaking the server itself.
Knowing where to look for will help you start with the best approach. As for caching, always keep that in mind, but that can prove to be quite problematic depending on the nature of your application.

PostgreSQL and PQC

I'm using MySQL with Memcached, but I'm planning to start using PostgreSQL instead of MySQL.
I know Memcached can work with PostgreSQL, but I found this online: PostgreSQL Query Cache. I've seen a presentation online, and it says memcached is used in this. But I don't understand: memcached, I have to "program" in my PHP-code, and PQC, not?
What's it all about? Is PQC the same as memcached, and could it replace memcached? For example: I have a table with all countries. It never changes, so I want to cache this instead of retrieving it from the database every time. Will PQC do this automatically?
PQC is an implementation of caching that uses Memcached. It sits in front of your database server and caches query results for you. If you are running a lot of identical queries, this will make your database load a whole lot less and your return times a whole lot faster. It is not a substitute for good design of your application, but it can certainly help, and the cost of implementing it is extremely low since it takes advantage of an existing layer of abstraction.
Memcached is a lower level tool. A well designed application will leave you a nice place to put code between the business logic and the database layer to cache results, and this is where you put your memcached calls. In other words, if your code is designed to allow this abstraction, fantastic. Otherwise, you're looking at a lot more work to implement.

Which is the Best database for Rails application?

I am developing a Rails application that will access a lot of RSS feeds or crawl sites for data (mostly news). It will be something like Google News but with a different approach, so I'll store a lot of news (or news summaries), classify them in different categories and use ranking and recommendation techniques.
Should I go with MySQL?
Is it worthwhile using IBM DB2
purexml to store the doucuments?
Also Ruby search implementations
(Ferret, Ultrasphinx and others) are
not needed If I choose DB2. Is that correct?
What are the advantages of
PostreSQL in this?
Does it makes sense to use Couch DB in
this scenario?
I'd like to choose the best option but without over-complicating the solution. So I discarded the idea to use two different storage solutions (one for the news documents and other for the rest of the data). I'm also considering only "free" options, so I didn't look at Oracle or MS SQL Server.
purexml is heavier than SQL, so you pay more for your roundtrip between webserver and DB. If you plan to have lots of users, I'd avoid it, your better off letting your webserver cache the requests, thus avoiding creating xml(rss) everytime, if that is what you are thinking about.
I'd go with MySQL because its really good at serving and its totally free, well PostgreSQL is too, but haven't used it so I can't say.
CouchDB could make sense, but not if you plan on doing OLAP (Offline Analysis) of your data, a normal RDBMS will be better at it.
Admitting firstly that I generally don't like mysql, I will say that there has been writing on this topic regarding postgres:
http://oldmoe.blogspot.com/2008/08/101-reasons-why-postgresql-is-better.html
This is always my choice when I need a pure relational database. I don't know whether a document database would be more appropriate for your application without knowing more about it. It does sound like it's something you should at least investigate.
MySQL is probably one of the best options out there; light, easy to install and maintain, multiplatform and free. On top of that there are some good free client tools.
Something to think about; because of the nature of your system you will probably have some tables that will grow quite a lot very quickly so you might want to think about performance.
Thus, MySQL supports vertical partitioning but only from V 5.1.
It sounds to me the application you will build can easily become a large-scale web app. I would suggest PostgreSQL, for it has been known for its reliability.
You can check out the following link -- Bob Ippolito from MochiMedia tells us why they ditched MySQL for PostgreSQL. Although the posts are more than 3 years old, the issues MySQL 5.1 has recently tend to prove that they are still relevant.
http://bob.pythonmac.org/archives/category/sql/mysql/
MySQL is good in production. I haven't used PostgreSQL for rails, but it's a good solution as well.
In the dev and test environments I'd start out with SQLite (default), and perhaps migrate to your target DB in the test environment as you move closer to completion.