Symfony MySQL subquery best practice : Native, QueryBuilder or DQL - mysql

Here is a simple MySQL query i want to use in a Symfony2 project :
SELECT * FROM
(
SELECT n.sdate, n.edate FROM `news` n
UNION
SELECT ss.sdate, ss.edate FROM `stagesession` ss
) AS sub
ORDER BY sub.sdate
In fact, this query will be a little more complicated, with more aliases, filter and joins with other tables.
Do I have to convert it in a DQL query, with the createQueryBuilder, or the best way is simply to use createNativeQuery from doctrine ?

My personal Best Practice with Doctrine is:
Query (QB vs. DQL vs. SQL):
use QB if building your query is more conditional than just passing some parameters, like if($onlyActive) $qb->andWhere('x.type = 5'); (I don't like string concat stuff)
use QB for compatibility reasons to pagination toolkits
use DQL for simple selects
use SQL if DQL-query not possible (e.g. DB-native expressions MySQL/Oracle/MSSQL, some weird statistics or hacky queries with UNION or huge subqueries)
at least you can also use SQL, if you're using a small data subset of a very huge DB (like writing some plugin software), because else if the database schema is quite small, you could create some entities from it and revalidate them (for example when you deploy) as a system-test. But if it's too complicated then QB or DQL would also be overkill for accessing such a database, because you have to define entities to work.
Result (orm vs. flat):
use ORM in business code wherever possible to have max. readable code (consider lazy loading)
use ORM in complicated nested views (no huge tables) to have nice clean code in your template (consider eager loading)
use flat arrays for read-only tables/lists
use flat arrays for optimization reasons when dealing with lot's of data (and caching not possible)
And always keep in mind, that you should first write simple code and iff it's to slow, optimize it with eager/lazy loading, Query/Result caching, HTTP caching and at least if you e.g. deal with some database synchronization or data importer you may have use flat arrays or fall back to native implementations, but don't underrate ORM ;).

Related

Is the performance of raw sql much better than using spring-data-how?

I want to request a high amount of records (100000 to 1000000) per select request with a join of three tables. Is the performance much better with nativeSQL instead of using spring-data-jpa for mapping it to #Entity objects?
Thx!
JPA and every ORM turn your query results into domain objects.
That of course takes resources. Spring Data JPA adds potential conversions to that and it preprocesses your query in order to support fancy ways of setting parameters.
If you are selecting large amounts of data the preprocessing of the statement probably doesn't matter that much.
But the conversion to domain objects will.
You used the word "migrating" which sounds like you are going to select data and then immediately write it somewhere else. If that is the case, use plain SQL and work directly on the ResultSet tell the driver to make it read only and forward only. See Understanding Forward Only ResultSet

Is the usage of scala collection equivalents in slick operation is harmful?

I have tables table_a and table_b in my database and they are mapped in slick with TableQuery Objects. I need to copy a restricted set of data from table_a to table_b.
Let the table query objects be tableQueryA and tableQueryB. The logic for filtering and copying data is complex. So
I think of doing scala collection equivalent of table query object in a for yield and treat them as normal collections. But Everything happens in one transaction. The code looks something like this.
for {
collA <- tableQueryA.filter(.....something....).result
collB <- tableQueryB.filter(.....somethingElse.....).result
...... do something with collA and collB
}
yield ...something
Is there a harm doing this way, i.e handling as scala collections and processing them?
I am using slick 3.2
By doing two separate tableQueryX.filter().result, you'll be executing two separate queries to the database. You could replace it with one query that joins two tables.
It's hard to say what is the better approach in term of performance as it depends on amount of filter or where clauses and what kind of indexes are used by the database to fulfill those. If you need a top notch performance, try both approaches and pick one that is the fastest.
If both of your queries yield big amount of data, you need to consider memory usage for your application too because all data is loaded before scala collection api can be used.
I dont see any harm as long as data is less - but better to filter out data at DB level to avoid any potential out of memory errors.

Complex filtering in rails app. Not sure complex sql is the answer?

I have an application that allows users to filter applicants based on very large set of criteria. The criteria are each represented by boolean columns spanning multiple tables in the database. Instead of using active record models I thought it was best to use pure sql and put the bulk of the work in the database. In order to do this I have to construct a rather complex sql query based on the criteria that the users selected and then run it through AR on the db. Is there a better way to do this? I want to maximize performance while also having maintainable and non brittle code at the same time? Any help would be greatly appreciated.
As #hazzit said, it is difficult to answer without much details, but here's my two cents on this. Raw SQL is usually needed to perform complex operations like aggregates, calculations, etc. However, when it comes to search / filtering features, I often find using raw SQL overkill and not quite maintainable.
The key question here is : can you break down your problem in multiple independent filters ?
If the answer is yes, then you should leverage the power of ActiveRecord and Arel. I often find myself implementing something like this in my model :
scope :a_scope, ->{ where something: true }
scope :another_scope, ->( option ){ where an_option: option }
scope :using_arel, ->{ joins(:assoc).where Assoc.arel_table[:some_field].not_eq "foo" }
# cue a bunch of scopes
def self.search( options = {} )
output = relation
relation = relation.a_scope if options[:an_option]
relation = relation.another_scope( options[:another_option] ) unless options[:flag]
# add logic as you need it
end
The beauty of this solution is that you declare a clean interface in which you can directly pour all the params from your checkboxes and fields, and that returns a relation. Breaking the query into multiple, reusable scopes helps keeping the thing readable and maintainable ; using a search class method ties it all together and allows thorough documentation... And all in all, using Arel helps securing the app against injections.
As a side note, this does not prevent you from using raw SQL, as long as the query can be isolated inside a scope.
If this method is not suitable to your needs, there's another option : use a full-fledged search / filtering solution like Sunspot. This uses another store, separate from your db, that indexes defined parts of your data for easy and performant search.
It is hard to answer this question fully without knowing more details, but I'll try anyway.
While databases are bad at quite a few things, they are very good at filtering data, especially when it comes to a high volumes.
If you do the filtering in Ruby on Rails (or just about any other programming language), the system will have to retrieve all of the unfiltered data from the database, which will cause tons of disk I/O and network (or interprocess) traffic. It then has to go through all those unfiltered results in memory, which may be quite a burdon on RAM and CPU.
If you do the filtering in the database, there is a pretty good chance that most of the records will never be actually retrieved from disk, won't be handed over to RoR and won't then be filtered. The main reason for indexes to even exist is for the sole purpose of avoiding expensive operations in order to speed things up. (Yes, they also help maintain data integrity)
To make this work, however, you may need to help the database a bit to do its job efficiently. You will have to create indexes matching your filtering criteria, and you may have to look into performance issues with certain types of queries (how to avoid temporary tables and such). However, it is definately worth it.
Having that said, there actually are a few types of queries that a given database is not good at doing. Those are few and far between, but they do exist. In those cases, an implementation in RoR might be the better way to go. Even without knowing more about your scenario, I'd say it's a pretty safe bet that your queries are not among those.

Challenges with Linq to sql concept in dot net

Let say if I used the Linq to Sql concept to interact with database from C# language , then what challenges I may be face? means in terms of architecture, performance , type safety, objects orientation etc ..!
Basically Linq to SQL generates a class for each table in your database, complete with relation properties and all, so you will have no problems with type safety. The use of C# partials allows you to add functionality to these objects without messing around with Linq to SQLs autogenerated code. It works pretty well.
As tables map directly to classes and objects, you will either have to accept that your domain layer mirrors the database design directly, or you will have to build some form of abstraction layer above Linq to SQL. The direct mirroring of tables can be especially troublesome with many-to-many relations, which is not directly supported - instead of Orders.Products you get Order.OrderDetails.SelectMany(od => od.Product).
Unlike most other ORMs Linq to SQL does not just dispense objects from the database and allow you to store or update objects by passing them back into the ORM. Instead Linq to SQL tracks the state of objects loaded from the database, and allows you to change the saved state. It is difficult to explain and strange to understand - I recommend you read some of Rick Strahls blogposts on the subject.
Performance wise Linq-to-SQL does pretty good. In benchmarking tests it shows speeds of about 90-95% of what a native SQL reader would provide, and in my experience real world usage is also pretty fast. Like all ORMs Linq to SQL is affected by the N+1 selects problem, but it provides good ways to specify lazy/eager loading depending on context.
Also, by choosing Linq to SQL you choose MSSQL - there do exist third party solutions that allow you to connect to other databases, but last time I checked, none of them appeared very complete.
All in all, Linq to SQL is a good and somewhat easy to learn ORM, which performs okay. If you need features beyond what Linq to SQL is offering, take a look at the new entity framework - it has more features, but is also more complex.
We've had a few challenges, mainly from opening the query construction capability to programmers that don't understand how databases work. Here are a few smells:
//bad scaling
//Query in a loop - causes n roundtrips
// when c roundtrips could have been performed.
List<OrderDetail> od = new List<OrderDetail>();
foreach(Customer cust in customers)
{
foreach(Order o in cust.Orders)
{
od.AddRange(dc.OrderDetails.Where(x => x.OrderId = o.OrderId));
}
}
//no seperation of
// operations intended for execution in the database
// from operations intended to be executed locally
var query =
from c in dc.Customers
where c.City.StartsWith(textBox1.Text)
where DateTime.Parse(textBox2.Text) <= c.SignUpDate
from o in c.Orders
where o.OrderCode == Enum.Parse(OrderCodes.Complete)
select o;
//not understanding when results are pulled into memory
// causing a full table load
List<Item> result = dc.Items.ToList().Skip(100).Take(20).ToList();
Another problem is that one more level of separation from the table structures means indexes are even easier to ignore (that's a problem with any ORM though).

select * math operations

is it possible to select colums and do complex operations on them for example select factorial(column1) from table1 or select integral_of(something) from table2
perhaps there are libraries that support such operations?
Yes, you can call all pre-defined functions of your DB on the select columns and you can use CREATE FUNCTION to define your own.
But DBs are meant to wade through huge amounts of data, not to do complex calculations on them. If you try this, you'll find that many operations are awfully slow (especially the user defined ones).
Which is why most people fetch the data from the database and then do the complex math on the application side. This also makes it more simple to test and optimize the code or replace it with a new version.
Yes, it is. If the function you want is not built into your RDBMS, you can write your own User Defined Functions.
You'll find an example here: http://www.15seconds.com/Issue/000817.htm.