I got multiple tables where I have to join, subquery,pagination, grouping, ordering . Keeping hibernate limitation in mind, sometime native SQL is required and during this time hibernate cache is helpless. Also the data is stored in hibernate second level cache is not automatic, since its stored only when DB is accessed. So first time second level cache is empty.
My problem is I used native sql to get data with multiple joins and grouping,ordering, finally ending up in the performance issue.
My thoughts: I like sql VIEW to pull data with all those joins ,ordering , grouping. But the sql VIEW is like a normal select statement and executes every time on access. Is there any live result set as table where I can just say fetch data as select * from ONE_LIVE_RESULT_SET where condition.
Is there any concept like LIVE_RESULT_SET IN sql world? Any comments.
Use a materialized view
Extract from Wikipedia: http://en.wikipedia.org/wiki/Materialized_view
A materialized view is a database object that contains the results of
a query. For example, it may be a local copy of data located remotely,
or may be a subset of the rows and/or columns of a table or join
result, or may be a summary based on aggregations of a table's data.
Materialized views, which store data based on remote tables, are also
known as snapshots. A snapshot can be redefined as a materialized
view.
Example syntax to create a materialized view in Oracle:
CREATE MATERIALIZED VIEW MV_MY_VIEW REFRESH FAST START WITH SYSDATE
NEXT SYSDATE + 1
AS SELECT * FROM ;
Regards
But this MATERIALIZED VIEW is not a live data(sync up with table) but inorder to make it live data it has to be REFRESH. Here the question will be When to REFRESH OR during such refresh again one has to wait. Also frequent data changing is another use case to suffer. Is there any ways where the refresh can be done for specific row?
Any hibernate experts!!! Does HIBERNATE persist the data on multiple joins, complex joins?
I have seen hibernate persisting second level cache session.get(id), but I am not sure about the hql or native sql having multiple/complex join. Is it possible to get from hibernate second level cache for multiple/comples joins ?
Related
I understand from this question that SQL language does support calculated columns in views.
I have a requirement where I have a table with multiple columns, and I need to calculate a sorting column in order to simplify my queries. I am thinking of creating a view for my origin table with those sorting columns calculated. But I am afraid that could be a performance nightmare as my table grows bigger.
Does any one have an idea on how that would affect performance?
Is it possible to create index on a calculated column in a view ?
UPDATE 1:
I am planning on using postgresql, but I am open to other opensource alternatives like MySQL
UPDATE 2:
as N.B. suggested:
I'm not a Postgres user, but the docs here are showing how to create that view and how to index it. If you're using Postgres and are familiar with it - stick with it. All databases work nearly the same, but if you're more proficient with one - no reason to change it. As for how it affects the performance - be it a view or a query that you construct dynamically - it's the same thing. View is just a huge help when querying, and if you can index it it means some memory will be spent on index. You have to measure
I am thinking now that materialized views are the way to go for my functional requirements, I can setup a trigger to refresh the Materialized View on each and every update on my table once I confirm this point:
How does REFRESH MATERIALIZED VIEW work ? does it drop the data and recreate the view from scratch ? or does it do some kind of differential refresh ?
Disclaimer: I have used both MySQL and PostgreSQL Database on a remote server for about 8 months only, and I have a preference for PostgreSQL for your use case.
TL;DR
According to the documentation, REFRESH MATERIALIZED VIEW
command will drop all data and re-populate the entire query's data if you add the WITH DATA clause.
You can create indexes for materialized view. The index could be on the calculated fields that are stored in the columns.
You cannot index a view (non-materialized)
You can create different types of materialized views depending on your needs (see URL link below).
Long Explanation
A) Materialized Views types and performance
I have a requirement where I have a table with multiple columns, and I need to calculate a sorting column in order to simplify my queries. I am thinking of creating a view for my origin table with those sorting columns calculated. But I am afraid that could be a performance nightmare as my table grows bigger.
If the calculations are very expensive, consider consuming more memory to store the results in materialized views or tables.
A materialized view is like a table that stores the result of a query. In the case of PostgreSQL materialized view, indexes can be created on it to speed up queries and it can be vacuumed to update the meta-data.
The materialized view that PostgreSQL provides is a naive one because you must manually refresh the data with REFRESH MATERIALIZED VIEW command. According to the documentation, this will drop all data and re-populate the data if you add the WITH DATA clause.
After that, you need to consider the performance needed for insert, update, delete operations:
If you have no real-time requirements (i.e. a full table
re-population is acceptable) then this option is fine.
Else, you might want to see this website post for different setup of materialized views, some of which allows for lazy refresh of data (trigger refresh data by rows)
https://hashrocket.com/blog/posts/materialized-view-strategies-using-postgresql
The second point also applies to MySQL as well (and is actually the traditional and customized way of building materialized views). To my knowledge, MySQL does not support materialized views out-of-the-box (require plugins). The convenience provided in (1) is one of the reasons why I chose PostgreSQL.
Is it possible to create index on a calculated column in a view ?
It is possible to index the columns of a materialized view, just as you do for a table.
B) Window functions in PostgreSQL
The second reason for choosing PostgreSQL over MySQL is because the former provides extended-SQL functions (or I would like to call them OLAP functions) that help with complex queries like ranking of rows and so on.
I shall leave it to you to explore this option (just do a Google Search on "PostgreSQL Window Functions").
According to my latest knowledge, MySQL has no built in support for this (maybe rely on plugins or own coding?).
I'd like to test how selecting and loading data in java application using hibernate and mysql can be optimized.
I divided results I found into 2 groups
MySQL
indexes - for sure
stored procedures - is there a difference if select is done in stored procedure?
views - is there a difference if select definition is kept in view?
query cache - does it work only if we do the same select second time?
Hibernate
hibernate cache - is this similar to query cache? how it can be configured?
lazy loading - can it help?
Are there any other ways? I use simple queries with several joins and aggregation functions.
I need to demonstate time changes between "before" and "after" optimization.
For more information I tried to read this, but language is to complicated for me.
Batch fetching is important when you read e.g. a collection. In that case Hibernate can get many rows of the collection in one SQL request, so that it would be faster.
Hibernate caching is very good solution (read about EHCache for instance) it can store retrieved data on memory and if nothing changes on his side it can retrieve it without even asking SQL engine what's going on there.
Lazy loading is a must for One to Many associations (you can kill your solution without this). But fortunately it is set by default in Hibernate for such associations.
You can also read about optimistic lock, which is faster than pessimistic lock in many cases.
Last but not least you should use proper transaction strategy, so that you shouldn't create new transactions when it is not necessary.
I have some questions on views -
Where views are created/stored in Mysql? Or they are only virtual and deleted after some time period?
When the data of views is refresh? (It refresh automatically when we insert data in actual table or we have to update view each time?)
Use of views is good or we should fire the queries each time?
Views are pure metadata. MySQL doesn't copy any data during the creating of a view, and it's also it is not deleted after a time.
When you run a select on a view, mysql (or any other database) runs the query defined at creation time.
There's no performance difference (or almos not different) between running a query on a table or on a view.
Some databases, such as oracle, support something called materialised views. These views, do copy the data, so they have to be refreshed, so the data doesn't become stale.
Leaving this as it turned up in Google results.
To see view definitions in MySQL you can use this query:
SELECT * FROM information_schema.VIEWS;
Regards,
James
I have read and followed instructions here:
What is an efficient method of paging through very large result sets in SQL Server 2005? and what becomes clear is I'm ordering by a non-indexed field - this is because it's a generated field from calcuations - it does not exist in the database.
I'm using the row_number() technique and it works pretty well. My problem is that my stored procedure does some pretty big joins on a fair bit of data and I'm ordering by the results of these joins. I realise that each time I page it has to call the entire query again (to ensure correct ordering).
What I would like (without pulling entire result set into the client code and paging there) is that once it SQL Server got the whole result set it could then page through that
Is there any built-in way to achieve that? - I thought that views might do this but I can't find info on this.
EDIT: Indexed Views will not work for me as I need to pass in parameters. Anyone got any more ideas - I think either I have to use memcached or have a service that builds indexes in the background. I just wish there was a way for SQL Server to get that table and hold onto it whilst it is paged...
I am not very familiar with paging, and without knowing the logic behind your procedure, I can only guess you'd benefit from IndexedViews or #TemporaryTables with Indexes.
You mentionned you were ordering by a non-indexed field that is generated, that information combined with the fact that your procedure calls the entire query every time would lead me to believe you could make that query an IndexedView. You'd get better performance from accessing it multiple times and it would also enable you to add an Index onto the field you're ordering by.
You could also use a #TemporaryTable if it somehow stays alive during your paging requests... Insert the dataset you are working with in a #TemporaryTable, you can then create an index with T-SQL on the generated colum.
Indexed Views for SQL Server 2005: http://technet.microsoft.com/en-us/library/cc917715.aspx
How to implement Materialized Views?
If not, how can I implement Materialized View with MySQL?
Update:
Would the following work? This doesn't occur in a transaction, is that a problem?
DROP TABLE IF EXISTS `myDatabase`.`myMaterializedView`;
CREATE TABLE `myDatabase`.`myMaterializedView` SELECT * from `myDatabase`.`myRegularView`;
I maintain LeapDB (http://www.leapdb.com) which adds incrementally refreshable materialized views to MySQL (aka fast refresh), even for views that use joins and aggregation. I've been working on this project for 13 years. It includes a change data capture utility to read the database logs. No triggers are used.
It includes two refresh methods. The first is similar to your method, except a new version is built, and then RENAME TABLE is used to swap the new for the old. At no point is the view unavailable for querying, but 2x the space is used for a short time.
The second method is true "fast refresh", it even has support for aggregation and joins.
LeapDB is significantly more advanced than the FromDual example referenced by astander.
Your example approximates a "full refresh" materialized view. You may need a "fast refresh" view, often used in a data warehouse setting, if the source tables include millions or billions of rows.
You would approximate a fast refresh by instead using insert / update (upsert) joining the existing "view table" against the primary keys of the source views (assuming they can be key preserved) or keeping a date_time of the last update, and using that in the criteria of the refresh SQL to reduce the refresh time.
Also, consider using table renaming, rather than drop/create, so the new view can be built and put in place with nearly no gap of unavailability. Build a new table 'mview_new' first, then rename the 'mview' to 'mview_old' (or drop it), and rename 'mview_new' to 'mview'. In your above sample, your view will be unavailable while your SQL populate is running.
This thread is rather old, so I will try to re-fresh it a bit:
I've been experimenting and even deployed in production several methods for having materialized views in MySQL. Basically all methods assume that you create a normal view and transfer the data to a normal table - the actual materialized view. Then, it's only a question of how you refresh the materialized view.
Here's what I've success with so far:
Using triggers - you can set triggers on the source tables on which you build the view. This minimizes the resource usage as the refresh is only done when needed. Also, data in the materialized view is realtime-ish
Using cron jobs with stored procedures or SQL scripts - refresh is done on a regular basis. You have more control as to when resources are used. Obviously you data is only as fresh as the refresh-rate allows.
Using MySQL scheduled events - similar to 2, but runs inside the database
Flexviews - using FlexDC mentioned by Justin. The closest thing to real materialized
I've been collecting and analyzing these methods, their pros and cons in my article Creating MySQL materialized views
looking forwards for feedback or proposals for other methods for creating materialized views in MySQL
According to the mySQL docs and comments at the bottom of the page, it just seems like people are creating views then creating tables from those views. Not sure if this solution is the equivalent of creating a materialized view, but it seems to be the only avenue available at this time.