Before I go much further down this thought process, I wanted to check whether the idea was feasible at all. Essentially, I have two datasets, each of which will consist of ~500K records. For the sake of discussion, we can assume they will be presented in CSV files for ingesting.
Basically, what I'll need to do, is take records from the first dataset, do a lookup against the second dataset, and then produce an output that essentially merges the two together and produces an output CSV file with the results. The expected number of records after the merge will be in the range of 1.5-2M records.
So, my questions are,
Will Power Automate allow me to work with CSV datasets of those sizes?
Will the "Apply to each" operator function across that large of a dataset?
Will Power Automate allow me to produce the export CSV file with that size?
Will the process actually complete, or will it eventually just hit some sort of internal timeout error?
I know that I can use more traditional services like SQL Server Integration Services for this, but I'm wondering whether Power Automate has matured enough to handle this level of ETL operation.
I have created SSRS Report using pivot and using matrix because in report we are using dynamic column display and sorting and It has 2 millions records. Report executed 2 min but rendering time take 10 minute. Please suggest approach for fast rendering.
Remove any function-related formatting (IIF(Fields!SomeField.Value>0, "white","red")) because SSRS will need to inspect every line. In fact, remove any SSRS functions you can and make SQL do the heavy lifting. I agree with #Alan that the report is still too big to be useful: ask the users what question they're trying to answer with that large of a data set and see if you can come up with a better way of answering it.
A better option for this type of analysis is Excel PowerPivot: let excel import the raw data and do the pivoting.
I am having problems understanding cubes and microcubes in BusinessObjects environment.
Although I have tried to find answers online, I did not find an answers that give overall explanations.
Beside description of the functionality, I would like to know where is cube and where is micro cube located: on server or in browser?
How many cubes/microcubes are there? One microcube per report or one micro cube per session, or sthg else?
Furthermore, can anyone explain the difference in aggregations on database level as opossed to aggregation on report level (when defining a measure, there are two possibilities - to define aggregation on report and/or aggregation level). Although there are some answers online, they are too general. Therefore, I would appreciate a simple explanation with an example.
And finaly, is it possible to color tables in data foundation layer? (since I have a lot of tables in a universe, it would be very helpful if I could color fact and dimensional tables).
It helps to understand the two-pass nature of querying traditional (non-OLAP) data in BO. When you refresh a report in BO, the report engine constructs a SQL statement based on the objects (results and conditions) that are specified in the query.
The SQL statement is sent to the database, and the report engine then retrieves the data that is returned from the database. This data becomes the microcube -- it is nothing more than a tabular representation of the data that was received from the database. If you could look at it visually, it would look identical to what you would get by running the SQL statement in a traditional SQL tool (such as TOAD or SQL*Plus).
The microcube then becomes the source for the report presentation. Your could, for example, create a table in a report by dragging in dimensions and measures from the object list. As you are dragging in these objects, the table will recalculate and redisplay as appropriate. This is all done by retrieving and calculating the data from the microcube, not from the source data. That is, as you are interacting with a report tab, you are not initiating a refresh from the database -- all of the calculation is done from the microcube.
For example, let's say you have created a new query with two dimensions (State, Region) and one measure (Sales). The SQL might look like this:
select state_nm,region_nm,sum(sales_amt)
from all_sales
group by state_nm,region_nm
Note that there is a sum() applied to sales_amt. This is causing the database to perform the initial aggregation on this field.
The microcube that is created after the refresh may look like:
AL North 30000
AL South 40000
AR North 5000
AR South 10000
Now you create a table in your report, and select just State and Sales. The report engine uses the data in the microcube to display the result:
AL 70000
AR 5000
In this case, the BO report engine has performed a sum() aggregation on Sales Amt. This is report-side aggregation.
Then you remove State. Again, the report engine aggregates the table from the microcube, not the database:
75000
The microcube is stored with the report file. If you are working with a WebI report via InfoView, then the microcube is loaded into the server's memory. When you save the report, a physical file is created on the server (with a .wid extension); the microcube is stored in that file. If you open the report again, the microcube is again loaded into the server's memory and is available for interaction.
If you are using Rich Client, then the same behavior applies, it's just using your workstation and local storage.
The term "cube" is generally used to describe data sources in an OLAP environment (SSAS or Essbase, for example). This is external to BO -- BO would be querying data from an OLAP cube.
Regarding the aggregation:
Database aggregation is performed by your RDBMS on the source data, before it is transferred to the client application (e.g. Web Intelligence). You can apply database aggregation by using a statement such as SUM() or COUNT() in the select statement of your measure (in the business layer of your universe). It only makes sense for measures objects, not dimensions.
Changes the data retrieved from the database
Can have a positive impact on performance due to a small dataset being returned
Leverages the database aggregate performance
.
Projection aggregation or report aggregation is the aggregation performed by the client application (e.g. Web Intelligence) after retrieving the data from the database, thus in-memory.
Happens on the fly as the dimensions change to which the measure is projected (hence projection aggregation)
The result set retrieved from the database remains the same
Regarding the table colours: have a look at the tutorial Apply color to tables that share the same information. This should show you how to configure the colours for the tables in the data foundation.
What is the most efficient way to automate both creation and deployment of simple SSRS reports from one underlying query?
An example query might look like
SELECT Name, ID, Date FROM Errorlog
Query could contain quite a few columns and anywhere from 1 to 1 million rows.
The business purpose behind this question is that I have a sizable number of report queries that need to go out as SSRS reports. I also need the capacity to turn any query I write instantly (or within a matter of seconds) into a simple SSRS report. Unfortunately, doing it through BIDS manually (using toolbox items and creating datasets is cumbersome, slow and unnecessarily repetitive. The only thing I am concerned with is making sure interactive page height/width is zero (to allow scrolling) and that columns are autosized.
How would you accomplish this in a way that is smooth and repeatable?
Let me start by saying that I don't think SSRS will not be very good at this. Specifically on two points this may be troublesome.
First, the number of rows may become a problem. One million results is typically a bit much for reporting services 2008 (though it does depend on the context a bit), it's much better at displaying either aggregated data, or a limited number (up to a few thousand - though again: depending on context) of data rows.
Second, a dynamic number of columns being returned by the SQL side will be a problem. There's only two ways around this that I know of:
Have a denormalized data set with a fixed number of columns, and one or more columns that contain the grouping. Then use a matrix to generate columns dynamically in SSRS. This does have a considerable performance impact.
Generate the RDL dynamically. There's information on the schema to do this, and if you create a good starting point it's very possible. After generating the RDL you'll have to execute it - how to do that depends on your specific setup.
Bottom line is that I wouldn't recommend using SSRS for the task you describe. Consider other technologies that may be better up to this task, e.g. SSIS packages, or perhaps another custom made or third party tool?
If I were you, I'd utilize 'Access Data Projects' which have a wizard for creating report.. that is then easy to upsize to Reporting Services. Right-click IMPORT into a solution full of RDL, and it prompts for MS Access file.
You can easily make a couple of columns into a report using an Access wizard, and then upsize to SSRS.. I've done it hundreds upon hundreds of times like this.
I Have created an SSRS Report for retrieving 55000 records using a Stored Procedure. When
executing from the Stored Proc it is taking just 3 seconds but when executing from SSRS report it is taking more than one minute. How can I solve this problem?
The additional time could be due to Reporting Services rendering the report in addition to querying the data. For example if you have 55,000 rows returned for the report and the report server then has to group, sort and/or filter those rows to render the report then that could take additional time.
I would have a look at the way the data is being grouped and filtered in the report, then review your stored procedure to see if you could offload some of that processing to the SQL code, maybe using some parameters. Try and aim to reduce the the amount of rows returned to the report to be the minimum needed to render the report and preferably try to avoid doing the grouping and filtering in the report itself.
I had such problem because of parameters sniffing in my SP. In SQL Management Studio when I run my SP, I recreated it with new execution plan (and call was very fast), but my reports used old bad plan (for another sequence of parameters) and load time was much longer than in SQL MS.
in the ReportServerDB you will find a table called ExecutionLog. you got to look up the catalog id of your report and check the latest execution instance. this can tell you the break-up of the times taken - for data retrieval, for processing, for rendering etc.
Use the SSRS Performance Dashboard reports to debug your issues.
Archaic question but because things like that are kinda recurring, my "quick and dirty" solution to improve SSRS, which works perfectly on large enterprise environments (I am rendering reports that can have up to 100.000+ lines daily) is to properly set the InteractiveSize of the page (for example, setting it to A4 size -21 cm ). When InteractiveSize is set to 0, then all results are going to be rendered as single page and this literally kills the performance of SSRS. In cases like that, queries that can take a few seconds on your DB can take forever to render (or cause an out of memory exception unless you have tons of redundant H/W on your SSRS server).
So, in cases of queries/ SP's that execute reasonably fast on direct call and retrieve large number of rows, set InteractiveSize and you won't need to bother with other, more sophisticated solutions.
I had a similar problem: a query that returns 4,000 rows and runs in 1 second on it's own was taking so long in SSRS that it timed out.
It turned out that the issue was caused by the way SSRS was handling a multi-valued parameter. Interestingly, if the user selected multiple values, the report rendered quickly (~1 second), but if only a single value was selected, the report took several minutes to render.
Here is the original query that was taking more than 100x longer to render than it should:
SELECT ...
FROM ...
WHERE filename IN (#file);
-- #file is an SSRS multi-value parameter passed directly to the query
I suspect the issue was that SSRS was bringing in the entire source table (over 1 million rows) and then performing a client-side filter.
To fix this, I ended up passing the parameter into the query through an expression, so that I could control the filter myself. That is, in the "DataSet Properties" window, on the "Parameters" screen, I replaced the parameter value with this expression:
=JOIN(Parameters!file.Value,",")
... (I also gave the result a new name: filelist) and then I updated the query to look like this:
SELECT ...
FROM ...
WHERE ',' + #filelist + ',' LIKE '%,' + FILENAME + ',%';
-- #filelist is passed to the query as the following expression:
-- =JOIN(Parameters!file.Value,",")
I would guess that moving the query to a stored procedure would also be an effective way to alleviate the problem (because SSRS basically does the same JOIN before passing a multi-value parameter to a stored procedure). But in my case it was a little simpler to handle it all within the report.
Finally, I should note that the LIKE operator is maybe not the most efficient way to filter on a list of items, but it's certainly much simpler to code than a split-string approach, and the report now runs in about a second, so splitting the string didn't seem worth the extra effort.
Obviously getting the report running correctly (i.e. taking the same order of magnitude of time to select the data as SSMS) would be preferable but as a work around, would your report support execution snapshots (i.e. no parameters, or parameter defaults stored in the report)?
This will allow a scheduled snapshot of the data to be retrieved and stored beforehand, meaning SSRS only needs to process and render the report when the user opens it. Should reduce the wait down to a few seconds (depending on what processing the report requires. YMMV, test to see if you get a performance improvement).
Go to the report's properties tab in Report manager, select Execution, change to Render this report from a report execution snapshot, specify your schedule.
The primary solution to speeding SSRS reports is to cache the reports. If one does this (either my preloading the cache at 7:30 am for instance) or caches the reports on-hit, one will find massive gains in load speed.
Please note that I do this daily and professionally and am not simply waxing poetic on SSRS
Caching in SSRS
http://msdn.microsoft.com/en-us/library/ms155927.aspx
Pre-loading the Cache
http://msdn.microsoft.com/en-us/library/ms155876.aspx
If you do not like initial reports taking long and your data is static i.e. a daily general ledger or the like, meaning the data is relatively static over the day, you may increase the cache life-span.
Finally, you may also opt for business managers to instead receive these reports via email subscriptions, which will send them a point in time Excel report which they may find easier and more systematic.
You can also use parameters in SSRS to allow for easy parsing by the user and faster queries. In the query builder type IN(#SSN) under the Filter column that you wish to parameterize, you will then find it created in the parameter folder just above data sources in the upper left of your BIDS GUI.
[If you do not see the data source section in SSRS, hit CTRL+ALT+D.
See a nearly identical question here: Performance Issuses with SSRS
Few things can be done to improve the performance of the report which are as below:
1. Enable caching on the report manager and set a time period to refresh the cache.
2. Apply indexing on all the backend database tables that are used as a source in the report, although your stored procedure is already taking very less time in rendering the data but still applying indexing can further improve the performance at the backend level.
3. Use shared datasets instead of using embedded datasets in the report and apply caching on all these datasets as well.
4. If possible, set the parameters to load default values.
5. Try to reduce the data that is selected by the stored procedure, e.g. if the report contains historical data which is of no use, a filter can be added to exclude that data.
I experienced the same issue. Query ran in SQL just fine but was slow as anything in SSRS. Are you using an SSRS parameter in your dataset? I've found that if you use the report parameter directly in certain queries, there is a huge performance hit.
Instead, if you have a report parameter called #reportParam, in the dataset, simply do the following:
declare #reportParamLocal int
set #reportParamLocal = #reportParam
select * from Table A where A.field = #reportParam
It's a little strange. I don't quite know why it works but it does for me.
One quick thing you may want to look at is whether elements on your report could be slowing down the execution.
For example i have found massive execution time differences when converting between dateTimes. Do any elements on report use the CDate function? If so you may want to consider doing your formatting at the sql level.
Conversions in general can cause a massive slow down so take the time to look into your dataset and see what may be the problem.
This is a bit of a mix of the answers above, but do your best to get the data back from your stored procedure in the simplest and most finished format. I do all my sorting, grouping and filtering up on the server. The server is built for this and I just let reporting services do the pretty display work.
I had the same issue ... it was indeed the rendering time but more specifically, it was because of the SORT being in SSRS. Try moving your sort to the stored procedure and removing any SORT from the SSRS report. On 55K rows, this will improve it dramatically.
Further to #RomanBadiornyi's answer, try adding
OPTION (RECOMPILE)
to the end of your main query in the stored procedure.
This will ensure the query is recompiled for different parameters each time, in case different parameters need a different execution plan.