I want to create this query with ssis component:
SELECT TOP 1 Date_Execution
FROM Action
WHERE Action_Ref=1
ORDER BY Date_Execution
Can I do that without using script component?
Update
This query must be included in a flow task. I need it to calculate a field of a table I'm creating.
Use the Aggregate transform. If you insist on starting with the full table contents then use a Conditional Split transform to get those rows that meet your WHERE clause, then use the Aggregate transform to get the minimum Date_Execution from the remaining rows.
Or just be reasonable about it and use a SQL execute task. The database server will be much more efficient than SSIS at getting this result.
Related
Can i use another column in InList clause?
Example,
i have created a variable and below is the formula.
IF [query1.column1] inList ([query2.column2]) then SUM([query1.amountColumn])
Else 0
OR is it possible to put variable after inList in formula?
If not possible -- is there any other alternative to this?
I see two possible approaches. I will to use the eFashion universe for both solutions.
Solution #1
Here are my 2 queries to begin...
Run your queries. Click on the columns you want to compare, [query1].[column1] and [query2].[column2] in your case; [Query 1].[Month] and [Query 2].[Month] for me. Right-click and merge them. They must be dimensions and of the same data type.
Now create a variable based on [Query 2].[Month Name] which you can filter on to eliminate the results from Query 1 that do not match up to anything in Query 2.
[UV Month Name]=[Query 2].[Month Name]
The key here is you need to change the Qualification to "Detail" and set the Associated Dimension to what we just merged by clicking three dots to the right. Choose [Month Name] not from either query, but the merged dimension.
Now build out your table with whatever object you want from Query 1 and add in the variable we just created.
Now add a filter on that variable to only show row where it is not null.
And you are done.
Pros
Works when limiting query (query2) has a relatively large number of values (compare to Cons for Solution #2).
Cons
More complicated to set up
May run into universe or performance issues related to query being filtered (query1).
Solution #2
Building upon Solution #1, I duplicated Query 1 and renamed it Query 3. Now you can choose "Results from another query" to get the [query1].[column1] InList ([query2].[column2]) logic you want.
If you take this approach then you don't need to do the merge, variable, and filter. The results of the query are filter before being returned by the report.
Pros
Simple
Cons
The number of values coming from your second query must be relatively small. It varies by database or maybe even your universe. I have found if it is over 1,000 values I get an error when I run the query that it is "too complex".
I am attempting to combine two groupings(sum), EPL and POL and relabel them as something, say "Other GL". The current output is this. I've attempted adding a formula in the criteria but it is not working. I have also attempted adding another column in the design view with a formula alone.
The best way to "combine" data rows for grouping (i.e. sums) is to create a preliminary query which reassigns the individual source rows to a common value. Then use that query as the source for the other query(ies). (Such a preliminary query could be either a nested query -a.k.a. subquery-, or a saved query. I personally prefer saved queries since they can be edited and viewed using the standard Access Query Designer, whereas subqueries can only be edited as SQL text.)
Without other database schema or SQL statement to work with, all I can show is a SQL snippet showing the altered selection:
SELECT iif(Claims2.Grouping = 'EPL' Or Claims2.Grouping = 'POL', 'Other GL', Claims2.Grouping) As AltGrouping, ...
FROM Claims2
For what it's worth, the same iif() statement could also be inserted directly into the your query as a "calculated field"--within the query designer just copy and paste it into the Field cell in place of Grouping. But a saved query that adjusts labels preliminary to final queries can be reused and makes later queries simpler.
I want to compare two row counts for two tables from two different connexions.
I've tried to get the number of rows for each different table by executing
Select count(*) as count1 from Table1
and
Select count(*) as count2 from Table2
in two different Execute SQL Script steps, as in the following screenshot, but I have no idea how to proceed.
Especially, I want to get the two different counts and compare them then branch with success/failure on whether they're equal or not, respectively.
How can I achieve this ?
It's pretty easy. There is a step called Evaluate rows number in a table. This both gets a row count from a table and tests it against a value. The value can come from a variable in the Job (note Job, not Transform).
So all you need to do is create a variable with a Set variables task, get the row count from one of your tables and then execute the Evaluate rows task. The following job will do just that.
The transform to get the row count for the other table is very simple. Just execute a SELECT COUNT(*) FROM {tblname} in a Table input step and flow the output to a Set variables step in a transform. Be sure to mark the variable as valid in the parent job.
You can also execute an SQL against a connection with the JavaScript step which would avoid creating the transform, but I prefer to avoid scripting when possible.
You can use Table Compare step, everything you need is here, regards
I have an SSIS data flow in SSIS 2012 project.
I need to calculate in the best way possible for every row field a sum of another table based on some criteria.
It would be something like a lookup but returning an aggregate on the lookup result.
Is there an SSIS way to do it by components or i need to turn to script task or stored procedure?
Example:
One data flow has a filed names LOT.
i need to get the sum(quantity) from table b where dataflow.LOT = tableb.lot
and write this back to a flow field
You just need to use the Lookup Component. Instead of selecting tableb write the query, thus
SELECT
B.Lot -- for matching
, SUM(B.quantity) AS TotalQuantity -- for data flow injection
FROM
tableb AS B
GROUP BY
B.Lot;
Now when the package begins, it will first run this query against that data source and generate the quantities across all lots.
This may or may not be a good thing based on data volumes and whether the values in tableB are changing. In the larger volume case, if it's a problem, then I'd look at whether I can do something about the above query. Maybe I only need current year's data. Maybe my list of Lots could be pushed into the remove server beforehand to only compute the aggregates for what I need.
If TableB is very active, then you might need to change your caching from the default of Full to a Partial or None. If Lot 10 shows up twice in the data flow, the None would perform 2 lookups against the source while the Partial would cache the values it has seen. Probably, depends on memory pressure, etc.
I have joined 5 tables and done transformation on these tables. Now I got a single table at the end. Now I want to perform sql query on this single table to filter records. But I don't know how to perform simple sql query on this table. I have attached a snap shot which shows the resulting table. How I get this resulting data set as the source? I want to populate my destination after filter out this data.
I am using SSIS 2008.
Click here to see the Table on which I want to perform a simple sql query
SELECT * FROM `first_table`
where `some_column` =
(
SELECT `*`
FROM second_table
WHERE
`some_column2`='something'
LIMIT 1
)
Try this code This will help. You can even use this to connect all those four tables with each other.
From the image you posted, it looks like you have a set of data in the dataflow you're trying to query against. You need to do one of two things at this point. Either you insert the data into a table in the database and use another data flow to query it, or you use use a conditional split (or multicast and conditional splits) to filter the rows down further from there.
Without more detail about what you're actually trying to accomplish, these are the recommendations I can determine.
You could send the rows into a record set destination, but you aren't able to query it like a regular table and you'd need some C#/VB skills to access it to do more than a FOR EACH loop.
Assuming your sql query that you want to run against the resulting table is simple, you can use a script component task. By simple, I mean, if it is of this nature:
SELECT * FROM T WHERE a = 'zz' and b = 'XX' etc.
However, if your query has self joins, then you would be better of dumping the outcome of joining those 5 tables in to a physical table, and go from there.
It appears that query is going to be real straight-forward; in that case using a script component would be helpful.
A separate question: It's advisable to do the sorting at the database level. You are using 5 sort tasks in your solution. Can you please elucidate the reason?