BIRT DataSet Remove duplicates - duplicates

I know how to remove duplicates from a table using suppress duplicates or through visibility. Is there a way I can remove duplicates from dataset?

If you are using a SQL query to obtain your dataset, then you can change your SELECT to SELECT DISTINCT.

Related

Merge data from two sources into one destination without duplicates

I have data from two different source locations that need to be combined into one. I am assuming I would want to do this with a merge or a merge join, but I am unsure of what exactly I need to do.
Table 1 has the same fields as Table 2 but the data is different which is why I would like to combine them into one destination table. I am trying to do this with SSIS, but I have never had to merge data before.
The other issue that i have is that some of the data is duplicated between the two. How would I only keep 1 of the duplicated records?
Instead of making an entirely new table which will need to be updated again every time Table 1 or 2 changes, you could use a combination of views and UNIONs. In other words create a view that is the result of a UNION query between your two tables. To get rid of duplicates you could group by whatever column uniquely identifies each record.
Here is a UNION query using Group By to remove duplicates:
SELECT
MAX (ID) AS ID,
NAME,
MAX (going)
FROM
(
SELECT
ID :: VARCHAR,
NAME,
going
FROM
facebook_events
UNION
SELECT
ID :: VARCHAR,
NAME,
going
FROM
events
) AS merged_events
GROUP BY
NAME
(Postgres not SSIS, but same concept)
Instead of Merge and Sort , Use union all Sort. because Merge transform need two sorted input and performance will be decreased
1)Give Source1 & Source2 as input to UnionALL Transformation
2) Give Output of UnionALL transfromation to Sort transformation and check remove duplicate keys.
This sounds like a pretty classic merge. Create your source and destination connections. Put in a Data Flow task. Put both sources into the Data Flow. Make sure the sources are both sorted and connect them to a Merge. You can either add in a Sort transformation between the connection and the Merge or sort them using a query when you pull them in. It's easier to do it with a query if that's possible in your situation. Put a Sort transformation after the Merge and check the "Remove rows with duplicate sort values" box. That will take care of any duplicates you have. Connect the Sort transformation to the data destination.
You can do this without SSIS, too.

Mysql sum as last row

Is it possible to have the SUM of all numaric fields in the last of a set of rows?
As of now, I'm using a very simple query such as:
SELECT
*,
SUM((UNIX_TIMESTAMP(end) - UNIX_TIMESTAMP(start))/3600)
FROM
times
in SQL you cant have a column that appears in only one row, likewise, you also cant have a row that doenst contain all the columns from the other rows.. So having a row that contains something unique is not possible. You can, however, add the calculated column to all rows in the dataset or do the calculation in the calling code after the data is returned.
I think what you are looking for is GROUP BY WITH ROLLUP you will find details on that in the MySQL manual.

How to collapse rows into a comma-delimited list in an SQL Query in MySql

It is relatively simple in T-SQL to concatenate related values into a comma delimited string in an SQL Query (see here: What is the best way to collapse the rows of a SELECT into a string? and here: What is the best way to collapse the rows of a SELECT into a string?). The latter link describes exactly what I need to do, but I need to do it in MySql, and the query that works in T-SQL doesn't work in MySql. Any MySql experts out there know how to do this?
Thanks!
Is called group_concat
select group_concat(your_id) from your_table

How to: ..WHERE STRTOSET(#p1), STRTOSET(#p2)

I am trying to filter a query by two (multi select) parameters.
It works fine when doing this for the first one, but complains when I add the the second one.
Is my syntax wrong is there a better way to achieve what I want?
MDX WHERE has very little in common with SQL WHERE. MDX WHERE does not effect the number of rows that return, just which cube slice the cells are to be retrieved from.
I would use the FILTER function since a MDX WHERE clause must be a tuple (cell address), no more no less, i.e.,
(Dim1.Member, Dim2.Member, etc.)
Hope this helps.
Tried subqueries?
SELECT
[Measures].[YourMeasure]
ON COLUMNS,
[Dimensions].[YourDimension]
ON ROWS
FROM
(SELECT STRTOSET(#p1) ON COLUMNS FROM
(SELECT STRTOSET(#p2) ON COLUMNS FROM
[YourCube] ) )

SSRS Grouping on a field, even when there are duplicates

I have a report in Reporting Services and there is a group that is based around a field value. I want the group to repeat itself on the report as many times as there are rows with that field. The problem is that using Field!field.Value seems to only pull distinct values. Since my dataset has rows that have duplicate values, they are not all showing.
When I declare my parent group, is there a way to tell it to group on every row in the parent group, not just the distinct rows?
Alternatively, is there a list of other options I can use other than just .Value on my field?
What about using the RowNumber function (not adding it into the dataset) as an expression to group on?
I don't have a report in front of me right now, but I think that might work.
Is there a second field that you can use a "dummy" (aka tie breaker, key etc), to include in the grouping to make it unique?
This is how I'd do it.
Edit: after comment.
Can you add a calculated field to the dataset, such as Rownumber to act as one?
Edit 2: I mean in SSRS itself: "Calculated field"