How to remove duplicate values on google data studio - duplicates

I have a dimension (a column from google sheets) called products with the following values:
product = [apple , apple_old_2019, pineapple , pineapple_old_2020, pineapple_old_2017 ...]
I need to regex then and remove the pattern old_**** and then aggregate the values by name.
In Google Sheets I would replace the values and then use the Unique formula, but in Google Data Studio there isn't such function.
I created a custom field called Product_pre with this formula:
REGEXP_EXTRACT(Product , '^(.+?)(_old_[0-9]{2}-[0-9]{4})' )
Then I created another custom field with the following formula:
CASE
WHEN Product_pre_process is null THEN Product
ELSE Product_pre_process
END
The problem is that the result has duplicated values:
product_processed = [apple , apple, pineapple , pineapple, pineapple ...]
How could I fix that?

1) Extract the First Word
The REGEXP_EXTRACT function below does the trick (extracting all the characters from the beginning of each string till the first instance of _):
REGEXP_EXTRACT(Product , "^([^_]*)")
2) Consolidation
If the chart type is a Table, then removing the rest of the dimensions and leaving just the newly created dimension will result in the metric values automatically aggregating based on the two values in the dimension (apple and pineapple).
Google Data Studio Report as well as a GIF to visualise the above:

Related

Filter by values from csv file in Tableau

I've get a dimension in my Tableau workbook called discount codes. This dimension holds 30,000 strings. Also I've get separate csv files that hold hundreds of discount codes.
In Tableau I want to filter out the values from a single csv file.
I have tried to create a filter and just paste the discount codes in a list:
When I select every single value manually it works. But when I paste the whole list Tableau can't match the discount codes.
Is there any way to filter the values without selecting every single value?
You could do this using Excel (for speed) and a calculated field. Use Excel to write the calculated field formula. You already have the list of discount codes in Excel, use this to create a calculated field, a giant CASE statement.
Assuming your list of exclusions starts in cell A1, cell B1 would be
="WHEN '"&A1&'" THEN 1 "
The formula in cell B2:
=B1 & "WHEN '"&A2&'" THEN 1 "
Drag that formula to the end, and you should then have the contents of a large case statement. Copy the final cell formula as values, then copy the text into a Tableau calulcated field.
Start the calculated field with:
CASE [Discount Codes]
*pasted value*
END
All being well, you can use that calculated field as a data source filter and exclude 1.
Note I haven't tested this so watch out for bracket errors, etc.

Cumulative data series displays error in a table in Power BI

I would like to display plan and fact cumulative data series in a dashboard with a bar and line combined chart and a table next to each other using Power BI Version: 2.59.5135.781 64-bit (2018. June) edition.
My DAX formula looks like this:
CUMULATIVE_FACT = CALCULATE(
SUM('FACT_TABLE'[FACT_VALUE]);
FILTER(
ALL('DATES');
'DATES'[YEAR]=MAX('DATES'[YEAR]) &&
'DATES'[DATE]<=MAX('DATES'[DATE])
)
)
Which works fine and gives a result as such (bars displayed as TÉNY refer to cumulative fact)
The cumulative plan (line referred to as TERV) series is identical to this but with plan figures. Also you can change the year so the aggregation only runs for the current year.
However, I would like to display either null (blank) or zero values for the fact series after a certain date which is given as a parameter. This parameter value is stored in a table with a single column and single row in a date type value.
So I modified my formula as such
CUMULATIVE_FACT = IF(VALUES('DATES'[DATE])<= MAX(PARAMETER_TABLE[PARAMETER_DATE]);
CALCULATE(
SUM('FACT_TABLE'[FACT_VALUE]);
FILTER(
ALL('DATES');
'DATES'[YEAR]=MAX('DATES'[YEAR]) &&
'DATES'[DATE]<=MAX('DATES'[DATE])
)
); 0)
The formula works fine for the chart but my table visual gives an error.
So the chart looks okay, perfectly the way I would like to display it, but the table gives back a 'A table of multiple values was supplied where a single value was expected' error message
Error message:
The column referred to in the message is basically the CUMULATIVE_FACT measure, I just changed it for ease of understanding. I tried with BLANK() instead of 0, but it looks the same.
No idea why it is not working with the table visual. Any ideas?
The problem is coming from this piece:
VALUES('DATES'[DATE])
This returns all values in the current filter context, not just a single one. That's why you're getting
A table of multiple values was supplied where a single value was expected
when you try to compare it to MAX(PARAMETER_TABLE[PARAMETER_DATE].
It works in the chart since VALUES('DATES'[DATE]) is always a single value that corresponds to the month on the axis, whereas the table has a total line that encompasses multiple months.
I think if you just turned off the total line, it would be OK. Otherwise, change VALUES('DATES'[DATE]) to an expression that returns a single date in the way you want. For example, MAX('DATES'[DATE]) might work.

SSRS 2012 csv export - dynamic column header name

I need to export my SSRS report to csv format. Issue I am facing is that few column header names change with country code (parameter) and I want to show the same name in my exported csv. I have gone through other related questions and topic, specifically this one. It suggested that there is a work around by setting data value to null. I have tried this by adding columns and hiding these based on country code, along with setting data value to null for that dataset field. But it did not work. I still get the hidden column in my export, with no values in it.
Can someone confirm that this workaround works or is there any other way apart from creating different reports for each country?
UPDATE: (added report screenshot and description for clarification)
Based on apporoach I have taken,
I need to only show Column 'Town' for Country A and only 'Suburb' for Country B. This is easily done by hiding columns based on Country Parameter( and works fine for EXCEL export), but when exported to CSV, both columns are exported.
UPDATE 2
Found this link which is similar to what I have been trying to figure out and looks like there is no solution which can be achieved using only one report.
According to this post
MSDN Social: Hide CSV columns conditionally in SSRS report
The XML and CSV renderers use the DataElementOutput property to
control visibility. We can select which item we want to hide in the
report, and set the “DataElementOutput” property with value “NoOutput”
to work around the issue.
Alternatively, we can add below expression to control the item’s
visibility. Please refer to the expression below:
=IIF(Globals!RenderFormat.Name="CSV",True,False)
This, incidentally, is the second answer in the thread that you posted. That answer links to the following article which explains how to do exactly what you need
Hide/Show Items Dependant On Export Format
UPDATE
It appears that you cannot programmatically set DataElementOutput for CSVs. However, according to this post SSRS - Programatically controlling the DataElementOutput property
in RS 2005 / 2008, you should be able to get the desired effect by adding a filter on the tablix directly (in addition to the visibility condition):
Filter expression: =(Parameters!DataPeriod.Value = "DAY")
Filter operator:    =
Filter value:     =true
Thereby, for the cases where the tablix is not visible, you are also filtering out all the data.
You may be able to show/hide country specific columns this way instead.
My approach would be to do this in SQL. I'm not sure how you determine which Countries require the town to be returned and which require the suburb to be returned but here's a couple of approaches that hopefully cover your scenario.
In either case, the idea is to return the town/suburb data in the same column and then an additional column that will contain a a caption that we can use as the column header.
a. Only one of either the town or suburb column in your table is populated. In this case it's pretty simple.
SELECT
Country
, ISNULL(TownOrCity, StreetSuburb) AS TownSuburb
, CASE WHEN TownOrCity IS NULL THEN 'Street-Suburb' ELSE 'Town-City' END AS Caption
FROM myTable
b. You know upfront which Countries require what and you can pass a parameter in to get the correct column. In this example we'll use a parameter called #TS and pass in either a T or and S
SELECT
Country
, CASE #TS WHEN 'T' THEN TownOrCity ELSE StreetSuburb END as TownSuburb
, CASE #TS WHEN 'T' THEN 'Town-City' ELSE 'Street-Suburb' END AS Caption
FROM myTable
Whichever approach we take you will end up with a simple table
Country TownSuburb Caption
Testland TownA Town
Testland TownB Town
Testland TownC Town
In you report, you don;t need to do anything except make the town/suburb column caption an expression something like =FIRST(Fields!Caption.Value)
That's it, the report is now nice and simple and should export without any issues.
UPDATE to method:
--
-- Dump data into a temp table
--
SELECT
Country
, CASE #TS WHEN 'T' THEN TownOrCity ELSE StreetSuburb END as TownSuburb
INTO #t
FROM myTable
--
--rename the column
--
DECLARE #NewColumnname sysname = CASE #TS WHEN 'T' THEN N'Town-City' ELSE N'Street-Suburb' END
EXECUTE tempdb..sp_rename N'tempdb..#t.[TownSuburb]', #NewColumnname, 'COLUMN'
--
-- finally get the result
--
SELECT * FROM #t

SSRS Reporting multi value parameters

I have a ssrs report, that gives me multiple product's price. My Parameter is not drill down, I have to type in the parameters(since I have large range of product number).
Now my questions is, how can i get the last entered product ( parameter) always appear at the bottom of the report ?. This would help me where to look the latest product in the report.For example I have product numbers like:
abc-234,
abc-570,
ght-908,
Now what I want is that the latest entered product number which is ght-908 to appear at the bottom of the ssrs reports. Right now it gives me the report for the multiple product, but its all over the place and i have to squint my eyes and try to find out where my most recent entered product numbers (parameters) is. I have also tried to stop the parameters to being refreshed everytime i add a product number.
Assuming your parameter name is MyParameter, in report designer (BIDS) just drop a textbox onto report below the data (e.g. Table) and put following expression into its value's formula:
=Parameters!MyParameter.Value.Split(",")(Parameters!MyParameter.Value.Split(",").Length - 1)
it will split the parameter list and grab the last value
Update: here is the screenshot with steps:
And here is the runtime result
This expression works for me:
=Trim(Right(Parameters!Product_Number.Value
, InStr(StrReverse(Parameters!Product_Number.Value), ",") - 1))
Trim might not be strictly necessary but is useful as it will work if the values are split with spaces as well as commas, or not.
For example:
It sounds like you want to order the results of the stored procedure by the order of the product codes as they are typed into the report parameter (which is a comma separated list).
You can return the index (order) of each product code in the parameter by using the Array.IndexOf and Split functions, e.g.
If you have a report parameter called "ProductNumber" and you also have a field called "ProductNumber" returned in your dataset, the following code will return the zero-based index of the Product Number as entered into the parameter list:
=Array.IndexOf(
Split(Parameters!ProductNumber.Value.ToString(), ",")
, Fields!ProductNumber.Value
)
So if abc-234 is the first product number in the parameter list then this code will return 0. If abc-570 is the second product number in the parameter list then this code will return 1, etc.
Assuming the products are listed in a tablix, then I would set the tablix sort expression to the above, which should sort the products into the order specified in the report parameter.

Google Spreadsheet Populating Cells in Other Sheet Based on Value

Am new to Google Docs, but have to create a cumulative report of comments that are flagged as positive or negative. I have 6 worksheets that ideally would populate to a single report, but I could create 6 individual reports for now.
In the source sheet, ColA is a numeric code identifying the category. Col B is the category description; Col C are the notes from one person; Col D is the code to identify it as positive or negative; Cols E and F are the notes from a 2nd person; G/H from a 3rd, etc.
The report sheet needs to transpose the vertical comments by category with the positive comments for all persons for the first category in Col G, the negative comments for the 1st category in Col H, etc for all 6 categories.
I was able to manually create this report using the following formula to extract the Positive comments from column C:
QUERY(EntrySheet1!C5:D15;"select * where D='P'")
But, it's pretty tedious to copy the formula laterally and vertically to accommodate all 6 categories and all 6 note takers.
So, my questions are whether or not there is an easier way to extract the information the way I need to report it. Also, is there a way to use something like Excel's Indirect function where I could use the concatenate function to build the formulas and the Indirect to evaluate that function. My thought here is that I could have an entry cell where I would identify which cumulative report I wanted to view by simply updating the cell. An alternative would be to load the data into an array and use a script to populate a static cumulative report. Real-time updating with formulas would be ideal, but creating a static report that is created from a script is acceptable. My biggest concern is the manual effort to update the formulas since they are sheet specific.
Use Google Spreadsheet INDIRECT function.
See the Google spreadsheets function list:
INDIRECT(reference)
Returns the reference specified by a text string. This function can also be used to
return the area of a corresponding string. Reference is a reference to a cell or an
area (in text form) for which to return the contents.
You might be able to feed the results of indirect into your query.