Power Query wont recognise correct column name when trying to identify duplicate values after transformations are applied - duplicates

Strange issue which has been giving me a bit of trouble for the past few hours. Im trying to identify which values are duplicates in custom.Column1 by creating a new column which returns either duplicate or unique if the value is already present or not within the column.
looking online I found that this can be achieved using:
if List.Count(List.FindText(Source[custom.Column1],[custom.Column1]))>1 then "duplicate" else "unique")
However, this only appears to work if used early on. Atfer I have applied transformations I get an error saying that the column name is not found.
Original Data:
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table15"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Custom.Column1", type text}, {"Custom", type text}, {"Custom.Column3", type text}, {"Custom.Column4", type text}, {"Custom.Column5", type text}}),
#"Merged Columns" = Table.CombineColumns(#"Changed Type",{"Custom.Column1", "Custom.Column4"},Combiner.CombineTextByDelimiter("#", QuoteStyle.None),"Merged"),
#"Reordered Columns1" = Table.ReorderColumns(#"Merged Columns",{"Merged", "Custom", "Custom.Column3", "Custom.Column5"}),
#"Merged Columns1" = Table.CombineColumns(#"Reordered Columns1",{"Custom", "Custom.Column5", "Custom.Column3"},Combiner.CombineTextByDelimiter("#", QuoteStyle.None),"Merged.1"),
#"Merged Columns2" = Table.CombineColumns(#"Merged Columns1",{"Merged", "Merged.1"},Combiner.CombineTextByDelimiter("?", QuoteStyle.None),"Merged.2"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merged Columns2", {{"Merged.2", Splitter.SplitTextByDelimiter("?", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged.2"),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged.2", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type2", "Merged.2", Splitter.SplitTextByDelimiter("#", QuoteStyle.None), {"Merged.2.1", "Merged.2.2"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged.2.1", type text}, {"Merged.2.2", type number}}),
#"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type3", "Merged.2.1", Splitter.SplitTextByDelimiter("#", QuoteStyle.None), {"Merged.2.1.1", "Merged.2.1.2", "Merged.2.1.3"}),
#"Replaced Value" = Table.ReplaceValue(#"Split Column by Delimiter2","-","",Replacer.ReplaceText,{"Merged.2.1.2"}),
#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value",{{"Merged.2.1.2", type number}}),
#"Filtered Rows2" = Table.SelectRows(#"Changed Type4", each ([Merged.2.1.1] <> "")),
//PROBLEMATIC LINE
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Custom.1", each if List.Count(List.FindText(Source[Merged.2.1.1],[Merged.2.1.1]))>1 then "duplicate" else "unique")
in
#"Added Custom3"
Hopefully you should see that all of the transformations work up until the final line when trying to identify duplicates. I just cant figure out why. A possible glitch in PowerQuery?
DATA FOR SOURCE:
Custom.Column1
Hydrogenated Glucose
Xanthan Gum
Methyl Paraben
Peppermint Flavour No.2
Sodium Hydroxide
Purified Water
Custom
Sorbitol
Maltitol
Xanthan Gum
Methyl Paraben
Methanol
Purified Water
Amorphous Silica Gel
Menthol
2-Isopropyl-5-Methylcyclohexanone
Sodium Hydroxide
Purified Water
Custom.Column3
68425-17-2
585-88-6
11138-66-2
99-76-3
67-56-1
7732-18-5
7631-86-9
89-78-1
89-80-5
1310-73-2
7732-18-5
Custom.Column4
25.59685
0.64875
0.37071
0.05561
0.0519
73.12789
Custom.Column5
50
8
-
100
0.3
0.5
0.5
30.00
5.00
-
-

The line of code you show that works is:
if List.Count(List.FindText(Source[custom.Column1],[custom.Column1]))>1 then "duplicate" else "unique")
So you are refering to the Step called "Source", which is the first one. This one contains a column called "custom.column1".
In the example that does not work you refer again to the step "Source", but this one doesn't have the column [Merged.2.1.1] which you added later. Thus you get the error.
To refer to the step you have to write:
#"Filtered Rows2"[Merged.2.1.1]
instead of
Source[Merged.2.1.1]

in last step, change
Source[Merged.2.1.1]
to
#"Filtered Rows2"[Merged.2.1.1]
Merged.2.1.1 does not exist in Source table. It exists in the final prior step named #"Filtered Rows2"

Related

Importing JSON array of arrays into Excel

I'm trying to import a JSON file with the following format into Excel:
[
[1,2,3,4],
[5,6,7,8]
]
I want to get a spreadsheet with 2 rows and 4 columns, where each row contains the contents of the inner array as separate column values, e.g.
Column A
Column B
Column C
Column D
1
2
3
4
5
6
7
8
Although this would seem to be an easy problem to solve, I can't seem to find the right PowerQuery syntax, or locate an existing answer that covers this scenario. I can easily import as a single column with 8 values, but can't seem to split the inner array into separate columns.
Assuming the JSON looks like
[
[1,2,3,4],
[5,6,7,8]
]
then this code in Powerquery
let Source = Json.Document(File.Contents("C:\temp\j.json")),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each Text.Combine(List.Transform([Column1], each Text.From(_)),",")),
ColumnTitles = List.Transform({1 .. List.Max(Table.AddColumn(#"Added Custom", "count", each List.Count(Text.PositionOfAny([Custom],{","},Occurrence.All))+1)[count])}, each "Column." & Text.From(_)),
#"Split Column by Delimiter" = Table.SplitColumn(#"Added Custom", "Custom", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), ColumnTitles),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Column1"})
in #"Removed Columns"
generates
It converts the JSON to a list of lists, converts to a table of lists, expands the list to be text with commas, then expands that into columns after dynamically creating column names for the new columns by counting the max number of commas

Is there a way to get records from 2 API with different API key in 1 connector Power BI using Power Query?

I have to get data from 2 API (one is input and another is hardcoded).
My scenario is I would input my 1st key (as Authentication key) to run 1st API and inside I have a column that calls the 2nd API (which I hardcoded apikey) but the system still alerts me as wrong credential. If I run separately it will run normally. I've done a similar case but in that only used 1 APIkey. I don't know how to resolve it.
ExpandData = (dataList as list) as table =>
let
FromList = Table.FromList(dataList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(FromList, "Column1", {"search-results"}, {"search-results"}),
#"Expanded search-results" = Table.ExpandRecordColumn(#"Expanded Column1", "search-results", {"entry"}, {"entry"}),
#"Expanded entry" = Table.ExpandListColumn(#"Expanded search-results", "entry"),
#"Expanded entry1" = Table.ExpandRecordColumn(#"Expanded entry", "entry", {"dc:identifier", "prism:doi"}, {"identifier", "doi"}),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded entry1",{"doi", "identifier"}),
#"Add Column" = Table.AddColumn(#"Reordered Columns", "Detail", each Text.Range(Text.From([identifier]),10)),
#"Add Column2" = Table.AddColumn(#"Add Column", "Detail2", each ExpandDataPublish(Text.From([Detail])))
in
#"Add Column2";
ExpandDataPublish = (data as text) as table =>
let
Source = Json.Document(Web.Contents("https://******" & data,
[Headers=[#"Accept" = "application/json;odata.metadata=minimal",
#"X-ELS-APIKey"="7f59af901d2d86f78a1fd60c1bf9426a"]])),
#"Converted to Table" = Table.FromRecords({Source})
in
#"Converted to Table";
It runs normally if not call 2nd API
But fail if call

Error when trying to split a column using powerquery in Azure Data Factory -UserQuery : Expression.Error: An error occurred invoking 'Table.AddColumn'

I get the following error when trying to split a column by space delimiter on PowerQuery in Data Factory :
UserQuery : Expression.Error: An error occurred invoking 'Table.AddColumn': We can't get the expression for the specified value.
What is causing this and how would I go about resolving it?
Many thanks
This is the error
The PowerQuery itself is :
let
Source = dedupedetipscsv,
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Candidate", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Candidate.1", "Candidate.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(Table.TransformColumnTypes(#"Split Column by Delimiter", {{"ApprovedDate", type text}}, "en-GB"), "ApprovedDate", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"ApprovedDate.1", "ApprovedDate.2"})
in
#"Split Column by Delimiter1"
Note: Power Query will split the column into as many columns as needed. The name of the new columns will contain the same name as the
original column. A suffix that includes a dot and a number that
represents the split sections of the original column will be appended
to the name of the new columns.
In the Table.AddColumn step might refer to variable which is a List. You need to refer to #"Renamed Columns" which is the last step that results in a table.
Split columns by delimiter into columns | Here
Make sure : An alternate for split by length and by position is listed below | M script workarounds
Table.AddColumn(Source, "First characters", each Text.Start([Email], 7), type text)
Table.AddColumn(#"Inserted first characters", "Text range", each Text.Middle([Email], 4, 9), type text)

PowerBI parse delimited data

I have a table containing other tables in its values. These other tables can be formatted either as CSV or JSON.
Can you please tell me how I can import this data into individual tables in PowerBI? I've tried with the PowerQuery GUI but so far unsuccessfully, perhaps there will be a need to use the code in the advanced editor.
I can't just parse this data outside PowerBI because of company guidelines prohibiting the use of scripts, so everything must be done within PowerBI - though the PowerQuery code is allowed.
csv:
"id,normdist,poissondist,binomial\r\n1,0.00013383,0.033689735,0.009765625\r\n2,0.004431848,0.084224337,0.043945313\r\n3,0.053990967,0.140373896,0.1171875\r\n4,0.241970725,0.17546737,0.205078125\r\n5,0.39894228,0.17546737,0.24609375\r\n6,0.241970725,0.146222808,0.205078125\r\n7,0.053990967,0.104444863,0.1171875\r\n8,0.004431848,0.065278039,0.043945313\r\n9,0.00013383,0.036265577,0.009765625\r\n10,1.49E-06,0.018132789,0.000976563\r\n"
json (by row)
[{"id":1,"normdist":0.0001,"poissondist":0.0337,"binomial":0.0098},{"id":2,"normdist":0.0044,"poissondist":0.0842,"binomial":0.0439},{"id":3,"normdist":0.054,"poissondist":0.1404,"binomial":0.1172},{"id":4,"normdist":0.242,"poissondist":0.1755,"binomial":0.2051},{"id":5,"normdist":0.3989,"poissondist":0.1755,"binomial":0.2461},{"id":6,"normdist":0.242,"poissondist":0.1462,"binomial":0.2051},{"id":7,"normdist":0.054,"poissondist":0.1044,"binomial":0.1172},{"id":8,"normdist":0.0044,"poissondist":0.0653,"binomial":0.0439},{"id":9,"normdist":0.0001,"poissondist":0.0363,"binomial":0.0098},{"id":10,"normdist":1.49e-06,"poissondist":0.0181,"binomial":0.001}]
Let's say that the data is in the CSV version but just a string in a database so that it looks like this in the query editor:
In order to expand this into a table, we need to split it into rows and columns. The Home tab has a Split Column tool we'll use like this using By Delimiter option from the dropdown:
That is, we use "\r\n" to split the cell into multiple rows.
Now our column looks like this:
Remove any blank rows and use the Split Column tool again. This time, you can leave the defaults since it will automatically guess that you want to split by comma and expand into rows.
If you promote the headers and clean up the column types, the final result should be
Full M Query for this example that you can paste into the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZFJbsMwDEXvkrUacBKHA/QUcRYpshHQ2EXc+6OS4mzkC3Th3Qf5+ThdLqdyT/PyfNzL+pt+lrKuy9z1V5mXR7l9T9NzmmZMcAYAZHZuklk9jHMPh2lWyi8n9ZAIo4s37UIkzNa0cEhm5Je1kzJHQGhLowAbe2jTaOi2MaUGSDAMjFpLtCxqHUmQwRzf3VuWw6P29MEoCsFvoo5EUaol4HukjVPW5URceZzSx801kzlw7DeP8ZxKmrPZ/pwICc8Snx/QRgZ05Ard6rtzQ56u6Xjm8czjmf/wmdc/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split by \r\n into Rows" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter("\r\n", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Filtered Blank Rows" = Table.SelectRows(#"Split by \r\n into Rows", each [Column1] <> null and [Column1] <> ""),
#"Split into Columns" = Table.SplitColumn(#"Filtered Blank Rows", "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Split into Columns", [PromoteAllScalars=true]),
#"Filtered Repeated Headers" = Table.SelectRows(#"Promoted Headers", each ([id] <> "id")),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Repeated Headers",{{"id", Int64.Type}, {"normdist", type number}, {"poissondist", type number}, {"binomial", type number}})
in
#"Changed Type"

How do I extract Json into an excel spreadsheet next to several parameters

I am trying to get financial data from Financial Modeling Prep's API into an excel spreadsheet. I am beginning to think that Power Query just does not do what I am looking for. I want to have one column with a static list of stock symbols (DAL, GOOG, AAL etc) and populate each row with financial data from various api calls such as the Net Income field from https://financialmodelingprep.com/api/v3/financials/income-statement/DAL and the current stock price from https://financialmodelingprep.com/api/v3/stock/real-time-price/DAL
What exactly have you tried? It's very simple to extract data from the first link you gave with the M Code below (all UI based, nothing advanced about that at all). Converting that into a function to go to the relevant URL for each code and do the same transformation is also trivial
let
Source = Json.Document(Web.Contents("https://financialmodelingprep.com/api/v3/financials/income-statement/DAL ")),
financials = Source[financials],
#"Converted to Table" = Table.FromList(financials, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"date", "Net Income"}, {"date", "Net Income"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Column1",{{"Net Income", type number}, {"date", type date}})
in
#"Changed Type"
Here is my solution in Python:
Create parameters:
company = "NVDA"
years = 5
Add Key:
api_key = 'YOUR_KEY'
Request:
r = requests.get(f'https://financialmodelingprep.com/api/v3/income-statement/{company}?limit={years}&apikey={api_key}')
data = r.json()
data
Extract data
date = []
symbol = []
revenue = []
costOfRevenue = []
grossProfit = []
for finance in data:
date.append(finance["date"])
symbol.append(finance["symbol"])
revenue.append(finance["revenue"])
costOfRevenue.append(finance["costOfRevenue"])
grossProfit.append(finance["grossProfit"])
ncome_nvda_dict = {
"Date" : date,
"Ticket": symbol,
"Revenue" : revenue,
"CostOfRevenue" : costOfRevenue,
"grossProfit" : grossProfit,
From Object To Pands
income_nvda_df = pd.DataFrame(income_nvda_dict, columns = ['Date', 'Ticket', 'Revenue', 'CostOfRevenue', 'grossProfit'])