PowerBI parse delimited data - json

I have a table containing other tables in its values. These other tables can be formatted either as CSV or JSON.
Can you please tell me how I can import this data into individual tables in PowerBI? I've tried with the PowerQuery GUI but so far unsuccessfully, perhaps there will be a need to use the code in the advanced editor.
I can't just parse this data outside PowerBI because of company guidelines prohibiting the use of scripts, so everything must be done within PowerBI - though the PowerQuery code is allowed.
csv:
"id,normdist,poissondist,binomial\r\n1,0.00013383,0.033689735,0.009765625\r\n2,0.004431848,0.084224337,0.043945313\r\n3,0.053990967,0.140373896,0.1171875\r\n4,0.241970725,0.17546737,0.205078125\r\n5,0.39894228,0.17546737,0.24609375\r\n6,0.241970725,0.146222808,0.205078125\r\n7,0.053990967,0.104444863,0.1171875\r\n8,0.004431848,0.065278039,0.043945313\r\n9,0.00013383,0.036265577,0.009765625\r\n10,1.49E-06,0.018132789,0.000976563\r\n"
json (by row)
[{"id":1,"normdist":0.0001,"poissondist":0.0337,"binomial":0.0098},{"id":2,"normdist":0.0044,"poissondist":0.0842,"binomial":0.0439},{"id":3,"normdist":0.054,"poissondist":0.1404,"binomial":0.1172},{"id":4,"normdist":0.242,"poissondist":0.1755,"binomial":0.2051},{"id":5,"normdist":0.3989,"poissondist":0.1755,"binomial":0.2461},{"id":6,"normdist":0.242,"poissondist":0.1462,"binomial":0.2051},{"id":7,"normdist":0.054,"poissondist":0.1044,"binomial":0.1172},{"id":8,"normdist":0.0044,"poissondist":0.0653,"binomial":0.0439},{"id":9,"normdist":0.0001,"poissondist":0.0363,"binomial":0.0098},{"id":10,"normdist":1.49e-06,"poissondist":0.0181,"binomial":0.001}]

Let's say that the data is in the CSV version but just a string in a database so that it looks like this in the query editor:
In order to expand this into a table, we need to split it into rows and columns. The Home tab has a Split Column tool we'll use like this using By Delimiter option from the dropdown:
That is, we use "\r\n" to split the cell into multiple rows.
Now our column looks like this:
Remove any blank rows and use the Split Column tool again. This time, you can leave the defaults since it will automatically guess that you want to split by comma and expand into rows.
If you promote the headers and clean up the column types, the final result should be
Full M Query for this example that you can paste into the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("7ZFJbsMwDEXvkrUacBKHA/QUcRYpshHQ2EXc+6OS4mzkC3Th3Qf5+ThdLqdyT/PyfNzL+pt+lrKuy9z1V5mXR7l9T9NzmmZMcAYAZHZuklk9jHMPh2lWyi8n9ZAIo4s37UIkzNa0cEhm5Je1kzJHQGhLowAbe2jTaOi2MaUGSDAMjFpLtCxqHUmQwRzf3VuWw6P29MEoCsFvoo5EUaol4HukjVPW5URceZzSx801kzlw7DeP8ZxKmrPZ/pwICc8Snx/QRgZ05Ard6rtzQ56u6Xjm8czjmf/wmdc/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Split by \r\n into Rows" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter("\r\n", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
#"Filtered Blank Rows" = Table.SelectRows(#"Split by \r\n into Rows", each [Column1] <> null and [Column1] <> ""),
#"Split into Columns" = Table.SplitColumn(#"Filtered Blank Rows", "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Split into Columns", [PromoteAllScalars=true]),
#"Filtered Repeated Headers" = Table.SelectRows(#"Promoted Headers", each ([id] <> "id")),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Repeated Headers",{{"id", Int64.Type}, {"normdist", type number}, {"poissondist", type number}, {"binomial", type number}})
in
#"Changed Type"

Related

Importing JSON array of arrays into Excel

I'm trying to import a JSON file with the following format into Excel:
[
[1,2,3,4],
[5,6,7,8]
]
I want to get a spreadsheet with 2 rows and 4 columns, where each row contains the contents of the inner array as separate column values, e.g.
Column A
Column B
Column C
Column D
1
2
3
4
5
6
7
8
Although this would seem to be an easy problem to solve, I can't seem to find the right PowerQuery syntax, or locate an existing answer that covers this scenario. I can easily import as a single column with 8 values, but can't seem to split the inner array into separate columns.
Assuming the JSON looks like
[
[1,2,3,4],
[5,6,7,8]
]
then this code in Powerquery
let Source = Json.Document(File.Contents("C:\temp\j.json")),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each Text.Combine(List.Transform([Column1], each Text.From(_)),",")),
ColumnTitles = List.Transform({1 .. List.Max(Table.AddColumn(#"Added Custom", "count", each List.Count(Text.PositionOfAny([Custom],{","},Occurrence.All))+1)[count])}, each "Column." & Text.From(_)),
#"Split Column by Delimiter" = Table.SplitColumn(#"Added Custom", "Custom", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), ColumnTitles),
#"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"Column1"})
in #"Removed Columns"
generates
It converts the JSON to a list of lists, converts to a table of lists, expands the list to be text with commas, then expands that into columns after dynamically creating column names for the new columns by counting the max number of commas

Power Query wont recognise correct column name when trying to identify duplicate values after transformations are applied

Strange issue which has been giving me a bit of trouble for the past few hours. Im trying to identify which values are duplicates in custom.Column1 by creating a new column which returns either duplicate or unique if the value is already present or not within the column.
looking online I found that this can be achieved using:
if List.Count(List.FindText(Source[custom.Column1],[custom.Column1]))>1 then "duplicate" else "unique")
However, this only appears to work if used early on. Atfer I have applied transformations I get an error saying that the column name is not found.
Original Data:
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table15"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Custom.Column1", type text}, {"Custom", type text}, {"Custom.Column3", type text}, {"Custom.Column4", type text}, {"Custom.Column5", type text}}),
#"Merged Columns" = Table.CombineColumns(#"Changed Type",{"Custom.Column1", "Custom.Column4"},Combiner.CombineTextByDelimiter("#", QuoteStyle.None),"Merged"),
#"Reordered Columns1" = Table.ReorderColumns(#"Merged Columns",{"Merged", "Custom", "Custom.Column3", "Custom.Column5"}),
#"Merged Columns1" = Table.CombineColumns(#"Reordered Columns1",{"Custom", "Custom.Column5", "Custom.Column3"},Combiner.CombineTextByDelimiter("#", QuoteStyle.None),"Merged.1"),
#"Merged Columns2" = Table.CombineColumns(#"Merged Columns1",{"Merged", "Merged.1"},Combiner.CombineTextByDelimiter("?", QuoteStyle.None),"Merged.2"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merged Columns2", {{"Merged.2", Splitter.SplitTextByDelimiter("?", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged.2"),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged.2", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type2", "Merged.2", Splitter.SplitTextByDelimiter("#", QuoteStyle.None), {"Merged.2.1", "Merged.2.2"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged.2.1", type text}, {"Merged.2.2", type number}}),
#"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type3", "Merged.2.1", Splitter.SplitTextByDelimiter("#", QuoteStyle.None), {"Merged.2.1.1", "Merged.2.1.2", "Merged.2.1.3"}),
#"Replaced Value" = Table.ReplaceValue(#"Split Column by Delimiter2","-","",Replacer.ReplaceText,{"Merged.2.1.2"}),
#"Changed Type4" = Table.TransformColumnTypes(#"Replaced Value",{{"Merged.2.1.2", type number}}),
#"Filtered Rows2" = Table.SelectRows(#"Changed Type4", each ([Merged.2.1.1] <> "")),
//PROBLEMATIC LINE
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Custom.1", each if List.Count(List.FindText(Source[Merged.2.1.1],[Merged.2.1.1]))>1 then "duplicate" else "unique")
in
#"Added Custom3"
Hopefully you should see that all of the transformations work up until the final line when trying to identify duplicates. I just cant figure out why. A possible glitch in PowerQuery?
DATA FOR SOURCE:
Custom.Column1
Hydrogenated Glucose
Xanthan Gum
Methyl Paraben
Peppermint Flavour No.2
Sodium Hydroxide
Purified Water
Custom
Sorbitol
Maltitol
Xanthan Gum
Methyl Paraben
Methanol
Purified Water
Amorphous Silica Gel
Menthol
2-Isopropyl-5-Methylcyclohexanone
Sodium Hydroxide
Purified Water
Custom.Column3
68425-17-2
585-88-6
11138-66-2
99-76-3
67-56-1
7732-18-5
7631-86-9
89-78-1
89-80-5
1310-73-2
7732-18-5
Custom.Column4
25.59685
0.64875
0.37071
0.05561
0.0519
73.12789
Custom.Column5
50
8
-
100
0.3
0.5
0.5
30.00
5.00
-
-
The line of code you show that works is:
if List.Count(List.FindText(Source[custom.Column1],[custom.Column1]))>1 then "duplicate" else "unique")
So you are refering to the Step called "Source", which is the first one. This one contains a column called "custom.column1".
In the example that does not work you refer again to the step "Source", but this one doesn't have the column [Merged.2.1.1] which you added later. Thus you get the error.
To refer to the step you have to write:
#"Filtered Rows2"[Merged.2.1.1]
instead of
Source[Merged.2.1.1]
in last step, change
Source[Merged.2.1.1]
to
#"Filtered Rows2"[Merged.2.1.1]
Merged.2.1.1 does not exist in Source table. It exists in the final prior step named #"Filtered Rows2"

Error when trying to split a column using powerquery in Azure Data Factory -UserQuery : Expression.Error: An error occurred invoking 'Table.AddColumn'

I get the following error when trying to split a column by space delimiter on PowerQuery in Data Factory :
UserQuery : Expression.Error: An error occurred invoking 'Table.AddColumn': We can't get the expression for the specified value.
What is causing this and how would I go about resolving it?
Many thanks
This is the error
The PowerQuery itself is :
let
Source = dedupedetipscsv,
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Candidate", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Candidate.1", "Candidate.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(Table.TransformColumnTypes(#"Split Column by Delimiter", {{"ApprovedDate", type text}}, "en-GB"), "ApprovedDate", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"ApprovedDate.1", "ApprovedDate.2"})
in
#"Split Column by Delimiter1"
Note: Power Query will split the column into as many columns as needed. The name of the new columns will contain the same name as the
original column. A suffix that includes a dot and a number that
represents the split sections of the original column will be appended
to the name of the new columns.
In the Table.AddColumn step might refer to variable which is a List. You need to refer to #"Renamed Columns" which is the last step that results in a table.
Split columns by delimiter into columns | Here
Make sure : An alternate for split by length and by position is listed below | M script workarounds
Table.AddColumn(Source, "First characters", each Text.Start([Email], 7), type text)
Table.AddColumn(#"Inserted first characters", "Text range", each Text.Middle([Email], 4, 9), type text)

How to transpose multiple .csv files and combine in Excel power query?

I want to analyse some spectral data.
I have a ~6500 csv files.
Each .csv file contains data with the fromat shown in pics.
How can I transpose all csv files??
....so then....I can combine them in powerQuery??
Thank you!
This will read in all .csv files in a directory, transpose and combine them for you
Use Home...advanced editor... to paste into PowerQuery and edit the 2nd line with the appropriate directory path
Based on Alexis Olson recent answer Reading the first n rows of a csv without parsing the whole file in Power Query
let
Source = Folder.Files("C:\directory\subdirectory\"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".csv")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content", "Name"}),
#"Invert" = Table.TransformColumns(#"Removed Other Columns", {{"Content", each Table.Transpose(Csv.Document(_))}}),
MaxColumns = List.Max(List.Transform(#"Invert"[Content], each Table.ColumnCount(_))),
#"Expanded Content" = Table.ExpandTableColumn(#"Invert", "Content", List.Transform({1..MaxColumns}, each "Column" & Number.ToText(_))),
#"Promoted Headers" = Table.PromoteHeaders(#"Expanded Content", [PromoteAllScalars=true]),
#"Filtered Rows1" = Table.SelectRows(#"Promoted Headers", each ([Date] <> "Date"))
in #"Filtered Rows1"

Parsing a .json column in Power BI

I want to parse a .json column through Power BI. I have imported the data directly from the server and have a .json column in the data along with other columns. Is there a way to parse this json column?
Example:
Key IDNumber Module JsonResult
012 200 Dine {"CategoryType":"dining","City":"mumbai"',"Location":"all"}
97 303 Fly {"JourneyType":"Return","Origin":"Mumbai (BOM)","Destination":"Chennai (MAA)","DepartureDate":"20-Oct-2016","ReturnDate":"21-Oct-2016","FlyAdult":"1","FlyChildren":"0","FlyInfant":"0","PromoCode":""}
276 6303 Stay {"Destination":"Clarion Chennai","CheckInDate":"14-Oct-2016","CheckOutDate":"15-Oct-2016","Rooms":"1","NoOfPax":"2","NoOfAdult":"2","NoOfChildren":"0"}
I wish to retain the other columns and also get the simplified parsed columns.
There is an easier way to do it, in the Query Editor on the column you want to read as a json:
Right click on the column
Select Transform>JSON
then the column becomes a Record that you can split in every property of the json using the button on the top right corner.
Use Json.Document function like this
let
...
your_table=imported_the_data_directly_from_the_server,
json=Table.AddColumn(your_table, "NewColName", each Json.Document([JsonResult]))
in
json
And then expand record to table using Table.ExpandRecordColumn
Or by clicking this button
Use Json.Document() function to convert string to Json data.
let
Source = Json.Document(Json.Document(Web.Contents("http://localhost:18091/pools/default/buckets/Aggregation/docs/AvgSumAssuredByProduct"))[json]),
#"Converted to Table" = Record.ToTable(Source),
#"Filtered Rows" = Table.SelectRows(#"Converted to Table", each not Text.Contains([Name], "type_")),
#"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Name", "AvgSumAssuredByProduct"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", type number}})
in
#"Changed Type"
import json
from urllib import urlopen
import string
from UserList import *
l=[]
j=[]
d_base=urlopen('https://api.thingspeak.com/channels/193888/fields/1.json?results=1')
data = json.load(d_base)
for k in data['feeds']:
name = k['entry_id']
value = k['field1']
l.append(name)
j.append(value)
print l[0]
print j[0]
**this python code may useful for you **
**270
1035
**