Parse Large amount of Json in Excel

Parse Large amount of Json in Excel - json

I have an application which reads in files containing json data parses them then performs some formatting and calculations.
The Core of the code itself works fine, however the issue I have is that it is slow.
I got Json Parsing code from another site (not sure if i'm allowed to link to it), and what I do is I split the file from the data into an array, Parse each row and put it into another array then perform all my tasks later.
The slow part is the Parsing of the data, overall it's pretty quick and works through about 10-15 rows a second, however my files have upwards of 5k rows of data in them, meaning each file takes roughly 5 minutes, and when as there are hundreds of files to get through, it takes almost an entire day to full parse all the data.
I've attached snippets of my code (not the full parsing functions as that's really long) just to give an idea of how my code is currently working
My query is, is there a way I can parse the entire file (each record is seperated by a line feed) instead of having to parse it row by row?
sourceArray = Split(sourceText, vbLf)
For Ctr = LBound(sourceArray) To UBound(sourceArray)
Set dic = ParseJSON(sourceArray(Ctr)
For Each v In dic
(do some formatting stuff, I tested by completely commenting out this part to make sure the ParseJSON is where the slowdown is)
next v
Next Ctr
Json Parser code
'-------------------------------------------------------------------
' VBA JSON Parser
'-------------------------------------------------------------------
Function ParseJSON(json$, Optional key$ = "obj") As Object
p = 1
token = Tokenize(json)
Set dic = CreateObject("Scripting.Dictionary")
If token(p) = "{" Then ParseObj key Else ParseArr key
Set ParseJSON = dic
End Function
Function ParseObj(key$)
Do: p = p + 1
Select Case token(p)
Case "]"
Case "[": ParseArr key
Case "{"
If token(p + 1) = "}" Then
p = p + 1
dic.Add key, "null"
Else
ParseObj key
End If
Case "}": key = ReducePath(key): Exit Do
Case ":": key = key & "." & token(p - 1)
Case ",": key = ReducePath(key)
Case Else: If token(p + 1) <> ":" Then dic.Add key, token(p)
End Select
Loop
End Function
Function ParseArr(key$)
Dim e&
Do: p = p + 1
Select Case token(p)
Case "}"
Case "{": ParseObj key & ArrayID(e)
Case "[": ParseArr key
Case "],": e = e
Case "]": Exit Do
Case ":": key = key & ArrayID(e)
Case ",": e = e + 1
Case Else: dic.Add key & ArrayID(e), token(p)
End Select
Loop
End Function
'-------------------------------------------------------------------
' Support Functions
'-------------------------------------------------------------------
Function Tokenize(s$)
Const Pattern = """(([^""\\]|\\.)*)""|[+\-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)?|\w+|[^\s""']+?"
Tokenize = RExtract(s, Pattern, True)
End Function
Function RExtract(s$, Pattern, Optional bGroup1Bias As Boolean, Optional bGlobal As Boolean = True)
Dim c&, m, n, v
With CreateObject("vbscript.regexp")
.Global = bGlobal
.MultiLine = False
.IgnoreCase = True
.Pattern = Pattern
If .test(s) Then
Set m = .Execute(s)
ReDim v(1 To m.Count)
For Each n In m
c = c + 1
v(c) = n.Value
If bGroup1Bias Then If Len(n.subMatches(0)) Or n.Value = """""" Then v(c) = n.subMatches(0)
Next
End If
End With
RExtract = v
End Function
Function ArrayID$(e)
ArrayID = "(" & e & ")"
End Function
Function ReducePath$(key$)
If InStr(key, ".") Then ReducePath = Left(key, InStrRev(key, ".") - 1) Else ReducePath = key
End Function
Function ListPaths(dic)
Dim s$, v
For Each v In dic
s = s & v & " --> " & dic(v) & vbLf
Next
End Function
Function GetFilteredValues(dic, Match)
Dim c&, i&, v, w
v = dic.Keys
ReDim w(1 To dic.Count)
For i = 0 To UBound(v)
If v(i) Like Match Then
c = c + 1
w(c) = dic(v(i))
End If
Next
ReDim Preserve w(1 To c)
GetFilteredValues = w
End Function
Function GetFilteredTable(dic, cols)
Dim c&, i&, j&, v, w, z
v = dic.Keys
z = GetFilteredValues(dic, cols(0))
ReDim w(1 To UBound(z), 1 To UBound(cols) + 1)
For j = 1 To UBound(cols) + 1
z = GetFilteredValues(dic, cols(j - 1))
For i = 1 To UBound(z)
w(i, j) = z(i)
Next
Next
GetFilteredTable = w
End Function
Function OpenTextFile$(f)
With CreateObject("ADODB.Stream")
.Charset = "utf-8"
.Open
.LoadFromFile f
OpenTextFile = .ReadText
End With
End Function
Sample Data
{
"op": "mcm",
"clk": "4985835795",
"pt": 1641446989915,
"mc": [
{
"id": "1.193068822",
"rc": [
{
"batl": [
[
0,
85,
10
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 536162
},
{
"batl": [
[
0,
85,
5
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 36024159
},
{
"batb": [
[
0,
1.02,
322
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 22734575
},
{
"batb": [
[
0,
1.02,
322
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 16286497
},
{
"batb": [
[
0,
1.02,
322
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 18673417
},
{
"batl": [
[
0,
85,
10
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 38835846
},
{
"batl": [
[
0,
85,
10
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 1527689
},
{
"batb": [
[
0,
1.02,
322
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 36394575
},
{
"batb": [
[
0,
1.02,
322
]
],
"ltp": 0.0,
"tv": 0.0,
"id": 1276636
}
]
}
]
}

I'm using this library to parse a json file with 4560 objects and 532kb under 3 seconds.
https://github.com/VBA-tools/VBA-JSON
Sub main()
Dim FSO As New FileSystemObject
Dim JsonTS As TextStream
Dim JsonText As String
Dim Parsed As Dictionary
Set JsonTS = FSO.OpenTextFile("sample4.json", ForReading)
JsonText = JsonTS.ReadAll
JsonTS.Close
Set Parsed = JsonConverter.ParseJson(JsonText)
For i = 1 To Parsed("people").Count
With Sheets(1)
Range("A" & i).Value = Parsed("people")(i)("firstName")
Range("B" & i).Value = Parsed("people")(i)("lastName")
Range("C" & i).Value = Parsed("people")(i)("gender")
Range("D" & i).Value = Parsed("people")(i)("age")
Range("E" & i).Value = Parsed("people")(i)("number")
End With
Next
End Sub
{
"people": [
{
"firstName": "Joe",
"lastName": "Jackson",
"gender": "male",
"age": 28,
"number": "7349282382"
}
]
}

Related

JSON to Excel PowerQuery import - how to get a row per nested field

I'm looking to use the Excel Power Query to import some json that looks like the following (but much bigger, more fields etc.):
example-records.json
{
"records": {
"record_id_1": {
"file_no": "5792C",
"loads": {
"load_id_1": {
"docket_no": "3116115"
},
"load_id_2": {
"docket_no": "3116118"
},
"load_id_3": {
"docket_no": "3208776"
}
}
},
"record_id_2": {
"file_no": "5645C",
"loads": {
"load_id_4": {
"docket_no": "2000527155"
},
"load_id_5": {
"docket_no": "2000527156"
},
"load_id_6": {
"docket_no": "2000527146"
}
}
}
}
}
I want to get a table like the following at the load_id / docket level. A row per load_id
What I've tried
Clicking buttons in power query UI I get the following.
The problem is I can't include a file_no column and this doesn't work when there are lots of load ids.
let
Source = Json.Document(File.Contents("H:\Software\Site Apps\example-records.json")),
records = Source[records],
#"Converted to Table" = Record.ToTable(records),
#"Expanded Value" = Table.ExpandRecordColumn(#"Converted to Table", "Value", {"file_no", "loads"}, {"Value.file_no", "Value.loads"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Value",{"Value.file_no"}),
#"Expanded Value.loads" = Table.ExpandRecordColumn(#"Removed Columns", "Value.loads", {"load_id_1", "load_id_2", "load_id_3", "load_id_4", "load_id_5", "load_id_6"}, {"Value.loads.load_id_1", "Value.loads.load_id_2", "Value.loads.load_id_3", "Value.loads.load_id_4", "Value.loads.load_id_5", "Value.loads.load_id_6"}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Expanded Value.loads", {"Name"}, "Attribute", "Value"),
#"Expanded Value1" = Table.ExpandRecordColumn(#"Unpivoted Columns", "Value", {"docket_no"}, {"Value.docket_no"})
in
#"Expanded Value1"

You can use
let Source = JSON(Json.Document(File.Contents("c:\temp\example.json"))),
#"Removed Other Columns" = Table.SelectColumns(Source,{"Name.1", "Name.3", "Value"}),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "Custom", each if [Name.3]=null then [Value] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Name.3] <> null))
in #"Filtered Rows"
based on this function I named JSON which comes from Imke https://www.thebiccountant.com/2018/06/17/automatically-expand-all-fields-from-a-json-document-in-power-bi-and-power-query/ which is reproduced below
let
func = (JSON) =>
let
Source = JSON,
ParseJSON = try Json.Document(Source) otherwise Source,
TransformForTable =
if Value.Is(ParseJSON, type record) then
Record.ToTable(ParseJSON)
else
#table(
{"Name", "Value"},
List.Zip({List.Repeat({0}, List.Count(ParseJSON)), ParseJSON})
),
AddSort = Table.Buffer(Table.AddColumn(TransformForTable, "Sort", each 0)),
LG = List.Skip(
List.Generate(
() => [Next = AddSort, Counter = 1, AddIndex = #table({"Sort"}, {{""}})],
each [AddIndex]{0}[Sort] <> "End",
each [
AddIndex = Table.AddIndexColumn([Next], "Index", 0, 1),
MergeSort = Table.CombineColumns(
Table.TransformColumnTypes(
AddIndex,
{{"Sort", type text}, {"Index", type text}},
"en-GB"
),
{"Sort", "Index"},
Combiner.CombineTextByDelimiter(".", QuoteStyle.None),
"Sort"
),
PJson = Table.TransformColumns(
MergeSort,
{{"Value", each try Json.Document(_) otherwise _}}
),
AddType = Table.AddColumn(
PJson,
"Type",
each
if Value.Is([Value], type record) then
"Record"
else if Value.Is([Value], type list) then
"List"
else if Value.Is([Value], type table) then
"Table"
else
"other"
),
AddStatus = Table.AddColumn(
AddType,
"Status",
each if [Type] = "other" then "Finished" else "Unfinished"
),
Finished = Table.SelectRows(AddStatus, each ([Status] = "Finished")),
Unfinished = Table.SelectRows(AddStatus, each ([Status] = "Unfinished")),
AddNext = Table.AddColumn(
Unfinished,
"Next",
each if [Type] = "Record" then {[Value]} else [Value]
),
RemoveCols = Table.RemoveColumns(AddNext, {"Value", "Type", "Status"}),
ExpandNext = Table.ExpandListColumn(RemoveCols, "Next"),
AddIndex2 = Table.AddIndexColumn(ExpandNext, "Index", 0, 1),
MergeSort2 = Table.CombineColumns(
Table.TransformColumnTypes(
AddIndex2,
{{"Sort", type text}, {"Index", type text}},
"en-GB"
),
{"Sort", "Index"},
Combiner.CombineTextByDelimiter(".", QuoteStyle.None),
"Sort"
),
TransformRecord = Table.TransformColumns(
MergeSort2,
{
{
"Next",
each try
Record.ToTable(_)
otherwise
try
if Value.Is(Text.From(_), type text) then
#table({"Value"}, {{_}})
else
_
otherwise
_
}
}
),
FilterOutNulls = Table.SelectRows(TransformRecord, each [Next] <> null),
Next =
if Table.IsEmpty(FilterOutNulls) then
#table({"Sort"}, {{"End"}})
else if Value.Is(FilterOutNulls[Next]{0}, type table) = true then
Table.ExpandTableColumn(
FilterOutNulls,
"Next",
{"Name", "Value"},
{"Name." & Text.From([Counter]), "Value"}
)
else
Table.RenameColumns(FilterOutNulls, {{"Next", "Value"}}),
Counter = [Counter] + 1
],
each Table.AddColumn([Finished], "Level", (x) => _[Counter] - 2)
)
),
Check = LG{2},
Combine = Table.Combine(LG),
Clean = Table.RemoveColumns(Combine, {"Status", "Type"}),
Trim = Table.TransformColumns(Clean, {{"Sort", each Text.Trim(_, "."), type text}}),
// Dynamic Padding for the sort-column so that it sorts by number in text strings
SelectSort = Table.SelectColumns(Trim, {"Sort"}),
SplitSort = Table.AddColumn(
SelectSort,
"Custom",
each List.Transform(try Text.Split([Sort], ".") otherwise {}, Number.From)
),
ToTable = Table.AddColumn(
SplitSort,
"Splitted",
each Table.AddIndexColumn(Table.FromColumns({[Custom]}), "Pos", 1, 1)
),
ExpandTable = Table.ExpandTableColumn(ToTable, "Splitted", {"Column1", "Pos"}),
GroupPos = Table.Group(
ExpandTable,
{"Pos"},
{{"All", each _, type table}, {"Max", each List.Max([Column1]), type text}}
),
Digits = Table.AddColumn(GroupPos, "Digits", each Text.Length(Text.From([Max]))),
FilteredDigits = List.Buffer(Table.SelectRows(Digits, each ([Digits] <> null))[Digits]),
SortNew = Table.AddColumn(
Trim,
"SortBy",
each Text.Combine(
List.Transform(
List.Zip({Text.Split([Sort], "."), List.Positions(Text.Split([Sort], "."))}),
each Text.PadStart(_{0}, FilteredDigits{_{1}}, "0")
),
"."
)
),
FilterNotNull = Table.SelectRows(SortNew, each ([Value] <> null)),
Reorder = Table.ReorderColumns(
FilterNotNull,
{"Value", "Level", "Sort", "SortBy"}
& List.Difference(
Table.ColumnNames(FilterNotNull),
{"Value", "Level", "Sort", "SortBy"}
)
),
Dots = Table.AddColumn(
#"Reorder",
"Dots",
each List.Select(Table.ColumnNames(#"Reorder"), (l) => Text.StartsWith(l, "Name"))
),
// This sort is just to view in the query editor. When loaded to the data model it will not be kept. Use "Sort by column" in the data model instead.
Sort = Table.Sort(Dots, {{"SortBy", Order.Ascending}})
in
Sort,
documentation = [
Documentation.Name = " Table.JsonExpandAll ",
Documentation.Description
= " Dynamically expands the <Json> Record and returns values in one column and additional columns to navigate. ",
Documentation.LongDescription
= " Dynamically expands the <Json> Record and returns values in one column and additional columns to navigate. Input can be JSON in binary format or the already parsed JSON. ",
Documentation.Category = " Table ",
Documentation.Version = " 1.2: Added column [Dots] (22/02/2019)",
Documentation.Author = " Imke Feldmann: www.TheBIccountant.com . ",
Documentation.Examples = {[Description = " ", Code = " ", Result = " "]}
]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

Managed to use an added custom column, the action that enables the expansion to one load id per row.
#"Added Custom" = Table.AddColumn(#"Expanded Value", "Custom", each Record.ToTable([Value.loads]))
let
Source = Json.Document(File.Contents("H:\Software\Site Apps\example-records.json")),
records = Source[records],
#"Converted to Table" = Record.ToTable(records),
#"Expanded Value" = Table.ExpandRecordColumn(#"Converted to Table", "Value", {"file_no", "loads"}, {"Value.file_no", "Value.loads"}),
#"Added Custom" = Table.AddColumn(#"Expanded Value", "Custom", each Record.ToTable([Value.loads])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Value.loads"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Name", "Value"}, {"Custom.Name", "Custom.Value"}),
#"Expanded Custom.Value" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom.Value", {"docket_no"}, {"Custom.Value.docket_no"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Custom.Value",{{"Name", "record_id"}, {"Value.file_no", "file_no"}, {"Custom.Name", "load_id"}, {"Custom.Value.docket_no", "docket_no"}})
in
#"Renamed Columns"

Implement function call (grouped running totals) in query

I have a function called fxGroupedRunningTotal (fxGRT) and a query (Totals). I want to call fxGRT within Totals, so that I get a column that shows the grouped running totals. I have only managed to test the fxGRT by importing the Totals query.
Query that uses Totals and calls fxGRT (built for testing the function):
let
Source = Totals,
BufferedValues = List.Buffer(Source[PD]),
BufferedMaterials = List.Buffer(Source[Material]),
RT = Table.FromColumns(
{
Source[Material], Source[Date], Source[PD],
fxGroupedRunningTotal(BufferedValues, BufferedMaterials)
},
{
"Material",
"Date",
"PD",
"Running Total"
})
in
RT
fxGroupedRunningTotals:
(values as list, grouping as list) as list =>
let
GRTList = List.Generate
(
()=> [ GRT = values{0}, i = 0 ],
each [i] < List.Count(values),
each try if grouping{[i]} = grouping{[i] + 1}
then [GRT = [GRT] + values {[i] + 1}, i = [i] + 1]
else [GRT = values{[i] + 1}, i = [i] + 1]
otherwise [i = [i] + 1]
,
each [GRT]
)
in
GRTList
Totals:
let
Källa = Table.NestedJoin(Cohv,{"Prod MDate"},Resb,{"Del Mdate"},"Resb",JoinKind.FullOuter),
#"Expanderad Resb" = Table.ExpandTableColumn(Källa, "Resb", {"Del Material", "Del Date", "Del Qty", "Del Mdate"}, {"Del Material", "Del Date", "Del Qty", "Del Mdate"}),
#"PD Date" = Table.AddColumn(#"Expanderad Resb", "PD Date", each if [Prod Date] = null then [Del Date] else [Prod Date]),
#"PD Material" = Table.AddColumn(#"PD Date", "PD Material", each if [Material Number] = null then [Del Material] else [Material Number]),
#"PD Mdate" = Table.AddColumn(#"PD Material", "PD Mdate", each [PD Material] & "." & Date.ToText([PD Date])),
#"Borttagna kolumner" = Table.RemoveColumns(#"PD Mdate",{"Prod Date", "Prod MDate", "Del Date", "Del Mdate"}),
#"Ändrad typ1" = Table.TransformColumnTypes(#"Borttagna kolumner",{{"PD Date", type date}}),
#"Ihopslagna frågor" = Table.NestedJoin(#"Ändrad typ1",{"PD Material"},Matmas,{"Material"},"Matmas",JoinKind.FullOuter),
#"Expanderad Matmas" = Table.ExpandTableColumn(#"Ihopslagna frågor", "Matmas", {"Material", "Material Description", "MRP Controller", "Safety stock", "Minimum Lot Size", "In Stock"}, {"Material", "Material Description", "MRP Controller", "Safety stock", "Minimum Lot Size", "In Stock"}),
#"Omdöpta kolumner" = Table.RenameColumns(#"Expanderad Matmas",{{"Material", "M Material"}}),
#"Lägg till egen" = Table.AddColumn(#"Omdöpta kolumner", "Material", each if [PD Material] = null then [M Material]else [PD Material]),
#"Borttagna kolumner1" = Table.RemoveColumns(#"Lägg till egen",{"Material Number", "Del Material", "PD Material", "M Material"}),
#"Lägg till egen1" = Table.AddColumn(#"Borttagna kolumner1", "Date", each if [PD Date] = null then DateTime.LocalNow() else [PD Date]),
#"Borttagna kolumner2" = Table.RemoveColumns(#"Lägg till egen1",{"PD Date"}),
#"Lägg till egen2" = Table.AddColumn(#"Borttagna kolumner2", "Date in stock", each if [Date] = DateTime.Date(DateTime.LocalNow()) then [In Stock] else null),
#"Borttagna kolumner3" = Table.RemoveColumns(#"Lägg till egen2",{"Date in stock"}),
#"Ändrad typ" = Table.TransformColumnTypes(#"Borttagna kolumner3",{{"Date", type date}}),
#"Ersatt värde" = Table.ReplaceValue(#"Ändrad typ",null,0,Replacer.ReplaceValue,{"Prod Qty"}),
#"Ersatt värde1" = Table.ReplaceValue(#"Ersatt värde",null,0,Replacer.ReplaceValue,{"Del Qty"}),
#"Ihopslagna frågor1" = Table.NestedJoin(#"Ersatt värde1",{"Date"},Dates,{"Date"},"Dates",JoinKind.RightOuter),
#"Expanderad Dates" = Table.ExpandTableColumn(#"Ihopslagna frågor1", "Dates", {"Current Date"}, {"Current Date"}),
#"Omdöpta kolumner1" = Table.RenameColumns(#"Expanderad Dates",{{"Date", "D Date"}}),
Date = Table.AddColumn(#"Omdöpta kolumner1", "Date", each if [D Date] is null then DateTime.Date(DateTime.LocalNow()) else [D Date]),
PD = Table.AddColumn(Date, "PD", each if [Current Date] = "Yes" then [In Stock]+[Prod Qty]-[Del Qty] else [Prod Qty]-[Del Qty])
in
PD
So how do I implement this function into a new column in my Totals? My attempts keep failing. Do I have to make a reference to Totals in order to make this work? It feels so wrong, since that would double the work load (?) with the data. I would like it to be as quick as possible.

You can reference the previous step as a table. Thus your query can be written
let
[...all your previous steps...]
PD = Table.AddColumn(Date, "PD", each if [Current Date] = "Yes" then [In Stock]+[Prod Qty]-[Del Qty] else [Prod Qty]-[Del Qty]),
RT =
Table.FromColumns(
List.Combine({Table.ToColumns(PD), {fxGroupedRunningTotals(PD[Material], PD[PD])}}),
List.Combine({Table.ColumnNames(PD), {"Running Total"}})
)
in
RT
This converts the table to a list of columns, adds on the new running total column (calling the function you've defined on specific columns of the table from the previous step PD), and then glues those columns back together with a similar method to preserve the column names and add a new one.

How to convert json to FoxPro cursor

File contains invoice list in json format like:
{
"status": "OK", "statusCode": 200, "messages": null,
"data": [{
"payment_type": "banktransfer", "fine": "0.200000", "quote_id": null, "order_id": null, "prepayment_id": null, "credited_invoices": [],
"interested_party_address_id": 279, "project_id": 875, "currency": "EUR", "owner_id": 3, "date": "2019-03-15", "deadline": "2019-03-20",
},
{
"payment_type": "banktransfer", "fine": "0.30000", "quote_id": null, "order_id": null, "prepayment_id": null, "credited_invoices": [],
"interested_party_address_id": 79, "project_id": 85, "currency": "EUR", "owner_id": 3, "date": "2019-04-15", "deadline": "2019-43-20",
}
.... more same type elements
]
}
How to convert it to FoxPRo cursor ?
Cursor should contain payment_type, fine, quote_id etc columns.
I tried
https://archive.codeplex.com/?p=qdfoxjson
and
http://www.sweetpotatosoftware.com/blog/index.php/2008/12/19/visual-foxpro-json-class-update/
but it looks like they require json to be in different format than my json.
Andrus

First thing is your JSON is invalid. There shouldn't be a hanging comma after the "deadline" date values.
The second thing is that there is no mechanism to tell VFP how to map elements to fields. So you have to do it yourself.
To do this I would use the njJson library, specifically nfJsonRead.
Assuming your JSON is in file 'test.json' and you have nfJsonRead.prg then for example:
Close All
Clear All
Clear
Set Procedure To nfJsonRead additive
* -- Add the other fields as appropriate.
Create Cursor mydata (payment_type c(20), fine n(12, 2))
loJson = nfjsonread(FileToStr("test.json"))
For each loData in loJson.data
Insert into mydata values (loData.payment_type, Val(loData.fine))
EndFor

Just a quick something I came up with for you. If I had more stuff in similar format, I would have done some cursor/table structure of "looking for begin delimiter", "looking for end delimiter", "final data type", etc... then looping through for each part. Here, I actually am showing how each part is being extracted and how to get one data row at a time.
** Example to put into a string called "json", no other context to json
TEXT TO json noshow
{
"status": "OK", "statusCode": 200, "messages": null,
"data": [{1
"payment_type": "banktransfer", "fine": "0.200000", "quote_id": null, "order_id": null, "prepayment_id": null, "credited_invoices": [],
"interested_party_address_id": 279, "project_id": 875, "currency": "EUR", "owner_id": 3, "date": "2019-03-15", "deadline": "2019-03-20",
},
{2
"payment_type": "banktransfer", "fine": "0.30000", "quote_id": null, "order_id": null, "prepayment_id": null, "credited_invoices": [],
"interested_party_address_id": 79, "project_id": 85, "currency": "EUR", "owner_id": 3, "date": "2019-04-15", "deadline": "2019-43-20",
}
.... more same type elements
]
}
ENDTEXT
** Strip the "Status" component ang get just the Data
json = SUBSTR( json, ATC( ["data":], json ))
CREATE CURSOR C_ImportRecs;
( PayType c(20),;
Fine n(10,6),;
QuoteID i,;
OrderID i,;
PrepayID i,;
Invoices c(20),;
IntPartyAddrID i,;
ProjectID i,;
TransCurrency c(3),;
OwnerID i,;
DatePart c(15),;
TransDate d,;
DeadPart c(15),;
Deadline d )
** scatter all fields to an object which will have fields by same name
SCATTER NAME newRec
** start counter to 1
nRow = 1
DO while .t.
** Look within the json string for the "nRow" instance of data starting / ending with { / }
oneData = STREXTRACT( json, "{", "}", nRow )
** if no more entries, get out
IF EMPTY( ALLTRIM( oneData ))
exit
ENDIF
** if memory variables are same name as cursor/table columns, simplifies insert
newRec.PayType = STREXTRACT( oneData, ["payment_type": "], ["] )
newRec.Fine = VAL( STREXTRACT( oneData, ["fine": "], ["] ))
** not including closing quote as ending delimiter since null allowed, looking for the "," after the IDs
newRec.QuoteID = INT( VAL( STREXTRACT( oneData, ["quote_id": ], [,] ) ))
newRec.OrderID = INT( VAL( STREXTRACT( oneData, ["order_id": ], [,] ) ))
newRec.PrepayID = INT( VAL( STREXTRACT( oneData, ["prepayment_id": ], [,] ) ))
newRec.Invoices = STREXTRACT( oneData, ["credited_invoices": ], [,] )
newRec.IntPartyAddrID = INT( VAL( STREXTRACT( oneData, ["interested_party_address_id": ], [,] ) ))
newRec.ProjectID = INT( VAL( STREXTRACT( oneData, ["project_id": ], [,] ) ))
** Currency is reserved word.
newRec.TransCurrency = STREXTRACT( oneData, ["currency": "], ["] )
newRec.OwnerID = INT( VAL( STREXTRACT( oneData, ["owner_id": ], [,] ) ))
** Date is reserved word
newRec.DatePart = ""
newRec.DatePart = STREXTRACT( oneData, ["date": "], ["] )
** Nice for macro substitution. Take incoming date format of ex: "2019-03-15"
** and change to "2019,03,15" This is because the DATE() function expects
** 3 parameters for YEAR, MONTH, DAY
tryDate = STRTRAN( newRec.DatePart, "-", "," )
** Now, by using macro via "&" forces the line below to immediately run as if
** run as if it was DATE( 2019,03,15) and create an actual date value
TRY
newRec.TransDate = DATE( &tryDate )
CATCH
newRec.TransDate = CTOD( "" )
ENDTRY
** same for deadline
newRec.DeadPart = STREXTRACT( oneData, ["deadline": "], ["] )
tryDate = STRTRAN( newRec.DeadPart, "-", "," )
TRY
** your second record has a month 43 which is an invalid month
newRec.Deadline = DATE( &tryDate )
CATCH
newRec.Deadline = CTOD( "" )
ENDTRY
** Since the cursor "C_ImportRecs" is already the active focus,
** we can just insert from memory variables
INSERT INTO C_ImportRecs FROM NAME newRec
** try if any more records
nRow = nRow + 1
ENDDO
BROWSE NORMAL NOCAPTIONS NOWAIT

Returning Null JSON Throwing Type Mismatch Error in VBA

Problem
Description
I'm currently running into a problem where the JSON that I am getting back has fields that are null.
In the code below, I've figured out that most of the fields have an assignee, and down another level assignees have a displayName. I've also found out that some things do not have an assignee. When that happens (and this would probably happen with other fields too, I'm just using this as an example) it removes that additional heirarchy level, and the actual path (also shown below) would be changed.
Question
Is there an easy way to iterate over this response, and set nulls to blanks maybe?
Set Json = JsonConverter.ParseJson(MyRequest.ResponseText)
That doesn't really help me with automation though. Notice [below] where I list components twice, because I don't know how to loop through that data and pull back the field as many times as it needs to be populated. Aka I know there are two components, but it only brings back one component, so I had to copy that code to get it to work correctly (I'm sorry for copying).
Code Snip
My code works great until it hits a null, then it throws an error.
''''''''
' Loop '
''''''''
For i = 0 To 40
' ActiveSheet.Cells(i + 1, 1) = Json("issues")(i + 1)("fields")("issuetype")("name")
' ActiveSheet.Cells(i + 1, 2) = Json("issues")(i)("key")
' ActiveSheet.Cells(i + 1, 3) = Json("issues")(i + 1)("fields")("summary")
' ActiveSheet.Cells(i + 1, 4) = Json("issues")(i + 1)("fields")("status")("name")
ActiveSheet.Cells(i + 1, 5) = Json("issues")(i + 1)("fields")("assignee")
ActiveSheet.Cells(i + 1, 5) = Json("issues")(i + 1)("fields")("assignee")("displayName")
' ActiveSheet.Cells(i + 1, 6) = Json("issues")(i + 1)("fields")("customfield_13301")
' ActiveSheet.Cells(i + 1, 7) = Json("issues")(i + 1)("fields")("components")(1)("name")
' ActiveSheet.Cells(i + 1, 8) = Json("issues")(i + 1)("fields")("components")(2)("name")
' ActiveSheet.Cells(i + 1, 9) = Json("issues")(i + 1)("fields")("customfield_13300")
' ActiveSheet.Cells(i + 1, 10) = Json("issues")(i + 1)("fields")("customfield_10002")
Next i
JSON
Ovbiously I had to delete some content for privacy reasons, but that shows the assignee as null. JSON with a, "displayName" just turns that null into an Array and has more fields under it.
{
"expand": "schema,names",
"startAt": 0,
"maxResults": 50,
"total": 52,
"issues": [
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{
"expand": "operations,versionedRepresentations,editmeta,changelog,renderedFields",
"id": "92110",
"self": "",
"key": "",
"fields": {
"customfield_13100": null,
"fixVersions": [],
"customfield_13500": null,
"customfield_11200": null,
"resolution": null,
"customfield_13502": null,
"customfield_13501": null,
"lastViewed": null,
"customfield_12000": null,
"customfield_12002": null,
"customfield_12001": null,
"priority": {},
"customfield_10100": null,
"customfield_10101": null,
"customfield_12003": null,
"customfield_12402": null,
"labels": [],
"customfield_11303": null,
"customfield_11305": null,
"customfield_11306": null,
"aggregatetimeoriginalestimate": null,
"timeestimate": null,
"versions": [],
"issuelinks": [],
"assignee": null,
"status": {},
"components": [],
"customfield_13200": null,
"customfield_13600": null,
"customfield_12900": null,
"aggregatetimeestimate": null,
"creator": {},
"customfield_14000": null,
"subtasks": [],
"customfield_14400": null,
"reporter": {},
"customfield_12101": null,
"customfield_12100": null,
"aggregateprogress": {},
"customfield_14401": null,
"customfield_14402": null,
"customfield_12500": null,
"customfield_13702": null,
"customfield_13704": null,
"customfield_13703": null,
"customfield_11802": null,
"progress": {},
"votes": {},
"issuetype": {},
"timespent": null,
"project": {},
"customfield_13300": null,
"aggregatetimespent": null,
"customfield_13302": null,
"customfield_13301": null,
"customfield_13700": null,
"customfield_11400": null,
"resolutiondate": null,
"workratio": -1,
"watches": {},
"created": "2017-07-21T08:04:42.000-0500",
"customfield_14102": null,
"customfield_10020": null,
"customfield_12200": null,
"customfield_14100": null,
"customfield_14101": null,
"customfield_12600": null,
"customfield_14500": null,
"customfield_10300": null,
"customfield_10016": null,
"customfield_13405": null,
"customfield_10017": null,
"customfield_13800": null,
"customfield_10018": null,
"customfield_10019": null,
"customfield_13409": null,
"updated": "2017-08-10T15:29:37.000-0500",
"timeoriginalestimate": null,
"description": null,
"customfield_10011": null,
"customfield_10012": null,
"customfield_13401": null,
"customfield_13400": null,
"customfield_10013": null,
"customfield_10014": null,
"customfield_11500": "{}",
"customfield_10015": null,
"customfield_13514": null,
"summary": "",
"customfield_14200": null,
"customfield_10000": null,
"customfield_13511": null,
"customfield_12301": null,
"customfield_10001": null,
"customfield_12300": null,
"customfield_10002": "1|i021pe:5z",
"customfield_13510": null,
"customfield_13513": null,
"customfield_10003": [],
"customfield_12302": null,
"customfield_10004": null,
"customfield_13504": null,
"customfield_13503": null,
"customfield_11600": null,
"customfield_13506": null,
"environment": null,
"customfield_13901": null,
"customfield_13505": null,
"customfield_13508": null,
"duedate": null,
"customfield_13509": null
}
},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{},
{}
]
}
Additional Data
I looked at the Raw file just to see if anything looked different (than it did in my JSON Formater Plugin for Chrome) and this is what it looked like:
"assignee":null,

Working with JSON files is much easier (IMHO) if you understand how the JsonConverter processes the JSON into a compound object. Let's look at a simple JSON format (taken from this useful site):
{
"array": [
1,
2,
3
],
"boolean": true,
"null": null,
"number": 123,
"object": {
"a": "b",
"c": "d",
"e": "f"
},
"string": "Hello World"
}
The JsonConverter maps each of these data items into their VBA counterparts.
"array" maps to Collection (anytime you see the square brackets [])
"boolean" maps to Boolean
"null" maps to Null
"number" maps to Double
"object" maps to Dictionary (anytime you see the curly braces {})
"string" maps to String
So now we can do useful things with your JSON example, such as determine how many entires are in your "issues" array by
Dim issues As Collection
Set issues = schema("issues")
Debug.Print issues.Count
Each of the entries in your "issues" array is actually a compound object itself, so it's a Dictionary. We could, therefore, do something like this:
Dim issue As Variant
For Each issue In issues
If issue.Exists("id") Then
Debug.Print "id = " & issue("id")
End If
Next issue
Of course, the "fields" section of this single issue is itself another Dictionary. So stacking up the dictionary references we can do this too:
Debug.Print "field summary is " & issue("fields")("summary")
All of this is background, hopefully to make it easier on accessing members of a JSON structure. Your real question is on handling NULLs. If the actual value of a field is set to null (see the above sample), then you check it like so
If IsNull(issue("fields")("customfield_13500")) Then ...
A couple of other side notes before we put it all together:
Always use Option Explicit
Avoid Select and Activate
Always define and set references to all Workbooks and Sheets
In the example below, you'll see that I assumed you had to check each field for Null. That is best accomplished by isolating that check in a subroutine rather than over-mess your code with a long string of If statements. The advantage of the code example below is that you don't have to hard-code the number of issues because your logic can detect how many there are.
Option Explicit
Sub main()
Dim schema As Object
Set schema = GetJSON("C:\dev\junk.json")
Dim thisWB As Workbook
Dim destSH As Worksheet
Set thisWB = ThisWorkbook
Set destSH = thisWB.Sheets("Sheet1")
Dim anchor As Range
Set anchor = destSH.Range("A1")
Dim issues As Collection
Set issues = schema("issues")
Dim i As Long
Dim issue As Variant
For Each issue In issues
If issue.Exists("id") Then
SetCell anchor.Cells(1, 1), issue("fields")("issuetype")("name")
SetCell anchor.Cells(1, 2), issue("key")
SetCell anchor.Cells(1, 3), issue("fields")("summary")
'--- if you're not sure if the "name" field is there,
' then remember it's a Dictionary so check with Exists
If issue("fields")("status").Exists("name") Then
SetCell anchor.Cells(1, 4), issue("fields")("status")("name")
Else
SetCell anchor.Cells(1, 4), vbNullString
End If
SetCell anchor.Cells(1, 5), issue("fields")("assignee")
SetCell anchor.Cells(1, 6), issue("fields")("customfield_13301")
'--- possibly get the Count and iterate over the exact number of components
For i = 0 To issue("fields")("components").Count - 1
SetCell anchor.Cells(1, 7), issue("fields")("components")(i)("name")
Next i
SetCell anchor.Cells(1, 9), issue("fields")("customfield_13300")
SetCell anchor.Cells(1, 10), issue("fields")("customfield_10002")
Set anchor = anchor.Offset(1, 0)
End If
Next issue
End Sub
Function GetJSON(ByVal filename As String) As Object
'--- first ingest the JSON file and get it parsed
Dim fso As FileSystemObject
Dim jsonTS As TextStream
Dim jsonText As String
Set fso = New FileSystemObject
Set jsonTS = fso.OpenTextFile(filename, ForReading)
jsonText = jsonTS.ReadAll
Set GetJSON = JsonConverter.ParseJson(jsonText)
End Function
Private Sub SetCell(ByRef thisCell As Range, ByVal thisValue As Variant)
If IsNull(thisValue) Then
thisCell = vbNullString
Else
thisCell = thisValue
End If
End Sub

Fix
This is what I did to get this to work:
If IsNull(Json("issues")(i + 1)("fields")("components")) Then
ActiveSheet.Cells(i + 1, 5).Value = ""
Else
ActiveSheet.Cells(i + 1, 7) = Json("issues")(i + 1)("fields")("components")(1)("name")
End If

You may get JSON data into arrays as shown in the example code below. Import JSON.bas module into the VBA project for JSON processing.
Sub Test()
' Put sourse JSON string to "\source.json" file, and save as ANSI or Unicode
Dim sJSONString As String
Dim vJSON As Variant
Dim sState As String
Dim aData()
Dim aHeader()
sJSONString = ReadTextFile(ThisWorkbook.Path & "\source.json", -2)
JSON.Parse sJSONString, vJSON, sState
vJSON = vJSON("issues")
JSON.ToArray vJSON, aData, aHeader
With Sheets(1)
.Cells.Delete
.Cells.WrapText = False
OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aData
.Columns.AutoFit
End With
End Sub
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Function ReadTextFile(sPath As String, lFormat As Long) As String
' lFormat -2 - System default, -1 - Unicode, 0 - ASCII
With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 1, False, lFormat)
ReadTextFile = ""
If Not .AtEndOfStream Then ReadTextFile = .ReadAll
.Close
End With
End Function

how do you convert a json entries in a file into a data frame?

I am trying to read files that has json content and convert that to tabular data based on some fields.
The file includes content like this:
{"senderDateTimeStamp":"2016/04/08 10:03:18","senderHost":null,"senderCode":"web_app","senderUsecase":"appinternalstats_prod","destinationTopic":"web_app_appinternalstats_realtimedata_topic","correlatedRecord":false,"needCorrelationCacheCleanup":false,"needCorrelation":false,"correlationAttributes":null,"correlationRecordCount":0,"correlateTimeWindowInMills":0,"lastCorrelationRecord":false,"realtimeESStorage":true,"receiverDateTimeStamp":1460124283554,"payloadData":{"timestamp":"2016-04-08T10:03:18.244","status":"get","source":"MSG1","ITEM":"TEST1","basis":"","pricingdate":"","content":"","msgname":"","idlreqno":"","host":"web01","Webservermember":"Web"},"payloadDataText":"","key":"web_app:appinternalstats_prod","destinationTopicName":"web_app_appinternalstats_realtimedata_topic","esindex":"web_app","estype":"appinternalstats_prod","useCase":"appinternalstats_prod","Code":"web_app"}
I need to be able to convert timestamp, source, host, status fields withing payloadData section for each line into a data frame in R.
I've tried this:
library(rjson)
d<-fromJSON(file="file.txt")
dput(d)
structure(list(senderDateTimeStamp = "2016/04/08 10:03:18", senderHost = NULL,
senderAppcode = "web", senderUsecase = "appinternalstats_prod",
destinationTopic = "web_appinternalstats_realtimedata_topic",
correlatedRecord = FALSE, needCorrelationCacheCleanup = FALSE,
needCorrelation = FALSE, correlationAttributes = NULL, correlationRecordCount = 0,
correlateTimeWindowInMills = 0, lastCorrelationRecord = FALSE,
realtimeESStorage = TRUE, receiverDateTimeStamp = 1460124283554,
payloadData = structure(list(timestamp = "2016-04-08T10:03:18.244",
status = "get", source = "MSG1",
region = "", evetid = "", osareqid = "", basis = "",
pricingdate = "", content = "", msgname = "", recipient = "",
objid = "", idlreqno = "", host = "web01", webservermember = "webSingleton"),
.Names = c("timestamp",
"status", "source", "region", "evetid",
"osareqid", "basis", "pricingdate", "content", "msgname",
"recipient", "objid", "idlreqno", "host", "webservermember"
)), payloadDataText = "", key = "web:appinternalstats_prod",
destinationTopicName = "web_appinternalstats_realtimedata_topic",
hdfsPath = "web/appinternalstats_prod", esindex = "web",
estype = "appinternalstats_prod", useCase = "appinternalstats_prod",
appCode = "web"), .Names = c("senderDateTimeStamp", "senderHost",
"senderAppcode", "senderUsecase", "destinationTopic", "correlatedRecord",
"needCorrelationCacheCleanup", "needCorrelation", "correlationAttributes",
"correlationRecordCount", "correlateTimeWindowInMills", "lastCorrelationRecord",
"realtimeESStorage", "receiverDateTimeStamp", "payloadData",
"payloadDataText", "key", "destinationTopicName", "hdfsPath",
"esindex", "estype", "useCase", "appCode"))
Any ideas how I could convert payloadData section of the json entry into a data frame?

This might be something you want:
library(rjson)
d<-fromJSON(file="file.txt")
myDf <- do.call("rbind", lapply(d, function(x) {
data.frame(TimeStamp = x$payloadData$timestamp,
Source = x$payloadData$source,
Host = $payloadData$host,
Status = x$payloadData$status)}))

Consider the package tidyjson:
library(tidyjson)
library(magrittr)
json <- '{"senderDateTimeStamp":"2016/04/08 10:03:18","senderHost":null,"senderCode":"web_app","senderUsecase":"appinternalstats_prod","destinationTopic":"web_app_appinternalstats_realtimedata_topic","correlatedRecord":false,"needCorrelationCacheCleanup":false,"needCorrelation":false,"correlationAttributes":null,"correlationRecordCount":0,"correlateTimeWindowInMills":0,"lastCorrelationRecord":false,"realtimeESStorage":true,"receiverDateTimeStamp":1460124283554,"payloadData":{"timestamp":"2016-04-08T10:03:18.244","status":"get","source":"MSG1","ITEM":"TEST1","basis":"","pricingdate":"","content":"","msgname":"","idlreqno":"","host":"web01","Webservermember":"Web"},"payloadDataText":"","key":"web_app:appinternalstats_prod","destinationTopicName":"web_app_appinternalstats_realtimedata_topic","esindex":"web_app","estype":"appinternalstats_prod","useCase":"appinternalstats_prod","Code":"web_app"}'
json %>%
gather_keys()
# head() of above
# document.id key
# 1 1 senderDateTimeStamp
# 2 1 senderHost
# 3 1 senderCode
# 4 1 senderUsecase
# 5 1 destinationTopic
# 6 1 correlatedRecord
json %>%
enter_object("payloadData") %>%
gather_keys() %>%
append_values_string()
# head() of above
# document.id key string
# 1 1 timestamp 2016-04-08T10:03:18.244
# 2 1 status get
# 3 1 source MSG1
# 4 1 ITEM TEST1
# 5 1 basis
# 6 1 pricingdate

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Parse Large amount of Json in Excel - json

Related

JSON to Excel PowerQuery import - how to get a row per nested field

Implement function call (grouped running totals) in query

How to convert json to FoxPro cursor

Returning Null JSON Throwing Type Mismatch Error in VBA

how do you convert a json entries in a file into a data frame?

Categories

Resources