This is my JSON data:
data = [{"document_id":"FT_3860001798686","party_type":"1","name":"LEE, GABRIEL"},{"document_id":"FT_3860001798686","party_type":"1","name":"MORRISON, VERNA"},{"document_id":"FT_3860001798686","party_type":"2","name":"PIONEER S&L ASSOCIATION"}]
Expected O/P:
data = {"document_id":"FT_3860001798686", "party 1":"1", "name":["LEE, GABRIEL", "MORRISON, VERNA"],"party 2":"2", "name":["PIONEER S&L ASSOCIATION"]}
Suppose if any party_type dnt have any value need to display that value as N/A, like below
data = {"document_id":"FT_3860001798686", "party 1":"1", "name":["LEE, GABRIEL", "MORRISON, VERNA"], "party 2":"2", "name": "N/A"}
What you wrote as your Expected O/P is not possible, because you can not have two elements with the same key ("name" in your sample) in a dictionary.
Related
%dw 2.0
output application/csv quoteValues=true, separator=";", header=true
---
[
{
Aaa: payload.problem.Aaa[0],
Bbb: payload.problem.Bbb[0]
}
]
And this write it to CSV like this:
Aaa;Bbb
1234; Test1
But this payload looks like this:
"problem": {
"Aaa": [
"1234",
"1567",
"105"
],
"Bbb": [
"Test1",
"Test2",
"Test3"
]
}
And I would like to write all 3 value or more depends from my request to my CSV. But now I can do it only if I specify which element from the array (example payload.issues.Aaa[0]). Because under my key Aaa I don't know exactly how many value I will get to response.
This solution assumes that the data for the first column has the same number of rows as all the others.
First I capture the names of columns from the key names in payload.problem, using the function namesOf(), and store the array of names in the variable column.
Then I map over the first 'column' to get the index of each row. Using that index I iterate over each column name obtaining the value for that column at that index, and transforming it into a key-value (column name, value of that column for the row). Finally I use reduce() to concatenate all the key-values from the same row into a single object, which represents the row as DataWeave expects to transform to CSV. If you change the output type to JSON you will see the structure more clearly.
%dw 2.0
output application/csv quoteValues=true, separator=";", header=true
var columns=namesOf(payload.problem)
---
payload.problem[columns[0]] map ((item, index) -> (
columns map ($): payload.problem[$][index])
reduce ($$++$)
)
Output:
Aaa;Bbb
"1234";"Test1"
"1567";"Test2"
"105";"Test3"
If the columns are not of the same size you can use this alternative that will find first a column with the maximum length and use it as the first iteration:
%dw 2.0
output application/csv quoteValues=true, separator=";", header=true
import firstWith from dw::core::Arrays
var columns=namesOf(payload.problem)
var maxColumnSize=max(columns map sizeOf(payload.problem[$]))
var maxColumnName=columns firstWith (sizeOf(payload.problem[$]) == maxColumnSize)
---
payload.problem[maxColumnName]
map ((item, index) -> (
columns map ($): payload.problem[$][index])
reduce ($$++$)
)
Below is the code to convert csv file to json format in python.
I have two fields 'recommendation' and 'rating'. Based on the recommendation value I need to set the value for rating field like if recommendation is 1 then rating =1 and vice versa. With the answer I got I'm getting output for only one record entry instead of getting all the records. I think it's overriding. Do I need to create separate list for that and append each record entry to the list to get the output for all records.
here's the updated code:
def main(input_file):
csv_rows = []
with open(input_file, 'r') as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
title = reader.fieldnames
for row in reader:
entry = OrderedDict()
for field in title:
entry[field] = row[field]
[c.update({'RATING': c['RECOMMENDATIONS']}) for c in reader]
csv_rows.append(entry)
with open(json_file, 'w') as f:
json.dump(csv_rows, f, sort_keys=True, indent=4, ensure_ascii=False)
f.write('\n')
I want to create the nested format like the below:
"rating": {
"user_rating": {
"rating": 1
},
"recommended": {
"rating": 1
}
After you've read the file in, using the csv.DictReader, you'll have a list of dicts. Since you want to set the values now, it's a simple dict manipulation. There are several ways, of which one is:
[c.update({'rating': c['recommendation']}) for c in read_csvDictReader]
Hope that helps.
I have generated a JSON file from data source which is of the format.
{}{}{}
I wish to convert this format to comma separated JSON Array as. [{},{},{}].
End goal is to push the JSON data [{},{},{}] to MongoDB.
My pythoin solution (although naive) looks something like this:
def CreateJSONArrayFile(filename):
print('Opening file with JSON data')
with open(filename) as data_file:
raw_data = data_file.read()
tweaked_data = raw_data.replace('}{', '}^|{')
split_data = tweaked_data.split('^|')
outfile = open('split_data.json', 'w')
outfile.write('[')
for item in split_data:
outfile.write("%s," % item)
outfile.write(']')
print('split_data.json Created with JSON Array')
The above code is giving me wrong results.
Can you please help me optimize the solution? Please let me know if you need more details from my end.
I'm with davedwards on this one, but if not an option -- I think this gets you what you are after.
myJson = """{"This": "is", "a": "test"} {"Of": "The", "Emergency":"Broadcast"}"""
myJson = myJson.replace("} {", "}###{")
new_list = myJson.split('###')
print(new_list)
yields:
['{"This": "is", "a": "test"}', '{"Of": "The", "Emergency":"Broadcast"}']
Not saying it is the most elegant way : )
So I have a lot of json files structured like this:
{
"Id": "2551faee-20e5-41e4-a7e6-57bd20b02a22",
"Timestamp": "2016-12-06T08:09:57.5541438+01:00",
"EventEntry": {
"EventId": 1,
"Payload": [
"1a3e0c9e-ef69-4c6a-ac8c-9b2de2fbc701",
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork.FetchVisionModelsForClientOnReferenceDateAsync(System.Int64 clientId, System.DateTime referenceDate, System.Threading.CancellationToken cancellationToken)",
25,
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork+<FetchVisionModelsForClientOnReferenceDateAsync>d__11.MoveNext\r\nDHS.PlanCare.Core.Extensions.IQueryableExtensions+<ExecuteAndThrowTaskCancelledWhenRequestedAsync>d__16`1.MoveNext\r\n",
false,
"2197, 6-12-2016 0:00:00, System.Threading.CancellationToken"
],
"EventName": "Duration",
"KeyWordsDescription": "Duration",
"PayloadSchema": [
"instanceSessionId",
"member",
"durationInMilliseconds",
"minimalStacktrace",
"hasFailed",
"parameters"
]
},
"Session": {
"SessionId": "0016e54b-6c4a-48bd-9813-39bb040f7736",
"EnvironmentId": "C15E535B8D0BD9EF63E39045F1859C98FEDD47F2",
"OrganisationId": "AC6752D4-883D-42EE-9FEA-F9AE26978E54"
}
}
How can I create an u-sql query that outputs the
Id,
Timestamp,
EventEntry.EventId and
EventEntry.Payload[2] (value 25 in the example below)
I can't figure out how to extend my query
#extract =
EXTRACT
Timestamp DateTime
FROM #"wasb://xxx/2016/12/06/0016e54b-6c4a-48bd-9813-39bb040f7736/yyy/{*}/{*}.json"
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
#res =
SELECT Timestamp
FROM #extract;
OUTPUT #res TO "/output/result.csv" USING Outputters.Csv();
I have seen some examples like:
U- SQL Unable to extract data from JSON file => this only queries one level of the document, I need data from multiple levels.
U-SQL - Extract data from json-array => this only queries one level of the document, I need data from multiple levels.
JSONTuple supports multiple JSONPaths in one go.
#extract =
EXTRACT
Id String,
Timestamp DateTime,
EventEntry String
FROM #"..."
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
#res =
SELECT Id, Timestamp, EventEntry,
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(EventEntry,
"EventId", "Payload[2]") AS Event
FROM #extract;
#res =
SELECT Id,
Timestamp,
Event["EventId"] AS EventId,
Event["Payload[2]"] AS Something
FROM #res;
You may want to look at this GIT example. https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql
This take 2 disparate data elements and combines them, like you have the Payload, and Payload schema. If you create key value pairs using the "Donut" or "Cake and Batter" examples you may be able to match the scema up to the payload and use the cross apply explode function.
I am trying to use JSX to convert a list of tuples to a JSON object.
The list items are based on a record definition:
-record(player, {index, name, description}).
and looks like this:
[
{player,1,"John Doe","Hey there"},
{player,2,"Max Payne","I am here"}
]
The query function looks like this:
select_all() ->
SelectAllFunction =
fun() ->
qlc:eval(qlc:q(
[Player ||
Player <- mnesia:table(player)
]
))
end,
mnesia:transaction(SelectAllFunction).
What's the proper way to make it convertable to a JSON knowing that I have a schema of the record used and knowing the structure of tuples?
You'll have to convert the record into a term that jsx can encode to JSON correctly. Assuming you want an array of objects in the JSON for the list of player records, you'll have to either convert each player to a map or list of tuples. You'll also have to convert the strings to binaries or else jsx will encode it to a list of integers. Here's some sample code:
-record(player, {index, name, description}).
player_to_json_encodable(#player{index = Index, name = Name, description = Description}) ->
[{index, Index}, {name, list_to_binary(Name)}, {description, list_to_binary(Description)}].
go() ->
Players = [
{player, 1, "John Doe", "Hey there"},
% the following is just some sugar for a tuple like above
#player{index = 2, name = "Max Payne", description = "I am here"}
],
JSON = jsx:encode(lists:map(fun player_to_json_encodable/1, Players)),
io:format("~s~n", [JSON]).
Test:
1> r:go().
[{"index":1,"name":"John Doe","description":"Hey there"},{"index":2,"name":"Max Payne","description":"I am here"}]