R JSON File to Dataframe using tidyjson - json

I exported a JSON file from Mongodb with the below format. I'm trying to create a dataframe from it but I can't see to get tidyjson to read it as it throws this error.
Error: lexical error: invalid char in json text.
{ "_id" : ObjectId("586e684427a06a4a658fa
(right here) ------^
I used read_json("file.json")
The file is below
{
"_id" : ObjectId("586e684427a06a4a658fa28e"),
"expires_in" : ISODate("2016-11-19T22:16:57.418+0000"),
"job_type" : "Satellite Sales & Service",
"inbound_id" : ObjectId("586e68440c83945fb2658754"),
"created_at" : ISODate("2017-01-05T15:37:40.850+0000"),
"action_states" : [
{
"_id" : ObjectId("586e684627a06a4a658fa293"),
"transition_duration" : NumberInt(0),
"name" : Symbol("created"),
"actor" : "home_owner",
"created_at" : ISODate("2017-01-05T15:37:42.297+0000")
},
{
"_id" : ObjectId("586e68ad0c83945fb2658825"),
"transition_duration" : NumberInt(1),
"name" : Symbol("accepted"),
"reason" : null,
"actor" : "contractor",
"created_at" : ISODate("2017-01-05T15:39:25.924+0000")
}
]
}
{
"_id" : ObjectId("586e675d27a06a4a658fa264"),
"expires_in" : ISODate("2016-11-19T22:16:57.418+0000"),
"job_type" : "Satellite Sales & Service",
"inbound_id" : ObjectId("586e675d0c83945fa2f6e190"),
"created_at" : ISODate("2017-01-05T15:33:49.934+0000"),
"action_states" : [
{
"_id" : ObjectId("586e675f27a06a4a658fa267"),
"transition_duration" : NumberInt(0),
"name" : Symbol("created"),
"actor" : "home_owner",
"created_at" : ISODate("2017-01-05T15:33:51.097+0000")
},
{
"_id" : ObjectId("586e694c0c83945faae36559"),
"transition_duration" : NumberInt(8),
"name" : Symbol("accepted"),
"reason" : null,
"actor" : "contractor",
"created_at" : ISODate("2017-01-05T15:42:04.116+0000")
}
]
}

This error due to mnogodb produce "extended" json format. You should try to export your data in 'strict' json mode using, for example, mongoexport (https://docs.mongodb.com/manual/reference/mongodb-extended-json/)

Related

How to read Json file and convert it to dataframe

I was trying to read the JSON file and convert to JSON, but I am finding difficulties here as i dont have much knowledge on this.
from pandas.io.json import json_normalize
import pandas as pd
import json
Path = "NavigatorInstances.json"
with open(Path, 'r') as myfile:
data= myfile.read()
data = json.loads(data)
df = pd.DataFrame.from_dict(json_normalize(data))
I am getting error as: json.decoder.JSONDecodeError: Expecting value: line 2 column 13 (char 15)
My JSON Sample data looks like below
{
"_id" : ObjectId("5ecfe5a0f9fcb510c8ec51e5"),
"RNI_Corps" : {
"MetaData" : {
"TaxonomyName" : "RN.Corps",
"InstanceName" : "Ratings Navigator Instance Corps",
"ThisMongoObjectId" : "5ecfe5a0f9fcb510c8ec51e5",
"ThisObjectShortId" : "37187",
"ReplacedMongoObjectId" : "",
"ThreadIDs" : "5ecfe5a0f9fcb510c8ec51e5",
"CurrentWF_Activity" : "Drafted",
"CurrentWF_ActivityDate" : "2020-05-28 16:23:53",
"VersionID" : {
"Tool" : "RN",
"Group" : "Corps",
"Sector" : "GenCos",
"Version" : "2.8.1.1"
},
"Cart" : {
"Id" : null,
"Status" : null,
"StatusDate" : null,
"Locked" : null
},
"Effective" : {
"Date" : null,
"Reason" : null,
"Source" : {
"SystemName" : null,
"SystemId" : null,
"EventType" : null
}
},
"Criteria" : {
"Id" : "10123001",
"Name" : "Exposure Draft: Sector Navigators",
"Date" : "2020-05-20 00:00:00"
},
"InstanceFileInfo" : {
"MongoObjectId" : "5ecfe5a0f9fcb510c8ec51df",
"MD5CheckSum" : "2d28eabe1a046f76e17a58cca6c386f1",
"Name" : "RN_2_8_1_1_96781051_2020_05_28_1.xlsm",
"SavedDate" : "2020-05-28 16:24:00"
},
"EntityInfo" : {
"AgentID" : NumberInt(1507132),
"AgentName" : "AES Mexico Generation Holdings, S. de R.L. de C.V.",
"NicknameID" : null,
"Nickname" : null,
"IssuerID" : NumberInt(96781051),
"IssuerName" : "AES Mexico Generation Holdings, S. de R.L. de C.V.",
"Region" : "Emerging Markets - Americas",
"CountryName" : "Mexico",
"Sovereign_Agent_ID" : null,
"Sector" : "GenCos"
},
Please help me to understand how I can convert the JSON data into a readable format such as pandas dataframe
Your sample is not in the JSON standard(or it is only a sample?).
The word "ObjectId" is not a string or number.
I think you can try https://www.json.cn to verify your JSON file first. This error is about json not DataFrame.

How to sort and extract data from JFrog JSON response using groovy for Jenkins pipelining

I am using OS version of JFrog Artifactory for my CI-CD activities which run via the Jenkins pipeline. I am novice to groovy/java
The REST APIs of OS JFrog Artifactory do not support the extraction of the latest build from a repository. With Jenkins pipeline in play, I was wondering if i could extract the data from the JSON response provided by Artifactory using Jenkins native groovy support(just to avoid external service which can be run via python/Java/Shell).
I am looking to put the extracted JSON response into a Map, sort the Map in descending order and extract the first Key-Value pair which contains the latest build info.
I end up getting "-1" as the response when I try to extract the data.
import groovy.json.JsonSlurper
def response = httpRequest authentication: 'ArtifactoryAPIKey', consoleLogResponseBody: false, contentType: 'TEXT_PLAIN', httpMode: 'POST', requestBody: '''
items.find({
"$and": [
{"repo": {"$match": "libs-snapshot-local"}},
{"name": {"$match": "simple-integration*.jar"}}
]
})''', url: 'http://<my-ip-and-port-info>/artifactory/api/search/aql'
def jsonParser = new JsonSlurper()
Map jsonOutput = jsonParser.parseText(response.content)
List resultsInfo = jsonOutput['results']
print(resultInfo[0].created)
def sortedResult = resultInfo.sort( {a, b -> b["created"] <=> a["created"] } )
sortedResult.each {
println it
}
The sample JSON to be parsed:
{
"results" : [ {
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.150",
"name" : "simple-integration-2.5.150.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-23T19:51:30.367+05:30",
"created_by" : "admin",
"modified" : "2019-06-23T19:51:30.364+05:30",
"modified_by" : "admin",
"updated" : "2019-06-23T19:51:30.368+05:30"
},{
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.140",
"name" : "simple-integration-2.5.140.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-21T19:52:40.670+05:30",
"created_by" : "admin",
"modified" : "2019-06-21T19:52:40.659+05:30",
"modified_by" : "admin",
"updated" : "2019-06-21T19:52:40.671+05:30"
},{
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.150",
"name" : "simple-integration-2.5.160.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-28T19:58:04.973+05:30",
"created_by" : "admin",
"modified" : "2019-06-28T19:58:04.970+05:30",
"modified_by" : "admin",
"updated" : "2019-06-28T19:58:04.973+05:30"
} ],
"range" : {
"start_pos" : 0,
"end_pos" : 3,
"total" : 3
}
}
//The output i am looking for: Latest build info with fields "created" and "name"
def jsonOutput = new groovy.json.JsonSlurper().parseText('''
{
"results" : [ {
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.150",
"name" : "simple-integration-2.5.150.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-23T19:51:30.367+05:30",
"created_by" : "admin",
"modified" : "2019-06-23T19:51:30.364+05:30",
"modified_by" : "admin",
"updated" : "2019-06-23T19:51:30.368+05:30"
},{
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.140",
"name" : "simple-integration-2.5.140.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-21T19:52:40.670+05:30",
"created_by" : "admin",
"modified" : "2019-06-21T19:52:40.659+05:30",
"modified_by" : "admin",
"updated" : "2019-06-21T19:52:40.671+05:30"
},{
"repo" : "libs-snapshot-local",
"path" : "simple-integration/2.5.150",
"name" : "simple-integration-2.5.160.jar",
"type" : "file",
"size" : 1175,
"created" : "2019-06-28T19:58:04.973+05:30",
"created_by" : "admin",
"modified" : "2019-06-28T19:58:04.970+05:30",
"modified_by" : "admin",
"updated" : "2019-06-28T19:58:04.973+05:30"
} ],
"range" : {
"start_pos" : 0,
"end_pos" : 3,
"total" : 3
}
}
''')
def last = jsonOutput.results.sort{a, b -> b.created <=> a.created }[0]
println last.created
println last.name
The problem here is not with Groovy code but the Jenkins pipeline.
This code as part of the question, and the solution provided by #daggett works charm on any Groovy IDE But, Fails when run via jenkins pipeline.
The issue URL: https://issues.jenkins-ci.org/browse/JENKINS-44924
I hope they fix it soon.
Thanks for your help guys.

Extract JSON value using Jmeter

I have this JSON:
{
"totalMemory" : 12206567424,
"totalProcessors" : 4,
"version" : "0.4.1",
"agent" : {
"reconnectRetrySec" : 5,
"agentName" : "1001",
"checkRecovery" : false,
"backPressure" : 10000,
"throttler" : 100
},
"logPath" : "/eq/equalum/eqagent-0.4.1.0-SNAPSHOT/logs",
"startTime" : 1494837249902,
"status" : {
"current" : "active",
"currentMessage" : null,
"previous" : "pending",
"previousMessage" : "Recovery:Starting pipelines"
},
"autoStart" : false,
"recovery" : {
"agentName" : "1001",
"partitionInfo" : { },
"topicToInitialCapturePosition" : { }
},
"sources" : [ {
"dataSource" : "oracle",
"name" : "oracle_source",
"captureType" : "directOverApi",
"streams" : [ ],
"idlePollingFreqMs" : 100,
"status" : {
"current" : "active",
"currentMessage" : null,
"previous" : "pending",
"previousMessage" : "Trying to init storage"
},
"host" : "192.168.191.5",
"metricsType" : { },
"bulkSize" : 10000,
"user" : "STACK",
"password" : "********",
"port" : 1521,
"service" : "equalum",
"heartbeatPeriodInMillis" : 1000,
"lagObjective" : 1,
"dataSource" : "oracle"
} ],
"upTime" : "157 min, 0 sec",
"build" : "0-SNAPSHOT",
"target" : {
"targetType" : "equalum",
"agentID" : 1001,
"engineServers" : "192.168.56.100:9000",
"kafkaOptions" : null,
"eventsServers" : "192.168.56.100:9999",
"jaasConfigurationPath" : null,
"securityProtocol" : "PLAINTEXT",
"stateMonitorTopic" : "_state_change",
"targetType" : "equalum",
"status" : {
"current" : "active",
"currentMessage" : null,
"previous" : "pending",
"previousMessage" : "Recovery:Starting pipelines"
},
"serializationFormat" : "avroBinary"
}
}
I trying using Jmeter to extract out the value of agentID, how can I do that using Jmeter, what would be better ? using extractor or json extractor?
what I am trying to do is to extract agentID value in order to use it on another http request sample, but first I have to extract it from this request.
thanks!
I believe using JSON Extractor is the best way to get this agentID value, the relevant JsonPath query will be as simple as $..agentID
Demo:
See the following reference material:
JsonPath - Getting Started - for initial information regarding JsonPath language, functions, operators, etc.
JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios - for more complex scenarios.

angular js json response grabbing of different matches (pattern to match all type of response)

the response is like the following format
{
"human_man" :
[
{"id" : "12345", "value" : "4567"},
{ "id" : "0000", "value" : "qwer"}
],
"human_woman" :
[
{"id" : "5454", "value" : "6565"},
{ "id" : "7878", "value" : "884"}
],
............................................
}
I want to catch the response for all matches , meaning....
if I use,
response.human_man ::: i would be catching--> [ {"id" : "12345", "value" : "4567"}, { "id" : "0000", "value" : "qwer"}]
if I use,
response.human_woman ::: i would be catching--> [{"id" : "5454", "value" : "6565"}, { "id" : "7878", "value" : "884"}]
so I want to know how to catch the response the type is
response.human_* (it should catch response.human_man & response.human_woman)
hope you guys understood the question.. :)
fast replies would be appreciated
Try this :)
var object = {
"human_man" :
[
{"id" : "12345", "value" : "4567"},
{ "id" : "0000", "value" : "qwer"}
],
"human_woman" :
[
{"id" : "5454", "value" : "6565"},
{ "id" : "7878", "value" : "884"}
]
};
for(var key in object) {
if(/^human_/.test(key))
console.log(object[key]);
}

How to add Timestamp to Spring-Data-Mongo in Roo?

I have a Spring Roo project I am trying to create based on log4mongo-java appender and I want to get access to the data entries that looks like:
{
"_id" : ObjectId("4f16cd30b138685057c8ebcb"),
"timestamp" : ISODate("2012-01-18T13:46:24.704Z"),
"level" : "INFO", "thread" : "catalina-exec-8180-3",
"message" : "method execution[execution(TerminationComponent.terminateCall(..))]",
"loggerName" :
{ "fullyQualifiedClassName" : "component_logger",
"package" : ["component_logger"],
"className" : "component_logger"
},
"properties" : {
"cookieId" : "EDE44DC03EB65D91657885A34C80595E"
},
"fileName" : "LoggingAspect.java",
"method" : "logForComponent",
"lineNumber" : "81", "class" : {
"fullyQualifiedClassName" : "com.comcast.ivr.core.aspects.LoggingAspect",
"package" : ["com", "comcast", "ivr", "core", "aspects", "LoggingAspect"],
"className" : "LoggingAspect"
},
"host" : {
"process" : "2220#pacdcivrqaapp01",
"name" : "pacdcivrqaapp01",
"ip" : "24.40.31.85"
},
"applicationName" : "D2",
"eventType" : "Development"
}
The timestamp looks like:
"timestamp" : ISODate("2012-01-17T22:30:19.839Z")
How can I add a field in my Logging domain object to map this field?
That's just the JavaScript Date (according to the mongo docs, and as can be demonstrated in the shell), so try with java.util.Date.