Loading a json file using python resulting in an error

Loading a json file using python resulting in an error - json

Goof mourning,
When I am trying to load a json file into a mongodb, I am getting the following error:
raise ValueError("No JSON object could be decoded").
In my opinion, my problem come from my second field, but I do not know how to make change "" into a name, or delete it before to load it.
My json file is :
{
"_id" : "585a9ecec62747d1e19497a5",
"" : NumberInt(0),
"VendorID" : NumberInt(2),
"lpep_pickup_datetime" : "2015-11-01 00:57:34",
"Lpep_dropoff_datetime" : "2015-11-01 23:57:45",
"Store_and_fwd_flag" : "N",
"RateCodeID" : NumberInt(5),
"Pickup_longitude" : -73.9550857544,
"Pickup_latitude" : 40.6637229919,
"Dropoff_longitude" : -73.958984375,
"Dropoff_latitude" : 40.6634483337,
"Passenger_count" : NumberInt(1),
"Trip_distance" : 0.09,
"Fare_amount" : 15.0,
"Extra" : 0.0,
"MTA_tax" : 0.0,
"Tip_amount" : 0.0,
"Tolls_amount" : 0.0,
"Ehail_fee" : "",
"improvement_surcharge" : 0.0,
"Total_amount" : 15.0,
"Payment_type" : NumberInt(2),
"Trip_type" : NumberInt(2),
"x" : -8232642.48775,
"y" : 4962866.701,
"valid_longitude" : NumberInt(1),
"valid_latitude" : NumberInt(1),
"valid_coordinates" : NumberInt(2)
}

The problem in your JSON file is not the empty-string key (that is allowed), but the occurrences of NumberInt(...): this is not valid in JSON. You need to provide the number without wrapping it in some kind of function.
So this will be valid:
{
"_id": "585a9ecec62747d1e19497a5",
"": 0,
"VendorID": 2,
"lpep_pickup_datetime": "2015-11-01 00:57:34",
"Lpep_dropoff_datetime": "2015-11-01 23:57:45",
"Store_and_fwd_flag": "N",
"RateCodeID": 5,
"Pickup_longitude": -73.9550857544,
"Pickup_latitude": 40.6637229919,
"Dropoff_longitude": -73.958984375,
"Dropoff_latitude": 40.6634483337,
"Passenger_count": 1,
"Trip_distance": 0.09,
"Fare_amount": 15.0,
"Extra": 0.0,
"MTA_tax": 0.0,
"Tip_amount": 0.0,
"Tolls_amount": 0.0,
"Ehail_fee": "",
"improvement_surcharge": 0.0,
"Total_amount": 15.0,
"Payment_type": 2,
"Trip_type": 2,
"x": -8232642.48775,
"y": 4962866.701,
"valid_longitude": 1,
"valid_latitude": 1,
"valid_coordinates": 2
}
If you have no control over the non-JSON file, then after reading the file contents replace the occurrences of NumberInt like this (in Python):
import re
json = re.sub(r"NumberInt\((\d+)\)", r"\1", json)

Related

Jmeter: Extracting JSON response with special/spaces characters

Hello can someone help me extract the value of user parameter which is "testuser1"
I tried to use this JSON Path expression $..data I was able to extract the entire response but unable to extract user parameter. Thanks in advance
{
"data": "{ "took" : 13, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "bushidodb_history_network_eval_ea9656ef-0a9b-474b-8026-2f83e2eb9df1_2021-april-10", "_type" : "network", "_id" : "6e2e58be-0ccf-3fb4-8239-1d4f2af322e21618059082000", "_score" : 1.0, "_source" : { "misMatches" : [ "protocol", "state", "command" ], "instance" : "e3032804-4b6d-3735-ac22-c827950395b4|0.0.0.0|10.179.155.155|53|UDP", "protocol" : "UDP", "localAddress" : "0.0.0.0", "localPort" : "12345", "foreignAddress" : "10.179.155.155", "foreignPort" : "53", "command" : "ping yahoo.com ", "user" : "testuser1", "pid" : "10060", "state" : "OUTGOINGFQ", "rate" : 216.0, "originalLocalAddress" : "192.168.100.229", "exe" : "/bin/ping", "md5" : "f9ad63ce8592af407a7be43b7d5de075", "dir" : "", "agentId" : "abcd-dcd123", "year" : "2021", "month" : "APRIL", "day" : "10", "hour" : "12", "time" : "1618059082000", "isMerged" : false, "timestamp" : "Apr 10, 2021 12:51:22 PM", "metricKey" : "6e2e58be-0ccf-3fb4-8239-1d4f2af322e2", "isCompliant" : false }, "sort" : [ 1618059082000 ] } ] }, "aggregations" : { "count_over_time" : { "buckets" : [ { "key_as_string" : "2021-04-10T08:00:00.000-0400", "key" : 1618056000000, "doc_count" : 1 } ] } }}",
"success": true,
"message": {
"code": "S",
"message": "Get Eval results Count Success"
}
}
Actual Response:
Images

What you posted doesn't look like a valid JSON to me.
If in reality you're getting what's in your image, to wit:
{
"data": "{ \"took\" : 13, \"timed_out\" : false, \"_shards\" : { \"total\" : 5, \"successful\" : 5, \"skipped\" : 0, \"failed\" : 0 }, \"hits\" : { \"total\" : 1, \"max_score\" : 1.0, \"hits\" : [ { \"_index\" : \"bushidodb_history_network_eval_ea9656ef-0a9b-474b-8026-2f83e2eb9df1_2021-april-10\", \"_type\" : \"network\", \"_id\" : \"6e2e58be-0ccf-3fb4-8239-1d4f2af322e21618059082000\", \"_score\" : 1.0, \"_source\" : { \"misMatches\" : [ \"protocol\", \"state\", \"command\" ], \"instance\" : \"e3032804-4b6d-3735-ac22-c827950395b4|0.0.0.0|10.179.155.155|53|UDP\", \"protocol\" : \"UDP\", \"localAddress\" : \"0.0.0.0\", \"localPort\" : \"12345\", \"foreignAddress\" : \"10.179.155.155\", \"foreignPort\" : \"53\", \"command\" : \"pingyahoo.com\", \"user\" : \"testuser1\", \"pid\" : \"10060\", \"state\" : \"OUTGOINGFQ\", \"rate\" : 216.0, \"originalLocalAddress\" : \"192.168.100.229\", \"exe\" : \"/bin/ping\", \"md5\" : \"f9ad63ce8592af407a7be43b7d5de075\", \"dir\" : \"\", \"agentId\" : \"abcd-dcd123\", \"year\" : \"2021\", \"month\" : \"APRIL\", \"day\" : \"10\", \"hour\" : \"12\", \"time\" : \"1618059082000\", \"isMerged\" : false, \"timestamp\" : \"Apr10, 202112: 51: 22PM\", \"metricKey\" : \"6e2e58be-0ccf-3fb4-8239-1d4f2af322e2\", \"isCompliant\" : false }, \"sort\" : [ 1618059082000 ] } ] }, \"aggregations\" : { \"count_over_time\" : { \"buckets\" : [ { \"key_as_string\" : \"2021-04-10T08: 00: 00.000-0400\", \"key\" : 1618056000000, \"doc_count\" : 1 } ] } }}",
"success": true,
"message": {
"code": "S",
"message": "Get Eval results Count Success"
}
}
the easiest way is just using 2 JSON Extractors:
Extract data attribute value into a JMeter Variable from the response
Extract user attribute value into a JMeter variable from ${data} JMeter Variable:
Demo:
If the response looks like exactly you posted you won't be able to use JSON Extractors and will have to treat it as normal text so your choice is limited to Regular Expression Extractor, example regular expression:
"user"\s*:\s*"(\w+)"

Add Regular Expression extractor with the corresponding request and extract it. Use the below expression.
Expression: "user" : "(.*?)"
Ref: https://jmeter.apache.org/usermanual/regular_expressions.html
Regular Expression Extractor Sample

JSON file to CSV file conversion when my JSON columns are dynamic

I found the solution for json to csv conversion. Below is the sample json and solution.
{
"took" : 111,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "alerts",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"alertID" : "639387c3-0fbe-4c2b-9387-c30fbe7c2bc6",
"alertCategory" : "Server Alert",
"description" : "Successfully started.",
"logId" : null
}
},
{
"_index" : "alerts",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"alertID" : "2",
"alertCategory" : "Server Alert",
"description" : "Successfully stoped.",
"logId" : null
}
}
]
}
}
The solution :
jq -r '.hits.hits[]._source | [ "alertID" , "alertCategory" , "description", "logId" ], ([."alertID",."alertCategory",."description",."logId" // "null"]) | #csv' < /root/events.json
The problem with this solution is that I have to hard code the column names. What If my json gets a few additions under _source tag later? I need a solution which can handle the dynamic data under _source. I am open to any other tool or command in shell.

Simply use keys_unsorted (or keys if you want them sorted). See e.g. Convert JSON array into CSV using jq or How to convert arbitrary simple JSON to CSV using jq? for two SO examples. There are many others too.

How to reformat specific data from json with jq

I'm new to json and want to extract data (blocks) from a specific json. I've tried to find information on how to do this with jq but so far I cannot seem to get what I want.
My json:
{
"now" : 1589987097.9,
"aircraft" : [
{
"mlat" : [],
"rssi" : -26.2,
"track" : 319,
"speed" : 354,
"messages" : 16,
"seen" : 0.7,
"altitude" : 38000,
"vert_rate" : 0,
"hex" : "44b5b4",
"tisb" : []
},
{
"squawk" : "6220",
"altitude" : 675,
"seen" : 1.1,
"messages" : 7220,
"tisb" : [],
"hex" : "484a95",
"mlat" : [],
"rssi" : -22
},
{
"hex" : "484846",
"tisb" : [],
"messages" : 20,
"speed" : 89,
"seen" : 0.4,
"squawk" : "7000",
"altitude" : 500,
"rssi" : -23.7,
"track" : 185,
"mlat" : []
},
{
"category" : "B1",
"mlat" : [],
"rssi" : -24.3,
"flight" : "ZSGBX ",
"altitude" : 3050,
"squawk" : "7000",
"seen" : 16.8,
"messages" : 37,
"tisb" : [],
"hex" : "00901a"
}
],
"messages" : 35857757
}
I would like to reformat this json to only include 'blocks' that contain specific hex values.
So for example, I want I would like my output to contain 44b5b4 and 00901a:
{
"now" : 1589987097.9,
"aircraft" : [
{
"mlat" : [],
"rssi" : -26.2,
"track" : 319,
"speed" : 354,
"messages" : 16,
"seen" : 0.7,
"altitude" : 38000,
"vert_rate" : 0,
"hex" : "44b5b4",
"tisb" : []
},
{
"category" : "B1",
"mlat" : [],
"rssi" : -24.3,
"flight" : "ZSGBX ",
"altitude" : 3050,
"squawk" : "7000",
"seen" : 16.8,
"messages" : 37,
"tisb" : [],
"hex" : "00901a"
}
],
"messages" : 35857757
}
Can someone tell me how to remove all items not having those 2 hex identifiers but still keep the same json structure?
Thanks a lot!

Do a select() on the array aircraft, matching only the required hex values. The map() function takes input the entire array and the result of the select operation i.e. filtering of objects based on the .hex value is updated back |= to the original array and the rest of the fields are kept intact.
jq '.aircraft |= map(select(.hex == "44b5b4" or .hex == "00901a"))' json

Select blocks whose hex matches one of the specific values and update aircraft to leave only those.
.aircraft |= map(select(.hex | IN("44b5b4", "00901a")))
Online demo

How to get nested json object/value in scala

I have Elasticsearch Search response that is a deeply nested Json file and I am stuck as to how to get a particular value from it. Please am new to Scala and programming in general and I have searched online and could not see any answer that explained it well.
This is the Json file and the value I want to get out is "getSum":"value"
Search_response: org.elasticsearch.action.search.SearchResponse = {
"took" : 32,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 12,
"max_score" : 1.0,
"hits" : [ {
"_index" : "myIndex",
"_type" : "myType",
"_id" : "4151202002020",
"_score" : 1.0,
"_source":{"pint":[{"printer":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Lam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Kam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Jas":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"tiv":[{ourc""s:"wrer","sourceType":"rsd","Vag":"agaatttt363336"}],"timeLineSource:[{"LA":"DGAT","GATA":"JAS","timeline":9.111694,"GA":"SFWF2525252552552525"}
}, {
"_index" : "myIndex",
"_type" : "myType",
"_id" : "4151202002020",
"_score" : 1.0,
"_source":{"pint":[{"printer":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Lam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Kam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Jas":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"tiv":[{ourc""s:"wrer","sourceType":"rsd","Vag":"agaatttt363336"}],"timeLineSource:[{"LA":"DGAT","GATA":"JAS","timeline":9.111694,"GA":"SFWF2525252552552525"}
}, {
"_index" : "myIndex",
"_type" : "myType",
"_id" : "4151202002020",
"_score" : 1.0,
"_source":{"pint":[{"printer":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Lam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Kam":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"},{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"Jas":[{"sourceName":"3636636","sourceType":"Bin","Star":0.0,"Fun":"gatayay"}],"tiv":[{ourc""s:"wrer","sourceType":"rsd","Vag":"agaatttt363336"}],"timeLineSource:[{"LA":"DGAT","GATA":"JAS","timeline":9.111694,"GA":"SFWF2525252552552525"}
}, {
},
"aggregations" : {
"DAEY" : {
"doc_count" : 59,
"histogram" : {
"buckets" : [ {
"key_as_string" : "1978-02-22T00:00:00.000Z",
"key" : 1503360000000,
"doc_count" : 59,
"nestedValue" : {
"doc_count" : 177,
"getSum" : {
"value" : 768.0690221786499
}
},
}
}
}
}
This is what I tried
val getResult: String = searchResult.toString.stripMargin
val getValue = JsonParser.parse(getResult).asInstanceOf[JObject].values("aggregations").toString

You can solve this by using type-safe config. Please find the required maven and sbt dependency below -
Maven Dependecy -
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.3.1</version>
</dependency>
Sbt Dependency -
libraryDependencies += "com.typesafe" % "config" % "1.3.1"
Afterwards, you can get the value of sum with below code -
import com.typesafe.config.ConfigFactory
val config = ConfigFactory.parseString(getResult)
config.getConfigList("aggregations.DAEY.buckets").get(0).getString("nestedValue .getSum.value")
Checkout API doc for library from this link

I finally used
val getResult: String = searchResult.toString.stripMargin
val getValue = JsonParser.parse(getResult).asInstanceOf[JObject].values("aggregations").toString
val valueToDouble = getValue.split(" ").last.dropRight(13).toDouble

Mongo: Does the number of fields in a document affect the query performance?

I have a collection called school and the document in it is like:
{
"_id" : 0,
"_class" : "com.aixueniao.server.model.School",
"companyUserId" : 0,
"schoolUserId" : 0,
"schoolName" : "校区名称",
"showSchoolName" : 0,
"gradeIds" : "[]",
"firstLevelSubjectIds" : "[]",
"secondLevelSubjectIds" : "[]",
"classType" : "",
"introduction" : "校区介绍",
"mainImageId" : 0,
"imageIds" : "[]",
"longitude" : 0.0,
"latitude" : 0.0,
"locationId" : 0,
"address" : "校区地址",
"runningTime" : 0.0,
"teacherCount" : 0,
"telephone" : "",
"fixedPhone" : "",
"createTime" : "2017-01-13 01:16:54",
"expirationTime" : "2017-01-13 01:16:54",
"schoolStatus" : "ARREARS",
"authorizationStatus" : "NO",
"rejectReason" : "无"
}
There are about 26 fields in the document, does the number of fields affect the query performance? I will query on 4 fields and will use $near query. Thanks in advance.

The simple answer is yes, which is why Mongo provides us with features like projections to achieve better performance by not having the database return all the fields in the document

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Loading a json file using python resulting in an error - json

Related

Jmeter: Extracting JSON response with special/spaces characters

JSON file to CSV file conversion when my JSON columns are dynamic

How to reformat specific data from json with jq

How to get nested json object/value in scala

Mongo: Does the number of fields in a document affect the query performance?

Categories

Resources