[
{
"RollNo":1,
"name":"John",
"age":20,
"Hobby":"Music",
"Date":"9/05/2018"
"sal":5000
},
{
"RollNo":2,
"name":"Ravi",
"age":25,
"Hobby":"TV",
"Date":"9/05/2018"
"sal":5000
},
{
"RollNo":3,
"name":"Devi",
"age":30,
"Hobby":"cooking",
"Date":"9/04/2018"
"sal":5000
}
]
Above is the JSON file i need to insert into a MongoDB. Similar JSON data is already in my mongoDB collection named 'Tests'.I have to ignore the records which is already
in the mongoDB based on a certain condition.
[RollNo in mongoDB == RollNo in the json need to insert && Hobby in mongoDB ==Hobby in the json need to insert && Date in mongoDB == Date in the json need to insert].
If this condition matches, i need to igore the insertion,else need to insert the data into DB .
I am using nodejs. Anyone please help me to do it.
If you are using mongoose then use upsert.
db.people.update(
{ RollNo: 1 },
{
"RollNo":1,
"name":"John",
"age":20,
"Hobby":"Music",
"Date":"9/05/2018"
"sal":5000
},
{ upsert: true }
)
But to avoid inserting the same document more than once, only use upsert: true if the query field is uniquely indexed.
The easiest and safest way to do this is by using a compound index.
You can create a compound index like this:
db.people.createIndex( { "RollNo": 1, "Hobby": 1, "Date" : 1 }, { unique: true } )
Then the duplicated inserts will produce an error which you need to process in your code.
I have a specific requirement to convert some related tables data in nested json like below by using Spark SQL. I have achieved it with Scala but not getting it resolved in Spark SQL.
{
"REPORTING_CARRIER":"9E",
"DISTANCE":"3132",
"ORIGIN_STATE_NM":"Pennsylvania",
"QUARTER":"2",
"YEAR":"2017",
"ITIN_GEO_TYPE":"2",
"BULK_FARE":"0",
"ORIGIN":"ABE",
"ORIGIN_AIRPORT_ID":"10135",
"ITIN_FARE":"787",
"ORIGIN_CITY_MARKET_ID":"30135",
"ROUNDTRIP":"1",
"Market":[
{
"MKT_DISTANCE":"1566",
"MKT_BULK_FARE":"0",
"MKT_NO_OF_CPNS":"2",
"MKT_DEST_STATE_NM":"Texas",
"MKT_OP_CARR_GRP":"9E:DL",
"MKT_TK_CARR_GRP":"DL:DL",
"MKT_MILES_FLOWN":"1566",
"MKT_AIRPORT_GROUP":"ABE:ATL:SAT",
"MKT_FARE_AMT":"393.5",
"MKT_ORIG_STATE_NM":"Pennsylvania",
"MKT_DEST_ARPT_CITY_NM":"33214",
"MKT_RPTG_CARR_NM":"9E",
"MKT_DEST":"SAT",
"MKT_DEST_CNTRY":"US",
"MKT_ORIG_CNTRY":"US",
"Coupon":[
{
"CPN_STATE_NM":"Georgia",
"CPN_DEST":"ATL",
"CPN_TKT_CARR_NM":"DL",
"TRIP_BREAK":"",
"CPN_MKT_ORIG_ARPT_NM":"10135",
"CLASS_OF_SVC":"X",
"CPN_TKT_NBR":"2017245",
"CPN_DEST_CITY_MKT_NM":"30397",
"CPN_DISTANCE":"692",
"SEQ_NUM":"1",
"ITIN_GEO_TYPE":"2",
"CPN_RPTG_CARR_NM":"9E",
"COUPON_GEO_TYPE":"2",
"CPN_ORIG_STATE_NM":"Pennsylvania",
"CPN_OPERG_CARR_NM":"9E",
"CPN_ORIG":"ABE",
"CPN_PASSENGERS":"1",
"COUPON_TYPE":"A",
"CPN_DEST_ARPT_NM":"10397",
"CPN_MKT_ORIG_CITY_NM":"30135",
"CPN_DEST_CNTRY":"US",
"CPN_MKT_ID":"201724501",
"CPN_ORIG_CNTRY":"US"
},
{
"CPN_STATE_NM":"Texas",
"CPN_DEST":"SAT",
"CPN_TKT_CARR_NM":"DL",
"TRIP_BREAK":"X",
"CPN_MKT_ORIG_ARPT_NM":"10397",
"CLASS_OF_SVC":"X",
"CPN_TKT_NBR":"2017245",
"CPN_DEST_CITY_MKT_NM":"33214",
"CPN_DISTANCE":"874",
"SEQ_NUM":"2",
"ITIN_GEO_TYPE":"2",
"CPN_RPTG_CARR_NM":"9E",
"COUPON_GEO_TYPE":"2",
"CPN_ORIG_STATE_NM":"Georgia",
"CPN_OPERG_CARR_NM":"DL",
"CPN_ORIG":"ATL",
"CPN_PASSENGERS":"1",
"COUPON_TYPE":"A",
"CPN_DEST_ARPT_NM":"14683",
"CPN_MKT_ORIG_CITY_NM":"30397",
"CPN_DEST_CNTRY":"US",
"CPN_MKT_ID":"201724501",
"CPN_ORIG_CNTRY":"US"
}
],
"MKT_ITIN_ID":"2017245",
"MKT_OPERG_CARR_NM":"99",
"MKT_DEST_ARPT_NM":"14683",
"MKT_ORIG_ARPT_NM":"ABE",
"MKT_ITIN_GEO_TYPE":"2",
"MKT_PASSENGERS":"1",
"MKT_ID":"201724501",
"MKT_TKT_CARR_NM":"DL"
},
{
"MKT_DISTANCE":"1566",
"MKT_BULK_FARE":"0",
"MKT_NO_OF_CPNS":"2",
"MKT_DEST_STATE_NM":"Pennsylvania",
"MKT_OP_CARR_GRP":"DL:DL",
"MKT_TK_CARR_GRP":"DL:DL",
"MKT_MILES_FLOWN":"1566",
"MKT_AIRPORT_GROUP":"SAT:ATL:ABE",
"MKT_FARE_AMT":"393.5",
"MKT_ORIG_STATE_NM":"Texas",
"MKT_DEST_ARPT_CITY_NM":"30135",
"MKT_RPTG_CARR_NM":"9E",
"MKT_DEST":"ABE",
"MKT_DEST_CNTRY":"US",
"MKT_ORIG_CNTRY":"US",
"Coupon":[
{
"CPN_STATE_NM":"Georgia",
"CPN_DEST":"ATL",
"CPN_TKT_CARR_NM":"DL",
"TRIP_BREAK":"",
"CPN_MKT_ORIG_ARPT_NM":"14683",
"CLASS_OF_SVC":"X",
"CPN_TKT_NBR":"2017245",
"CPN_DEST_CITY_MKT_NM":"30397",
"CPN_DISTANCE":"874",
"SEQ_NUM":"3",
"ITIN_GEO_TYPE":"2",
"CPN_RPTG_CARR_NM":"9E",
"COUPON_GEO_TYPE":"2",
"CPN_ORIG_STATE_NM":"Texas",
"CPN_OPERG_CARR_NM":"DL",
"CPN_ORIG":"SAT",
"CPN_PASSENGERS":"1",
"COUPON_TYPE":"A",
"CPN_DEST_ARPT_NM":"10397",
"CPN_MKT_ORIG_CITY_NM":"33214",
"CPN_DEST_CNTRY":"US",
"CPN_MKT_ID":"201724503",
"CPN_ORIG_CNTRY":"US"
},
{
"CPN_STATE_NM":"Pennsylvania",
"CPN_DEST":"ABE",
"CPN_TKT_CARR_NM":"DL",
"TRIP_BREAK":"X",
"CPN_MKT_ORIG_ARPT_NM":"10397",
"CLASS_OF_SVC":"X",
"CPN_TKT_NBR":"2017245",
"CPN_DEST_CITY_MKT_NM":"30135",
"CPN_DISTANCE":"692",
"SEQ_NUM":"4",
"ITIN_GEO_TYPE":"2",
"CPN_RPTG_CARR_NM":"9E",
"COUPON_GEO_TYPE":"2",
"CPN_ORIG_STATE_NM":"Georgia",
"CPN_OPERG_CARR_NM":"DL",
"CPN_ORIG":"ATL",
"CPN_PASSENGERS":"1",
"COUPON_TYPE":"A",
"CPN_DEST_ARPT_NM":"10135",
"CPN_MKT_ORIG_CITY_NM":"30397",
"CPN_DEST_CNTRY":"US",
"CPN_MKT_ID":"201724503",
"CPN_ORIG_CNTRY":"US"
}
],
"MKT_ITIN_ID":"2017245",
"MKT_OPERG_CARR_NM":"DL",
"MKT_DEST_ARPT_NM":"10135",
"MKT_ORIG_ARPT_NM":"SAT",
"MKT_ITIN_GEO_TYPE":"2",
"MKT_PASSENGERS":"1",
"MKT_ID":"201724503",
"MKT_TKT_CARR_NM":"DL"
}
],
"NO_OF_CPNS":"4",
"ORIGIN_COUNTRY":"US",
"ITIN_ID":"2017245",
"PASSENGERS":"1",
"MILES_FLOWN":"3132"
}
You can use the from_json() helper function within the select() Dataset API call, to extract or decode data's attributes and values from a JSON string into a DataFrame as columns, dictated by a schema.
example, given the following json { "reporting_carrier": "A", "market": { "value": 10 } }, stored in the rawJsonDf
case class MarketData (reporting_carrier: String, market_json: String)
val jsonSchema = new StructType()
.add("value", LongType)
rawJsonDf
.toDf("reporting_carrier","market")
.as[MarketData]
rawJsonDf
.select(from_json($"market_json", jsonSchema) as "market")
.filter($"market.value" > 5)
see this great tutorial by databricks for more info.
Trying to figure out the best way to query a MySQL table containing a json column.
I am successfully able to get product OR port.
SELECT ip, JSON_EXTRACT(json_data, '$.data[*].product' ) FROM `network`
This will return:
["ftp","ssh"]
What I'm looking to get is something like this, or some other way to represent association and handle null values:
[["ftp",21],["ssh",22],[NULL,23]]
Sample JSON
{
"key1":"Value",
"key2":"Value",
"key3":"Value",
"data": [
{
"product":"ftp",
"port":"21"
},
{
"product":"ssh",
"port":"22"
},
{
"port":"23"
}
]
}
I have following JSON data in MySQL JSON FIELD
{
"Session0":[
{
"r_type":"main",
"r_flag":"other"
},
{
"r_type":"sub",
"r_flag":"kl"
}
],
"Session1":[
{
"r_type":"up",
"r_flag":"p2"
},
{
"r_type":"id",
"r_flag":"mb"
}
],
"Session2":[
{
"r_type":"main",
"r_flag":"p2"
},
{
"r_type":"id",
"r_flag":"mb"
}
]
}
Now, I wish to search ALL sessions where r_type="main". The session number can vary, hence I can not use a OR query. So, I need something like: where
JSON_EXTRACT(field,"$.Session**[*].r_type")="main"
But this does not seem to work. I need to be able to use wildcard in the property's name and then search an array for a property inside it. How do I do that?
Following work's, but that limits our ability to have unlimited Sessions numbers.
SELECT field->"$.Session1[*].r_type" from table
I have json stored in table with 3 million rows.
A single row contains json in below format
[
{
"Transaction":[
{
"ProductInfo":[
{
"LINE_NO":"1",
"STOCKNO":"890725471381116060"
},
{
"LINE_NO":"2",
"STOCKNO":"890725315884216020"
}
]
}
],
"Payment":[
{
"ENTSRLNO":"1",
"DOCDT":"08/25/2016"
}
],
"Invoice":[
{
"SALES_TYPE":"Salesinvoice",
"POS_CODE":"A20",
"CUSTOMER_ID":"0919732189692",
"TRXN_TYPE":"2100",
"DOCNOPREFIX":"CM16",
"DOCNO":"1478",
"BILL_DATE":"08/25/2016 03:59:07"
}
]
}
]
I want to dump above json in three different table
ProductInfo
Payment table
Invoice
How to perform above task in a optimise way?
Well most efficient way will be to write a procedure and use open json in sql server
check below link:
https://msdn.microsoft.com/en-IN/library/dn921879.aspx