Delete / add nested objects in Elastic search - json

I cannot find examples in the Elastic manual on nested objects on how to modify fields and nested objects of documents using RESTful commands in Kibana Sense. I am looking for something similar to Solrs atomic updates here, which allow to update specific fields of documents.
How do RESTful commands in Kibana Sense look like that accomplish this? The only related info in the manual I can find is on Partial Updates to Documents, but I do not know how this can be applied for this use case.
For example, straight from the Elastic docs:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
How can I delete an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
}
]
}
How can I add an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}

You will have to use scripted updates unless you want to fetch all nested objects then add / remove items and re-index them all which is the previous answer proposed. However if you have a lot of nested documents you should be doing partial updates / additions and deletes. It is much quicker from data transfer and indexing point of view.
Here is a good article how to do scripted updates in general:
https://iridakos.com/programming/2019/05/02/add-update-delete-elasticsearch-nested-objects

Unless I misunderstand your ask, you just post the updated document version to the same document id each time you want.
To delete a nested document (or any field):
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
}
]
}
To add a user, add it to the list:
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}
Note: Documents in elasticsearch are immutable. Making a change to a single field causes the entire document to be re-indexed. Nested documents are always re-indexed with the parent document so if you change a field in the parent the nested document is also re-indexed. This can be a performance issue if the nested documents are large and the parents have frequent changes.

For this specific use case, you must use a scripted update. In javascript the call will look something like:
const documentUpdateInstructions = {
index: "index-name",
id: "document-id",
body: {
script: {
lang: "painless",
source: `ctx._source.myNestedObject.removeIf(object -> object.username == params.username);`,
params: {
username: "my_username"
},
},
},
};
await client.update(documentUpdateInstructions);
This takes a document in the form of
document._source = {
...
"myNestedObject": [
{
"username": "my_username",
...
},
{
"username": "not_my_username",
...
}
]
}
and deletes the object inside myNestedObject who's username matches the username provided (in this case my_username). The resulting document will be:
document._source = {
...
"myNestedObject": [
{
"username": "not_my_username",
...
}
]
}

Related

Difference between Set and List for DynamoDB

I'm uploading data to my Dynamo Db table with the sensor's data. I created a List for sensors locations, however, I heard that it might be better to create a set and I could not find a difference between the way I upload data and the way it would be presented. Currently if I use List("L":) I have [ { "S" : "Culpeper VA" }, { "S" : "Colorado Springs Co" } ] in my table. Would it be different if I use Set instead and what attribute on the left I would use instead of "L" for list?
{
"Sensor" : {
"S": "Sensor1"
},
"SensorDescription": {
"S" : "Sensor to meassure water temperature"
},
"ImageFile" : {
"S" : "/Sensors/images/acoustic-elementarray.jpg"
},
"SampleRate":{
"N" : "2048"
},
"Locations" : {
"L": [
{
"S" : "Culpeper VA"
},
{
"S": "Colorado Springs Co"
}
]
}
}
That is my JSON that I use with put item API call
Now I figured out, the best way in my case would be to use String Set instead, update JSON is :
"Locations" : {
"SS": [ "Colorado Springs Co" , "Culpeper VA"
]
}

How to get to key in MongoDB? [duplicate]

Suppose you have the following documents in my collection:
{
"_id":ObjectId("562e7c594c12942f08fe4192"),
"shapes":[
{
"shape":"square",
"color":"blue"
},
{
"shape":"circle",
"color":"red"
}
]
},
{
"_id":ObjectId("562e7c594c12942f08fe4193"),
"shapes":[
{
"shape":"square",
"color":"black"
},
{
"shape":"circle",
"color":"green"
}
]
}
Do query:
db.test.find({"shapes.color": "red"}, {"shapes.color": 1})
Or
db.test.find({shapes: {"$elemMatch": {color: "red"}}}, {"shapes.color": 1})
Returns matched document (Document 1), but always with ALL array items in shapes:
{ "shapes":
[
{"shape": "square", "color": "blue"},
{"shape": "circle", "color": "red"}
]
}
However, I'd like to get the document (Document 1) only with the array that contains color=red:
{ "shapes":
[
{"shape": "circle", "color": "red"}
]
}
How can I do this?
MongoDB 2.2's new $elemMatch projection operator provides another way to alter the returned document to contain only the first matched shapes element:
db.test.find(
{"shapes.color": "red"},
{_id: 0, shapes: {$elemMatch: {color: "red"}}});
Returns:
{"shapes" : [{"shape": "circle", "color": "red"}]}
In 2.2 you can also do this using the $ projection operator, where the $ in a projection object field name represents the index of the field's first matching array element from the query. The following returns the same results as above:
db.test.find({"shapes.color": "red"}, {_id: 0, 'shapes.$': 1});
MongoDB 3.2 Update
Starting with the 3.2 release, you can use the new $filter aggregation operator to filter an array during projection, which has the benefit of including all matches, instead of just the first one.
db.test.aggregate([
// Get just the docs that contain a shapes element where color is 'red'
{$match: {'shapes.color': 'red'}},
{$project: {
shapes: {$filter: {
input: '$shapes',
as: 'shape',
cond: {$eq: ['$$shape.color', 'red']}
}},
_id: 0
}}
])
Results:
[
{
"shapes" : [
{
"shape" : "circle",
"color" : "red"
}
]
}
]
The new Aggregation Framework in MongoDB 2.2+ provides an alternative to Map/Reduce. The $unwind operator can be used to separate your shapes array into a stream of documents that can be matched:
db.test.aggregate(
// Start with a $match pipeline which can take advantage of an index and limit documents processed
{ $match : {
"shapes.color": "red"
}},
{ $unwind : "$shapes" },
{ $match : {
"shapes.color": "red"
}}
)
Results in:
{
"result" : [
{
"_id" : ObjectId("504425059b7c9fa7ec92beec"),
"shapes" : {
"shape" : "circle",
"color" : "red"
}
}
],
"ok" : 1
}
Caution: This answer provides a solution that was relevant at that time, before the new features of MongoDB 2.2 and up were introduced. See the other answers if you are using a more recent version of MongoDB.
The field selector parameter is limited to complete properties. It cannot be used to select part of an array, only the entire array. I tried using the $ positional operator, but that didn't work.
The easiest way is to just filter the shapes in the client.
If you really need the correct output directly from MongoDB, you can use a map-reduce to filter the shapes.
function map() {
filteredShapes = [];
this.shapes.forEach(function (s) {
if (s.color === "red") {
filteredShapes.push(s);
}
});
emit(this._id, { shapes: filteredShapes });
}
function reduce(key, values) {
return values[0];
}
res = db.test.mapReduce(map, reduce, { query: { "shapes.color": "red" } })
db[res.result].find()
Another interesing way is to use $redact, which is one of the new aggregation features of MongoDB 2.6. If you are using 2.6, you don't need an $unwind which might cause you performance problems if you have large arrays.
db.test.aggregate([
{ $match: {
shapes: { $elemMatch: {color: "red"} }
}},
{ $redact : {
$cond: {
if: { $or : [{ $eq: ["$color","red"] }, { $not : "$color" }]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}}]);
$redact "restricts the contents of the documents based on information stored in the documents themselves". So it will run only inside of the document. It basically scans your document top to the bottom, and checks if it matches with your if condition which is in $cond, if there is match it will either keep the content($$DESCEND) or remove($$PRUNE).
In the example above, first $match returns the whole shapes array, and $redact strips it down to the expected result.
Note that {$not:"$color"} is necessary, because it will scan the top document as well, and if $redact does not find a color field on the top level this will return false that might strip the whole document which we don't want.
Better you can query in matching array element using $slice is it helpful to returning the significant object in an array.
db.test.find({"shapes.color" : "blue"}, {"shapes.$" : 1})
$slice is helpful when you know the index of the element, but sometimes you want
whichever array element matched your criteria. You can return the matching element
with the $ operator.
db.getCollection('aj').find({"shapes.color":"red"},{"shapes.$":1})
OUTPUTS
{
"shapes" : [
{
"shape" : "circle",
"color" : "red"
}
]
}
The syntax for find in mongodb is
db.<collection name>.find(query, projection);
and the second query that you have written, that is
db.test.find(
{shapes: {"$elemMatch": {color: "red"}}},
{"shapes.color":1})
in this you have used the $elemMatch operator in query part, whereas if you use this operator in the projection part then you will get the desired result. You can write down your query as
db.users.find(
{"shapes.color":"red"},
{_id:0, shapes: {$elemMatch : {color: "red"}}})
This will give you the desired result.
Thanks to JohnnyHK.
Here I just want to add some more complex usage.
// Document
{
"_id" : 1
"shapes" : [
{"shape" : "square", "color" : "red"},
{"shape" : "circle", "color" : "green"}
]
}
{
"_id" : 2
"shapes" : [
{"shape" : "square", "color" : "red"},
{"shape" : "circle", "color" : "green"}
]
}
// The Query
db.contents.find({
"_id" : ObjectId(1),
"shapes.color":"red"
},{
"_id": 0,
"shapes" :{
"$elemMatch":{
"color" : "red"
}
}
})
//And the Result
{"shapes":[
{
"shape" : "square",
"color" : "red"
}
]}
You just need to run query
db.test.find(
{"shapes.color": "red"},
{shapes: {$elemMatch: {color: "red"}}});
output of this query is
{
"_id" : ObjectId("562e7c594c12942f08fe4192"),
"shapes" : [
{"shape" : "circle", "color" : "red"}
]
}
as you expected it'll gives the exact field from array that matches color:'red'.
Along with $project it will be more appropriate other wise matching elements will be clubbed together with other elements in document.
db.test.aggregate(
{ "$unwind" : "$shapes" },
{ "$match" : { "shapes.color": "red" } },
{
"$project": {
"_id":1,
"item":1
}
}
)
Likewise you can find for the multiple
db.getCollection('localData').aggregate([
// Get just the docs that contain a shapes element where color is 'red'
{$match: {'shapes.color': {$in : ['red','yellow'] } }},
{$project: {
shapes: {$filter: {
input: '$shapes',
as: 'shape',
cond: {$in: ['$$shape.color', ['red', 'yellow']]}
}}
}}
])
db.test.find( {"shapes.color": "red"}, {_id: 0})
Use aggregation function and $project to get specific object field in document
db.getCollection('geolocations').aggregate([ { $project : { geolocation : 1} } ])
result:
{
"_id" : ObjectId("5e3ee15968879c0d5942464b"),
"geolocation" : [
{
"_id" : ObjectId("5e3ee3ee68879c0d5942465e"),
"latitude" : 12.9718313,
"longitude" : 77.593551,
"country" : "India",
"city" : "Chennai",
"zipcode" : "560001",
"streetName" : "Sidney Road",
"countryCode" : "in",
"ip" : "116.75.115.248",
"date" : ISODate("2020-02-08T16:38:06.584Z")
}
]
}
Although the question was asked 9.6 years ago, this has been of immense help to numerous people, me being one of them. Thank you everyone for all your queries, hints and answers. Picking up from one of the answers here.. I found that the following method can also be used to project other fields in the parent document.This may be helpful to someone.
For the following document, the need was to find out if an employee (emp #7839) has his leave history set for the year 2020. Leave history is implemented as an embedded document within the parent Employee document.
db.employees.find( {"leave_history.calendar_year": 2020},
{leave_history: {$elemMatch: {calendar_year: 2020}},empno:true,ename:true}).pretty()
{
"_id" : ObjectId("5e907ad23997181dde06e8fc"),
"empno" : 7839,
"ename" : "KING",
"mgrno" : 0,
"hiredate" : "1990-05-09",
"sal" : 100000,
"deptno" : {
"_id" : ObjectId("5e9065f53997181dde06e8f8")
},
"username" : "none",
"password" : "none",
"is_admin" : "N",
"is_approver" : "Y",
"is_manager" : "Y",
"user_role" : "AP",
"admin_approval_received" : "Y",
"active" : "Y",
"created_date" : "2020-04-10",
"updated_date" : "2020-04-10",
"application_usage_log" : [
{
"logged_in_as" : "AP",
"log_in_date" : "2020-04-10"
},
{
"logged_in_as" : "EM",
"log_in_date" : ISODate("2020-04-16T07:28:11.959Z")
}
],
"leave_history" : [
{
"calendar_year" : 2020,
"pl_used" : 0,
"cl_used" : 0,
"sl_used" : 0
},
{
"calendar_year" : 2021,
"pl_used" : 0,
"cl_used" : 0,
"sl_used" : 0
}
]
}
if you want to do filter, set and find at the same time.
let post = await Post.findOneAndUpdate(
{
_id: req.params.id,
tasks: {
$elemMatch: {
id: req.params.jobId,
date,
},
},
},
{
$set: {
'jobs.$[i].performer': performer,
'jobs.$[i].status': status,
'jobs.$[i].type': type,
},
},
{
arrayFilters: [
{
'i.id': req.params.jobId,
},
],
new: true,
}
);
This answer does not fully answer the question but it's related and I'm writing it down because someone decided to close another question marking this one as duplicate (which is not).
In my case I only wanted to filter the array elements but still return the full elements of the array. All previous answers (including the solution given in the question) gave me headaches when applying them to my particular case because:
I needed my solution to be able to return multiple results of the subarray elements.
Using $unwind + $match + $group resulted in losing root documents without matching array elements, which I didn't want to in my case because in fact I was only looking to filter out unwanted elements.
Using $project > $filter resulted in loosing the rest of the fields or the root documents or forced me to specify all of them in the projection as well which was not desirable.
So at the end I fixed all of this problems with an $addFields > $filter like this:
db.test.aggregate([
{ $match: { 'shapes.color': 'red' } },
{ $addFields: { 'shapes': { $filter: {
input: '$shapes',
as: 'shape',
cond: { $eq: ['$$shape.color', 'red'] }
} } } },
])
Explanation:
First match documents with a red coloured shape.
For those documents, add a field called shapes, which in this case will replace the original field called the same way.
To calculate the new value of shapes, $filter the elements of the original $shapes array, temporarily naming each of the array elements as shape so that later we can check if the $$shape.color is red.
Now the new shapes array only contains the desired elements.
for more details refer =
mongo db official referance
suppose you have document like this (you can have multiple document too) -
{
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44b"
},
"results": [
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/d/d4/The_Kashmir_Files_poster.jpg",
"title": "The Kashmir Files",
"overview": "Krishna endeavours to uncover the reason behind his parents' brutal killings in Kashmir. He is shocked to uncover a web of lies and conspiracies in connection with the massive genocide.",
"originalLanguage": "hi",
"imdbRating": "8.3",
"isbookMark": null,
"originCountry": "india",
"productionHouse": [
"Zee Studios"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44c"
}
},
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/a/a9/Black_Adam_%28film%29_poster.jpg",
"title": "Black Adam",
"overview": "In ancient Kahndaq, Teth Adam was bestowed the almighty powers of the gods. After using these powers for vengeance, he was imprisoned, becoming Black Adam. Nearly 5,000 years have passed, and Black Adam has gone from man to myth to legend. Now free, his unique form of justice, born out of rage, is challenged by modern-day heroes who form the Justice Society: Hawkman, Dr. Fate, Atom Smasher and Cyclone",
"originalLanguage": "en",
"imdbRating": "8.3",
"isbookMark": null,
"originCountry": "United States of America",
"productionHouse": [
"DC Comics"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44d"
}
},
{
"yearOfRelease": "2022",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/0/09/The_Sea_Beast_film_poster.png",
"title": "The Sea Beast",
"overview": "A young girl stows away on the ship of a legendary sea monster hunter, turning his life upside down as they venture into uncharted waters.",
"originalLanguage": "en",
"imdbRating": "7.1",
"isbookMark": null,
"originCountry": "United States Canada",
"productionHouse": [
"Netflix Animation"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44e"
}
},
{
"yearOfRelease": "2021",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/7/7d/Hum_Do_Hamare_Do_poster.jpg",
"title": "Hum Do Hamare Do",
"overview": "Dhruv, who grew up an orphan, is in love with a woman who wishes to marry someone with a family. In order to fulfil his lover's wish, he hires two older individuals to pose as his parents.",
"originalLanguage": "hi",
"imdbRating": "6.0",
"isbookMark": null,
"originCountry": "india",
"productionHouse": [
"Maddock Films"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c44f"
}
},
{
"yearOfRelease": "2021",
"imagePath": "https://upload.wikimedia.org/wikipedia/en/7/74/Shang-Chi_and_the_Legend_of_the_Ten_Rings_poster.jpeg",
"title": "Shang-Chi and the Legend of the Ten Rings",
"overview": "Shang-Chi, a martial artist, lives a quiet life after he leaves his father and the shadowy Ten Rings organisation behind. Years later, he is forced to confront his past when the Ten Rings attack him.",
"originalLanguage": "en",
"imdbRating": "7.4",
"isbookMark": null,
"originCountry": "United States of America",
"productionHouse": [
"Marvel Entertainment"
],
"_id": {
"$oid": "63b5cfbfbcc3196a2a23c450"
}
}
],
"__v": 0
}
=======
mongo db query by aggregate command -
mongomodels.movieMainPageSchema.aggregate(
[
{
$project: {
_id:0, // to supress id
results: {
$filter: {
input: "$results",
as: "result",
cond: { $eq: [ "$$result.yearOfRelease", "2022" ] }
}
}
}
}
]
)
For the new version of MongoDB, it's slightly different.
For db.collection.find you can use the second parameter of find with the key being projection
db.collection.find({}, {projection: {name: 1, email: 0}});
You can also use the .project() method.
However, it is not a native MongoDB method, it's a method provided by most MongoDB driver like Mongoose, MongoDB Node.js driver etc.
db.collection.find({}).project({name: 1, email: 0});
And if you want to use findOne, it's the same that with find
db.collection.findOne({}, {projection: {name: 1, email: 0}});
But findOne doesn't have a .project() method.

Need help in generating REGEX for multi line JSON format

From the below JSON data, I want to cut out the attributes object and keep only Name of the Account. Sample JSON
{
"Accounts":[
{
"attributes":{
"type":"Account",
"url":"/services/data/v41.0/sobjects/Account/001S0000008mgjpIAA"
},
"Name":"Name+Test#Reseller"
},
{
"attributes":{
"type":"Account",
"url":"/services/data/v41.0/sobjects/Account/001S000000m5gyuIAA"
},
"Name":"Test Reseller Myself"
}
]
}
After matching with REGEX and replacing with "". The JSON should look like,
{
"Accounts" : [{
"Name" : "Name+Test#Reseller"
}, {
"Name" : "Test Reseller Myself"
}]
}
Use map and return only name property value
obj.accounts = obj.accounts.map( s => {Name: s.Name } );
I found myself an answer. I constructed a two regex
1. "attributes" : {\w*\W*\d*\D*\d*.\d*\D*\w*"\w*
2. {.\s*\S*

Search within array object

I have a the following json object --
{
"Title": "Terminator,
"Purchases": [
{"Country": "US", "Site": "iTunes"},
{"Country": "FR", "Site": "Google"}
]
}
Given the above object, here is how the search results show yield:
"Titles on iTunes in US" ==> YES, show "Terminator"
"Titles on Google in FR" ==> YES, show "Terminator"
"Titles on iTunes in FR" ==> NO
However, if I just AND the query, to get Titles with Purchase.Country="FR" and Titles with Purchase.Site="iTunes", it would erroneously show the above result, since both conditions are met. However, I want to restrict that facet to within the purchase item. The equivalent in python code would be:
for purchase in item['Purchases']:
if purchase['Country'] == "FR" and purchase['Site'] == "iTunes":
return True
Currently it works like this:
for purchase in item['Purchases']:
if purchase['Country'] == "FR":
has_fr = True
if purchase['Site'] == "iTunes":
has_itunes = True
if has_itunes and has_fr: return True
How would this be done in ElasticSearch?
First, you need to index the "Purchases" field as a nested field, by defining the mapping of your object type like this:
{
"properties" : {
"Purchases" : {
"type" : "nested",
"properties": {
"Country" : {"type": "string" },
"Site" : {"type": "string" }
}
}
}
}
Only then will ElasticSearch keep the association between the individual countries and the individual sites, as described here.
Next, you should use a nested query, such as this one:
{ "query":
{ "nested" : {
"path" : "Purchases",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"match" : {"Purchases.Country" : "US"}
},
{
"match" : {"Purchases.Site" : "iTunes"}
}
]
}
}
}
}
}
This will return your object if the query combines "US" and "iTunes", but not if it combines "US" and "Google". The details are described here.

How to map UML composition cardinality to JSON schema?

How to specify that a property of type object can appear only 1 time (i think this is default), N times, or any times? Or even not at all.
The question is, how to translate the standard UML composition cardinality information (min..max) to JSON Schema in case of properties of type 'object'?
"A" : {
"type" : "object",
"properties" : {
"B" : {
"type" : "object"
},
},
}
based on this schema, A may contain exactly one B, however I need to be able to specify:
- if it may contain none
- it may contain more (n)
- it may contain any
Thanks:
Endre
If you want to show the meta-definition info in JSon, a natural solution would be to add a "MultiplicityElement" and "AggregationKind" attributes (like in the UML metamodel):
{
"A": {
"type": "object",
"properties": [
{
"B": {
"type": "object",
"AggregationKind": "composite",
"MultiplicityElement": {
"lower": 0,
"upper": "n"
}
}
}
]
}
}
You might want to use "class" instead of "object" in this case, since you actually define your class structure. Alternative values for AggregationKind are "shared" (for aggregation) or "none".
Note that I put "properties" in a [] brackets, to indicate that there can be further properties added.
UPDATE (after the 1st comment)
First of all - the JSon is perfectly valid. Take a lok at this site: http://jsonlint.com/ I don't have time to investigate the reason of the fault on the one proposed by you, I suspect it has to do with the schema.
And more important - be careful here, I think you are mixing meta-model with model-information. I suspected this during my original answer and now you practically confirmed it.
The question is do you intend to show description of a class model (meta-model level) or description of a object model (model level).
If this is a class model description: change type to "class" and describe each class only once
If this is an object model: add a tag "class" to indicate the base class, use "values" instead of the "properties", use "property" instead of "type" to indicate the corresponding properties, remove AggregationKind and MultiplicityElement.
Or clarify your intention :)
Schema:
{
"type" : "object",
"properties" : {
"A" : {
"type" : "object",
"properties" : {
"B" : {
"type" : "array",
"minItems" : 1,
"maxItems" : 2
}
},
"required" : [ "B" ]
}
}
}
Valid instance:
{
"A": {
"B" : [ 1 ]
}
}
Another valid instance:
{
"A": {
"B" : [ 1, 2 ]
}
}
A not valid instance:
{
"A": {
}
}
Another not valid instance:
{
"A": {
"B" : []
}
}
Yet another not valid instance:
{
"A": {
"B" : [ 1, 2, 3]
}
}