Need help in generating REGEX for multi line JSON format - json

From the below JSON data, I want to cut out the attributes object and keep only Name of the Account. Sample JSON
{
"Accounts":[
{
"attributes":{
"type":"Account",
"url":"/services/data/v41.0/sobjects/Account/001S0000008mgjpIAA"
},
"Name":"Name+Test#Reseller"
},
{
"attributes":{
"type":"Account",
"url":"/services/data/v41.0/sobjects/Account/001S000000m5gyuIAA"
},
"Name":"Test Reseller Myself"
}
]
}
After matching with REGEX and replacing with "". The JSON should look like,
{
"Accounts" : [{
"Name" : "Name+Test#Reseller"
}, {
"Name" : "Test Reseller Myself"
}]
}

Use map and return only name property value
obj.accounts = obj.accounts.map( s => {Name: s.Name } );

I found myself an answer. I constructed a two regex
1. "attributes" : {\w*\W*\d*\D*\d*.\d*\D*\w*"\w*
2. {.\s*\S*

Related

Difference between Set and List for DynamoDB

I'm uploading data to my Dynamo Db table with the sensor's data. I created a List for sensors locations, however, I heard that it might be better to create a set and I could not find a difference between the way I upload data and the way it would be presented. Currently if I use List("L":) I have [ { "S" : "Culpeper VA" }, { "S" : "Colorado Springs Co" } ] in my table. Would it be different if I use Set instead and what attribute on the left I would use instead of "L" for list?
{
"Sensor" : {
"S": "Sensor1"
},
"SensorDescription": {
"S" : "Sensor to meassure water temperature"
},
"ImageFile" : {
"S" : "/Sensors/images/acoustic-elementarray.jpg"
},
"SampleRate":{
"N" : "2048"
},
"Locations" : {
"L": [
{
"S" : "Culpeper VA"
},
{
"S": "Colorado Springs Co"
}
]
}
}
That is my JSON that I use with put item API call
Now I figured out, the best way in my case would be to use String Set instead, update JSON is :
"Locations" : {
"SS": [ "Colorado Springs Co" , "Culpeper VA"
]
}

Like expression in JSON Path

I have a JOSN some thing like this
{
"Room" :{
"Book" :
{
"name" : "abc"
},
"Book1":
{
"name" : "xyz"
},
"Book3":
{
"name" : "abc123"
},
"Tv" :
{
"name" : "zyc"
},
"audio":
{
"name" :"sound ++"
}
}
}
From this JSON I want to filter out all book elements("book","book1","book2") using JSONPATH
As I got to know in in JSONPATH we do not have any "Like" type syntax , but we can do that by using regex.
I tried with this
$.Room[?(/^.*book.*$/i.test(#.Room))]
But this expression return nothing from the JSON.
Can any one help me out in this...
Maybe this link will be helpful for you . Check the table
$..book[?(#.author =~ /.*Tolkien/i)]. This expression brings All books whose author name ends with Tolkien (case-insensitive) --> Modify it for yours

Search inside JSON with Elastic

I have an index/type in ES which has the following type of records:
body "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}"
type "xxxx"
source "11.2.21.0"
The body field is a JSON.So I want to search for example the records that have in their JSON body Status:0.
Query should look something like this(it doesn't work):
GET <host>:<port>/index/type/_search
{
"query": {
"match" : {
"body" : "Status:0"
}
}
}
Any ideas?
You have to change the analyser settings of your index.
For the JSON pattern you presented you will need to have a char_filter and a tokenizer which remove the JSON elements and then tokenize according to your needs.
Your analyser should contain a tokenizer and a char_filter like these ones here:
{
"tokenizer" : {
"type": "pattern",
"pattern": ","
},
"char_filter" : [ {
"type" : "mapping",
"mappings" : [ "{ => ", "} => ", "\" => " ]
} ],
"text" : [ "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}" ]
}
Explanation: the char_filter will remove the characters: { } ". The tokenizer will tokenize by the comma.
These can be tested using the Analyze API. If you execute the above JSON against this API you will get these tokens:
{
"tokens" : [ {
"token" : "Status:0",
"start_offset" : 2,
"end_offset" : 13,
"type" : "word",
"position" : 0
}, {
"token" : "Time:2017-10-3 16:39:58.591",
"start_offset" : 15,
"end_offset" : 46,
"type" : "word",
"position" : 1
} ]
}
The first token ("Status:0") which is retrieved by the Analyze API is the one you were using in your search.

Delete / add nested objects in Elastic search

I cannot find examples in the Elastic manual on nested objects on how to modify fields and nested objects of documents using RESTful commands in Kibana Sense. I am looking for something similar to Solrs atomic updates here, which allow to update specific fields of documents.
How do RESTful commands in Kibana Sense look like that accomplish this? The only related info in the manual I can find is on Partial Updates to Documents, but I do not know how this can be applied for this use case.
For example, straight from the Elastic docs:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
How can I delete an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
}
]
}
How can I add an entry in the nested object, so that the document "1" looks like:
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}
You will have to use scripted updates unless you want to fetch all nested objects then add / remove items and re-index them all which is the previous answer proposed. However if you have a lot of nested documents you should be doing partial updates / additions and deletes. It is much quicker from data transfer and indexing point of view.
Here is a good article how to do scripted updates in general:
https://iridakos.com/programming/2019/05/02/add-update-delete-elasticsearch-nested-objects
Unless I misunderstand your ask, you just post the updated document version to the same document id each time you want.
To delete a nested document (or any field):
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
}
]
}
To add a user, add it to the list:
PUT my_index/my_type/1
{
"group" : "fans",
"user" : [
{
"first" : "Alice",
"last" : "White"
},
{
"first" : "Peter",
"last" : "Parker"
}
]
}
Note: Documents in elasticsearch are immutable. Making a change to a single field causes the entire document to be re-indexed. Nested documents are always re-indexed with the parent document so if you change a field in the parent the nested document is also re-indexed. This can be a performance issue if the nested documents are large and the parents have frequent changes.
For this specific use case, you must use a scripted update. In javascript the call will look something like:
const documentUpdateInstructions = {
index: "index-name",
id: "document-id",
body: {
script: {
lang: "painless",
source: `ctx._source.myNestedObject.removeIf(object -> object.username == params.username);`,
params: {
username: "my_username"
},
},
},
};
await client.update(documentUpdateInstructions);
This takes a document in the form of
document._source = {
...
"myNestedObject": [
{
"username": "my_username",
...
},
{
"username": "not_my_username",
...
}
]
}
and deletes the object inside myNestedObject who's username matches the username provided (in this case my_username). The resulting document will be:
document._source = {
...
"myNestedObject": [
{
"username": "not_my_username",
...
}
]
}

how to iterating through json data in python?

I have a json output like below,
{
"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},
"hits":{"total":2,"max_score":1.0,
"hits":
[
{
"_index":"management",
"_type":"",
"_id":"/home/myfld1/myid1",
"_score":1.0,
"_source" :
{
"newslides": "User Mgmt1 ",
"metaData":
{
"fileName": "file1",
}
}
},
{
"_index":"management",
"_type":"",
"_id":"/home/myfld3/myid3",
"_score":1.0,
"_source" :
{
"newslides": "User mgmt2 ",
"metaData":
{
"fileName": "file2",
}
}
}
]
}
}
I am trying to get the "fileName" field from the above json. I have tried like this,
for filenames in response["hits"]["hits"][0]["_source"]["newslides"]["metaData"]:
filearray.append(filenames["fileName"])
But i am getting one error like below,
for filenames in response["hits"]["hits"][0]["_source"]["newslides"]["metaData"]:
TypeError: string indices must be integers
Please help me to get that file name.
It is simply a dictionary with some lists and other dictionaries in it. Just be careful and go through the elements iteratively:
filearray = []
for x in response["hits"]["hits"]:
filearray.append(x["_source"]["metaData"]["fileName"])
or in one line:
filearray = [x["_source"]["metaData"]["fileName"] for x in response["hits"]["hits"]]