I am new to ElasticSearch and I need to know if it is possible to assign a custom id to my input. I am using the LinkSO dataset. The JSON file should look like this:
{"index": {"_id": "11"}}
{"title": "title1", "body": "body1", "answer": "answer1"}
{"index": {"_id": "22"}}
{"title": "title2", "body": "body2", "answer": "answer2"}
My PUT command is
PUT /test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "whitespace",
"filter": [
"lowercase",
"porter_stem"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "BM25"
},
"body": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "BM25"
},
"answer": {
"type": "text",
"analyzer": "my_analyzer",
"similarity": "BM25"
}
}
}
}
I am using this command on cmd
curl -s -H "Content-Type: application/json" -XPOST localhost:9200/test/_doc/_bulk --data-binary "#test.json"
When inserting only one json data like
{"title": "title1", "body": "body1", "answer": "answer1"}
It works fine. The data is inserted and I get the output. But whenever I try to add 2 data like
{"title": "title1", "body": "body1", "answer": "answer1"}
{"title": "title2", "body": "body2", "answer": "answer2"}
The json file shows end of file expected error. The same happens when I use add index on the json file.
Can someone guide me how to work it and how can I add the "_id" as well because ElasticSearch is giving id on its own as _bulk
In your file you need to have the data as below. To set the id you need this line, {"index": {"_id": "11"}}, before each document. Another important point is to add a line break at the end of the file.
{"index": {"_id": "11"}}
{"title": "title1", "body": "body1", "answer": "answer1"}
{"index": {"_id": "22"}}
{"title": "title2", "body": "body2", "answer": "answer2"}
\n
Related
So i have reviewed and tested the code in Combining the bookmark JSON files from Chome and Edge Chromium into one file using PowerShell? which was very handy to understand some things about JSON and powershell.
I have Chrome's and Edge's bookmarks exported into 2 discreet files by using the code in the link.
Now the problem i have is that if i combine the 2 it is changing the format for the new bookmark file is not properly readable by edge.
Ideally what I need to do is re-write the file adding entries to specific nodes (i am not sure about the JSON nomeclature so please excuse my ignorance)
So an Edge/Chrome bookmark file looks like:
{
"checksum": "555e69b0181a1a30d58cc3fff59f7dad",
"roots": {
"bookmark_bar": {
"children": [ {
"date_added": "13301408662223632",
"guid": "7b8047d6-eb01-4629-a960-a81307fbb0b0",
"id": "17",
"name": "Home - BBC News",
"type": "url",
"url": "https://www.bbc.co.uk/news"
}, {
"date_added": "13301408702331979",
"guid": "eb1d3e92-eb5a-4011-a6ea-e6f15ec9a58a",
"id": "20",
"name": "Your stream on SoundCloud",
"type": "url",
"url": "https://soundcloud.com/stream"
} ],
"date_added": "13301408646379132",
"date_modified": "13301408702331979",
"guid": "0bc5d13f-2cba-5d74-951f-3f233fe6c908",
"id": "1",
"name": "Bookmarks bar",
"type": "folder"
},
"other": {
"children": [ ],
"date_added": "13301408646379134",
"date_modified": "0",
"guid": "82b081ec-3dd3-529c-8475-ab6c344590dd",
"id": "2",
"name": "Other bookmarks",
"type": "folder"
},
"synced": {
"children": [ ],
"date_added": "13301408646379137",
"date_modified": "0",
"guid": "4cf2e351-0e85-532b-bb37-df045d8f8d0f",
"id": "3",
"name": "Mobile bookmarks",
"type": "folder"
}
},
"version": 1
}
i would like to be able to add to nodes under "bookmark_bar" and "other":
So my code from reading these 2 files and stroring them as JSON is:
$data1 = Get-Content $BackupStore/Bookmarks-chrome.json | ConvertFrom-Json
$data2 = Get-Content $BackupStore/Bookmarks-edge.json | ConvertFrom-Json
$store1=$data1.roots
$store2=$data2.roots
How should i proceed now combining them in their respective nodes?
I ran curl command and then parsed the value ("id").
request:
curl "http://192.168.22.22/test/index/limit:1/page:1/sort:id/pag1.json" | jq -r '.[0].id'
curl response:
[
{
"id": "381",
"org_id": "9",
"date": "2018-10-10",
"info": "THIS IS TEST",
"uuid": "5bbd1b41bc",
"published": 1,
"an": "2",
"attribute_count": "4",
"orgc_id": "8",
"timestamp": "1",
"dEST": "0",
"sharing": "0",
"proposal": false,
"locked": false,
"level_id": "1",
"publish_timestamp": "0",
"disable_correlation": false,
"extends_uuid": "",
"Org": {
"id": "5",
"name": "test",
"uuid": "5b9bc"
},
"Orgc": {
"id": "1",
"name": "test",
"uuid": "5b9f93bdeac1b41bc"
},
"ETag": []
}
]
jq response:
381
Now I'm trying to get the "id" number 381, and then to create a new JSON file on the disk when I place the "id" number in the right place.
The new JSON file for example:
{
"request": {
"Event": {
"id": "381",
"task": "new"
}
}
}
Given your input, this works:
jq -r '{"request": {"Event": {"id": .[0].id, "task": "new"}}}' > file
I have a json file and I want to add some value from top in another place in json.
I am trying to use jq command line.
{
"channel": "mychannel",
"videos": [
{
"id": "10",
"url": "youtube.com"
},
{
"id": "20",
"url": "youtube.com"
}
]
}
The output would be:
{
"channel": "mychannel",
"videos": [
{
"channel": "mychannel",
"id": "10",
"url": "youtube.com"
},
{
"channel": "mychannel",
"id": "20",
"url": "youtube.com"
}
]
}
in my json the "channel" is static, same value always. I need a way to concatenate always in each video array.
Someone can help me?
jq .videos + channel
Use a variable to remember .channel in the later stages of the pipeline.
$ jq '.channel as $ch | .videos[].channel = $ch' tmp.json
{
"channel": "mychannel",
"videos": [
{
"id": "10",
"url": "youtube.com",
"channel": "mychannel"
},
{
"id": "20",
"url": "youtube.com",
"channel": "mychannel"
}
]
}
I am starting with elasticsearch now and i don't know anything about it.
I have folowing .JSON:
[
{
"label": "Admin Law",
"tags": [
"#admin"
],
"owner": "generalTopicTagText"
},
{
"label": "Judicial review",
"tags": [
"#JR"
],
"owner": "generalTopicTagText"
},
{
"label": "Admiralty/Shipping",
"tags": [
"#shipping"
],
"owner": "generalTopicTagText"
}
]
My mapping is this:
{
"topic_tax": {
"properties": {
"label": {
"type": "string",
"index": "not_analyzed"
},
"tags": {
"type": "string",
"index_name": "tag"
},
"owner": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
I need to put the first .Json into Elasticsearch, but it does not work.
All I know is that i am defining only 1 of this:
{
"label": "Judicial review",
"tags": [
"#JR"
],
"owner": "generalTopicTagText"
}
So when i try to put all of them with my elasticsearch.init, it will not work.
But I really don't know how to declare the mapping.Json to put the all .Json, it is like i need something like a for there.
You have to insert them json after json. But what you should do is use the bulk api of elasticsearch to insert multiple documents in one request. Check this api doc to see how it works
You can do something like this
curl -XPUT 'localhost:9000/es/post/1?version=2' -d '{
"text" : "your test message!"
}'
here is the documentation for index json with elasticsearch
I want to index & search nested json in solr. Here is my json code
{
"id": "44444",
"headline": "testing US",
"generaltags": [
{
"type": "person",
"name": "Jayalalitha",
"relevance": "0.334",
"count": 1
},
{
"type": "person",
"name": "Kumar",
"relevance": "0.234",
"count": 1
}
],
"socialtags": {
"type": "SocialTag",
"name": "US",
"importance": 2
},
"topic": {
"type": "Topic",
"name": "US",
"score": "0.936"
}
}
When I try to Index, I'm getting the error "Error parsing JSON field value. Unexpected OBJECT_START"
When we tried to use Multivalued Field & index, we couldn't able to search using the multivalued field? Its returning "Undefined Field"
Also Please advice if I need to do any changes in schema.xml file?
You are nesting child documents within your document. You need to use the proper syntax for nested child documents in JSON:
[
{
"id": "1",
"title": "Solr adds block join support",
"content_type": "parentDocument",
"_childDocuments_": [
{
"id": "2",
"comments": "SolrCloud supports it too!"
}
]
},
{
"id": "3",
"title": "Lucene and Solr 4.5 is out",
"content_type": "parentDocument",
"_childDocuments_": [
{
"id": "4",
"comments": "Lots of new features"
}
]
}
]
Have a look at this article which describes JSON child documents and block joins.
Using the format mentioned by #qux you will face "Expected: OBJECT_START but got ARRAY_START at [16]",
"code": 400
as when JSON starting with [....] will parsed as a JSON array
{
"id": "44444",
"headline": "testing US",
"generaltags": [
{
"type": "person",
"name": "Jayalalitha",
"relevance": "0.334",
"count": 1
},
{
"type": "person",
"name": "Kumar",
"relevance": "0.234",
"count": 1
}
],
"socialtags": {
"type": "SocialTag",
"name": "US",
"importance": 2
},
"topic": {
"type": "Topic",
"name": "US",
"score": "0.936"
}
}
The above format is correct.
Regarding searching. Kindly use the index to search for the elements of the JSON array.
The workaround for this can be keeping the whole JSON object inside other JSON object and the indexing it
I was suggesting to keep the whole data inside another JSON object. You can try the following way
{
"data": [
{
"id": "44444",
"headline": "testing US",
"generaltags": [
{
"type": "person",
"name": "Jayalalitha",
"relevance": "0.334",
"count": 1
},
{
"type": "person",
"name": "Kumar",
"relevance": "0.234",
"count": 1
}
],
"socialtags": {
"type": "SocialTag",
"name": "US",
"importance": 2
},
"topic": {
"type": "Topic",
"name": "US",
"score": "0.936"
}
}
]
}
see the syntax in http://yonik.com/solr-nested-objects/
$ curl http://localhost:8983/solr/demo/update?commitWithin=3000 -d '
[
{id : book1, type_s:book, title_t : "The Way of Kings", author_s : "Brandon Sanderson",
cat_s:fantasy, pubyear_i:2010, publisher_s:Tor,
_childDocuments_ : [
{ id: book1_c1, type_s:review, review_dt:"2015-01-03T14:30:00Z",
stars_i:5, author_s:yonik,
comment_t:"A great start to what looks like an epic series!"
}
,
{ id: book1_c2, type_s:review, review_dt:"2014-03-15T12:00:00Z",
stars_i:3, author_s:dan,
comment_t:"This book was too long."
}
]
}
]'
supported from solr 5.3