Elasticsearch - Sense - Indexing JSON files? - json

I'm trying to load some JSON files to my local ES instance via Sense, but I can't seem to figure the code out. I know ES has the Bulk API and the Index API, but I can't seem to bring the code together. How can I upload/index JSON files to my local ES instance using Sense? Thank you!

Yes, ES has a bulk api to upload JSON files to the ES cluster. I don't think that API is exposed in low level languages as in case of Sense it is Javascript in the browser. High level clients are available in Java or C# which expose more control over the ES cluster. I don't think chrome browser will support execution of this command.
To upload a JSON file to elastic using the bulk api.
1) This command uploads JSON documents from a JSON file.
curl -s -XPOST localhost:9200/_bulk --data-binary #path_to_file;
2)The JSON file should be formatted as follows:
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value3" }
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "doc" : {"field2" : "value2"} }
Where JSON object doc represents each JSON object data and the corresponding index JSON object represent metadata for that particular JSON doc like document id, type in index,index name.
link to bulk upload
Also you can refer my previous answer

Related

Importing CSV File in Elasticsearch

I am new to elasticsearch. Trying to import CSV file by following the guide
And successfully imported the file and it has created an index with the documents also.
But what I found that in every docs _id contains random unique id as a value. I want to have value of _id from the CSV file field (the CSV file which I'm importing contains a field with unique id for every row) using query or any other ways. And I do not know how to do that.
Even in docs it is not explained. A sample example of document of elasticsearch index is shown below
{
"_index" : "sample_index",
"_type" : "_doc",
"_id" : "nGHXgngBpB_Kjkqcxfj",
"_score" : 1.0,
"_source" : {
"categoryid" : "34128b58-9148-11eb-a8b3-0242ac130003",
"categoryname" : "Blogs",
"isdeleted" : "False"
}
while adding ingest pipeline with the following query
{
"set": {
"field": "_id",
"value": "{{categoryid}}"
}
}
it throwing an error with this message
You can achieve this by modifying the ingest pipeline used to ingest your CSV file.
In the Ingest pipeline area (Advanced section), simply add the following processor at the end of the pipeline and the document ID will be set accordingly:
...
{
"set": {
"field": "_id",
"value": "{{categoryid}}"
}
}
It should look like this:
Added following processor in ingest pipeline section and it works..
{
"processors": [
{
"set": {
"field": "_id",
"value": "{{categoryid}}"
}
}
]
}

Can we interchange JSON schema with YAML schema? or viceversa?

I have a device application which gets the data in JSON format. This JSON format is generated by another web based application using a YAML schema.
Now, as the web tool validates this JSON data file against the YAML schema, my device application also has to validate it against a schema. Since, the resource on my device is limited and we already have json schema validation in place, we are restricted to use schema in JSON format only.
So, my question is could we replace the YAML schema with JSON schema for the web tool? The web application has Swagger.
On another note, is there any existing script or open source tool to convert YAML schema to JSON schema?
Not sure about the OpenAPI definition. Its a simple schema file that will be used to validate JSON data. The JSON schema (draft v4) has below format. Our device application is in C++ language. Not sure about what is used in Web Tool, but it has some Swagger framework that generates the JSON data file for us.
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"definitions": {
...
"foobar_Result" : {
"type" : "object",
"properties" : {
"request" : {
"type" : "integer"
},
"success" : {
"type" : "boolean"
},
"payload" : {
"type" : "array", "items" : {"$ref" : "#/definitions/foobar_Parameter"}
}
},
"required" : ["request"],
"additionalProperties" : false
}
},
"$ref" : "#/definitions/foobar_Result"
}
If you are looking converting between API specification formats then this tool might help https://www.apimatic.io/transformer/

Moving mapping from old ElasticSearch to latest ES (5)

I've inherited some pretty old (v2.something) ElasticSearch instance running in cloud somewhere and need to get the data out starting with mappings to local instance of latest ES (v5). Unfortunately, it fails with following error:
% curl -X PUT 'http://127.0.0.1:9200/easysearch?pretty=true' --data #easysearch_mapping.json
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "unknown setting [index.easysearch.mappings.espdf.properties.abstract.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
],
"type" : "illegal_argument_exception",
"reason" : "unknown setting [index.easysearch.mappings.espdf.properties.abstract.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
},
"status" : 400
}
The mapping I got from old instance does contain some fields of this kind:
"espdf" : {
"properties" : {
"abstract" : {
"type" : "string"
},
"document" : {
"type" : "attachment",
"fields" : {
"content" : {
"type" : "string"
},
"author" : {
"type" : "string"
},
"title" : {
"type" : "string"
},
This "espdf" thing probably comes from Meteor's "EasySearch" component, but I have more structures like this in the mapping and new ES rejects each of them (I tried editing the mapping and deleting the "espdf" key and value).
How can I get the new ES to accept the mapping? Is this some legacy issue from 2.x ES and I should somehow convert this to new 5.x ES format?
The reason it fails is because the older ES had a plugin installed called mapper-attachments, which would add the attachment mapping type to ES.
In ES 5, this plugin has been replace by the ingest-attachment plugin, which you can install like this:
bin/elasticsearch-plugin install ingest-attachment
After running this command in your ES_HOME folder, restart your ES cluster and it should go better.

Error Loading json file on elasticsearch aws

I've just set up an elasticsearch domain using Elastic search service from aws.
Now I want to feed it with some json file using:
curl -XPOST 'my-aws-domain-here/_bulk/' --data-binary #base_enquete.json
according to the documentation here.
My json file looks like the following:
[{"INDID": "10040","DATENQ": "29/7/2013","Name": "LANDIS MADAGASCAR SA"},
{"INDID": "10050","DATENQ": "14/8/2013","Name": "MADAFOOD SA","M101P": ""}]
which gives me this error:
{"error":"ActionRequestValidationException[Validation Failed: 1: no requests added;]","status":400}
I tried without [ and ] same error!
Note that I already set up access policy to be open to the world for dev stage purpose.
Any help of any kind will be helpful :)
This is because of the wrong format of data.
Please go through the documentation here.
Ideally it should be in format -
action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
This means that content of the file you are sending should be in following format -
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{"INDID": "10040","DATENQ": "29/7/2013","Name": "LANDIS MADAGASCAR SA"}
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{"INDID": "10050","DATENQ": "14/8/2013","Name": "MADAFOOD SA","M101P": ""}

Trying to make nested json and point to the same object using multiple keys

let say i have this JSON data
"value1" : { "name" : "Foo" }
"value2" : { "name" : "14" }
"value3" : { "gender" : "Male" }
now i am trying to do this
"value1', "value2", "value3" : { "name" : "Foo" }
or maybe this if at all possible
["value1', "value2", "value3"] : { "name" : "Foo" }
so in a nutshell i have data that i would like to access using multiple pointers point to the same data in a JSON formate so that i don't have to repeat the same data for different pointers
here is an example of data:
"Model 1" : { "E-Series" : ["Green", "Purple"] }
let say "Model 2" has same info as "Model 1" how can point "Model 2" to "Model 1" data object in JSON without repeating the same code over and over again
This is not possible in JSON format. You can simulate it in code for example for Model2 set "ref": "Model1" and then programmatically read data from object Model1.
JSON hasn't got this feature by design.
This is not correct JSON syntax, and JSON does not have provisions for links. Your options are for encoding object references are:
LD+JSON (http://json-ld.org/)
HAL+JSON (http://stateless.co/hal_specification.html)
JSON-R (http://java.dzone.com/articles/json-r-json-extension-deals)
dojox.json.ref (https://dojotoolkit.org/reference-guide/1.10/dojox/json/ref.html)
etc, etc, ...
your custom data model for references
The benefit of using something more or less standard is a better (future) integration, but it may not be relevant for your task.