Bulk loading JSON object as document into elasticsearch - json

Is there a way to bulk load the data below into elasticsearch without modifying the original content? I POST each object to be a single document. At the moment I'm using Python to parse through individual objects and POST them one at a time.
{
{"name": "A"},
{"name": "B"},
{"name": "C"},
{"name": "D"},
}
Doing this type of processing in production from REST servers into elasticsearch is taking a lot of time.
Is there a single POST/curl command that can upload the file above at once and elasticsearch parses it and makes each object into its own document?
We're using elasticsearch 1.3.2

Yes, you can do bulk api via curl by using the _bulk endpoint. But not custom parsing. Whatever process that creates the file can format it to ES specification if that is an option. See here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html
There is also bulk support in python via helper. See here:
http://elasticsearch-py.readthedocs.org/en/master/helpers.html

Related

Azure Data Factory - extracting information from Data Lake Gen 2 JSON files

I have an ADF pipeline loading raw log data as JSON files into a Data Lake Gen 2 container.
We now want to extract information from those JSON files and I am trying to find the best way to get information from said files.
I found that Azure Data Lake Analytics and U-SQL scripts are pretty powerful and also cheap, but they require a steep learning curve.
Is there a recommended way to parse JSON files and extract information from them? Would Data Lake tables be an adequate storage for this extracted information and act then as a source for downstream reporting process?
And finally, will Azure Data Factory ever be able to parse nested arrays JSONs?
We can parse JSON files and extract information via data flow. We can parse nested arrays JSONs via Flatten transformation in mapping data flow.
Json example:
{
"count": 1,
"value": [{
"obj": 123,
"lists": [{
"employees": [{
"name": "",
"id": "001",
"tt_1": 0,
"tt_2": 4,
"tt3_": 1
},
{
"name": "",
"id": "002",
"tt_1": 10,
"tt_2": 8,
"tt3_": 1
}]
}]
}]
}
Flatten active settings and output preview:
Mapping data flow follows an extract, load, and transform (ELT) approach and works with staging datasets that are all in Azure. Currently, the following datasets can be used in a source transformation.
So I think using data flow in ADF is the easiest way to extract information and act then as a source for downstream reporting process.

Importing Well-Structured JSON Data into ElasticSearch via Cloud Watch

Is is there known science for getting JSON data logged via Cloud Watch imported into an Elasticsearch instance as well structured JSON?
That is -- I'm logging JSON data during the execution of an Amazon Lambda function.
This data is available via Amazon's Cloud Watch service.
I've been able to import this data into an elastic search instance using functionbeat, but the data comes in as an unstructured message.
"_source" : {
"#timestamp" : "xxx",
"owner" : "xxx",
"message_type" : "DATA_MESSAGE",
"cloud" : {
"provider" : "aws"
},
"message" : ""xxx xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx INFO {
foo: true,
duration_us: 19418,
bar: 'BAZ',
duration_ms: 19
}
""",
What I'm trying to do is get a document indexed into elastic that has a foo field, duration_us field, bar field, etc. Instead of one that has a plain text message field.
It seems like there are a few different ways to do this, but I'm wondering if there's a well trod path for this sort of thing using elastic's default tooling, or if I'm doomed to one more one-off hack.
Functionbeat is a good starting point and will allow you to keep it as "serverless" as possible.
To process the JSON, you can use the decode_json_fields processor.
The problem is that your message isn't really JSON though. Possible solutions I could think of:
A dissect processor that extracts the JSON message to pass it on to the decode_json_fields — both in the Functionbeat. I'm wondering if trim_chars couldn't be abused for that — trim any possible characters except for curly braces.
If that is not enough, you could do all the processing in Elasticsearch's Ingest pipeline where you probably stitch this together with a Grok processor and then the JSON processor.
Only log a JSON message if you can to make your life simpler; potentially move the log level into the JSON structure.

nifi invokehttp post complex json

I trying to use InvokeHttpProcessor in Apache NiFi to perform POST request with complex JSON body.
Accordingly this tutorial: http://www.tomaszezula.com/2016/10/30/nifi-and-http-post-configuration
I know how to use UpdateAttribute processor to add name/value pairs and then apply an additional transformation via AttributesToJSON.
But how to deal with complex JSON?
For example I have to perform request to GoogleAnalytics reporting API, so I need to perform this request:
POST https://analyticsreporting.googleapis.com/v4/reports:batchGet
{
"reportRequests":
[
{
"viewId": "XXXX",
"dateRanges": [{"startDate": "2014-11-01", "endDate": "2014-11-30"}],
"metrics": [{"expression": "ga:users"}]
}
]
}
any ideas?
You can use the GenerateFlowFile and ReplaceText processors to provide a template as the flowfile content and then populate the actual values. Once that JSON object is formed as flowfile content, it should be easy to send it via POST using InvokeHTTP

JSON logfile into Solr via Apache nifi

I'm tring to read a json log file and insert into solr collection using apache nifi.logfile is in following format(one json object perline)
{"#timestamp": "2017-02-18T02:16:50.496+04:00","message": "hello"}
{"#timestamp": "2017-02-18T02:16:50.496+04:00","message": "hello"}
{ "#timestamp": "2017-02-18T02:16:50.496+04:00","message": "hello"}
I was able to load the file and split by lines using different processes. How can i proceed further ?
You can use the PutSolrContentStream processor to write content to Solr from Apache NiFi. If each flowfile contains a single JSON record (and you should ensure you are splitting the JSON correctly even if it covers multiple lines, so examine SplitJSON vs. SplitText), each will be written to Solr as a different document. You can also use MergeContent to write in batches and be more efficient.
Bryan Bende wrote a good article on the Apache site on how to use this processor.

Parse json file in cocos2dx

i am working on cocos2d-x project i will use my code for ios and android both. Now i am using json file to store unlock data but i don't know how to parse the data in cocos2dx. can i have brief tutorial (demo code) so that i can proceed with my project.
i have created json file and save data in it. i put demo of inside json file.
{
"id": "food",
"name": "cherry",
"price": "20 coins",
"unlock": {
"level": 15,
"advance_unlock_price": "200 coins"
}
}
You can use rapidjson that is a part of Cocos2d-x.
Using Rapidjson in Cocos2D-X: Creating a JSON Document in Code and Serializing it
Use rapid json if your game has big big scale otherwise you can use jsonccp it is lightweight but not good as compared to rapidjson.rapidjson more fast than any parser.
rapid json:
http://rapidjson.org/
jsoncpp:
https://github.com/open-source-parsers/jsoncpp