Logstash - import nested JSON into Elasticsearch

Logstash - import nested JSON into Elasticsearch - json

I have a large amount (~40000) of nested JSON objects I want to insert into elasticsearch an index.
The JSON objects are structured like this:
{
"customerid": "10932"
"date": "16.08.2006",
"bez": "xyz",
"birthdate": "21.05.1990",
"clientid": "2",
"address": [
{
"addressid": "1",
"tile": "Mr",
"street": "main str",
"valid_to": "21.05.1990",
"valid_from": "21.05.1990",
},
{
"addressid": "2",
"title": "Mr",
"street": "melrose place",
"valid_to": "21.05.1990",
"valid_from": "21.05.1990",
}
]
}
So a JSON field (address in this example) can have an array of JSON objects.
What would a logstash config look like to import JSON files/objects like this into elasticsearch? The elasticsearch mapping for this index should just look like the structure of the JSON. The elasticsearch document id should be set to customerid.
input {
stdin {
id => "JSON_TEST"
}
}
filter {
json{
source => "customerid"
....
....
}
}
output {
stdout{}
elasticsearch {
hosts => "https://localhost:9200/"
index => "customers"
document_id => "%{customerid}"
}
}

If you have control of what's being generated, the easiest thing to do is to format you input as single line json and then use the json_lines codec.
Just change your stdin to:
stdin { codec => "json_lines" }
and then it'll just work:
cat input_file.json | logstash -f json_input.conf
where input_file.json has lines like:
{"customerid":1,"nested": {"json":"here"}}
{"customerid":2,"nested": {"json":"there"}}
and then you won't need the json filter.

Related

How to feed a value into a field in a json array in Gatling?

I am using Gatling to test an API that accepts a json body like below:
{
"data": {
"fields": [
{
"rank": 1
},
{
"name": "Jack"
}
]
}
}
I have created a file feeder.json that contains array of json objects like above.
Below is the feeder.json
[
{
"data": {
"fields": [
{
"rank": 1
},
{
"name": "Jack"
}
]
}
}
]
I have created another file template.txt that contains the template of above json.
Below is the template.txt
{
"data": {
"fields": [
{
"rank": ${data.fields[0].rank} //this is not working
},
{
"name": "Jack"
}
]
}
}
val jsonFeeder = jsonFile("feeder.json").circular
scenario("Test scenario")
.feed(jsonFeeder)
.exec(http("API call test")
.post("/data")
.body(ElFileBody("template.txt"))
.asJson
.check(status is 200))
I am feeding the feeder.json and also sending json body from template.json. The 'rank' property values should get set from feeder into the json body. But I am getting an error 'Map named 'data' does not contain key 'fields[0]'. Stuck with this.

Access by index syntax uses parens, not square braces.
#{data.fields(0).rank}

Parsing JSON from CouchDB to ElasticSearch via Logstash

I am not able to parse JSON from CouchDB to Elasticsearch index in the desired way.
My CouchDB data looks like this:
{
"_id": "56161609157031561692637",
"_rev": "4-4119e8df293a6354be4c9fd7e8b12e68",
"deleteFlag": "N",
"entryUser": "John",
"parameter": "{\"id\":\"14188\",\"rcs_p\":null,\"rcs_e\":null,\"dep_p\":null,\"dep_e\":null,\"dep_place\":null,\"rcf_p\":null,\"rcf_e\":null,\"rcf_place\":null,\"dlv_p\":\"3810\",\"dlv_e\":\"1569\",\"seg_no\":null,\"trans_type\":\"incoming\",\"trans_service\":\"delivery\"}",
"physicalId": "0",
"recordDate": "2020-12-28T17:50:16+05:45",
"tag": "CARGO",
"uId": "56161609157031561692637",
"~version": "CgMBKgA="
}
What I am trying to do is be able to search using the nested field of the parameter of the above JSON.
When I put the data in ES index it is stored like this:
{
"_index": "del3",
"_type": "_doc",
"_id": "XRCV9XYBx5PRwauO--qO",
"_version": 1,
"_score": 0,
"_source": {
"#version": "1",
"doc_as_upsert": true,
"doc": {
"physicalId": "0",
"recordDate": "2020-12-27T12:56:45+05:45",
"tag": "CARGO",
"~version": "CgMBGgA=",
"uId": "48541609052212485430933",
"_rev": "3-937bf92e6010afec13664b1d9d06844b",
"deleteFlag": "N",
"entryUser": "John",
"parameter": "{\"id\":\"4038\",\"rcs_p\":null,\"rcs_e\":null,\"dep_p\":null,\"dep_e\":null,\"dep_place\":null,\"rcf_p\":null,\"rcf_e\":null,\"rcf_place\":null,\"dlv_p\":\"5070\",\"dlv_e\":\"2015\",\"seg_no\":null,\"trans_type\":\"incoming\",\"trans_service\":\"delivery\"}"
},
"#timestamp": "2021-01-12T07:53:33.978Z"
},
"fields": {
"#timestamp": [
"2021-01-12T07:53:33.978Z"
],
"doc.recordDate": [
"2020-12-27T07:11:45.000Z"
]
}
}
I want to be able to access the fields inside the parameter (id, rcs_p, rcs_e, ..) in Elasticsearch.
Here is my logstash.conf file:
input {
couchdb_changes {
host => "<host_name>"
port => 5984
db => "mychannel_asset$management"
keep_id => false
keep_revision => true
#initial_sequence => 0
always_reconnect => true
sequence_path => "/usr/share/logstash/config/seqfile"
}
}
filter {
json {
source => "[parameter]"
remove_field => ["[parameter]"]
}
}
output {
if([doc][tag] == "CARGO") {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "del3"
user => elastic
password => changeme
}
}
}
How do I achieve my desired result? I also tried to do by creating a custom template by defining a nested type for parameter but no luck yet. Any help would be appreciated.

I think you did almost everything right. I'm not too sure about the actual structure, but one of these might work:
filter {
json {
source => "parameter"
target => "parameter"
}
}
filter {
json {
source => "[doc][parameter]"
target => "[doc][parameter]"
}
}
I don't know how CouchDB source input plugins works but it seems to be putting everything under doc object.

API POST request in Julia

I am trying to convert some Python code to Julia. Here is the Python code:
url = "http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101G/BefUtvKon1749"
json = {
"query": [
{
"code": "Kon",
"selection": {
"filter": "item",
"values": [
"1",
"2"
]
}
},
{
"code": "ContentsCode",
"selection": {
"filter": "item",
"values": [
"000000LV"
]
}
}
],
"response": {
"format": "px"
}
}
r = requests.post(url=url, json=json)
Below is the Julia code, that is not working, with this error message:
syntax: { } vector syntax is discontinued around path:8
top-level scope at population_data.jl:8
using DataFrames, DataFramesMeta, HTTP, JSON3
url = "http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101G/BefUtvKon1749"
json = {
"query": [
{
"code": "Kon",
"selection": {
"filter": "item",
"values": [
"1",
"2",
"1+2"
]
}
},
{
"code": "ContentsCode",
"selection": {
"filter": "item",
"values": [
"000000LV"
]
}
}
],
"response": {
"format": "px"
}
}
r = HTTP.post(url, json)
My attempts to solve this are the following:
Convert the json variable to a string using """ around it.
Converting the JSON string to Julia data types, using JSON3.read()
Passing the converted JSON string to the POST request. This gives the following error:
IOError(Base.IOError("read: connection reset by peer (ECONNRESET)", -54) during request(http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101G/BefUtvKon1749)
None of it works, and I am not even sure that it is about the JSON format. It could be that I am passing the wrong parameters to the POST request. What should I do?

One way of solving this consists in building the parameters as native julia data structures, and use JSON to convert and use them as the body of your PUT request:
Dictionaries in julia are built using a syntax like Dict(key => value). Arrays are built using a standard syntax: [a, b, c]. The julia native data structure equivalent to your parameters would look like this:
params = Dict(
"query" => [
Dict("code" => "Kon",
"selection" => Dict(
"filter" => "item",
"values" => [
"1",
"2",
"1+2"
]),
),
Dict("code"=> "ContentsCode",
"selection" => Dict(
"filter" => "item",
"values" => [
"000000LV"
]),
),
],
"response" => Dict(
"format" => "px"
))
Then, you can use JSON.json() to build the JSON representation of it as a string and pass it to the HTTP request:
using HTTP
using JSON
url = "http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101G/BefUtvKon1749"
# send the request
r = HTTP.request("POST", url,
["Content-Type" => "application/json"],
JSON.json(params))
# retrieve the response body as a string
b = String(r.body)

How to add root node in JSON for every object?

I convert excel file to JSON , to import it into my firebase DB.
On conversion, I have the JSON data in below format
[
{
"ProductNumber": "7381581",
"SKU": "test3",
},
{
"ProductNumber": "7381582",
"SKU": "test",
},
{..}
]
But I need it like this
{
"7381581" :{
"ProductNumber": "7381581",
"SKU": "test3",
},
"7381582":{
"ProductNumber": "7381582",
"SKU": "test",
},{..}
}
How can I make changes to the spreadsheet records to get the JSON in the above format ? (OR)
How should I add the key values to JSON dynamically?

You can use reduce as suggested to iterate over the original array and transform it into an object.
data.reduce((prev, current) => {
prev[current.ProductNumber] = current;
return prev;
}, {});
You can see a working example in the playground here.

json array parsing issue with logstash

We want to implement service request trace using http plugin of logstash in JSON Array format.
We are getting the following error when trying to parse the JSON array:
error:
:message=>"gsub mutation is only applicable for Strings, skipping", :field=>"message", :value=>nil, :level=>:debug, :file=>"logstash/filters/mutate.rb", :line=>"322", :method=>"gsub"}
:message=>"Exception in filterworker", "exception"=>#<LogStash::ConfigurationError: Only String and Array types are splittable. field:message is of type = NilClass>
My json array is :
{
"data": [
{
"appName": "DemoApp",
"appVersion": "1.1",
"deviceId": "1234567",
"deviceName": "moto e",
"deviceOSVersion": "5.1",
"packageName": "com.DemoApp",
"message": "testing null pointer exception",
"errorLog": "null pointer exception"
},
{
"appName": "DemoApp",
"appVersion": "1.1",
"deviceId": "1234567",
"deviceName": "moto e",
"deviceOSVersion": "5.1",
"packageName": "com.DemoApp",
"message": "testing illegal state exception",
"errorLog": "illegal state exception"
}
]
}
my logstash config is :
input {
http {
codec => "plain"
}
}
filter{
json {
source => "message"
}
mutate { gsub => [ "message", "},", "shr" ] }
split {
terminator => "shr"
field => "data"
}
}
}
output {
stdout { codec => "json" }
gelf{
host => localhost
facility => "%{type}"
level =>["%{SeverityLevel}", "INFO"]
codec => "json"
}
file{
path => "/chroot/result.log"
}
}
Any help would be appreciated.

Logstash has a default metadata field named message. So your json message field is overlapping that. Consider changing json field name message to another.
The other option maybe using target setting and referencing the target field like:
json { source => "message" target => "data"}
mutate { gsub => [ "[data][message]", "\}\,\r\n\r\n\{", "\}shr\{" ] }
I hope this helps.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Logstash - import nested JSON into Elasticsearch - json

Related

How to feed a value into a field in a json array in Gatling?

Parsing JSON from CouchDB to ElasticSearch via Logstash

API POST request in Julia

How to add root node in JSON for every object?

json array parsing issue with logstash

Categories

Resources