How to import a big JSON-file to a Docker-swarm cluster with ELK stack? - json

Basically I want to import JSON-data into (Logstash->ElasticSearch->) Kibana, but I'm completely new and stuck at the different methods, which I do not fully understand and get errors or no output.
What I've got is a file test.json containing Wikipedia-data in this format:
{
"results": [
{
"curr": "Ohio_\"Heartbeat_Bill\"",
"n": 43,
"prev": "other-external",
"type": "external"
},
{
"curr": "Ohio_\"Heartbeat_Bill\"",
"n": 1569,
"prev": "other-search",
"type": "external"
},
{
"curr": "Ohio_\"Heartbeat_Bill\"",
"n": 11,
"prev": "other-internal",
"type": "external"
},
...
And so on. The file is 1.3Mb big, because I've deleted some of the largest examples.
I tried the curl command:
cat test.json | jq -c '.[] | {"index": {}}, .' | curl -XPOST localhost:9200/_bulk --data-binary #-
and
curl -s -XPOST localhost:9200/_bulk --data-binary #test.json
and
write "{ "index" : { } }" at the beginning of the document
I also tried:
curl -XPUT http://localhost:9200/wiki -d '
{
"mappings" : {
"_default_" : {
"properties" : {
"curr" : {"type": "string"},
"n" : {"type": "integer"},
"prev" : {"type": "string"},
"type" : {"type": "string"}
}
}
}
}
';
But I always get this error:
{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}
Or when I use:
curl localhost:9200/wiki -H "Content-type:application/json" -X POST -d #test.json
I get:
{"error":"Incorrect HTTP method for uri [/wiki] and method [POST], allowed: [GET, HEAD, DELETE, PUT]","status":405}
And when I replace "wiki" with "_bulk", like all the examples seem to have in common, then I get:
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication token for REST request [/_bulk]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401
I also have copy-pasted-and-adjusted-as-far-as-I-understood the conf-file in Kibana-Logstash-Pipeline like this:
input
{
file
{
codec => multiline
{
pattern=> '^\{'
negate=> true
what=> previous
}
path => ["/home/user/docker-elastic/examples/pretty.json"]
start_position => "beginning"
sincedb_path => "/dev/null"
exclude => "*.gz"
}
}
filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}
}
output
{
elasticsearch {
protocol => "http"
codec => json
host => "localhost"
index => "wiki_json"
embedded => true
}
stdout { codec => rubydebug }
}
But when I click "create and deploy" nothing happens.
So I have tried some examples, but like I said - I don't fully understand them and have therefore trouble getting my data to Kibana. I've written Logstash and ElasticSearch, because I would love to pass the data by using those, too.
Can somebody please explain to me, how I can pass this data directly, without manually altering the file? Many answers said that the data cannot be passed in the structure I have but must be "one line, one input"-only. But I cannot alter the whole file with nearly 40000 data by hand and I would like not have to write a python-script for it..
Maybe there is a tool or something? Or maybe I'm just too stupid to understand the syntax and am doing something wrong?
Any help is appreciated!
Thank you in advance!

Like #Ian Kemp answered in the comment section, the Problem was that I used POST and not PUT. After that I got an error saying that authentification failed, so I googled for it and got the final answer:
curl elastic:changeme#localhost:9200/wiki -H "Content-type: application/json" -X PUT -d #test.json
with the index line in the file.
This is the structure of how I finally got the data to be in Elasticsearch :)
THANK YOU very much Ian Kemp!

Related

Fiware Scorpio appending new attribute to existing entity

I tried to append a new entity to an existing entity.
Same as in the official example on:
https://scorpio.readthedocs.io/en/latest/API_walkthrough.html#updating-an-entity-appending-to-an-entity
Try to append attribute humidity:
curl localhost:9090/ngsi-ld/v1/entities/house2%3Asmartrooms%3Aroom1/attrs -s -S -X PATCH -H 'Content-Type: application/json' -H 'Link: https://pastebin.com/raw/Mgxv2ykn' -d #- <<EOF
{
"humidity": {
"value": 34,
"unitCode": "PER",
"type": "Property",
"providedBy": {
"type": "Relationship",
"object": "smartbuilding:house2:sensor2222"
}
}
}
But I receive the error:
{
https://uri.etsi.org/ngsi-ld/default-context/notUpdated : [ {
https://uri.etsi.org/ngsi-ld/attributeName : {
"#id" : https://uri.etsi.org/ngsi-ld/default-context/humidity
},
https://uri.etsi.org/ngsi-ld/reason : "attribute not found in original entity"
} ]
}
Can anybody tell me, if they encounter the same problem?
Or how to fix this?
Thank you in advance!
Hi if you want to append new attribute in existing entity so you should use POST request instead of PATCH.
Sorry for inconvenience, we will update in document as well.
Thanks,
Amit Raghav.

Sending nested JSON object using HTTPie

I am trying to use HTTPie to parse to send some nested JSON object, but I can not find how. It is pretty clear how to send a JSON object but not a nested one such as
{ "user": { "name": "john"
"age": 10 } }
Update for HTTPie 3.0 released in January 2022:
There’s now built-in support for nested JSON using the HTTPie language:
$ http pie.dev/post \
tool[name]=HTTPie \
tool[about][homepage]=httpie.io \
tool[about][mission]='Make APIs simple and intuitive' \
tool[platforms][]=terminal \
tool[platforms][]=desktop \
tool[platforms][]=web \
tool[platforms][]=mobile
{
"tool": {
"name": "HTTPie",
"about": {
"mission": "Make APIs simple and intuitive",
"homepage": "httpie.io"
},
"platforms": [
"terminal",
"desktop",
"web",
"mobile"
]
}
}
You can learn more about nested JSON in the docs: https://httpie.io/docs/cli/nested-json
Old answer for HTTPie older than 3.0:
You can pass the whole JSON via stdin:
$ echo '{ "user": { "name": "john", "age": 10 } }' | http httpbin.org/post
Or specify the raw JSON as value with :=:
$ http httpbin.org/post user:='{"name": "john", "age": 10 }'
I like this way:
$ http PUT localhost:8080/user <<<'{ "user": { "name": "john", "age": 10 }}'
It is preferrable because it has the same prefix as the related commands, and so it is convenient to find the commands with Ctrl+R in bash:
$ http localhost:8080/user/all
$ http GET localhost:8080/user/all # the same as the previous
$ http DELETE localhost:8080/user/234
If you have fishshell, which doesn't have Here Strings, I can propose the following workaround:
~> function tmp; set f (mktemp); echo $argv > "$f"; echo $f; end
~> http POST localhost:8080/user < (tmp '{ "user": { "name": "john", "age": 10 }}')
Another approach mentioned in the httpie docs is using a JSON file; this has worked well for me for payloads that are more verbose and deeply nested.
http POST httpbin.org/post < post.json
On Windows 10 (cmd.exe) the syntax is a little bit different due to quoting rules. Properties/strings need to be surrounded by double quotes.
http -v post https://postman-echo.com/post address:="{""city"":""london""}"
POST /post HTTP/1.1
Content-Type: application/json
Host: postman-echo.com
User-Agent: HTTPie/2.3.0
{
"address": {
"city": "london"
}
}
You can also send the whole object using echo, and without double quoting.
echo {"address": {"city":"london"} } | http -v post https://postman-echo.com/post

elasticsearch bulk insert JSON file

I have the following JSON file
I have used awk to get rid of empty spaces, trailing, next line
awk -v ORS= -v OFS= '{$1=$1}1' data.json
I have added a create request at the top of my data.json followed by \n and the rest of my data.
{"create": {"_index":"socteam", "_type":"products"}}
When I issue bulk submit request, I get the following error
CURL -XPUT http://localhost:9200/_bulk
{
"took": 1,
"errors": true,
"items": [
{
"create": {
"_index": "socteam",
"_type": "products",
"_id": "AVQuGPff-1Y7OIPIJaLX",
"status": 400,
"error": {
"type": "mapper_parsing_exception",
"reason": "failed to parse",
"caused_by": {
"type": "not_x_content_exception",
"reason": "Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes"
}
}
}
}
]
Any idea on what this error mean? I haven't created any mapping, I'm using vanilla elasticsearch
Accordingly to this doc, you have to specify index and type in URL:
curl -XPUT 'localhost:9200/socteam/products/_bulk?pretty' --data-binary "#data.json"
It works for PUT and POST methods.
And your data.json file should have structure like:
{"index":{"_id":"1"}}
{"name": "John Doe" }
{"index":{"_id":"2"}}
{"name": "Jane Doe" }
Maybe there present another method to import data, but i know just this... Hope it'll help...

How to convert CURL into URI in Elasticsearch

I have a curl command for elasticsearch aggregation as below.
curl -XGET "http://localhost:9200/employee/_search?search_type=count&pretty" -d '{
"aggregations": {
"profile": {
"terms": {
"field": "_type"
},
"aggs": {
"hits": {
"top_hits": {
"size": 1
}
}
}
}
}
}'
I want to search these above curl into my htmlpage in browser, how to convert this into normal url like URI search in elasticsearch ?
Please help me to convert above to url ?
You can use the source query string parameter in order to pass the body directly in the URL
curl -XGET 'http://localhost:9200/employee/_search?search_type=count&pretty&source={"aggregations":{"profile":{"terms":{"field":"_type"},"aggs":{"hits":{"top_hits":{"size":1}}}}}}'
^
|
use the source parameter

cURL Response JSON RestAPI

I need to delete some wrong data, inserted in a lot of processes, and I need to figure if this is possible with cURL and rest API, with a script in sh, batch or something like this:
curl -u admin:admin -i -H "Accept: application/json" -X GET "http://json_bpm.com/wle/v1/service/6112449?action=getData&fields=context"
First I just need the result map.
Output:
{"status":"200","data":{"result":"{\"context\":{\"name\":\"xxx\" (...)
"resultMap":{"context":{"name\":"xxx\" (...) }}}
Because I need to remove the userDelete array (see below) for thousands of processes, and set this again using curl. If you know how to remove arrays from JSON too, you're the man. :)
{
"context": {
"name": "Change Process",
"startUser": {
"user": "0001"
},
"endUser": {
"user": "0001"
},
"userDelete": {
"user": "0002"
},
"origin": "GUI",
"userAction": "Change Process"
}
}