Querying with elasticsearch and jcard json - json

I have a data set containing vcards based on the JSON Jcard Mapping (https://www.rfc-editor.org/rfc/rfc7095) The problem is
that I want to search on the 'fn' field only.
The vcard data has the following format.
["vcard",
[
["version", {}, "text", "4.0"],
["fn", {}, "text", "John Doe"],
["gender", {}, "text", "M"],
["categories", {}, "text", "computers", "cameras"],
...
]
]
I'm creating a vcard document like this
> curl -X POST localhost:9200/vcards/id1 -d '{
"id":"id1",
"vcardArray" : ["vcard",
[
["version", {}, "text", "4.0"],
["fn", {}, "text", "John Doe"],
["gender", {}, "text", "M"]
]
],
"status":["registered"]
}'
Normally you would create a specific mapping so when the document is analysed a new index is created and searching for
a fn field would look something like this....
curl -v -X POST http://localhost:9200/vcards/_search -d '{ "query" :
{ "bool" : { "must": { "match" : {
"vcardArray.vcard.fn" :
{ "query" : "Rik Ribbers" , "type" : "phrase" }
} } } }
}'
A potential mapping would look like this, but this one is not working
> curl -X PUT http://localhost:9200/vcards -d '
{
"mappings": {
"vcardArray" : {
"type" : "nested",
"properties" : {
"vcard" : {
"type" : "index",
"index" : "not_analyzed"
}
}
}
}
}'
Any pointer to the correct mapping or query would be helpful.

Related

Index a JSON file into elasticsearch command/mapping errors

I'm new to ELK and I want to import a JSON file into Elasticsearch. this is my file:
{
"news":{
"1":{
"_score":1.0,
"_index":"newsvit",
"_source":{
"content":" \u0641\u0647\u06cc\u0645\u0647 \u062d\u0633\u0646\u200c\u0645\u06cc\u0631\u06cc: \u0627\u06af\u0631\u0686\u0647 \u062f\u0631 \u0647\u06cc\u0627\u0647\u0648\u06cc \u0627\u0646\u062a\u062e\u0627\u0628\u0627\u062a \u0631\u06cc\u0627\u0633\u062a \u062c\u0645\u0647\u0648\u0631\u06cc\u060c \u0645\u0648\u0636\u0648\u0639\u06cc \u0645\u0627\u0646\u0646\u062f \u0645\u0639\u0631\u0641\u06cc \u06a9\u0627\u0646\u062f\u06cc\u062f\u0627\u0647\u0627\u06cc \u0634\u0648\u0631\u0627\u06cc \u0634\u0647\u0631 \u062f\u0631 \u062d\u0627\u0634\u06cc\u0647 \u0642\u0631\u0627\u0631 \u06af\u0631\u0641\u062a\u0647\u060c \u0627\u0645\u0627 \u0627\u0645\u0633\u0627\u0644 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u067e\u0646\u062c\u0645\u06cc\u0646 \u062f\u0648\u0631\u0647 \u0627\u0646\u062a\u062e\u0627\u0628 \u0627\u0639\u0636\u0627\u06cc \u0634\u0648\u0631\u0627\u06cc \u0634\u0647\u0631\u060c \u0627\u06cc\u0646 \u0631\u0648\u06cc\u062f\u0627\u062f \u0628\u0647 \u0646\u0633\u0628\u062a \u062f\u0648\u0631\u0647\u200c\u0647\u0627\u06cc \u0642\u0628\u0644\u060c \u0628\u06cc\u0634\u062a\u0631 \u0645\u0648\u0631\u062f \u062a\u0648\u062c\u0647 \u0648\u0627\u0642\u0639 \u0634\u062f\u0647. \u0627\u06cc\u0646 \u0627\u0642\u0628\u0627\u0644\u060c \u0686\u0647 \u0627\u0632 \u0633\u0648\u06cc \u0686\u0647\u0631\u0647\u200c\u0647\u0627\u06cc \u0645\u0637\u0631\u062d \u0628\u0631\u0627\u06cc \u062b\u0628\u062a \u0646\u0627\u0645 \u0648 \u0686\u0647 \u0627\u0632 \u0633\u0648\u06cc \u0645\u0631\u062f\u0645 \u0628\u0631\u0627\u06cc \u0645\u0634\u0627\u0631\u06a9\u062a \u062f\u0631 \u0627\u06cc\u0646 \u0631\u0648\u06cc\u062f\u0627\u062f\u060c \u0639\u0644\u062a\u200c\u0647\u0627\u06cc \u06af\u0648\u0646\u0627\u06af\u0648\u0646\u06cc \u0645\u06cc\u200c\u062a\u0648\u0627\u0646\u062f \u062f\u0627\u0634\u062a\u0647 \u0628\u0627\u0634\u062f \u06a9\u0647 \u062a\u0648\u062c\u0647 \u0628\u0647 \u0622\u0646\u060c \u0645\u06cc\u200c\u062a\u0648\u0627\u0646\u062f \u0631\u0627\u0647\u06af\u0634\u0627\u06cc \u0627\u0639\u0636\u0627\u06cc \u0631\u06",
"lead":"\u062c\u0627\u0645\u0639\u0647 > \u0634\u0647\u0631\u06cc - \u0645\u06cc\u0632\u06af\u0631\u062f\u06cc \u062f\u0631\u0628\u0627\u0631\u0647 \u0639\u0645\u0644\u06a9\u0631\u062f \u062f\u0648\u0631\u0647\u200c\u0647\u0627\u06cc \u06af\u0630\u0634\u062a\u0647 \u0634\u0648\u0631\u0627\u06cc \u0634\u0647\u0631\u060c \u0622\u0646\u0686\u0647 \u0627\u0639\u0636\u0627\u06cc \u062c\u062f\u06cc\u062f \u0628\u0627\u06cc\u062f \u0645\u062f \u0646\u0638\u0631 \u062f\u0627\u0634\u062a\u0647 \u0628\u0627\u0634\u0646\u062f \u0648 \u0647\u0645\u0686\u0646\u06cc\u0646 \u0645\u0627\u0647\u06cc\u062a \u0633\u06cc\u0627\u0633\u06cc \u0628\u0648\u062f\u0646 \u06cc\u0627 \u0646\u0628\u0648\u062f\u0646 \u0634\u0648\u0631\u0627\u06cc \u0634\u0647\u0631.",
"agency":"13",
"date_created":1494518193,
"url":"http://www.khabaronline.ir/(X(1)S(bud4wg3ebzbxv51mj45iwjtp))/detail/663749/society/urban",
"image":"uploads/2017/05/11/1589793661.jpg",
"category":"15"
},
"_type":"news",
"_id":"2981643"
},
"2": {
...
based on what I have learnt, at first, I tried to create a mapping system for it in DevTools of Kibana. I want to be able to perform queries and search on this file based on fields in _source, such as category, id and so on. this is my mapping:
PUT /main-news-test-data
{
"mappings": {
"properties": {
"_score": {"type":"integer"},
"_index": {"type":"keyword"},
"_type":{"type":"keyword"},
"_id":{"type":"keyword"}
},
"_source":{
"properties": {
"content":{"type":"text"},
"title":{"type":"text"},
"lead":{"type":"text"},
"agency":{"type":"keyword"},
"date_created":{"type":"date"},
"url":{"type":"keyword"},
"image":{"type":"keyword"},
"category":{"type":"keyword"}
}
}
}
}
HEAD main-news-test-data
GET /main-news-test-data/_search?q=*
but when I run this in Devtools I receive this error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Mapping definition for [_source] has unsupported parameters: [properties : {image={type=keyword}, agency={type=keyword}, date_created={type=date}, title={type=text}, category={type=keyword}, content={type=text}, lead={type=text}, url={type=keyword}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [_doc]: Mapping definition for [_source] has unsupported parameters: [properties : {image={type=keyword}, agency={type=keyword}, date_created={type=date}, title={type=text}, category={type=keyword}, content={type=text}, lead={type=text}, url={type=keyword}}]",
"caused_by" : {
"type" : "mapper_parsing_exception",
"reason" : "Mapping definition for [_source] has unsupported parameters: [properties : {image={type=keyword}, agency={type=keyword}, date_created={type=date}, title={type=text}, category={type=keyword}, content={type=text}, lead={type=text}, url={type=keyword}}]"
}
},
"status" : 400
}
I also tried to index my file into elasticsearch using this PowerShell command afterwards:
Invoke-RestMethod "http://localhost:9200/main-news-test-data/doc/_bulk?pretty" -Method Post -ContentType 'application/x-ndjson' -InFile "test.json"
but again I get this error from Powershell:
Invoke-RestMethod : {
"error" : {
"root_cause" : [
{
"type" : "json_e_o_f_exception",
"reason" : "Unexpected end-of-input: expected close marker for Object (start marker at [Source:
(org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper); line: 1, column: 1])\n at
[Source: (org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper); line: 2, column: 1]"
}
],
"type" : "json_e_o_f_exception",
"reason" : "Unexpected end-of-input: expected close marker for Object (start marker at [Source:
(org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper); line: 1, column: 1])\n at
[Source: (org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper); line: 2, column: 1]"
},
"status" : 400
}
So what should I do? how do I import a JSON file into elasticsearch that is queryable by fields?
with what I read I can say that :
your mapping is strange
Just put :
PUT /main-news-test-data
{
"mappings": {
"properties": {
"content": {
"type": "text"
},
"title": {
"type": "text"
},
"lead": {
"type": "text"
},
"agency": {
"type": "keyword"
},
"date_created": {
"type": "date"
},
"url": {
"type": "keyword"
},
"image": {
"type": "keyword"
},
"category": {
"type": "keyword"
}
}
}
}
Your Json is wrong. A bulk don't use a valid json.
A file for the _bulk api will look like this :
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "main-news-test-data", "_id" : "2" } }
{ "field1" : "value2" }
Please also Note that "_score":1.0, has no reason to be in your request and that _type is deprecated (if you use a 7.0+, _type can only be _doc and should be ignored)

Using jq to extract multiple fields and create a new object

I have this particular json object,
[
{
"userid" : "fe2e48b7-858b-4a0d-964a-efb8483a00c4",
"lastupdateddate" : "84798000-13cd-11ea-8080-808080808080",
"transactionid" : "10383117.2216238756",
"accountid" : "10383117.10921962",
"misctransactiondata" : null,
"rawtransactiondata" : "{\"id\":\"1234567\",\"account_id\":\"456451962\"}",
"source" : "gateway",
"transactiondatajson" : "{\"version\":\"v1\",\"transactionId\":\"4234234.2216238756\",\"accountId\":\"345345345.10921962\"}",
"version" : "v1"
}
]
which I'd like to transform into,
{
"transactions": [
{
"version": "v1",
"transactionId": "4234234.2216238756",
"accountId": "345345345.10921962",
"rawData": {
"id": "1234567",
"account_id": "456451962"
}
}
]
}
Currently I have,
jq '{transactions: [.[0] | (.transactiondatajson|fromjson) ]}'
which creates the transactions array of objects however I'm not entirely sure how to create the rawData nested object from .rawtransactiondata
How to best append the object with jq ?
One of many possibilities:
.[]
| {transactions:
[(.transactiondatajson|fromjson)
+ {rawData: (.rawtransactiondata|fromjson)} ] }

Parse and Map 2 Arrays with jq

I am working with a JSON file similar to the one below:
{ "Response" : {
"TimeUnit" : [ 1576126800000 ],
"metaData" : {
"errors" : [ ],
"notices" : [ "query served by:1"]
},
"stats" : {
"data" : [ {
"identifier" : {
"names" : [ "apiproxy", "response_status_code", "target_response_code", "target_ip" ],
"values" : [ "IO", "502", "502", "7.1.143.6" ]
},
"metric" : [ {
"env" : "dev",
"name" : "sum(message_count)",
"values" : [ 0.0]
} ]
} ]
} } }
My object is to display a mapping of the identifier and values like :
apiproxy=IO
response_status_code=502
target_response_code=502
target_ip=7.1.143.6
I have been able to parse both names and values with
.[].stats.data[] | (.identifier.names[]) and .[].stats.data[] | (.identifier.values[])
but I need help with the jq way to map the values.
The whole thing can be done in jq using the -r command-line option:
.[].stats.data[]
| [.identifier.names, .identifier.values]
| transpose[]
| "\(.[0])=\(.[1])"

Replacing specific fields in JSON from text file

I have a json structure and would like to replace strings in 2 fields that are in a seperate text file.
Here is the json file with 2 records:
{
"events" : {
"-KKQQIUR7FAVxBOPOFhr" : {
"dateAdded" : 1487592568926,
"owner" : "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type" : "boycott"
},
"-KKjjM-pAXvTuEjDjoj_" : {
"dateAdded" : 1487933370561,
"owner" : "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type" : "boycott"
}
},
"geo" : {
"-KKQQIUR7FAVxBOPOFhr" : {
".priority" : "qw3yttz1k9",
"g" : "qw3yttz1k9",
"l" : [ 40.762632, -73.973837 ]
},
"-KKjjM-pAXvTuEjDjoj_" : {
".priority" : "qw3yttx6bv",
"g" : "qw3yttx6bv",
"l" : [ 41.889019, -87.626291 ]
}
},
"log" : "null",
"users" : {
"62e6aaa0-a50c-4448-a381-f02efde2316d" : {
"events" : {
"-KKQQIUR7FAVxBOPOFhr" : {
"type" : "boycott"
},
"-KKjjM-pAXvTuEjDjoj_" : {
"type" : "boycott"
}
}
}
}
}
And here is the txt file that I want to substitue in:
49.287130, -123.124026
36.129770, -115.172811
There are lots more records but I kept this to 2 for brevity.
Any help would be appreciated. Thank you.
The problem description seems to assume that the ordering of the key-value pairs within a JSON object is fixed. Different JSON-oriented tools (and indeed different versions of jq) have different takes on this. In any case, the following assumes a version of jq that respects the ordering (e.g. jq 1.5); it also assumes that inputs is available, though that is inessential.
The key to the following solution is the helper function, map_nth_value/2, which modifies the value of the nth key in a JSON object:
def map_nth_value(n; filter):
to_entries
| (.[n] |= {"key": .key, "value": (.value | filter)} )
| from_entries ;
[inputs | select(length > 0) | split(",") | map(tonumber)] as $lists
| reduce range(0; $lists|length) as $i
( $object;
.geo |= map_nth_value($i; .l = $lists[$i] ) )
With the above jq program in a file (say program.jq), and with the text file in a file (say input.txt) and the JSON object in a file (say object.json), the following invocation:
jq -R -n --argfile object object.json -f program.jq input.txt
produces:
{
"events": {
"-KKQQIUR7FAVxBOPOFhr": {
"dateAdded": 1487592568926,
"owner": "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type": "boycott"
},
"-KKjjM-pAXvTuEjDjoj_": {
"dateAdded": 1487933370561,
"owner": "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type": "boycott"
}
},
"geo": {
"-KKQQIUR7FAVxBOPOFhr": {
".priority": "qw3yttz1k9",
"g": "qw3yttz1k9",
"l": [
49.28713,
-123.124026
]
},
"-KKjjM-pAXvTuEjDjoj_": {
".priority": "qw3yttx6bv",
"g": "qw3yttx6bv",
"l": [
36.12977,
-115.172811
]
}
},
"log": "null",
"users": {
"62e6aaa0-a50c-4448-a381-f02efde2316d": {
"events": {
"-KKQQIUR7FAVxBOPOFhr": {
"type": "boycott"
},
"-KKjjM-pAXvTuEjDjoj_": {
"type": "boycott"
}
}
}
}
}

Elasticsearch - completion suggester payload transforming, returns invalid JSON

I am trying to use the elasticsearch completion suggester. I have app_user objects, which come into my elasticsearch instance via a couchdb river.
This is the mapping I use:
{
"app_user" : {
"_all" : {"enabled" : true},
"_source" : {
"includes" : [
"_id",
"_rev",
"type",
"profile.callname",
"profile.fullname",
"email"
]
},
"properties" : {
"suggest" : { "type" : "completion",
"index_analyzer" : "simple",
"search_analyzer" : "simple",
"payloads" : true
}
},
"transform" : [
{"script": "ctx._source.suggest = ['input':[ctx._source.email, ctx._source.profile.fullname, ctx._source.profile.callname]]"},
{"script": "ctx._source.suggest.payload = ['_id': ctx._source['_id'], 'type': ctx._source['type'], '_rev': ctx._source['_rev']]"}
,
{"script": "ctx._source.suggest.payload << ['label': ctx._source.profile.fullname, 'text': ctx._source.email]"}
]
}
}
So I am trying to include the object ID and a display text in the payload.
When I view the generated document via http://localhost:9200/myindex/app_user/<someid>?pretty&_source_transform, everything seems OK:
{
"_index": "myindex",
"_type": "app_user",
"_id": "<someid>",
"found": true,
"_source": {
"_rev": "2-dcd7b9d456e205d3e9d859fdc2c6a688",
"_id": "<someid>",
"email": "joni#example.org",
"suggest": {
"input": [
"joni#example.org",
...
],
"output": "joni surname - joni#example.org",
"payload": {
"_id": "<someid>",
"type": "app_user",
"_rev": "2-dcd7b9d456e205d3e9d859fdc2c6a688",
"label": "joni surname",
"text": "joni#example.org"
}
},
"type": "app_user",
"profile": {
"callname": "",
"fullname": "joni surname"
}
}
}
However, when I try to get the document via _suggest, elasticsearch API somehow breaks the JSON object payload:
curl -XGET "http://localhost:9200/myindex/_suggest" -d '{
"all-suggest": {
"text": "joni",
"completion": {
"field": "suggest"
}
}
}'
results in
{"_shards":{"total":5,"successful":5,"failed":0},"all-suggest":[{"text":"joni","offset":0,"length":5,"options":[{"text":"joni surname - joni#example.org","score":1.0,"payload"::)
�_id`<someid>�typeIapp_user�_reva2-dcd7b9d456e205d3e9d859fdc2c6a688�label�joni surname�textSjoni#example.org�}]}]}
which is definitely no valid JSON.. Any ideas?
This is actually a bug in elasticsearch. It was reported and acknowledged here and should be fixed shortly.