Elasticsearch exact phrase match on JSON - json

I am working on exact phrase match from a json field using the elasticsearch. I have tried mutiple syntax like multi_match, query_string & simple_query_string but they does not return results exactly as per the given phrase.
query_string syntax that I am using;
"query":{
"query_string":{
"fields":[
"json.*"
],
"query":"\"legal advisor\"",
"default_operator":"OR"
}
}
}
I also tried filter instead of query but filter is not given any result on json. The syntax I used for filter is;
"query": {
"bool": {
"filter": {
"match": {
"json": "legal advisor"
}
}
}
}
}
Now the question is;
Is it possible to perform exact match operation on json using elasticsearch?

You can try using multi-match query with type phrase
{
"query": {
"multi_match": {
"query": "legal advisor",
"fields": [
"json.*"
],
"type": "phrase"
}
}
}

Since you have not provided your sample docs and expected docs, I am assuming you are looking for a phrase match, Adding a working sample.
Index sample docs which will also generate the index mapping
{
"title" : "legal advisor"
}
{
"title" : "legal expert advisor"
}
Now if you are looking for exact phrase search of legal advisor use below query
{
"query": {
"match_phrase": {
"title": "legal advisor"
}
}
}
Which returns only first doc
"hits": [
{
"_index": "64989158",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642,
"_source": {
"title": "legal advisor"
}
}
]

Related

How to check in elasticsearch if a JSON object has a key using the DSL?

If I have two documents within an index of the following format, I just want to weed out the ones which have an empty JSON instead of my expected key.
A
{
"search": {
"gold": [1,2,3,4]
}
B
{
"search":{}
}
I should just get A json and not B json.
I've tried the exists query to search for "gold" but it just checks for non null values and returns the list.
Note: The following doesn't do what I want.
GET test/_search
{
"query": {
"bool": {
"must": [
{
"exists": { "field": "search.gold" }}
]
}
}
}
This is a simple question but I'm unable to find a way to do it even after searching through their docs.
If someone can help me do this it would be really great.
The simplified mapping of the index is :
"test": {
"mappings": {
"carts": {
"dynamic": "true",
"_all": {
"enabled": false
},
"properties": {
"line_items": {
"properties": {
"line_items_dyn_arr": {
"type": "nested",
"properties": {
"dynamic_key": {
"type": "keyword"
}
}
}
}
}
}
}
}
}
Are you storing complete json in search field?
If this is not the case then please share the mapping of your index and sample data.
Update: Query for nested field:
{
"query": {
"nested": {
"path": "search",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "search.gold"
}
}
]
}
}
}
}
}
For nested type fields we need to specify the path and query to be executed on nested fields since nested fields are indexed as child documents.
Elastic documentation: Nested Query
UPDATE based on the mapping added in question asked:
{
"query": {
"nested": {
"path": "line_items.line_items_dyn_arr",
"query": {
"exists": {
"field": "line_items.line_items_dyn_arr"
}
}
}
}
}
Notice that we used "path": "line_items.line_items_dyn_arr". The reason we require to provide full path is because nested field line_items_dyn_arr is itself under line_items object. Had line_items_dyn_arr be a property of mapping and not the property of object or nested field the previous query would work fine.
Nishant's answer is right but for some reason I could get it working only if the path and field are the whole paths.
The following works for me.
{
"nested": {
"path": "search.gold",
"query": {
"exists": {
"field": "search.gold"
}
}
}
}

How to perform partial matching on _id in Elastic search

I am trying to perform a partial word matching on the _id field in my Elastic search instance.
After searching the official documentation I found out that the best way to do this is to create a n-gram analyzer, so using Sense I did this:
PUT /index2
{"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"partial_filter": {
"type": "ngram",
"min_gram": 2,
"max_gram": 20
}
},
"analyzer": {
"partial": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"partial_filter"
]
}
}
}
}}
I have tried to test the analyzer using :
POST /index2/_analyze
{
"analyzer": "partial",
"text": "brown fox"
}
And it worked as expected producing proper combinations.
The next step should be to apply the analyzer to the relevant fields,so I tried to do this:
PUT /index2/_mapping/type2
{
"type2": {
"properties": {
"_id": {
"type": "string",
"analyzer": "partial"
}
}
}
}
But i am getting an error:
"reason": "Field [_id] is defined twice in [type2]"
Probably because _id field gets created during the index2 creation along with the analyzer.
So my question is how can I use the partial search on the _id field?
Is there any other way to do this?
Thanks in advance!

How to perform AND condition in elasticsearch query?

I have the following query where I want to query the indexname for ID "abc_12-def that fall within the date range specified in the range filter.
But the below query is fetching values of different ID as well(for eg: abc_12-edf, abc_12-pgf etc) and that fall outside the date range. Any advice on how I can give an AND condition here? Thanks.
curl -XPOST 'localhost:9200/indexname/status/_search?pretty=1&size=1000000' -d '{
"query": {
"filtered" : {
"filter": [
{ "term": { "ID": "abc_12-def" }},
{ "range": { "Date": { "gte": "2015-10-01T09:12:11", "lte" : "2015-11-18T10:10:13" }}}
]
}
}
}'
You need to use Bool query for AND aka MUST condition
{
"query": {
"bool": {
"must": [
{
"term": {
"ID": "abc_12-def"
}
},
{
"range": {
"Date": {
"gte": "2015-10-01T09:12:11",
"lte": "2015-11-18T10:10:13"
}
}
}
]
}
}
}
Also, all fields by default are analyzed using standard analyzer, which means abc_12-def is tokenized as [abc_12, def]. term query does not analyze the string.
If you are looking for an exact match, you should mark the field as not_analyzed. How to map it as not_analyzed is explained here.

ElasticSearch - Combining query match with wildcard

I'm fairly new to ElasticSearch still, but I'm currently trying to wrap my head around why I am not able to mix a wildcard query with a match as well.
Take this JSON body for example
{
"size":"10",
"from":0,
"index":"example",
"type":"logs",
"body":{
"query":{
"match":{
"account":"1234"
},
"wildcard":{
"_all":"*test*"
}
},
"sort":{
"timestamp":{
"order":"desc"
}
}
}
}
It returns with the error "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed;"
(Full dump: http://pastebin.com/uJJZm8fQ)
However, if I remove either the wildcard or match key from the request body - it returns results as expected.
I've been going through the documentation and I'm not really able to find any relevant content at all.
At first I thought it was to do with the _all parameter, but even if I explicitly specify a key, the same result occurs.
Before I assume that I should be using the 'bool' operator, or something alike to mix my query types, is there any explanation for this?
The exception says that it does not understand the field "index". When querying Elasticsearch you include the index name and type in the URL. There is no wildcard search in a match query. There is a wildcard search in the query_string query.
Your query should be something like this with match:
POST /example/logs/_search
{
"size": 10,
"from": 0,
"query" : {
"match": {
"account": "1234"
}
},
"sort": {
"timestamp" : {
"order": "desc"
}
}
Or something like this with query_string:
POST /example/logs/_search
{
"size": 10,
"from": 0,
"query" : {
"query_string": {
"default_field": "account",
"query": "*1234*"
}
},
"sort": {
"timestamp" : {
"order": "desc"
}
}
EDIT: Adding an example of a wildcard query:
POST /example/logs/_search
{
"size": 10,
"from": 0,
"query" : {
"wildcard": "*test*"
},
"sort": {
"timestamp" : {
"order": "desc"
}
}

ElasticSearch not returning expected results

I am attempting to run an ElasticSearch search using the following query. Please pardon my ignorance, as I'm new to ES, and I've sorta cobbled this together by trial and error trying to follow the documentation. Basically, the only parts that are working as expected are the from, size, sort, and the match on severity. Thank you in advance for the assist!
{
"from":0,
"size":50,
"sort":{"timestamp":{"order":"desc"}},
"query":[
{
"range":{
"timestamp":{"gte":"2013-11-18T05:00:00+00:00","lte":"2013-12-02T05:00:00+00:00"}
}
},
{
"query":{
"match":{"severity":{"query":"medium","operator":"or"}}
}
},
{
"query":{
"constantScore":{
"filter":{
"query":{
"query_string":{"default_field":"_all","query":"10.1.10.22"}
}
}
}
}
}
]
}
I think you need to read more about Query DSL. Here's the correct query based on your input:
{
"query": {
"query_string": {
"default_field": "_all",
"query": "10.1.10.22"
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gte": "2013-11-18T05:00:00+00:00",
"lte": "2013-12-02T05:00:00+00:00"
}
}
},
{
"term": {
"severity": "medium"
}
}
]
}
}
}
The above query can be explained as:
- filter the data first using bool filter, "must" here can be understood as "AND". So the data will be filter by "timestamp in range..." AND "serverity=medium"
- then search the filtered data using "query_string"
That will make your searching much more faster.
In any case, your query is not formatted correctly. If you want to combine multiple queries you can use the bool query. See the docs: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html