Conversion from sql to elastic search query - mysql

I want to convet the foll. sql query to elastic json query
select count(distinct(fk_id)),city_id from table
where status1 != "xyz" and satus2 = "abc" and
cr_date >="date1" and cr_date<="date2" group by city_id
Also is there any way of writing nested queries in elastic.
select * from table where status in (select status from table2)

The first query can be translated like this in the Elasticsearch query DSL:
curl -XPOST localhost:9200/table/_search -d '{
"size": 0,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"status2": "abc"
}
},
{
"range": {
"cr_date": {
"gt": "date1", <--- don't forget to change the date
"lt": "date2" <--- don't forget to change the date
}
}
}
],
"must_not": [
{
"term": {
"status1": "xyz"
}
}
]
}
}
}
},
"aggs": {
"by_cities": {
"terms": {
"field": "city_id"
},
"aggs": {
"fk_count": {
"cardinality": {
"field": "fk_id"
}
}
}
}
}
}'

Using Sql API In Elastic search, we can write queries and also we can translate them to elastic query
POST /_sql/translate
{
"query": "SELECT * FROM customer where address.Street='JanaChaitanya Layout' and Name='Pavan Kumar'"
}
Response for this is
{
"size" : 1000,
"query" : {
"bool" : {
"must" : [
{
"term" : {
"address.Street.keyword" : {
"value" : "JanaChaitanya Layout",
"boost" : 1.0
}
}
},
{
"term" : {
"Name.keyword" : {
"value" : "Pavan Kumar",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"Name",
"address.Area",
"address.Street"
],
"excludes" : [ ]
},
"docvalue_fields" : [
{
"field" : "Age"
}
],
"sort" : [
{
"_doc" : {
"order" : "asc"
}
}
]
}
Now we can use this result to query elastic search
For further details please go through this article
https://xyzcoder.github.io/elasticsearch/2019/06/25/making-use-of-sql-rest-api-in-elastic-search-to-write-queries-easily.html

Related

Elastic Search: Multiple term search in one query

New to ElasticSearch.
I have documents under an index: myindex in Elastic search with mappings:
http://host:port/myindex/_mapping
{
"mappings":{
"properties": {
"en_US": {
"type": "keyword"
}
}
}
}
Let's say my 3 documents look like this:
{
"product": "p1",
"subproduct": "p1.1"
}
{
"product": "p1",
"subproduct": "p1.2"
}
{
"product": "p2",
"subproduct": "p2.1"
}
Now, I am querying using for single subproduct p1.1 with product p1 as below and it's working fine:
POST: http://host:port/myindex/_search
{
"query": {
"bool" : {
"must" : {
"term" : { "product" : "p1" }
},
"filter": {
"term" : { "subproduct" : "p1.1" }
}
}
}
}
My question is:
How I can query for 2 or more subproducts in one _search query, like suproducts p1.1 and p1.2 under product p1 ?
Query should return list of all subproduct p1.1 and subproduct p1.2 with p1 product.
Simply change the term-query in your filter-clause to a terms-query and search for multiple terms.
{
"query": {
"bool" : {
"must" : {
"term" : { "product" : "p1" }
},
"filter": {
"terms" : { "subproduct" : ["p1.1", "p1.2"] }
}
}
}
}

How to write Elasticsearch multiple must scripts query?

I want to use a query to compare multiple fields. I have field 1 to 4. I want to search data which field 1 is greater than field 2 and below query is work perfectly;
{
"size": 0,
"_source": [
"field1",
"field2",
"field3",
"field4"
],
"sort": [],
"query": {
"bool": {
"filter": [],
"must": {
"script": {
"script": {
"inline": "doc['field1'].value > doc['field2'].value;",
"lang": "painless"
}
}
}
}
}
}
Now, I want to search data which field 1 is greater than field 2 and also which field 3 is greater than field 4. according Elastic Search: How to write multi statement scripts? and This link I just need to separate each statement with a semicolon. So it should be like this:
{
"size": 0,
"_source": [
"field1",
"field2",
"field3",
"field4"
],
"sort": [],
"query": {
"bool": {
"filter": [],
"must": {
"script": {
"script": {
"inline": "doc['field1'].value > doc['field2'].value; doc['field3'].value > doc['field4'].value;",
"lang": "painless"
}
}
}
}
}
}
But that query doesn't work and return compile error like this:
{"root_cause":[{"type":"script_exception","reason":"compile
error","script_stack":["doc['field1'].value > doc[' ...","^----
HERE"],"script":"doc['field1'].value > doc['field2'].value;
doc['field1'].value > doc['field2'].value;
","lang":"painless"}],"type":"search_phase_execution_exception","reason":"all
shards
failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"financials","node":"8SXaM2HcStelpLHvTDSMCQ","reason":{"type":"query_shard_exception","reason":"failed
to create query: {\n \"bool\" : {\n \"must\" : [\n {\n \"script\" :
{\n \"script\" : {\n \"source\" : \"doc['field1'].value >
doc['field2'].value; doc['field1'].value > doc['field2'].value; \",\n
\"lang\" : \"painless\"\n },\n \"boost\" : 1.0\n }\n }\n ],\n
\"adjust_pure_negative\" : true,\n \"boost\" : 1.0\n
}\n}","index_uuid":"hz12cHg1SkGwq712n6BUIA","index":"financials","caused_by":{"type":"script_exception","reason":"compile
error","script_stack":["doc['field1'].value > doc[' ...","^----
HERE"],"script":"doc['field1'].value > doc['field2'].value;
doc['field1'].value > doc['field2'].value;
","lang":"painless","caused_by":{"type":"illegal_argument_exception","reason":"Not
a statement."}}}}]}
You need to combine your two conditions like this:
doc['field1'].value > doc['field2'].value && doc['field3'].value > doc['field4'].value
^
|
replace the semicolon by &&
In order to use more than condition, 'must', 'should' and 'must_not' can be use as arrays, and each condition become on element of it. According to Elasticsearch documentation
"query": {
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"filter": {
"term" : { "tag" : "tech" }
},
"must_not" : {
"range" : {
"age" : { "gte" : 10, "lte" : 20 }
}
},
"should" : [
{ "term" : { "tag" : "wow" } },
{ "term" : { "tag" : "elasticsearch" } },
{ "term" : { "tag" : "and so on" } }
],
"minimum_should_match" : 1,
"boost" : 1.0
}
}

Elastic query to show exact match OR other fields if not found

I need some help rewriting my elasticsearch query.
What i need is:
1- to show a single record if there is an exact match on the two fields verb and sessionid.raw (partial matches are not accepted).
"must": [
{ "match" : { "verb" : "login" } },
{ "term" : { "sessionid.raw" : strSessionID } },
]
OR
2- to show the top 5 records (sorted by _score DESC and #timestamp ASC) that match some other fields, giving a boost if the records are between the specified time range.
"must": [
{ "match" : { "verb" : "login" } },
{ "term" : { "pid" : strPID } },
],
"should": [
{ "match" : { "user.raw" : strUser } },
{ "range" : { "#timestamp" : {
"from" : QueryFrom,
"to" : QueryTo,
"format" : DateFormatElastic,
"time_zone" : "America/Sao_Paulo",
"boost" : 2 }
} },
]
The code below is almost doing what i want.
Right now it boosts sessionid.raw to the top if found, but the remaining records are not being discarded.
var objQueryy = {
"fields" : [ "#timestamp", "program", "pid", "sessionid.raw", "user", "frontendip", "frontendname", "_score" ],
"size" : ItemsPerPage,
"sort" : [ { "_score" : { "order": "desc" } }, { "#timestamp" : { "order" : "asc" } } ],
"query" : {
"bool": {
"must": [
{ "match" : { "verb" : "login" } },
{ "term" : { "pid" : strPID } },
{ "bool": {
"should": [
{ "match" : { "user.raw" : strUser } },
{ "match" : { "sessionid.raw": { "query": strSessionID, "boost" : 3 } } },
{ "range" : { "#timestamp" : { "from": QueryFrom, "to": QueryTo, "format": DateFormatElastic, "time_zone": "America/Sao_Paulo" } } },
],
}},
],
},
},
}
Elasticsearch cannot "prune" your secondary results for you when an exact match is also found.
You would have to implement this discarding functionality on the client side after all results had been returned.
You may find the cleanest implementation is to execute your two search strategies separately. Your search client would:
Run the first (exact match) query
Run the second (expanded) query only if no results found

Elastic Search aggregation enhanced filtering for nested query

I have the following objects indexed:
{ "ProjectName" : "Project 1",
"Roles" : [
{ "RoleName" : "Role 1", "AddedAt" : "2015-08-14T17:11:31" },
{ "RoleName" : "Role 2", "AddedAt" : "2015-09-14T17:11:31" } ] }
{ "ProjectName" : "Project 2",
"Roles" : [
{ "RoleName" : "Role 1", "AddedAt" : "2015-10-14T17:11:31" } ] }
{ "ProjectName" : "Project 3",
"Roles" : [
{ "RoleName" : "Role 2", "AddedAt" : "2015-11-14T17:11:31" } ] }
I.e., a list of projects with different roles added, added in different time.
(Roles list is a nested field)
What I need is to have aggregation which would select how many projects exist per certain role, BUT only(!) if the role was added to the project in certain period.
A classic query (without the dates rande filtering) looks like this (and works well):
{ // ... my main query here
"aggs" : {
"agg1" : {
"nested" : {
"path" : "Roles"
},
"aggs" : {
"agg2": {
"terms": {
"field" : "Roles.RoleName"
},
"aggs": {
"agg3":{
"reverse_nested": {}
}}}}}}
But this approach is not working for me, because if I need filtering by dates starting from let's say '2015-09-01', both 'Role 1' and 'Role 2' would be selected for the first project (i.e., the project for them) as the 'Role 1' would hit because 'Role 2''s project hits because of the 'Role 2' AddedAt date criterium.
So, I consider, I should add the following condition somewhere additionally:
"range": { "Roles.AddedAt": {
"gte": "2015-09-01T00:00:00",
"lte": "2015-12-02T23:59:59"
}}
But I can not find a correct way to do that.
The results of the working query are (kind of) the following:
"aggregations": {
"agg1": {
"doc_count": 17,
"agg2": {
"buckets": [
{
"key": "Role 1",
"doc_count": 2,
"agg3": {
"doc_count": 2
}
},
{
"key": "Role 2",
"doc_count": 2,
"agg3": {
"doc_count": 2
}
},
Try this:
{
"aggs": {
"agg1": {
"nested": {
"path": "Roles"
},
"aggs": {
"NAME": {
"filter": {
"query": {
"range": {
"Roles.AddedAt": {
"gte": "2015-09-01T00:00:00",
"lte": "2015-12-02T23:59:59"
}
}
}
},
"aggs": {
"agg2": {
"terms": {
"field": "Roles.RoleName"
},
"aggs": {
"agg3": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}

elasticsearch request an element in an array

I have a document indexed in my elastic search like:
{
...
purchase:{
zones: ["FR", "GB"]
...
}
...
}
I use this kind of query to find for example document with puchase's zone to GB
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"purchase.zones": "GB"
}
}
}
}
}
But with it i get no results...
I would like to perform a query like in php in_array("GB", purchase.zones).
Any help would be very helpful.
If your "purchase" field is nested type then you have to use nested query to access the "zones".
{
"nested" : {
"path" : "obj1",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"match" : {"obj1.name" : "blue"}
},
{
"range" : {"obj1.count" : {"gt" : 5}}
}
]
}
}
}
}
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html