django-elasticsearch-dsl-drf with JSONfield - json

I'm using django-elasticsearch-dsl-drf package and I have Postgres jsonField which I want to index. I tried to use Nestedfield in the document but without any properties since the json field is arbitrary, But I'm not able to search on that field, and I don't see anything related to that on their documentation.
Any idea how can I achieve this?
Mapping:
{
"mappings": {
"_doc": {
"properties": {
"jsondata": {
"type": "nested",
"properties": {
"timestamp": {
"type": "date"
},
"gender": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"group_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}, ...
I want to search on that field like jsondata.gender = x

Query for jsonfield.gender = x
GET <index>/_search
{
"query": {
"nested": {
"path": "jsonfield",
"query": {
"term": {
"jsonfield.gender.keyword": {
"value": "x"
}
}
}
}
}
}
NOTE: The query has not been verified using Kibana Dev Tools.

Related

must match URL address returning a lot of documents - Elasticsearch

I'm simply trying to check how many documents have the same link value. There is something weird going on.
Let's say one or more documents has this link value: https://twitter.com/someUser/status/1288024417990144000
I search for it using this JSON query:
/theIndex/_doc/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"link": "https://twitter.com/someUser/status/1288024417990144000"
}
}
]
}
}
}
It returns documents 522 of 546, with the first document being the correct one. It acts more like a query_string than a must match
If I search another more unique field like sha256sum:
{
"query": {
"bool": {
"must": [
{
"match": {
"sha256sum": "dad06b7a0a68a0eb879eaea6e4024ac7f97e38e6ac2b191afa7c363948270303"
}
}
]
}
}
}
It returns 1 document like it should.
I've tried searching must term aswell, but it returns 0 documents.
Mapping
{
"images": {
"aliases": {},
"mappings": {
"properties": {
"sha256sum": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"link": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
},
"settings": {
"index": {
"number_of_shards": "1",
"provided_name": "images",
"creation_date": "1593711063075",
"analysis": {
"filter": {
"synonym": {
"ignore_case": "true",
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
},
"analyzer": {
"synonym": {
"filter": [
"synonym"
],
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "1",
"uuid": "a5zMwAYCQuW6U4R8POiaDw",
"version": {
"created": "7050199"
}
}
}
}
}
I wouldn't think such a simple issue would be so hard to fix. Am I just missing something right in front of my eyes?
Does anyone know what might be going on here?
Even though I don't see the link field in your mapping (is it source?), I suspect it is a text field and text fields are analyzed. If you want to perform an exact match, you need to match on the link.keyword field and it's going to behave like you expect:
{
"query": {
"bool": {
"must": [
{
"match": {
"link.keyword": "https://twitter.com/someUser/status/1288024417990144000"
^
|
add this
}
}
]
}
}
}

What is wrong with this elastic json query, mapping?

I am trying to use nested JSON to query DB records. Here is my query -
"query": {
"nested": {
"path": "metadata.technical",
"query": {
"bool": {
"must": [
{
"term": {
"metadata.technical.key": "techcolor"
}
},
{
"term": {
"metadata.technical.value": "red"
}
}
]
}
}
}
}
Here is this part in my mapping.json -
"metadata": {
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
And I have table that has column 'value' and its content is -
{"technical":
{
"techname22": "test",
"techcolor":"red",
"techlocation": "usa"
}
}
Why I can't get any results with this? FYI I am using ES 1.7. Thanks for any help.
To respect the mapping you've defined your sample document should look like this:
{
"technical": [
{
"key": "techname22",
"value": "test"
},
{
"key": "techcolor",
"value": "red"
},
{
"key": "techlocation",
"value": "usa"
}
]
}
Changing your document with the above structure would make your query work as it is.
The real mapping of this document:
{
"technical": {
"techname22": "test",
"techcolor": "red",
"techlocation": "usa"
}
}
Is more like this:
{
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
"techname22": {
"type": "string"
},
"techcolor": {
"type": "string"
},
"techlocation": {
"type": "string"
}
}
}
}
}
If all your keys are dynamic and not known in advance, you can also configure your mapping to be dynamic as well, i.e. don't define any fields in the nested type and new fields will be added if not already present in the mapping:
{
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
}
}
}
}

How to use function_score on nested geo_point field

I've been trying to use function_score on nested geo_type but Elasticsearch always returns documents with _score = 1.
My mapping
{
"myindex": {
"mappings": {
"offers": {
"properties": {
"country_id": {
"type": "long"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"locations": {
"type": "nested",
"properties": {
"city": {
"type": "string",
"store": true,
"fields": {
"city_original": {
"type": "string",
"analyzer": "keyword_analyzer"
}
}
},
"coordinates": {
"type": "geo_point"
}
}
}
}
}
}
}
}
The problem is that some of documents can have empty coordinates. My query:
GET /myindex/offers/_search?explain
{
"query": {
"nested": {
"path": "locations",
"query": {
"function_score": {
"functions": [
{
"filter": {
"exists": {
"field": "locations.coordinates"
}
},
"gauss": {
"locations.coordinates": {
"origin": {
"lat": 49.1,
"lon": 17.03333
},
"scale": "10km",
"offset": "20km"
}
}
}
]
}
}
}
}
}
As I said, Elasticsearch returns documents with _score = 1 and explanation: Score based on child doc range from 21714 to 21714
What am I doing wrong? My version of Elastisearch is 1.7.

filter '_index' same way as '_type' in search across multiple index query elastic search

I have two indexes index1 and index2 and both has two types type1 and type2 with same name in elastic search.(please assume that we have valid business reason behind it)
I would like to search index1 - type1 and index2 -type2
here is my query
POST _search
{
"query": {
"indices": {
"indices": ["index1","index2"],
"query": {
"filtered":{
"query":{
"multi_match": {
"query": "test",
"type": "cross_fields",
"fields": ["_all"]
}
},
"filter":{
"or":{
"filters":[
{
"terms":{
"_index":["index1"], // how can i make this work?
"_type": ["type1"]
}
},
{
"terms":{
"_index":["index2"], // how can i make this work?
"_type": ["type2"]
}
}
]
}
}
}
},
"no_match_query":"none"
}
}
}
You can use the indices, type in a bool filter to filter on type and index
The query would look something on these lines :
POST index1,index2/_search
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "test",
"type": "cross_fields",
"fields": [
"_all"
]
}
},
"filter": {
"bool": {
"should": [
{
"indices": {
"index": "index1",
"filter": {
"type": {
"value": "type1"
}
},
"no_match_filter": "none"
}
},
{
"indices": {
"index": "index2",
"filter": {
"type": {
"value": "type2"
}
},
"no_match_filter": "none"
}
}
]
}
}
}
}
}
Passing the index names in the url example : index1,index2/_search is a good practice else you risk executing query across all indices in the cluster.

Elastic Search mapping bad mapping

I have the following mapping for a Elastic Search index. I am posting ("PUT") it to http://abc.com/test/article/_mapping.
{
"article": {
"settings": {
"analysis": {
"analyzer": {
"stem": {
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"stop",
"porter_stem"
]
}
}
}
},
"mappings": {
"properties": {
"DocumentID": {
"type": "string"
},
"ContentSource": {
"type": "integer"
},
"ContentType": {
"type": "integer"
},
"PageTitle": {
"type": "string",
"analyzer": "stem"
},
"ContentBody": {
"type": "string",
"analyzer": "stem"
},
"URL": {
"type": "string"
}
}
}
}
}
I get an OK message from Elastic Search. But when I go to http://abc.com/test/article/_mapping , I don't see the settings of the mapping. All I see is this
{ "article" : { "properties" : { } }}
I had this working before I added the settings portion for the analyzer. Any help is appreciated!
I figured it out. The first "article" string needs to be deleted. And the "PUT" should happen against http://abc.com/index