What is wrong with this elastic json query, mapping? - json

I am trying to use nested JSON to query DB records. Here is my query -
"query": {
"nested": {
"path": "metadata.technical",
"query": {
"bool": {
"must": [
{
"term": {
"metadata.technical.key": "techcolor"
}
},
{
"term": {
"metadata.technical.value": "red"
}
}
]
}
}
}
}
Here is this part in my mapping.json -
"metadata": {
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
And I have table that has column 'value' and its content is -
{"technical":
{
"techname22": "test",
"techcolor":"red",
"techlocation": "usa"
}
}
Why I can't get any results with this? FYI I am using ES 1.7. Thanks for any help.

To respect the mapping you've defined your sample document should look like this:
{
"technical": [
{
"key": "techname22",
"value": "test"
},
{
"key": "techcolor",
"value": "red"
},
{
"key": "techlocation",
"value": "usa"
}
]
}
Changing your document with the above structure would make your query work as it is.
The real mapping of this document:
{
"technical": {
"techname22": "test",
"techcolor": "red",
"techlocation": "usa"
}
}
Is more like this:
{
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
"techname22": {
"type": "string"
},
"techcolor": {
"type": "string"
},
"techlocation": {
"type": "string"
}
}
}
}
}
If all your keys are dynamic and not known in advance, you can also configure your mapping to be dynamic as well, i.e. don't define any fields in the nested type and new fields will be added if not already present in the mapping:
{
"include_in_parent": true,
"properties": {
"technical": {
"type": "nested",
"properties": {
}
}
}
}

Related

Why is the json object valid against this conditional schema?

Here is the json object.
{
"payment": {
"account": [
{
"type": "ACCOUNT_INFORMATION",
"identification": "2451114"
},
{
"type": "XXX",
"identification": "2451114"
}
]
}
}
And this is the schema.
{
"if": {
"properties": {
"payment": {
"properties": {
"account": {
"items": {
"properties": {
"type": {
"const": "ACCOUNT_INFORMATION"
}
}
}
}
}
}
}
},
"then": {
"properties": {
"payment": {
"properties": {
"account": {
"items": {
"properties": {
"identification": {
"maxLength": 8,
"minLength": 8
}
}
}
}
}
}
}
}
}
If remove the second account items as follows, the schema gives error.
{
"payment": {
"account": [
{
"type": "ACCOUNT_INFORMATION",
"identification": "2451114"
}
]
}
}
Is this due to the conditional schema cannot be apply to an embedded array?
Validation used https://www.jsonschemavalidator.net/
The first json object returns no error while the second one returns error with violation of minLength constraint.
Should both return error?
To see what's happening, let's break down the schema to focus on the critical part of the if schema.
"items": {
"properties": {
"type": { "const": "ACCOUNT_INFORMATION" }
}
}
Given this schema, the following instance is not valid because not all "type" properties have the value "ACCOUNT_INFORMATION".
[
{
"type": "ACCOUNT_INFORMATION",
"identification": "2451114"
},
{
"type": "XXX",
"identification": "2451114"
}
]
And, this following value is valid because all "type" properties have the value "ACCOUNT_INFORMATION".
[
{
"type": "ACCOUNT_INFORMATION",
"identification": "2451114"
}
]
That difference in validation result is the reason these two values behave differently in your schema. The then schema is applied only when the if schema evaluates to true, which is what you get with the second example and not the first. The then schema is applied on the second example and the minLength constraint causes validation to fail.
It seems like your conditional only applies to the items schema, so you can solve this by moving your conditional into that object only.
{
"properties": {
"payment": {
"properties": {
"account": {
"items": {
"if": {
"properties": {
"type": {
"const": "ACCOUNT_INFORMATION"
}
},
"required": ["type"]
},
"then": {
"properties": {
"identification": {
"maxLength": 8,
"minLength": 8
}
}
}
}
}
}
}
}
}

django-elasticsearch-dsl-drf with JSONfield

I'm using django-elasticsearch-dsl-drf package and I have Postgres jsonField which I want to index. I tried to use Nestedfield in the document but without any properties since the json field is arbitrary, But I'm not able to search on that field, and I don't see anything related to that on their documentation.
Any idea how can I achieve this?
Mapping:
{
"mappings": {
"_doc": {
"properties": {
"jsondata": {
"type": "nested",
"properties": {
"timestamp": {
"type": "date"
},
"gender": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"group_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}, ...
I want to search on that field like jsondata.gender = x
Query for jsonfield.gender = x
GET <index>/_search
{
"query": {
"nested": {
"path": "jsonfield",
"query": {
"term": {
"jsonfield.gender.keyword": {
"value": "x"
}
}
}
}
}
}
NOTE: The query has not been verified using Kibana Dev Tools.

Add pattern validation in json schema when property is present

Below is my schema definition and I would like to add pattern that depends on environment propertyName (env1, env2 or env3). Each env should have different pattern. For instance when env1 is present then url will have a different pattern than when env2 is present etc.
{
"environments": {
"env1": {
"defaultAccess": {
"url": [
"something-staging"
]
}
}
}
}
My current schema definition for that example
{
"$schema": "https://json-schema.org/draft-07/schema#",
"definitions": {
"envType": {
"type": "object",
"properties": {
"defaultAccess": {
"type": "object",
"properties": {
"url": {
"type": "string",
"pattern": "^[a-zA-Z0-9- \/]*$"
}
},
"required": [
"url"
]
}
}
},
"environmentTypes": {
"type": "object",
"properties": {
"env1": {
"$ref": "#/definitions/envType"
},
"env2": {
"$ref": "#/definitions/envType"
},
"env3": {
"$ref": "#/definitions/envType"
}
}
},
"type": "object",
"properties": {
"environments": {
"$ref": "#/definitions/environmentTypes"
}
}
}
}
In my head I have something like this but do not know how to apply it to the schema properly.
{
"if": {
"properties": {
"environments": {
"env1" : {}
}
}
},
"then":{
"properties": {
"environments-env1-defaultAccess-url" : { "pattern": "^((?!-env2).)*$" }
}
}
}
etc..
If understand correctly what you're trying to do, you shouldn't need conditionals for this kind of thing.
You have an error in your schema that might be tripping you up. You have your main schema inside the definitions keyword. If you run this through a validator, you should get an error saying that the value a /definitions/type must be an object.
Aside from that, schema composition using allOf should do the trick. Below, I've shown an example at /definitions/env1Type.
It looks like you were hoping for a less verbose way to specify a schema deep in an object structure (""). Unfortunately, there's no way around having to chain the properties keyword all the way down like I've demonstrated at /definitions/env1Type.
{
"$schema": "https://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"environments": { "$ref": "#/definitions/environmentTypes" }
},
"definitions": {
"environmentTypes": {
"type": "object",
"properties": {
"env1": { "$ref": "#/definitions/env1Type" },
"env2": { "$ref": "#/definitions/env2Type" },
"env3": { "$ref": "#/definitions/env3Type" }
}
},
"envType": { ... },
"env1Type": {
"allOf": [{ "$ref": "#/definitions/envType" }],
"properties": {
"defaultAccess": {
"properties": {
"url": { "pattern": "^((?!-env1).)*$" }
}
}
}
},
"env2Type": { ... },
"env3Type": { ... }
}
}

How to avoid Mongodb from creating "members" fields in Json?

I am storing Json Schemas/Objects in a mongodb to query them from a REST client. For accessing the database, I use Spring Boot with MongoTemplates. My problem is that mongodb seems to create several fields like "members", "_class" and "_id" automatically. I was already able to remove the _class and _id fields in the output, but these members seem to be quite persistent, I found no information about how to remove them (or suppress them in the output at least). I also can't just remove them from the objects after querying them because then all the information (title, definitions...) is gone, too.
Anyone out there who had this problem as well and can help? I would appreciate :)
Regards
Object(s) after querying out of mongodb:
[
{
"members": {
"__ID__": {
"value": "c89bae58-8911-45a4-80b0-843278ac72c8"
}
}
},
{
"members": {
"title": {
"value": "LogicalNodes"
},
"__ID__": {
"value": "a09ffc24-d25f-467a-bcd7-9eaed5fbc44e"
}
}
},
{
"members": {
"title": {
"value": "LogicalNodes"
},
"definitions": {
"members": {
"mv": {
"members": {
"type": {
"value": "object"
},
"properties": {
"members": {
"i": {
"members": {
"type": {
"value": "number"
}
}
}
}
}
}
}are
}
},
"__ID__": {
"value": "eb054bd3-2c50-43eb-9ee9-4c8e54a8236d"
}
}
},
{
"members": {
"title": {
"value": "TestObject"
},
"definitions": {
"members": {
"ab": {
"members": {
"type": {
"value": "object"
},
"properties": {
"members": {
"q": {
"members": {
"type": {
"value": "number"
}
}
}
}
}
}
}
}
},
"__ID__": {
"value": "4b5a5813-5596-4c88-9a24-e19ae08b7548"
}
}
},
{
"members": {
"$schema": {
"value": "schema"
},
"title": {
"value": "TestObject"
},
"definitions": {
"members": {
"ab": {
"members": {
"type": {
"value": "object"
},
"properties": {
"members": {
"q": {
"members": {
"type": {
"value": "number"
}
}
}
}
}
}
}
}
},
"__ID__": {
"value": "e2a82f73-709d-410d-a79c-1e997f9fc5a4"
}
}
},
{
"members": {
"$schema": {
"value": "schema"
},
"title": {
"value": "TestObject"
},
"name": {
"value": "test"
},
"__ID__": {
"value": "26089639-8b99-4f47-8696-d8f3e50694b7"
}
}
}
]
NOTE: I have added the "ID" fields by myself. Everything is fine, but there shouldn't be any "members" fields, as they are producing more overhead and make the output worse readable as well as they make the queries more complex.

How to use function_score on nested geo_point field

I've been trying to use function_score on nested geo_type but Elasticsearch always returns documents with _score = 1.
My mapping
{
"myindex": {
"mappings": {
"offers": {
"properties": {
"country_id": {
"type": "long"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"locations": {
"type": "nested",
"properties": {
"city": {
"type": "string",
"store": true,
"fields": {
"city_original": {
"type": "string",
"analyzer": "keyword_analyzer"
}
}
},
"coordinates": {
"type": "geo_point"
}
}
}
}
}
}
}
}
The problem is that some of documents can have empty coordinates. My query:
GET /myindex/offers/_search?explain
{
"query": {
"nested": {
"path": "locations",
"query": {
"function_score": {
"functions": [
{
"filter": {
"exists": {
"field": "locations.coordinates"
}
},
"gauss": {
"locations.coordinates": {
"origin": {
"lat": 49.1,
"lon": 17.03333
},
"scale": "10km",
"offset": "20km"
}
}
}
]
}
}
}
}
}
As I said, Elasticsearch returns documents with _score = 1 and explanation: Score based on child doc range from 21714 to 21714
What am I doing wrong? My version of Elastisearch is 1.7.