Cloudant Selector Query - json

I would like to query using cloudant db using selector, for example that is shown below: user would like to have loanborrowed whose amount exceeds a number, how to access the array in a cloudant selector to find a specific record
{
"_id": "65c5e4c917781f7365f4d814f6e1665f",
"_rev": "2-73615006996721fef9507c2d1dacd184",
"userprofile": {
"name": "tom",
"age": 30,
"employer": "Microsoft"
},
"loansBorrowed": [
{
"loanamount": 5000,
"loandate": "01/01/2001",
"repaymentdate": "01/01/2001",
"rateofinterest": 5.6,
"activeStatus": true,
"penalty": {
"penalty-amount": 500,
"reasonforPenalty": "Exceeded the date by 10 days"
}
},
{
"loanamount": 3000,
"loandate": "01/01/2001",
"repaymentdate": "01/01/2001",
"rateofinterest": 5.6,
"activeStatus": true,
"penalty": {
"penalty-amount": 400,
"reasonforPenalty": "Exceeded the date by 10 days"
}
},
{
"loanamount": 2000,
"loandate": "01/01/2001",
"repaymentdate": "01/01/2001",
"rateofinterest": 5.6,
"activeStatus": true,
"penalty": {
"penalty-amount": 500,
"reasonforPenalty": "Exceeded the date by 10 days"
}
}
]
}

If you use the default Cloudant Query index (type text, index everything):
{
"index": {},
"type": "text"
}
Then the following query selector should work to find e.g. all documents with a loanamount > 1000:
"loansBorrowed": { "$elemMatch": { "loanamount": { "$gt": 1000 } } }
I'm not sure that you can coax Cloudant Query to only index nested fields within an array so, if you don't need the flexibility of the "index everything" approach, you're probably better off creating a Cloudant Search index which indexes just the specific fields you need.

While Will's answer works, I wanted to let you know that you have other indexing options with Cloudant Query for handling arrays. This blog has the details on various tradeoffs (https://cloudant.com/blog/mango-json-vs-text-indexes/), but long story short, I think this might be the best indexing option for you:
{
"index": {
"fields": [
{"name": "loansBorrowed.[].loanamount", "type": "number"}
]
},
"type": "text"
}
Unlike Will's index-everything approach, here you're only indexing a specific field, and if the field contains an array, you're also indexing every element in the array. Particularly for "type": "text" indexes on large datasets, specifying a field to index will save you index-build time and storage space. Note that text indexes that specify a field must use the following form in the "fields": field: {"name": "fieldname", "type": "boolean,number, or string"}
So then the corresponding Cloudant Query "selector": statement would be this:
{
"selector": {
"loansBorrowed": {"$elemMatch": {"loanamount": {"$gt": 4000}}}
},
"fields": [
"_id",
"userprofile.name",
"loansBorrowed"
]
}
Also note that you don't have to include "fields": as part of your "selector": statement, but I did here to only project certain parts of the JSON. If you omit it from your "selector": statement, the entire document will be returned.

Related

Elastic Search - Nested aggregation

I would like to form a nested aggregation type query in elastic search. Basically , the nested aggregation is at four levels.
groupId.keyword
---direction
--billingCallType
--durationCallAnswered
example:
"aggregations": {
"avgCallDuration": {
"terms": {
"field": "groupId.keyword",
"size": 10000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"call_direction": {
"terms" : {
"field": "direction"
},
"aggregations": {
"call_type" : {
"terms": {
"field": "billingCallType"
},
"aggregations": {
"avg_value": {
"terms": {
"field": "durationCallAnswered"
}
}
}
}
}
}
}
}
}
This is part of a query . While running this , I am getting the error as
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [direction] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
Can anyone throw light on this?
Tldr;
As the error state, you are performing an aggregation on a text field, the field direction.
Aggregation are not supported by default on text field, as it is very expensive (cpu and memory wise).
They are 3 solutions to your issue,
Change the mapping from text to keyword (will require re indexing, most efficient way to query the data)
Change the mapping to add to this field fielddata: true (flexible, but not optimised)
Don't do the aggregation on this field :)

How do I properly use deleteMany() with an $and query in the MongoDB shell?

I am trying to delete all documents in my collection infrastructure that have a type.primary property of "pipelines" and a type.secondary property of "oil."
I'm trying to use the following query:
db.infrastructure.deleteMany({$and: [{"properties.type.primary": "pipelines"}, {"properties.type.secondary": "oil"}] }),
That returns: { acknowledged: true, deletedCount: 0 }
I expect my query to work because in MongoDB Compass, I can retrieve 182 documents that match the query {$and: [{"properties.type.primary": "pipelines"}, {"properties.type.secondary": "oil"}] }
My documents appear with the following structure (relevant section only):
properties": {
"optional": {
"description": ""
},
"original": {
"Opername": "ENBRIDGE",
"Pipename": "Lakehead",
"Shape_Leng": 604328.294581,
"Source": "EIA"
},
"required": {
"unit": null,
"viz_dim": null,
"years": []
},
"type": {
"primary": "pipelines",
"secondary": "oil"
}
...
My understanding is that I just need to pass a filter to deleteMany() and that $and expects an array of objects. For some reason the two combined isn't working here.
I realized the simplest answer was the correct one -- I spelled my database name incorrectly.

Elastic search filter for distinct categories

I made a simple mapping with three fields and i am analyzing one field which is text type and other fields are keyword type.
example
fields: Category_one, Category_two, Category_three.
Now i am searching the documents.
Get _search/cat
{
"size": 4,
"query": {
"match": {
"Category_one.ngrams": {
"query": "Nice food place in XYZ location",
"analyzer": "standard"
}
},
"aggs":{
"distincr_values":{
"terms": {
"fields" : "Category_two"
}
}
}
}
}
It's showing this error
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[match] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 5
},
"status": 400
}
Kindly help me with this error. My main motive is to find distinct searches according Category_two field.
Any help would be appreciated.
I believe youre getting this error because of your query structure.
Your aggregations keyword must be outside (same level as) the query. At the moments your aggs is wrapped up inside the query.
Following this structure:
Get _search/cat
{
"size": 4,
"query": {
'query goes here'
},
"aggs":{
'aggregation go here'
}
}

Querying Microsoft Academic graph by fields of study of references

I've been playing around with the Microsoft Academic API, trying to do graph queries using JSON formatted queries. I'm at the point where I think I can produce results, but for some reason I don't get the full set of results.
The query I am attempting to perform will retrieve all papers that reference a paper that has a FieldOfStudy that is one of the ones I'm looking for. Essentially, I'm trying to find out how well cited a field of study is.
I think the query should look something like this:
{
"path": "/paper/ReferenceIDs/reference/FieldOfStudyIDs/field",
"paper": {
"type": "Paper",
"match" : {
"PublishYear": 2017
},
"select": ["DOI","OriginalTitle","PublishYear"]
},
"reference" : {
"type" : "Paper",
"select" : "OriginalTitle"
},
"field": {
"type": "FieldOfStudy",
"select": [ "Name" ],
"return": { "id": [106686826,204641814] }
}
}
Unfortunately, I get only an incomplete subset of results. Funnily enough though, if I further restrict the initial node by matching on a title, I get another set of results (disjoint from the first query result set)
{
"path": "/paper/ReferenceIDs/reference/FieldOfStudyIDs/field",
"paper": {
"type": "Paper",
"match" : {
"OriginalTitle": "cancer",
"PublishYear": 2017
},
"select": ["DOI","OriginalTitle","PublishYear"]
},
"reference" : {
"type" : "Paper",
"select" : "OriginalTitle"
},
"field": {
"type": "FieldOfStudy",
"select": [ "Name" ],
"return": { "id": [106686826,204641814] }
}
}
So, what could be going on here? Is the query giving up because the very first node it hits on the broader search doesn't match the path? Is it even possible to query all papers published in a year like this?

MySQL regexp search JSON array

I am storing json data to one of the fields in a table and I am having trouble using REGEXP to return the correct entry
Basically, it matches other attributes in the JSON object, that it should not
Sample JSON
{
"data": {
"en": {
"containers": [
{
"id": 1441530944931,
"template": "12",
"columns": {
"column1": [
"144",
"145",
"148"
],
"column2":[
"135",
"148",
"234"
]
}
}
],
"left": "152",
"right": "151"
},
}
}
Now, I would like to search the columns array against a specific value (ie 148)
Right now I have the below query
WHERE (w.`_attrs` REGEXP '"column[0-9]":.*\\[.*"148".*\\]'
which works just fine
However, if I change the value from 148 to 152 or 151, it also works
For some reason the query matches the attribute left and right as well, but this is not desirable
Any help?
Thanks
Or... Switch to MariaDB 10 and index the components of the JSON.