Json contains with exclusion or wildcard - json

Is there a way to either wildcard or exclude fields in a json_contains statement (postgres w/ sqlalchemy)?
For example, lets say one of the rows of my database has a field called MyField which has a typical json value of ...
MyField : {Store: "HomeDepot", Location: "New York"}
Now, I am doing a json contains on that with a larger json variable called larger_json...
larger_json : {Store: "HomeDepot", Location: "New York", Customer: "Bob" ... }
In sqlalchemy, I could use a MyTable.MyField.comparator.contained_by(larger_json) and in this case, that would work fine. But what if, for example, I later removed Location as a field in my variable... so I still have the value in my database, but it no longer exists in larger_json:
MyField : {Store: "HomeDepot", Location: "New York"}
larger_json : {Store: "HomeDepot", Customer: "Bob" ... }
Assume that I know when this happens, i.e. I know that the database has Location but the larger_json does not. Is there a way for me to either wildcard Location, i.e. something like this...
{Store: "HomeDepot", Location: "*", Customer: "Bob" ... }
or to exclude it from the json value? Something like this?
MyTable.MyField.exclude_fields().comparator.contained_by(larger_json)
Or is there another recommended approach for dealing with this?

Not sure if that's what you need, but you could remove Location as a key from the values you search:
... WHERE (tab.myfield - 'Location') <# larger_json

Related

How do I search for a specific string in a JSON Postgres data type column?

I have a column named params in a table named reports which contains JSON.
I need to find which rows contain the text 'authVar' anywhere in the JSON array. I don't know the path or level in which the text could appear.
I want to just search through the JSON with a standard like operator.
Something like:
SELECT * FROM reports
WHERE params LIKE '%authVar%'
I have searched and googled and read the Postgres docs. I don't understand the JSON data type very well, and figure I am missing something easy.
The JSON looks something like this.
[
{
"tileId":18811,
"Params":{
"data":[
{
"name":"Week Ending",
"color":"#27B5E1",
"report":"report1",
"locations":{
"c1":0,
"c2":0,
"r1":"authVar",
"r2":66
}
}
]
}
}
]
In Postgres 11 or earlier it is possible to recursively walk through an unknown json structure, but it would be rather complex and costly. I would propose the brute force method which should work well:
select *
from reports
where params::text like '%authVar%';
-- or
-- where params::text like '%"authVar"%';
-- if you are looking for the exact value
The query is very fast but may return unexpected extra rows in cases when the searched string is a part of one of the keys.
In Postgres 12+ the recursive searching in JSONB is pretty comfortable with the new feature of jsonpath.
Find a string value containing authVar:
select *
from reports
where jsonb_path_exists(params, '$.** ? (#.type() == "string" && # like_regex "authVar")')
The jsonpath:
$.** find any value at any level (recursive processing)
? where
#.type() == "string" value is string
&& and
# like_regex "authVar" value contains 'authVar'
Or find the exact value:
select *
from reports
where jsonb_path_exists(params, '$.** ? (# == "authVar")')
Read in the documentation:
The SQL/JSON Path Language
jsonpath Type

Select JSON array's fields from SQL view to create a column

I have an SQL Table which one of the columns contain a JSON array in the following format:
[
{
"id":"1",
"translation":"something here",
"value":"value of something here"
},
{
"id":"2",
"translation":"something else here",
"value":"value of something else here"
},
..
..
..
]
Is there any way to use an SQL Query and retrieve columns with the ID as header and the "value" as the value of the column? Instead of return only one column with the JSON array.
For example, if I run:
SELECT column_with_json FROM myTable
It will return the above array. Where I want to return
1,2
value of something here, value of something else here
You can't use SQL to retrieve columns from the JSON stored inside the table: to the database engine the JSON is just unstructured text saved in a text field.
Some relational databases, like PostgreSQL, have a JSON type and functions to support JSON query. If this is your case, you should be able to perform the query you want.
Check this for an example on how it work with PostgreSQL:
http://clarkdave.net/2013/06/what-can-you-do-with-postgresql-and-json/

Is mongodb (or other nosql dbs) the best solution for the following scenario?

Considering the following data structures what would be better to QUERY the data once stored in a database system (rdbms or nosql)? The fields within the metadata field are user defined and will differ from user to user. Possible values are Strings, Number, "Dates" or even arrays.
var file1 = {
id: 123, name: "mypicture", owner: 1
metadata: {
people: ["Ben", "Tom"],
created: 2013/01/01,
license: "free",
rating: 4
...
},
tags: ["tag1", "tag2", "tag3", "tag4"]
}
var file2 = {
id: 155, name: "otherpicture", owner: 1
metadata: {
people: ["Tom", "Carla"],
created: 2013/02/02,
license: "free",
rating: 4
...
},
tags: ["tag4", "tag5"]
}
var file1OtherUser = {
id: 345, name: "mydocument", owner: 2
metadata: {
autors: ["Mike"],
published: 2013/02/02,
…
},
tags: ["othertag"]
}
Our users should have the ability to search/filter their files:
User 1: Show all files where "Tom" is in "people" array
User 1: Show all files "created" between 2013/01/01 and 2013/02/01
User 1: Show all files having "license" "free" and "rating" greater 2
User 2: Show all files "published" in "2012" and tagged with "important"
...
Results should be filtered in way like you can do in OS X with intelligent folders. The individual metadata fields are defined before files are being uploaded/stored. But they also may change after that, e.g. User 1 may rename the metadata field "people" to "cast".
As #WiredPrairie said, the field within the metadata field look variable, maybe dependant upon what the user enters which is supported by:
User 1 may rename the metadata field "people" to "cast".
MongoDB cannot create variable indexes whereby you just say that every new field in metadata gets added to the compound index, however you could do a key-value type structure like so:
var file1 = {
id: 123, name: "mypicture", owner: 1
metadata: [
{k: people, v:["Ben", "Tom"]},
{k: created, v:2013/01/01},
],
tags: ["tag1", "tag2", "tag3", "tag4"]
}
That is one method of doing this, allowing you to index on both k and v dynamically within the metadata field. You would then query by this like so:
db.col.find({metadata:{$elemMatch:{k:people,v:["Ben"]}}})
However this does introduce another problem. $elemMatch works on top level, not nested elements. Imagine you wanted to find all files where "Ben" was one of the people, you can't use $elemMatch here so you would have to do:
db.col.find({metadata.k:people,metadata.v:"Ben"})
The immediate problem with this query is in the way MongoDB queries. When it queries the metadata field it will say: where one field of "k" equals "people" and a field of "v" equals "Ben".
Since this is a multi-value field you could run into the problem where even though "Ben" is not in the peoples list, because he exists in another field on the metadata you actually pick out the wrong documents; i.e. this query would pick up:
var file1 = {
id: 123, name: "mypicture", owner: 1
metadata: [
{k: people, v:["Tom"]},
{k: created, v:2013/01/01},
{k: person, v: "Ben"}
],
tags: ["tag1", "tag2", "tag3", "tag4"]
}
The only real way to solve this is to factor off the dynamic fields to another collection where you don't have this problem.
This creates a new problem though, you can no longer get a full file with a single round trip and nor can you aggregate both the file row and its user defined fields in one go. So all in all you loose a lot of abilities by dong this.
That being said you can still perform quite a few queries, i.e.:
User 1: Show all files where "Tom" is in "people" array
User 1: Show all files "created" between 2013/01/01 and 2013/02/01
User 1: Show all files having "license" "free" and "rating" greater 2
User 2: Show all files "published" in "2012" and tagged with "important"
All of those would still be possible with this schema.
As for which is better -RDBMS or NoSQL; it is difficult to say here, I would say both could be quite good, if done right, at querying this structure.

mongodb find() order is different from schema order

db.blog.save({ title : "My First Post", author: {name : "Jane", id : 1}})
what should below return as the key order does not match?
db.blog.find({"author" : {"id" : 1, "name" : "Jane"}})
EDIT:
based on official mongodb documentation , the keyorder must match (at least for findOne()). It wont return the match-only object using db.blog.findOne({"author" : {"id" : 1, "name" : "Jane"}})
The order of the keys in your query selector is irrelevant. It doesn't need to match the order of the keys you used when adding the document you're searching for.
UPDATE
If you're just looking for an order-independent way to query based on an embedded document, you need to use dot notation:
db.blog.find({"author.id" : 1, "author.name" : "Jane"})
Normally, as #JohnnyHK states the order of the query keys does not matter except for the example you have shown:
db.blog.find({"author" : {"id" : 1, "name" : "Jane"}})
This query will not return results that do not match exactly. Using the query he shows of:
db.blog.find({"author.id" : 1, "author.name" : "Jane"})
Will be key order independent. The reasons for this difference is because in the first query you are searching by an object as such the querier actually searches for exactly that object (in the simplest terms). The same applies for indexes created on the field which contains a set of sub documents, the order does matter.
According to the JSON definition, the key order doesn't matter.
An object is an unordered collection of zero or more name/value pairs
I don't know anything about MongoDB, but I assume it follows the normal rules of JSON, at which point it should return the "My First Post" entry.

MongoDB : Update Modifier semantics of "$unset"

In MongoDB, the update modifier unset works as follows:
Consider a Mongo DB Database db with a collection users. Users contain a Document, of the following format:
//Document for a user with username: joe
{
"_id" : ObjectId("4df5b9cf9f9a92b1584fff16"),
"relationships" : {
"enemies" : 2,
"friends" : 33,
"terminated" : "many"
},
"username" : "joe"
}
If I want to remove the terminated key, I have to specify the $unset update modifier as follows:
>db.users.update({"username":"joe"},{"$unset":{"relationships.terminated": "many"}});
My Question is, why do I have to specify the ENTIRE KEY VALUE PAIR for the $unset to work, instead of simply specifying:
>db.users.update({"username":"joe"},{"$unset":{"relationships.terminated"}});
Mon Jun 13 13:25:57 SyntaxError: missing : after property id (shell):1
Why not?
EDIT:
If the way to $unset is to specify the entire key value pair, in accordance with JSON specifications, or to add "1" as the value to the statement, why can't the Shell do the "1" substitution itself? Why isn't such a feature provided? Are there any pitfalls of providing such support?
The short answer is because {"relationships.terminated"} is not a valid json/bson object. A JSON object is composed of a key and a value, and {"relationships.terminated"} only has a key (or value, depends on how you look it).
Affortunately to unset a field in Mongo you do not need to set the actual value of the field you want to remove. You can use any value (1 is commonly used in Mongo docs) no matter the actual value of relationships.terminated:
db.users.update({"username":"joe"},{"$unset":{"relationships.terminated" : 1}});