Is there a way to enrich JSON field in MySQL? - mysql

Let's take a simple schema with two tables, one that describes an simple entity item (id, name)
id | name
------------
1 | foo
2 | bar
and another, lets call it collection, that references to an item, but inside a JSON Object in something like
{
items: [
{
id: 1,
quantity: 2
}
]
}
I'm looking for a way to eventually enrich this field (kind of like populate in Mongo) in the collection with the item element referenced, to retrieve something like
{
...
items: [
{
item: {
id: 1,
name: foo
},
quantity: 2
}
]
...
}
If you have a solution with PostgreSQL, I take it as well.

If I understood correctly, your requirement is to convert an Input JSON data into MySQL table so that you can work with JSON but leverage the power of SQL.
Mysql8 recently released JSONTABLE function. By using this function, you can store your JSON in the table directly and then query it like any other SQL query.
It should serve your immediate case, but this means that your table schema will have a JSON column instead of traditional MySQL columns. You will need to check if it serves your purpose.
This is a good tutorial for the same.

Related

Get parent and child records from different tables in MySql (one to many)

I have a parent table called orders and a child table of order_items. I need to fetch orders of a customer and corresponding order items for each orders. Below is the schema:
Orders:
Id | Order_date | Order_value
-------------------------------
1234 | 2016-12-12 | 700.00
Order Items:
Id | ProductId | OrderId
----------------------------
8847 | shirt_blue | 1234
8848 | shirt_red | 1234
I need an end result in JSON which looks like this:
{
id: '1234',
order_date: '2016-12-12',
order_value: 700.00,
Items: [
{id: '8847', productid: 'shirt_blue' },
{id: '8848', productid: 'shirt_red' }
]
}
What is the cleanest and fastest way to get these records from MySql ? I am using NodeJS/Node-MySql if it makes any difference.
Currently I am fetching order records first and looping over them in my app layer and fetching order item records and append them to each record. This is not very effective way to do things.
You can give a shot to Sequelize, which is an ORM build for Node.js and supports MySQL, PostgreSQL, SQLite and MSSQL.
With use of Sequelize, you can create models that reflect database tables. First step is to instantiate the sequelize itself:
var sequelize = new Sequelize('mysql://user:pass#example.com:3306/dbname');
Then you are able to create models:
var Order = sequelize.define('Order', {
order_date: Sequelize.DATE,
order_value: Sequelize.STRING
});
Order.sync(); // this is run in order to reflect the changes in your database
You can also define relations between models (database tables) like that:
var OrderItem = sequelize.define('OrderItem', {
// attributes etc...
});
OrderItem.belongsTo(Order); // this would add OrderId attribute to OrderItem, which is a Foreign Key to Order table
Order.hasMany(OrderItem); // Likewise
By creating the relations you are able to return OrderItems of specified Order:
Order.findById(1).then(function(order){
order.getOrderItems().then(function(orderItems){
// here you get all order items assigned to given order
});
});
This is very comfortable solution in terms of creation of models (tables) and migrating them. You can also create your custom SQL statements and run them directly via the sequelize instance.
Check out documentation, it contains lots of examples and precisely describe most use cases.
Newer versions of MySQL support JSON Object results.
This will let you return nested child arrays.
The benefit is you can write nested SQL queries (so SQL can optimize) instead of manually fetching all the child elements in Node.js.
How do I generate nested json objects using mysql native json functions?

How do I query a complex JSONB field in Django 1.9

I have a table item with a field called data of type JSONB. I would like to query all items that have text that equals 'Super'. I am trying to do this currently by doing this:
Item.objects.filter(Q(data__areas__texts__text='Super'))
Django debug toolbar is reporting the query used for this is:
WHERE "item"."data" #> ARRAY['areas', 'texts', 'text'] = '"Super"'
But I'm not getting back any matching results. How can I query this using Django? If it's not possible in Django, then how can I query this in Postgresql?
Here's an example of the contents of the data field:
{
"areas": [
{
"texts": [
{
"text": "Super"
}
]
},
{
"texts": [
{
"text": "Duper"
}
]
}
]
}
try Item.objects.filter(data__areas__0__texts__0__text='Super')
it is not exact answer, but it can clarify some jsonb filter features, also read django docs
I am not sure what you want to achieve with this structure, but I was able to get the desired result only with strange raw query, it can look like this:
Item.objects.raw("SELECT id, data FROM (SELECT id, data, jsonb_array_elements(\"table_name\".\"data\" #> '{areas}') as areas_data from \"table_name\") foo WHERE areas_data #> '{texts}' #> '[{\"text\": \"Super\"}]'")
Dont forget to change table_name in query (in your case it should be yourappname_item).
Not sure you can use this query in real programs, but it probably can help you to find a way for a better solution.
Also, there is very good intro to jsonb query syntax
Hope it will help you

Active Record: JSON Query

Inside my database model, I've got a json field which has the following structure:
json_field: {"data"=>{"key_1"=>"value1", "key_2"=>"value"} }
Trying to query this using select:
Model.select(:id, "json_field -> 'data'")
Model.select(:id, "json_field -> 'data' as data")
yields the array of objects, but without the json field selected.
#<ActiveRecord::Relation [#<Model id: 1, Model id: 2 ...>]
Thanks for any help.
This:
#<ActiveRecord::Relation [#<Model id: 1, Model id: 2 ...>]
is the result of calling inspect on the query and inspect will only display columns that the model knows about it. The model will query the table for the columns during startup so it will only know about columns that are actually in the table.
ActiveRecord creates column accessor methods on the fly using method_missing so it can create methods things in a query that aren't columns in the actual table.
So your data is there, you just have to ask for it by name, for example:
Model.select(:id, "json_field -> 'data' as data").map(&:data)
will give you the data values.

Using JSON-based Database for unordered data

I am working on a simple app for Android. I am having some trouble using the Firebase database since it uses JSON objects and I am used to relational databases.
My data will consists of two users that share a value. In relational databases this would be represented in a table like this:
**uname1** **uname2** shared_value
In which the usernames are the keys. If I wanted the all the values user Bob shares with other users, I could do a simple union statement that would return the rows where:
uname1 == Bob or unname == Bob
However, in JSON databases, there seems to be a tree-like hierarchy in the data, which is complicated since I would not be able to search for users at the top level. I am looking for help in how to do this or how to structure my database for best efficiency if my most common search will be one similar to the one above.
In case this is not enough information, I will elaborate: My database would be structured like this:
{
'username': 'Bob'
{
'username2': 'Alice'
{
'shared_value' = 2
}
}
'username': 'Cece'
{
'username2': 'Bob'
{
'shared_value' = 4
}
}
As you can see from the example, Bob is included in two relationships, but looking into Bobs node doesn't show that information. (The relationship is commutative, so who is "first" cannot be predicted).
The most intuitive way to fix this would be duplicate all data. For example, when we add Bob->Alice->2, also add Alice->Bob->2. In my experience with relational databases, duplication could be a big problem, which is why I haven't done this already. Also, duplication seems like an inefficient fix.
Is there a reason why you don't invert this? How about a collection like:
{ "_id": 2, "usernames":[ "Bob", "Alice"]}
{ "_id": 4, "usernames":[ "Bob", "Cece"]}
If you need all the values for "Bob", then index on "usernames".
EDIT:
If you need the two usernames to be a unique key, then do something like this:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
But this would still permit the creation of:
{ "_id": {"uname1":"Alice", "uname2":"Bob"}, "value": 78 }
(This issue is also present in your as-is relational model, btw. How do you handle it there?)
In general, I think implementing an array by creating multiple columns with names like "attr1", "attr2", "attr3", etc. and then having to search them all for a possible value is an artifact of relational table modeling, which does not support array values. If you are converting to a document-oriented storage, these really should be an embedded list of values, and you should use the document paradigm and model them as such, instead of just reimplementing your table rows as documents.
You can still have old structure:
[
{ username: 'Bob', username2: 'Alice', value: 2 },
{ username: 'Cece', username2: 'Bob', value: 4 },
]
You may want to create indexes on 'username' and 'username2' for performance. And then just do the same union.
To create a tree-like structure, the best way is to create an "ancestors" array that stores all the ancestors of a particular entry. That way you can query for either ancestors or descendants and all documents that are related to a particular value in the tree. Using your example, you would be able to search for all descendants of Bob's, or any of his ancestors (and related documents).
The answer above suggest:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
That is correct. But you don't get to see the relationship between Bob and Cece with this design. My suggestion, which is from Mongo, is to store ancestor keys in an ancestor array.
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 , "ancestors": [{uname: "Cece"}]}
With this design you still get duplicates, which is something that you do not want. I would design it like this:
{"username": "Bob", "ancestors": [{"username": "Cece", "shared_value": 4}]}
{"username": "Alice", "ancestors": [{"username": "Bob", "shared_value": 2}, {"username": "Cece"}]}

JSON path parent object, or equivalent MongoDB query

I am selecting nodes in a JSON input but can't find a way to include parent object detail for each array entry that I am querying. I am using pentaho data integration to query the data using JSON input form a mongodb input.
I have also tried to create a mongodb query to achieve the same but cannot seem to do this either.
Here are the two fields/paths that display the data:
$.size_break_costs[*].size
$.size_break_costs[*].quantity
Here is the json source format:
{
"_id" : ObjectId("4f1f74ecde074f383a00000f"),
"colour" : "RAVEN-SMOKE",
"name" : "Authority",
"size_break_costs" : [
{
"quantity" : NumberLong("80"),
"_id" : ObjectId("518ffc0697eee36ff3000002"),
"size" : "S"
},
{
"quantity" : NumberLong("14"),
"_id" : ObjectId("518ffc0697eee36ff3000003"),
"size" : "M"
},
{
"quantity" : NumberLong("55"),
"_id" : ObjectId("518ffc0697eee36ff3000004"),
"size" : "L"
}
],
"sku" : "SK3579"
}
I currently get the following results:
S,80
M,14
L,55
I would like to get the SKU and Name as well as my source will have multiple products (SKU/Description):
SK3579,Authority,S,80
SK3579,Authority,M,14
SK3579,Authority,L,55
When I try To include using $.sku, I the process errors.
The end result i'm after is a report of all products and the available quantities of their various sizes. Possibly there's an alternative mongodb query that provides this.
EDIT:
It seems the issue may be due to the fact that not all lines have the same structure. For example the above contains 3 sizes - S,M,L. Some products come in one size - PACK. Other come in multiple sizes - 28,30,32,33,34,36,38 etc.
The error produced is:
*The data structure is not the same inside the resource! We found 1 values for json path [$.sku], which is different that the number retourned for path [$.size_break_costs[].quantity] (7 values). We MUST have the same number of values for all paths.
I have tried the following mongodb query separately which gives the correct results, but the corresponding export of this doesn't work. No values are returned for the Size and Quantity.
Query:
db.product_details.find( {}, {sku: true, "size_break_costs.size": true, "size_break_costs.quantity": true}).pretty();
Export:
mongoexport --db brandscope_production --collection product_details --csv --out Test01.csv --fields sku,"size_break_costs.size","size_break_costs.quantity" --query '{}';
Shortly after I added my own bounty, I figured out the solution. My problem has the same basic structure, which is a parent identifier, and some number N child key/value pairs for ratings (quality, value, etc...).
First, you'll need a JSON Input step that gets the SKU, Name, and size_break_costs array, all as Strings. The important part is that size_break_costs is a String, and is basically just a stringified JSON array. Make sure that under the Content tab of the JSON Input, that "Ignore missing path" is checked, in case you get one with an empty array or the field is missing for some reason.
For your fields, use:
Name | Path | Type
ProductSKU | $.sku | String
ProductName | $.name | String
SizeBreakCosts | $.size_break_costs | String
I added a "Filter rows" block after this step, with the condition "SizeBreakCosts IS NOT NULL", which is then passed to a second JSON Input block. This second JSON block, you'll need to check "Source is defined in a field?", and set the value of "Get source from field" to "SizeBreakCosts", or whatever you named it in the first JSON Input block.
Again, make sure "Ignore missing path" is checked, as well as "Ignore empty file". From this block, we'll want to get two fields. We'll already have ProductSKU and ProductName with each row that's passed in, and this second JSON Input step will further split it into however many rows are in the SizeBreakCosts input JSON. For fields, use:
Name | Path | Type
Quantity | $.[*].quantity | Integer
Size | $.[*].size | String
As you can see, these paths use "$.[*].FieldName", because the JSON string we passed in has an array as the root item, so we're getting every item in that array, and parsing out its quantity and size.
Now every row should have the SKU and name from the parent object, and the quantity and size from each child object. Dumping this example to a text file, I got:
ProductSKU;ProductName;Size;Quantity
SK3579;Authority;S; 80
SK3579;Authority;M; 14
SK3579;Authority;L; 55