Convert nested json to Cassandra - json

I run into a problem when trying to convert json data to Cassandra.
The json data is like:
{
"A": {
"A_ID" : "1111"
"field1": "value1",
"field2": "value2",
"field3": [
{
"id": "id1",
"name": "name1",
"segment": [
{
"segment_id": "segment_id_1",
"segment_name": "segment_name_1",
"segment_value": "segment_value_1"
},
{
"segment_id": "segment_id_2",
"segment_name": "segment_name_2",
"segment_value": "segment_value_2"
},
...
]
},
{
"id": "id2",
"name": "name2",
"segment": [
{
"segment_id": "segment_id_3",
"segment_name": "segment_name_3",
"segment_value": "segment_value_3"
},
{
"segment_id": "segment_id_4",
"segment_name": "segment_name_4",
"segment_value": "segment_value_4"
},
...
]
},
...
]
}
}
Thank you very much!
I see a post about composite keys here:
https://pkghosh.wordpress.com/2013/07/14/storing-nested-objects-in-cassandra-composite_columns/
But I do not know what does this post mean because the author did not give a complete solution.

In such cases, you should keep entire json string in database.
For example – You can create a table for "segement" part of the json.
Create table segment_by_id{
Id text,
Name text,
Json text,
PRIMARY KEY (id, name)
WITH CLUSTERING ORDER BY (name DESC);
This way you can divide the json part by part. You can also save the entire json in one table with primary and clustering keys suitable to your needs.
Cassandra uses LZ4 compression so large data can be saved efficiently. Go through Compression documentation to use a compression algoritham for table.

Related

Retrieve specific value from a JSON blob in MS SQL Server, using a property value?

In my DB I have a column storing JSON. The JSON looks like this:
{
"views": [
{
"id": "1",
"sections": [
{
"id": "1",
"isToggleActive": false,
"components": [
{
"id": "1",
"values": [
"02/24/2021"
]
},
{
"id": "2",
"values": []
},
{
"id": "3",
"values": [
"5393",
"02/26/2021 - Weekly"
]
},
{
"id": "5",
"values": [
""
]
}
]
}
]
}
]
}
I want to create a migration script that will extract a value from this JSON and store them in its own column.
In the JSON above, in that components array, I want to extract the second value from the component with an ID of "3" (among other things, but this is a good example). So, I want to extract the value "02/26/2021 - Weekly" to store in its own column.
I was looking at the JSON_VALUE docs, but I only see examples for specifing indexes for the json properties. I can't figure out what kind of json path I'd need. Is this even possible to do with JSON_VALUE?
EDIT: To clarify, the views and sections components can have static array indexes, so I can use views[0].sections[0] for them. Currently, this is all I have with my SQL query:
SELECT
*
FROM OPENJSON(#jsonInfo, '$.views[0].sections[0]')
You need to use OPENJSON to break out the inner array, then filter it with a WHERE and finally select the correct value with JSON_VALUE
SELECT
JSON_VALUE(components.value, '$.values[1]')
FROM OPENJSON (#jsonInfo, '$.views[0].sections[0].components') components
WHERE JSON_VALUE(components.value, '$.id') = '3'

JSON structure with missing fields - Advice needed- Is it good practice?

I have this JSON structure:
{
"object1": [
{
"field1": {
"first": null,
"last": "",
},
"array1": [
{
"title": "1",
},
{
"title": "2",
},
]
},
{
"array1": [
{
"title": "4",
},
{
"title": "5",
}
]
},
...
]
}
Here the field1 is missing in the second object and I also save it in this way on a mongo database. The reason for this decision is that I only have and need field1 on the first object. Is it okay or should I just add field1 for other objects but just let them blank?
It appears as if field1 might be some kind of pointer (or reference) to first & last document/item. Personally (and without knowing what you are needing to achieve), I would store this field1 in each element - with as you suggest null values where appropriate.
For instance - what happens if you can can delete elements - and the first one gets wiped? What will this mean for your structure?

Solr create Core which has Array of Json object and Json object similar to elastic search

I want to create a core which has doc as under
{
"dataSet_s": "MYSQL_NEW_W",
"ruleType_s": "IF_EQUALS_THEN_EQUALS",
"enable_s": "true",
"testCaseId_s": "CASE_2",
"condition": {
"conClause_s": "IF",
"conField_s": "ENGINE_RPM",
"conOperator1_s": "GREATER",
"conVal1_s": "5000",
"conVal2_s": "1000"
},
"result": [
{
"resClause1": "THEN",
"resField1": "SPEED",
"resOperator1": "EQUALS",
"resVal1": "100",
"resVal2": "200"
},
{
"resClause1": "THEN",
"resField1": "SPEED",
"resOperator1": "GREATER",
"resVal1": "1000",
"resVal2": "2000"
}
]
}'
In elastic search the document has the same structure after inserting, but in Solr when I tried insert after creating a simple core it looks..
{
"dataSet_s": "MYSQL_NEW_W",
"ruleType_s": "IF_EQUALS_THEN_EQUALS",
"enable_s": "true",
"testCaseId_s": "CASE_2",
"condition.conClause_s": "IF",
"condition.conField_s": "ENGINE_RPM",
"condition.conOperator1_s": "GREATER",
"condition.conVal1_s": "5000",
"condition.conVal2_s": "1000",
"result.resClause1": [
"THEN",
"THEN"
],
"result.resField1": [
"SPEED",
"SPEED"
],
"result.resOperator1": [
"EQUALS",
"GREATER"
],
"result.resVal1": [
100,
1000
],
"result.resVal2": [
200,
2000
],
"id": "ed0af96b-5127-4686-ae4c-26621b941919",
"_version_": 1541533921419722800
}
Can we store json Object and Array of Json object in solr similar to Elastic search.
There are two aspects to this:
Storing original structure
Indexing it for search
Elasticsearch has that built in, but Solr does it through the configuration with some flexibility. It is described in the Reference Guide under indexing custom JSON. Basically, the original structure goes into _src_ field. The indexing can happen in several different ways, but default would just dump all content into the text field, which is effectively what Elasticsearch does.
It is important to remember that both use Lucene underneath which does not support nested structures. So, both engine do mapping between representation and what Lucene can actually do.

JSON is it best practice to give each element in an array an id attribute?

Is it best practice in JSON to give objects in an array an id similar to below?. Im trying to decide on a JSON format for a restful service im implementing and decide include it or not... If it is to be modified by CRUD operations is it a good idea?
{
"tables": [
{
"id": 1,
"tablename": "Table1",
"columns": [
{
"name": "Col1",
"data": "-5767703747778052096"
},
{
"name": "Col2",
"data": "-5803732544797016064"
}
]
},
{
"id": 2,
"tablename": "Table2",
"columns": [
{
"name": "Col1",
"data": "-333333"
},
{
"name": "Col2",
"data": "-44444"
}
]
}
]
}
Client-Generated IDs
A server MAY accept a client-generated ID along with a request to
create a resource. An ID MUST be specified with an "id" key, the value
of which MUST be a universally unique identifier. The client SHOULD
use a properly generated and formatted UUID as described in RFC 4122
[RFC4122].
jsonapi.org

Talend: parse JSON string to multiple output

I'm aware of this question but I don't believe that there is no solution with standars component. I'm using Talend ESB Studio 5.4.
I have to parse a JSON string from a REST web service into multiple output, and add them to a database.
Database has two tables:
User (user_id, name, card, card_id, points)
Action (user_id, action_id, description, used_point)
My JSON Structure is something like that:
{
"users": [
{
"name": "foo",
"user_id": 1,
"card": {
"card_id": "AAA",
"points": 10
},
"actions": [
{
"action_id": 1,
"description": "buy",
"used_points": 2
},
{
"action_id": 3,
"description": "buy",
"used_points": 1
}
]
},
{
"name": "bar",
"user_id": 2,
"card": {
"card_id": "BBB",
"points": -1
},
"actions": [
{
"id": 2,
"description": "sell",
"used_point": 5
}
]
}
]
}
I have tried to add a JSON Schema Metadata but it is not clear to me how to "flat" the JSON. I have tried to look at tXMLMap, tExtractJSONFields.. but no luck till now.
I also had a look at tJavaRow but I don't understand how to make a Schema for that.
It's a pity because till now I'm loving Talend! Any advice?
You can save a json file in your disk, then create new json file in the metadata of Talend studio, the wizard retrieve the schema for you, after saving, you ca, copie schema in the generic schema of the metadata, and it's done, use that generic schema where you want, this is how to use generic schema in the tRestClient component: