How to index multidimensional arrays in couchdb - json

I have a multidimensional array that I want to index with CouchDB (really using Cloudant). I have users which have a list of the teams that they belong to. I want to search to find every member of that team. So, get me all the User objects that have a team object with id 79d25d41d991890350af672e0b76faed. I tried to make a json index on "Teams.id", but it didn't work because it isn't a straight array but a multidimensional array.
User
{
"_id": "683be6c086381d3edc8905dc9e948da8",
"_rev": "238-963e54ab838935f82f54e834f501dd99",
"type": "Feature",
"Kind": "Profile",
"Email": "gc#gmail.com",
"FirstName": "George",
"LastName": "Castanza",
"Teams": [
{
"id": "79d25d41d991890350af672e0b76faed",
"name": "First Team",
"level": "123"
},
{
"id": "e500c1bf691b9cfc99f05634da80b6d1",
"name": "Second Team Name",
"level": ""
},
{
"id": "4645e8a4958421f7d843d9b34c4cd9fe",
"name": "Third Team Name",
"level": "123"
}
],
"LastTeam": "79d25d41d991890350af672e0b76faed"
}

This is a lot like my response at Cloudant Selector Query but here's the deal, applied to your question:
The easiest way to run this query is using "Cloudant Query" (or "Mango", as it's called in the forthcoming CouchDB 2.0 release) -- and not the traditional MapReduce view indexing system in CouchDB. (This blog covers the differences: https://cloudant.com/blog/mango-json-vs-text-indexes/ and this one is an overview: https://developer.ibm.com/clouddataservices/2015/11/24/cloudant-query-json-index-arrays/).
Here's what your CQ index should look like:
{
"index": {
"fields": [
{"name": "Teams.[].id", "type": "string"}
]
},
"type": "text"
}
And what the subsequent query looks like:
{
"selector": {
"Teams": {"$elemMatch": {"id": "79d25d41d991890350af672e0b76faed"}}
},
"fields": [
"_id",
"FirstName",
"LastName"
]
}
You can try it yourself in the "Query" section of the Cloudant dashboard or via curl with something like this:
curl -H "Content-Type: application/json" -X POST -d '{"selector":{"Teams":{"$elemMatch":{"id":"79d25d41d991890350af672e0b76faed"}}},"fields":["_id","FirstName","LastName"]}' https://broberg.cloudant.com/teams_test/_find
That database is world-readable, so you can see the sample documents I created in there here: https://broberg.cloudant.com/teams_test/_all_docs?include_docs=true
Dig the Seinfeld theme :D

You simply need to loop through the Teams array and emit a view entry for each of the teams.
function (doc) {
if(doc.Kind === "Profile"){
for (var i=0; i<doc.Teams.length; i++) {
var team = doc.Teams[i];
emit(team.id, [doc.FirstName, doc.LastName]);
}
}
}
You can then query for all profiles with a specific team id by keying on the team id like this
.../view?key="79d25d41d991890350af672e0b76faed"
giving
{"total_rows":7,"offset":2,"rows":[
{"id":"0d15041f43b43ae07e8faa737f00032c","key":"79d25d41d991890350af672e0b76faed","value":["Adam","Alpha"]},
{"id":"68779729be3610fd8b52b22574000ae8","key":"79d25d41d991890350af672e0b76faed","value":["Bob","Bravo"]},
{"id":"9f97f1565f03aebae9ca73e207001ee1","key":"79d25d41d991890350af672e0b76faed","value":["Chuck","Charlie"]}
]}
or you can include the actual profiles in the result by adding &include_docs=true to the query.

Related

replace "key" name in whole JSON python for bulk data in efficient way

Actually i am pushing data to other system but before pushing i have to change the "key" in the whole JSON. JSON may contain 200 or 10000 or 250000 data.
sample JSON:
{
"insert": "table",
"contacts": [
{
"testName": "testname",
"ContactID": 212121
},
{
"testName": "testname",
"ContactID": 2146354564
},
{
"testName": "testname",
"ContactID": 12312
},
{
"testName": "testname",
"ContactID": 211221
},
{
"testName": "testname",
"ContactID": 10218550
}
]
}
I need to change contacts array Keys. These contacts may be in bulk. So i need to work with this efficiently with minimal complexity.
The above JSON to be converted as below
{
"insert": "table",
"contacts": [
{
"name": "testname",
"phone": 212121
},
{
"name": "testname",
"phone": 2146354564
},
{
"name": "testname",
"phone": 12312
},
{
"name": "testname",
"phone": 211221
},
{
"name": "testname",
"phone": 10218550
}
]
}
here is my code trying by loop
ini_dict = request.data
contact_data = ini_dict['contacts']
for i in contact_data:
i['name'] = i.pop('testName')
print(contact_data)
Please suggest me how can i change the key names efficiently for bulk data. i mean for 50000 lists in contacts. "for loop" will be leading a performance issue. So please let me know the efficient way to achieve this
I dont know how fast you need it to be nor how you are choosing to store your json. One simple solution is just store it as a string and then replace all the instances of your attributes.
# Something like this using a jsonstring
jsonstring.replace("'testName':", "'name':")
jsonstring.replace("'ContactId':", "'phone':")
If you want to do this in bulk you, may need to create some batch process to be able to fetch multiple existing records and make changes at once. I have done this before with the java equivalent of https://pypi.org/project/JayDeBeApi/ but, that was more for modifying existing records in a database.

How to query deep nested json array from couchbase?

How to query deep nested json array from couchbase? I have the following documents in the couchbase bucket. I need to query to list all apps who have Permissions "android.permission.BATTERY_STATS"
How to query to list all apps with permissions from nested json array?
My Json Documents,
Document:1
{
"data": {
"com.facebook.katana": {
"studioId": "Facebook",
"screenshotUrls": [
"https://lh3.googleusercontent.com/JcPdPqplBxgG6dEQuxvuhO4jvE64AzxOCGWe8w55dMMeXU4rZs2MwpfGQTWvv6QR-g",
"https://lh3.googleusercontent.com/w0kSYY7jlPjGDd3KEVgDTpzUf4k67G7rfELOf6qj1SSC7n6Ege44vp8QkeX57ZM6bFU"
],
"primaryCategoryName": "Social",
"studioName": "Facebook",
"description": "Keeping up with friends is faster and easier than ever. Share updates and photos, engage with friends and Pages, and stay connected to communities important to you"
"starRatings": {
"1": 9706642,
"2": 3384344,
"3": 7224416,
"4": 12323358,
"5": 49438051
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BATTERY_STATS",
"android.permission.BLUETOOTH",
"android.permission.READ_PHONE_STATE",
"android.permission.READ_PROFILE",
"android.permission.READ_SMS",
"android.permission.READ_SYNC_SETTINGS",
"com.nokia.pushnotifications.permission.RECEIVE",
"com.sec.android.provider.badge.permission.READ",
"com.sec.android.provider.badge.permission.WRITE",
"com.sonyericsson.home.permission.BROADCAST_BADGE"
],
"appId": "com.facebook.katana",
"userRatingCount": 82076811,
"currency": "USD",
"iconUrl": "https://lh3.googleusercontent.com/ZZPdzvlpK9r_Df9C3M7j1rNRi7hhHRvPhlklJ3lfi5jk86Jd1s0Y5wcQ1QgbVaAP5Q=w100",
"releaseDate": "Nov 14, 2018",
"appName": "Facebook",
"studioUrl": "https://www.facebook.com/facebook",
"hasInAppPurchases": 1,
"bundleId": "com.facebook.katana",
"version": "198.0.0.53.101",
"commentCount": 22211936,
"fileSizeBytes": 58044343,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"SOCIAL"
],
"tagline": "Find friends, watch live videos, play games & save photos in your social network",
"averageUserRating": 4.0770621299744,
"primaryCategoryId": "SOCIAL",
"videoScreenUrl": "https://lh4.ggpht.com/3RG_Y8JPK0Hcyui9OcapiONP_aDWKTRZ50wqZW_wbyOF0FamAYEYZfMTW9Cs1OT1kA"
}
},
"response_msec": 11,
"status": 200
}
Document:2
{
"data": {
"com.whatsapp": {
"studioId": "WhatsApp Inc.",
"screenshotUrls": [
"https://lh3.googleusercontent.com/MMue08byixTw74ST_VkNQDUUJBgVEbjNHDYLhIuHmYhMIMJIp3KjVlnhhqZQOZUtNt8",
"https://lh3.googleusercontent.com/foFmwvVGIwWWXJIukN7png18lFjFgbw3K7BqIm8G-jsFgSTVtkCa-dDkFApUzbvzIvbe"
],
"primaryCategoryName": "Communication",
"studioName": "WhatsApp Inc.",
"description": "WhatsApp Messenger is a FREE messaging app available for Android and other smartphones.
"starRatings": {
"1": 4713598,
"2": 1917919,
"3": 4962745,
"4": 11307648,
"5": 55392894
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BLUETOOTH",
"android.permission.BROADCAST_STICKY",
"android.permission.CAMERA",
"android.permission.CHANGE_WIFI_STATE",
"android.permission.GET_ACCOUNTS",
"android.permission.GET_TASKS",
"android.permission.INSTALL_SHORTCUT",
"android.permission.INTERNET",
"android.permission.MANAGE_ACCOUNTS",
"com.whatsapp.permission.REGISTRATION",
"com.whatsapp.permission.VOIP_CALL",
"com.whatsapp.sticker.READ"
],
"appId": "com.whatsapp",
"userRatingCount": 78294804,
"currency": "USD",
"iconUrl": "https://lh6.ggpht.com/mp86vbELnqLi2FzvhiKdPX31_oiTRLNyeK8x4IIrbF5eD1D5RdnVwjQP0hwMNR_JdA=w100",
"releaseDate": "Nov 5, 2018",
"appName": "WhatsApp Messenger",
"studioUrl": "http://www.whatsapp.com/",
"bundleId": "com.whatsapp",
"version": "2.18.341",
"commentCount": 19763316,
"fileSizeBytes": 23857699,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"COMMUNICATION"
],
"tagline": "Simple. Personal. Secure.",
"averageUserRating": 4.4145045280457,
"primaryCategoryId": "COMMUNICATION",
"videoScreenUrl": "https://lh3.ggpht.com/aZrXAunkovhf0630Ykz1A7h2rzFX_dErd6fRiB7fNKU_DkNtetTquEra1bjc3sR2kLs"
}
},
"response_msec": 15,
"status": 200
}
As I say in the comment, this is a tricky one. I'm going to try to simplify your docs first, and then give an answer that I came up with.
You have two docs, which contain a nested object with a permissions array. Each nested object has a (potentially) different name. So, let's assume we have two simple docs like this:
id: doc1
{
"foo": {
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
}
}
id: doc2
{
"bar": {
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
}
The first one has a "foo" nested object, the second has a "bar" nested object, but both nested objects have a "permissions" array. You want to find all the documents that have a permission of "android.permission.BATTERY_STATS".
I checked out the N1QL docs for anything that might be helpful, and I especially checked out the Object Functions section. There's a function called OBJECT_UNWRAP that might do the trick. From the docs: "This function enables you to unwrap an object without knowing the name in the name-value pair."
So, if I simply unwrap the above documents, then I can basically discard the "foo" and the "bar" parts.
SELECT META(b).id, OBJECT_UNWRAP(b).permissions
FROM sstbucket b
You can put unwrap a deeper nested object if necessary, but I'm trying to keep this simple.
The results of that query would be:
[
{
"id": "doc1",
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
},
{
"id": "doc2",
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
]
And now, it's a simple ANY/SATISFIES statement to find the document:
SELECT META(b).id
FROM sstbucket b
WHERE ANY p IN OBJECT_UNWRAP(b).permissions SATISFIES p == 'android.permission.BATTERY_STATS' END;
Which would return
[
{
"id": "doc1"
}
]
So, that works. What I don't know for sure is how to create a proper index for this particular query. I created a primary index just to make it work (CREATE PRIMARY INDEX ON sstbucket), but that's not going to perform very well.
You can use OBJECT functions (https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/objectfun.html) and Array indexing.
If you need document ID or whole document.
CREATE INDEX ix1 ON default ( DISTINCT ARRAY (DISTINCT ARRAY permision
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT META(d).id FROM default AS d
WHERE ANY app IN OBJECT_VALUES(d.data)
SATISFIES (ANY permision IN app.permissions
SATISFIES permision = "android.permission.BATTERY_STATS"
END)
END;
If you need only appId and see if it uses covering index.
CREATE INDEX ix2 ON default ( ALL ARRAY (ALL ARRAY [permision, app.appId]
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT [permision, app.appId][1] AS appId FROM default AS d
UNNEST OBJECT_VALUES(d.data) AS app
UNNEST app.permissions AS permision
WWHERE [permision, app.appId] >= ["android.permission.BATTERY_STATS"] AND
[permision, app.appId] < [SUCCESSOR("android.permission.BATTERY_STATS")] ;

JSON is it best practice to give each element in an array an id attribute?

Is it best practice in JSON to give objects in an array an id similar to below?. Im trying to decide on a JSON format for a restful service im implementing and decide include it or not... If it is to be modified by CRUD operations is it a good idea?
{
"tables": [
{
"id": 1,
"tablename": "Table1",
"columns": [
{
"name": "Col1",
"data": "-5767703747778052096"
},
{
"name": "Col2",
"data": "-5803732544797016064"
}
]
},
{
"id": 2,
"tablename": "Table2",
"columns": [
{
"name": "Col1",
"data": "-333333"
},
{
"name": "Col2",
"data": "-44444"
}
]
}
]
}
Client-Generated IDs
A server MAY accept a client-generated ID along with a request to
create a resource. An ID MUST be specified with an "id" key, the value
of which MUST be a universally unique identifier. The client SHOULD
use a properly generated and formatted UUID as described in RFC 4122
[RFC4122].
jsonapi.org

Talend: parse JSON string to multiple output

I'm aware of this question but I don't believe that there is no solution with standars component. I'm using Talend ESB Studio 5.4.
I have to parse a JSON string from a REST web service into multiple output, and add them to a database.
Database has two tables:
User (user_id, name, card, card_id, points)
Action (user_id, action_id, description, used_point)
My JSON Structure is something like that:
{
"users": [
{
"name": "foo",
"user_id": 1,
"card": {
"card_id": "AAA",
"points": 10
},
"actions": [
{
"action_id": 1,
"description": "buy",
"used_points": 2
},
{
"action_id": 3,
"description": "buy",
"used_points": 1
}
]
},
{
"name": "bar",
"user_id": 2,
"card": {
"card_id": "BBB",
"points": -1
},
"actions": [
{
"id": 2,
"description": "sell",
"used_point": 5
}
]
}
]
}
I have tried to add a JSON Schema Metadata but it is not clear to me how to "flat" the JSON. I have tried to look at tXMLMap, tExtractJSONFields.. but no luck till now.
I also had a look at tJavaRow but I don't understand how to make a Schema for that.
It's a pity because till now I'm loving Talend! Any advice?
You can save a json file in your disk, then create new json file in the metadata of Talend studio, the wizard retrieve the schema for you, after saving, you ca, copie schema in the generic schema of the metadata, and it's done, use that generic schema where you want, this is how to use generic schema in the tRestClient component:

JSON Design & Query with Jersey

I am trying to design a JSON object that would work with Jersey and Jackson.
Am fairly new to JSON / Restful programming, so I am wondering if the following is viable.
{
"name": "myservice",
"orders": [
{
"name": "iphone",
"description": "iPhone 5",
"providers": [
{
"name": "a",
"description": "AT&T",
"pricing": ["$40", "$70", "$120"]
},
{
"name": "b",
"description": "Verizon",
"pricing": ["$45", "$60", "$85"]
}
]
},
{
"name": "galaxy3",
"description": "Samsung Galaxy 3",
"providers": [
{
"name": "a",
"description": "AT&T",
"pricing": ["$45", "$60", "$85"]
}
]
}
]
}
Get all information regarding iPhone's Verizon provider:
curl GET -H'Content-Type: application/json' https://mydomain/myservice/iphone/b
would return:
{
"name": "b",
"description": "Verizon",
"pricing": ["$45", "$60", "$85"]
}
Get list of pricing for iPhone's AT&T provider:
curl GET -H'Content-Type: application/json' https://mydomain/myservice/iphone/a?pricing
Would return:
{
["$40", "$70", "$120"]
}
Any examples or feedback will be greatly appreciated!
Here is a good discussion about defining a REST API: REST Complex/Composite/Nested Resources
Here is what I would change in your json:
1. orders -> order, because resources are declared as singular nouns
2. providers -> provider, because of the same
This is how I would call from a client if I know what I need to get (using composite resources):
https://<mydomain>/myservice/order/iphone/provider/b
https://<mydomain>/myservice/order/iphone/provider/a/pricing
In case you need to search for an order, you can define the request like:
https://<mydomain>/myservice/order?name=iphone -> it would return the 1st element in the "order" list
The assumption is that "name" is a key for the respective resouces (order and provider)