How to query deep nested json array from couchbase? - json

How to query deep nested json array from couchbase? I have the following documents in the couchbase bucket. I need to query to list all apps who have Permissions "android.permission.BATTERY_STATS"
How to query to list all apps with permissions from nested json array?
My Json Documents,
Document:1
{
"data": {
"com.facebook.katana": {
"studioId": "Facebook",
"screenshotUrls": [
"https://lh3.googleusercontent.com/JcPdPqplBxgG6dEQuxvuhO4jvE64AzxOCGWe8w55dMMeXU4rZs2MwpfGQTWvv6QR-g",
"https://lh3.googleusercontent.com/w0kSYY7jlPjGDd3KEVgDTpzUf4k67G7rfELOf6qj1SSC7n6Ege44vp8QkeX57ZM6bFU"
],
"primaryCategoryName": "Social",
"studioName": "Facebook",
"description": "Keeping up with friends is faster and easier than ever. Share updates and photos, engage with friends and Pages, and stay connected to communities important to you"
"starRatings": {
"1": 9706642,
"2": 3384344,
"3": 7224416,
"4": 12323358,
"5": 49438051
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BATTERY_STATS",
"android.permission.BLUETOOTH",
"android.permission.READ_PHONE_STATE",
"android.permission.READ_PROFILE",
"android.permission.READ_SMS",
"android.permission.READ_SYNC_SETTINGS",
"com.nokia.pushnotifications.permission.RECEIVE",
"com.sec.android.provider.badge.permission.READ",
"com.sec.android.provider.badge.permission.WRITE",
"com.sonyericsson.home.permission.BROADCAST_BADGE"
],
"appId": "com.facebook.katana",
"userRatingCount": 82076811,
"currency": "USD",
"iconUrl": "https://lh3.googleusercontent.com/ZZPdzvlpK9r_Df9C3M7j1rNRi7hhHRvPhlklJ3lfi5jk86Jd1s0Y5wcQ1QgbVaAP5Q=w100",
"releaseDate": "Nov 14, 2018",
"appName": "Facebook",
"studioUrl": "https://www.facebook.com/facebook",
"hasInAppPurchases": 1,
"bundleId": "com.facebook.katana",
"version": "198.0.0.53.101",
"commentCount": 22211936,
"fileSizeBytes": 58044343,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"SOCIAL"
],
"tagline": "Find friends, watch live videos, play games & save photos in your social network",
"averageUserRating": 4.0770621299744,
"primaryCategoryId": "SOCIAL",
"videoScreenUrl": "https://lh4.ggpht.com/3RG_Y8JPK0Hcyui9OcapiONP_aDWKTRZ50wqZW_wbyOF0FamAYEYZfMTW9Cs1OT1kA"
}
},
"response_msec": 11,
"status": 200
}
Document:2
{
"data": {
"com.whatsapp": {
"studioId": "WhatsApp Inc.",
"screenshotUrls": [
"https://lh3.googleusercontent.com/MMue08byixTw74ST_VkNQDUUJBgVEbjNHDYLhIuHmYhMIMJIp3KjVlnhhqZQOZUtNt8",
"https://lh3.googleusercontent.com/foFmwvVGIwWWXJIukN7png18lFjFgbw3K7BqIm8G-jsFgSTVtkCa-dDkFApUzbvzIvbe"
],
"primaryCategoryName": "Communication",
"studioName": "WhatsApp Inc.",
"description": "WhatsApp Messenger is a FREE messaging app available for Android and other smartphones.
"starRatings": {
"1": 4713598,
"2": 1917919,
"3": 4962745,
"4": 11307648,
"5": 55392894
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BLUETOOTH",
"android.permission.BROADCAST_STICKY",
"android.permission.CAMERA",
"android.permission.CHANGE_WIFI_STATE",
"android.permission.GET_ACCOUNTS",
"android.permission.GET_TASKS",
"android.permission.INSTALL_SHORTCUT",
"android.permission.INTERNET",
"android.permission.MANAGE_ACCOUNTS",
"com.whatsapp.permission.REGISTRATION",
"com.whatsapp.permission.VOIP_CALL",
"com.whatsapp.sticker.READ"
],
"appId": "com.whatsapp",
"userRatingCount": 78294804,
"currency": "USD",
"iconUrl": "https://lh6.ggpht.com/mp86vbELnqLi2FzvhiKdPX31_oiTRLNyeK8x4IIrbF5eD1D5RdnVwjQP0hwMNR_JdA=w100",
"releaseDate": "Nov 5, 2018",
"appName": "WhatsApp Messenger",
"studioUrl": "http://www.whatsapp.com/",
"bundleId": "com.whatsapp",
"version": "2.18.341",
"commentCount": 19763316,
"fileSizeBytes": 23857699,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"COMMUNICATION"
],
"tagline": "Simple. Personal. Secure.",
"averageUserRating": 4.4145045280457,
"primaryCategoryId": "COMMUNICATION",
"videoScreenUrl": "https://lh3.ggpht.com/aZrXAunkovhf0630Ykz1A7h2rzFX_dErd6fRiB7fNKU_DkNtetTquEra1bjc3sR2kLs"
}
},
"response_msec": 15,
"status": 200
}

As I say in the comment, this is a tricky one. I'm going to try to simplify your docs first, and then give an answer that I came up with.
You have two docs, which contain a nested object with a permissions array. Each nested object has a (potentially) different name. So, let's assume we have two simple docs like this:
id: doc1
{
"foo": {
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
}
}
id: doc2
{
"bar": {
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
}
The first one has a "foo" nested object, the second has a "bar" nested object, but both nested objects have a "permissions" array. You want to find all the documents that have a permission of "android.permission.BATTERY_STATS".
I checked out the N1QL docs for anything that might be helpful, and I especially checked out the Object Functions section. There's a function called OBJECT_UNWRAP that might do the trick. From the docs: "This function enables you to unwrap an object without knowing the name in the name-value pair."
So, if I simply unwrap the above documents, then I can basically discard the "foo" and the "bar" parts.
SELECT META(b).id, OBJECT_UNWRAP(b).permissions
FROM sstbucket b
You can put unwrap a deeper nested object if necessary, but I'm trying to keep this simple.
The results of that query would be:
[
{
"id": "doc1",
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
},
{
"id": "doc2",
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
]
And now, it's a simple ANY/SATISFIES statement to find the document:
SELECT META(b).id
FROM sstbucket b
WHERE ANY p IN OBJECT_UNWRAP(b).permissions SATISFIES p == 'android.permission.BATTERY_STATS' END;
Which would return
[
{
"id": "doc1"
}
]
So, that works. What I don't know for sure is how to create a proper index for this particular query. I created a primary index just to make it work (CREATE PRIMARY INDEX ON sstbucket), but that's not going to perform very well.

You can use OBJECT functions (https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/objectfun.html) and Array indexing.
If you need document ID or whole document.
CREATE INDEX ix1 ON default ( DISTINCT ARRAY (DISTINCT ARRAY permision
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT META(d).id FROM default AS d
WHERE ANY app IN OBJECT_VALUES(d.data)
SATISFIES (ANY permision IN app.permissions
SATISFIES permision = "android.permission.BATTERY_STATS"
END)
END;
If you need only appId and see if it uses covering index.
CREATE INDEX ix2 ON default ( ALL ARRAY (ALL ARRAY [permision, app.appId]
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT [permision, app.appId][1] AS appId FROM default AS d
UNNEST OBJECT_VALUES(d.data) AS app
UNNEST app.permissions AS permision
WWHERE [permision, app.appId] >= ["android.permission.BATTERY_STATS"] AND
[permision, app.appId] < [SUCCESSOR("android.permission.BATTERY_STATS")] ;

Related

Jq convert an object into an array

I have the following file "Pokemon.json", it's a stripped down list of Pokémon, listing their Pokédex ID, name and an array of Object Types.
[{
"name": "onix",
"id": 95,
"types": [{
"slot": 2,
"type": {
"name": "ground"
}
},
{
"slot": 1,
"type": {
"name": "rock"
}
}
]
}, {
"name": "drowzee",
"id": 96,
"types": [{
"slot": 1,
"type": {
"name": "psychic"
}
}]
}]
The output I'm trying to achieve is, extracting the name value of the type object and inserting it into an array.
I can easily get an array of all the types with
jq -r '.pokemon[].types[].type.name' pokemon.json
But I'm missing the key part to transform the name field into it's own array
[ {
"name": "onix",
"id": 95,
"types": [ "rock", "ground" ]
}, {
"name": "drowzee",
"id": 96,
"types": [ "psychic" ]
} ]
Any help appreciated, thank you!
In the man it states you have an option to use map - which essentially means walking over each result and returning something (in our case, same data, constructed differently.)
This means that for each row you are creating new object, and put some values inside
Pay attention, you do need another iterator within, since we want one object per row.
(we simply need to map the values in different way it is constructed right now.)
So the solution might look like so:
jq -r '.pokemon[]|{name:.name, id:.id, types:.types|map(.type.name)}' pokemon.json

Google json style guide: How to send single item response?

There is an items node in the specifications which says it is for an array of items, like paging items, youtube video list
What if I have GET request on a single item, how should the response be formatted ?
Just to one item in the array?
items:[item]
https://google.github.io/styleguide/jsoncstyleguide.xml
I don't think #tanmay_vijay's answer is correct or nuanced enough as it seems that single item responses are in arrays in the YouTube example in the docs.
{
"apiVersion": "2.0",
"data": {
"updated": "2010-02-04T19:29:54.001Z",
"totalItems": 6741,
"startIndex": 1,
"itemsPerPage": 1,
"items": [
{
"id": "BGODurRfVv4",
"uploaded": "2009-11-17T20:10:06.000Z",
"updated": "2010-02-04T06:25:57.000Z",
"uploader": "docchat",
"category": "Animals",
"title": "From service dog to SURFice dog",
"description": "Surf dog Ricochets inspirational video ...",
"tags": [
"Surf dog",
"dog surfing",
"dog",
"golden retriever",
],
"thumbnail": {
"default": "https://i.ytimg.com/vi/BGODurRfVv4/default.jpg",
"hqDefault": "https://i.ytimg.com/vi/BGODurRfVv4/hqdefault.jpg"
},
"player": {
"default": "https://www.youtube.com/watch?v=BGODurRfVv4&feature=youtube_gdata",
"mobile": "https://m.youtube.com/details?v=BGODurRfVv4"
},
"content": {
"1": "rtsp://v5.cache6.c.youtube.com/CiILENy73wIaGQn-Vl-0uoNjBBMYDSANFEgGUgZ2aWRlb3MM/0/0/0/video.3gp",
"5": "https://www.youtube.com/v/BGODurRfVv4?f=videos&app=youtube_gdata",
"6": "rtsp://v7.cache7.c.youtube.com/CiILENy73wIaGQn-Vl-0uoNjBBMYESARFEgGUgZ2aWRlb3MM/0/0/0/video.3gp"
},
"duration": 315,
"rating": 4.96,
"ratingCount": 2043,
"viewCount": 1781691,
"favoriteCount": 3363,
"commentCount": 1007,
"commentsAllowed": true
}
]
}
}
It could however be that it depends on the resource being targeted from the request. This is the way it is in the competing JSONAPI standard.
From JSONAPI standard:
A logical collection of resources MUST be represented as an array, even if it only contains one item or is empty.
You don't need to have items field for showing single item. If you're sure your API is always going to return single object, you can return it as data itself.
{
"data": {
"kind": "user",
"fields": "author,id",
"id": "bart",
"author": "Bart"
}
}
Fields such as data.kind data.fields data.etag data.id data.lang data.updated data.deleted can still be used here.
Source for snippet docs

How to index multidimensional arrays in couchdb

I have a multidimensional array that I want to index with CouchDB (really using Cloudant). I have users which have a list of the teams that they belong to. I want to search to find every member of that team. So, get me all the User objects that have a team object with id 79d25d41d991890350af672e0b76faed. I tried to make a json index on "Teams.id", but it didn't work because it isn't a straight array but a multidimensional array.
User
{
"_id": "683be6c086381d3edc8905dc9e948da8",
"_rev": "238-963e54ab838935f82f54e834f501dd99",
"type": "Feature",
"Kind": "Profile",
"Email": "gc#gmail.com",
"FirstName": "George",
"LastName": "Castanza",
"Teams": [
{
"id": "79d25d41d991890350af672e0b76faed",
"name": "First Team",
"level": "123"
},
{
"id": "e500c1bf691b9cfc99f05634da80b6d1",
"name": "Second Team Name",
"level": ""
},
{
"id": "4645e8a4958421f7d843d9b34c4cd9fe",
"name": "Third Team Name",
"level": "123"
}
],
"LastTeam": "79d25d41d991890350af672e0b76faed"
}
This is a lot like my response at Cloudant Selector Query but here's the deal, applied to your question:
The easiest way to run this query is using "Cloudant Query" (or "Mango", as it's called in the forthcoming CouchDB 2.0 release) -- and not the traditional MapReduce view indexing system in CouchDB. (This blog covers the differences: https://cloudant.com/blog/mango-json-vs-text-indexes/ and this one is an overview: https://developer.ibm.com/clouddataservices/2015/11/24/cloudant-query-json-index-arrays/).
Here's what your CQ index should look like:
{
"index": {
"fields": [
{"name": "Teams.[].id", "type": "string"}
]
},
"type": "text"
}
And what the subsequent query looks like:
{
"selector": {
"Teams": {"$elemMatch": {"id": "79d25d41d991890350af672e0b76faed"}}
},
"fields": [
"_id",
"FirstName",
"LastName"
]
}
You can try it yourself in the "Query" section of the Cloudant dashboard or via curl with something like this:
curl -H "Content-Type: application/json" -X POST -d '{"selector":{"Teams":{"$elemMatch":{"id":"79d25d41d991890350af672e0b76faed"}}},"fields":["_id","FirstName","LastName"]}' https://broberg.cloudant.com/teams_test/_find
That database is world-readable, so you can see the sample documents I created in there here: https://broberg.cloudant.com/teams_test/_all_docs?include_docs=true
Dig the Seinfeld theme :D
You simply need to loop through the Teams array and emit a view entry for each of the teams.
function (doc) {
if(doc.Kind === "Profile"){
for (var i=0; i<doc.Teams.length; i++) {
var team = doc.Teams[i];
emit(team.id, [doc.FirstName, doc.LastName]);
}
}
}
You can then query for all profiles with a specific team id by keying on the team id like this
.../view?key="79d25d41d991890350af672e0b76faed"
giving
{"total_rows":7,"offset":2,"rows":[
{"id":"0d15041f43b43ae07e8faa737f00032c","key":"79d25d41d991890350af672e0b76faed","value":["Adam","Alpha"]},
{"id":"68779729be3610fd8b52b22574000ae8","key":"79d25d41d991890350af672e0b76faed","value":["Bob","Bravo"]},
{"id":"9f97f1565f03aebae9ca73e207001ee1","key":"79d25d41d991890350af672e0b76faed","value":["Chuck","Charlie"]}
]}
or you can include the actual profiles in the result by adding &include_docs=true to the query.

Talend: parse JSON string to multiple output

I'm aware of this question but I don't believe that there is no solution with standars component. I'm using Talend ESB Studio 5.4.
I have to parse a JSON string from a REST web service into multiple output, and add them to a database.
Database has two tables:
User (user_id, name, card, card_id, points)
Action (user_id, action_id, description, used_point)
My JSON Structure is something like that:
{
"users": [
{
"name": "foo",
"user_id": 1,
"card": {
"card_id": "AAA",
"points": 10
},
"actions": [
{
"action_id": 1,
"description": "buy",
"used_points": 2
},
{
"action_id": 3,
"description": "buy",
"used_points": 1
}
]
},
{
"name": "bar",
"user_id": 2,
"card": {
"card_id": "BBB",
"points": -1
},
"actions": [
{
"id": 2,
"description": "sell",
"used_point": 5
}
]
}
]
}
I have tried to add a JSON Schema Metadata but it is not clear to me how to "flat" the JSON. I have tried to look at tXMLMap, tExtractJSONFields.. but no luck till now.
I also had a look at tJavaRow but I don't understand how to make a Schema for that.
It's a pity because till now I'm loving Talend! Any advice?
You can save a json file in your disk, then create new json file in the metadata of Talend studio, the wizard retrieve the schema for you, after saving, you ca, copie schema in the generic schema of the metadata, and it's done, use that generic schema where you want, this is how to use generic schema in the tRestClient component:

JSON Slurper Offsets

I have a large JSON file that I'm trying to parse with JSON Slurper. The JSON file consists of information about bugs so it has things like issue keys, descriptions, and comments. Not every issue has a comment though. For example, here is a sample of what the JSON input looks like:
{
"projects": [
{
"name": "Test Project",
"key": "TEST",
"issues": [
{
"key": "BUG-1",
"priority": "Major",
"comments": [
{
"author": "a1",
"created": "d1",
"body": "comment 1"
},
{
"author": "a2",
"created": "d2",
"body": "comment 2"
}
]
},
{
"key": "BUG-2",
"priority": "Major"
},
{
"key": "BUG-3",
"priority": "Major",
"comments": [
{
"author": "a3",
"created": "d3",
"body": "comment 3"
}
]
}
]
}
]
}
I have a method that creates Issue objects based on the JSON parse. Everything works well when every issue has at least one comment, but, once an issue comes up that has no comments, the rest of the issues get the wrong comments. I am currently looping through the JSON file based on the total number of issues and then looking for comments using how far along in the number of issues I've gotten. So, for example,
parsedData.issues.comments.body[0][0][0]
returns "comment 1". However,
parsedData.issues.comments.body[0][1][0]
returns "comment 3", which is incorrect. Is there a way I can see if a particular issue has any comments? I'd rather not have to edit the JSON file to add empty comment fields, but would that even help?
You can do this:
parsedData.issues.comments.collect { it?.body ?: [] }
So it checks for a body and if none exists, returns an empty list
UPDATE
Based on the update to the question, you can do:
parsedData.projects.collectMany { it.issues.comments.collect { it?.body ?: [] } }