How to get ordered results from couchbase - couchbase

I have in my bucket a document containing a list of ID (childList).
I would like to query over this list and keep the result ordered like in my JSON. My query is like (using java SDK) :
String query = new StringBuilder().append("SELECT B.name, META(B).id as id ")
.append("FROM" + bucket.name() + "A ")
.append("USE KEYS $id ")
.append("JOIN" + bucket.name() + "B ON KEYS ARRAY i FOR i IN A.childList end;").toString();
This query will return rows that I will transform into my domain object and create a list like this :
n1qlQueryResult.allRows().forEach(n1qlQueryRow -> (add to return list ) ...);
The problem is the output order is important.
Any ideas?
Thank you.

here is a rough idea of a solution without N1QL, provided you always start from a single A document:
List<JsonDocument> listOfBs = bucket
.async()
.get(idOfA)
.flatMap(doc -> Observable.from(doc.content().getArray("childList")))
.concatMapEager(id -> bucket.async().get(id))
.toList()
.toBlocking().first();
You might want another map before the toList to extract the name and id, or to perform your domain object transformation even maybe...
The steps are:
use the async API
get the A document
extract the list of children and stream these ids
asynchronously fetch each child document and stream them but keeping them in original order
collect all into a List<JsonDocument>
block until the list is ready and return that List.

Related

extract rows and columns from dictionary of JSON responses consisting of lists of dictionaries in python

sorry for the confusing title.
So im trying to read a butload of JSON responses using grequests with this loop:
def GetData():
urlRespDict = {}
for OrderNo in LookupNumbers['id']:
urls1 = []
for idno in ParameterList:
urlTemp = url0_const + OrderNo + url1_const + idno + param1_const
urls1.append(urlTemp)
urlRespDict[OrderNo] = grequests.map((grequests.get(u) for u in urls1))
return urlRespDict
Which is all fine and dandy, my response is a dictionary of 4 keys with consisting of a lists with sizes 136.
When i read one of the responses with (key and index are random):
d1 = dict_responses['180378'][0].json()
I get a list of dictionaries that has a dictionary inside see picture below.
Basically all i want to get out is the value from the 'values' key where in this case is '137' and
'13,80137' ideally i want to create a df that has columns with the 'key' (in this case the '137') and rows with the values extracted from d1.
I've tried using apply(pd.Series) on the values dict. But it is very time consuming.
like:
df2 = [(pd.DataFrame.from_records(n))['values'].apply(pd.Series,dtype="string") for n in df1]
just to see the data.
I hope theres another alternative, i am not an experienced coder
I hope i explained it good enough and i hope you can help. Thank you so much in advance

How to iterate over a nested json array and change the node properties based on a condition?

I want to create new properties for a pre-existing node based on a condition. To be more specific, I want to create a new node property when a node property matches with a property within the nested JSON array. my json structure looks something like this:
{"some tasks":[
{"id":1,"name":"John Doe"},
{"id":2,"name":"Jane Doe"}
],
"some links":[
{"id":1,"type":"cartoon"}
{"id":2,"type":"anime"}
]
}
I already created nodes with properties from tasks - now I want to iterate over the links part and update the properties of the node when the ids match. I tried using foreach like so-
call apoc.load.json("file:///precedence.json")yield value as line
foreach(link in line.link| match (n) where n.id=link.source)
return n
which returns the error-
Neo.ClientError.Statement.SyntaxError: Invalid use of MATCH inside FOREACH (line 2, column 28 (offset: 93))
"foreach(link in line.link| match (n) where n.id=link.source)"
so how do i check this condition inside foreach?
You can't use a MATCH inside a FOREACH, it only allows updating clauses.
Instead you can UNWIND the list back into rows (you'll have a row per entry in the list) then you can MATCH and SET as needed.
I also highly recommend using labels in your query, and ensuring you have in index on the label and id property for fast lookups. For this example I'll use :Node as the label (so for the example you would have :Node(id) for the index):
CALL apoc.load.json("file:///precedence.json") yield value as line
UNWIND line.link as link
MATCH (n:Node)
WHERE n.id = link.source
SET n.type = link.type

Returning MySQL data as an OBJECT rather than an ARRAY (Knex)

Is there a way to get the output of a MySQL query to list rows in the following structure
{
1:{voo:bar,doo:dar},
2:{voo:mar,doo:har}
}
as opposed to
[
{id:1,voo:bar,doo:dar},
{id:2,voo:mar,doo:har}
]
which I then have to loop through to create the desired object?
I should add that within each row I am also concatenating results to form an object, and from what I've experimented with you can't group_concatenate inside a group_concatenation. As follows:
knex('table').select(
'table.id',
'table.name',
knex.raw(
`CONCAT("{", GROUP_CONCAT(DISTINCT
'"',table.voo,'"',':','"',table.doo,'"'),
"}") AS object`
)
.groupBy('table.id')
Could GROUP BY be leveraged in any way to achieve this? Generally I'm inexperienced at SQL and don't know what's possible and what's not.

USql Call data in multidimensional JSON array

I have this JSON file in a data lake that looks like this:
{
"id":"398507",
"contenttype":"POST",
"posttype":"post",
"uri":"http://twitter.com/etc",
"title":null,
"profile":{
"#class":"PublisherV2_0",
"name":"Company",
"id":"2163171",
"profileIcon":"https://pbs.twimg.com/image",
"profileLocation":{
"#class":"DocumentLocation",
"locality":"Toronto",
"adminDistrict":"ON",
"countryRegion":"Canada",
"coordinates":{
"latitude":43.7217,
"longitude":-31.432},
"quadKey":"000000000000000"},
"displayName":"Name",
"externalId":"00000000000"},
"source":{
"name":"blogs",
"id":"18",
"param":"Twitter"},
"content":{
"text":"Description of post"},
"language":{
"name":"English",
"code":"en"},
"abstracttext":"More Text and links",
"score":{}
}
}
in order to call the data into my application, I have to turn the JSON into a string using this code:
DECLARE #input string = #"/MSEStream/{*}.json";
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
#allposts =
EXTRACT
jsonString string
FROM #input
USING Extractors.Text(delimiter:'\b', quoting:true);
#extractedrows = SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(jsonString) AS er FROM #allposts;
#result =
SELECT er["id"] AS postID,
er["contenttype"] AS contentType,
er["posttype"] AS postType,
er["uri"] AS uri,
er["title"] AS Title,
er["acquisitiondate"] AS acquisitionDate,
er["modificationdate"] AS modificationDate,
er["publicationdate"] AS publicationDate,
er["profile"] AS profile
FROM #extractedrows;
OUTPUT #result
TO "/ProcessedQueries/all_posts.csv"
USING Outputters.Csv();
This output the JSON into a .csv file that is readable and when I download the file all data is displayed properly. My problem is when I need to get the data inside profile. Because the JSON is now a string I can't seem to extract any of that data and put it into a variable to use. Is there any way to do this? or do I need to look into other options for reading the data?
You can use JsonTuple on the profile string to further extract the specific properties you want. An example of U-SQL code to process nested Json is provided in this link - https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql.
You can use JsonTuple on the profile column to further extract specific nodes
E.g. use JsonTuple to get all the child nodes of the profile node and extract specific values like how you did in your code.
#childnodesofprofile =
SELECT
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(profile) AS childnodes_map
FROM #result;
#values =
SELECT
childnodes_map["name"] AS name,
childnodes_map["id"] AS id
FROM #result;
Alternatively, if you are interested in specific values, you can also pass paramters to the JsonTuple function to get the specific nodes you want. The code below gets the locality node from the recursively nested nodes (as described by the "$..value" construct.
#locality =
SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(profile, "$..locality").Values AS locality
FROM #result;
Other supported constructs by JsonTuple
JsonTuple(json, "id", "name") // field names
JsonTuple(json, "$.address.zip") // nested fields
JsonTuple(json, "$..address") // recursive children
JsonTuple(json, "$[?(#.id > 1)].id") // path expression
JsonTuple(json) // all children
Hope this helps.

List json processing

I have difficulty processing a list a Scala:
Currently I have a list of like this
(List(JString(2437), JString(2445), JString(2428), JString(321)), CompactBuffer((4,1)))
and I would like after processing, the result will look like below:
( (2437, CompactBuffer((4,1))), (2445, CompactBuffer((4,1))), (2428, CompactBuffer((4,1))), (321, CompactBuffer((4,1))) )
Can any body help me with this issue?
Thank you very much.
Try this:
val pair = (List(JString(2437), JString(2445), JString(2428), JString(321)),
CompactBuffer((4,1)))
val result = pair._1.map((_, pair._2))
First, pair._1 gets the list from the tuple. Then, map performs the function on each element of the list. The function (_, pair._2) puts the given element from the list in a new tuple together with the second part of the pair tuple.