Concatenate values from non-adjacent objects based on multiple matching criteria - json

I received help on a related question previously on this forum and am wondering if there is a similarly straightforward way to resolve a more complex issue.
Given the following snippet, is there a way to merge the partial sentence (the one which does not end with a "[punctuation mark][white space]" pattern) with its remainder based on the matching TextSize? When I tried to adjust the answer from the related question I quickly ran into issues, but I am basically looking to translate a rule such as if .Text !endswith("[punctuation mark][white space]") then .Text + next .Text where .TextSize matches
{
"Text": "Was it political will that established social democratic policies in the 1930s and ",
"Path": "P",
"TextSize": 9
},
{
"Text": "31 Lawrence Mishel and Jessica Schieder, Economic Policy Institute website, May 24, 2016 at (https://www.epi.org/publication/as-union-membership-has-fallen-the-top-10-percent-have-been-getting-a-larger-share-of-income/). ",
"Path": "Footnote",
"TextSize": 8
},
{
"Text": "Fig. 9.2 Higher union membership has been associated with a higher share of income to lower income brackets (the lower 90%) and a lower share of income to the top 10% of earners. ",
"Path": "P",
"TextSize": 8
},
{
"Text": "1940s, or that undermined them after the 1970s? Or was it abundant and cheap energy resources that enabled social democratic policies to work until the 1970s, and energy constraints that forced a restructuring of policy after the 1970s? ",
"Path": "P",
"TextSize": 9
},
{
"Text": "Recall that my economic modeling discussed in Chap. 6 shows that, even with no change in the assumption related to labor \u201cbargaining power,\u201d you can explain a shift from increasing to declining income equality (higher equality expressed as a higher wage share) by a corresponding shift from a period of rapidly increasing per capita resource consumption to one of constant per capita resource consumption. ",
"Path": "P",
"TextSize": 9
}
The result I'm looking for would be as follows:
{
"Text": "Was it political will that established social democratic policies in the 1930s and 1940s, or that undermined them after the 1970s? Or was it abundant and cheap energy resources that enabled social democratic policies to work until the 1970s, and energy constraints that forced a restructuring of policy after the 1970s? ",
"Path": "P",
"TextSize": 9
},
{
"Text": "31 Lawrence Mishel and Jessica Schieder, Economic Policy Institute website, May 24, 2016 at (https://www.epi.org/publication/as-union-membership-has-fallen-the-top-10-percent-have-been-getting-a-larger-share-of-income/). ",
"Path": "Footnote",
"TextSize": 8
},
{
"Text": "Fig. 9.2 Higher union membership has been associated with a higher share of income to lower income brackets (the lower 90%) and a lower share of income to the top 10% of earners. ",
"Path": "P",
"TextSize": 8
},
{
"Text": "Recall that my economic modeling discussed in Chap. 6 shows that, even with no change in the assumption related to labor \u201cbargaining power,\u201d you can explain a shift from increasing to declining income equality (higher equality expressed as a higher wage share) by a corresponding shift from a period of rapidly increasing per capita resource consumption to one of constant per capita resource consumption. ",
"Path": "P",
"TextSize": 9
}

The following, which assumes the input is a valid JSON array, will merge every .Text with at most one successor, but can easily be modified to merge multiple .Text values together as shown in Part 2 below.
Part 1
# input and output: an array of {Text, Path, TextSize} objects.
# Attempt to merge the .Text of the $i-th object with the .Text of a subsequent compatible object.
# If a merge is successful, the subsequent object is removed.
def attempt_to_merge_next($i):
.[$i].TextSize as $class
| first( (range($i+1; length) as $j | select(.[$j].TextSize == $class) | $j) // null) as $j
| if $j then .[$i].Text += .[$j].Text | del(.[$j])
else .
end;
reduce range(0; length) as $i (.;
if .[$i] == null then .
elif .[$i].Text|test("[,.?:;]\\s*$")|not
then attempt_to_merge_next($i)
else .
end)
Part 2
Using the above def:
def merge:
def m($i):
if $i >= length then .
elif .[$i].Text|test("[,.?:;]\\s*$")|not
then attempt_to_merge_next($i) as $x
| if ($x|length) == length then m($i+1)
else $x|m($i)
end
else m($i+1)
end ;
m(0);
merge

Related

Comparing values between a previous element and its subsequent element in an array

I'm scrolling through the jq Manual and reading through every command available, but am only about 10% complete in reading it. (It's quite long, which is a good thing except that I have an art project presentation due in six days and I have to finish with this JSON analysis first so I can start measuring a cutting 350 meters of tape.)
I have a JSON file with exactly one object. That one object contains an array of 3555 JSON objects, which can be accessed via its index numbered from 0 to 3554. This shows the structure of one of those JSON objects (I've modified the phone numbers and the body/content of the instant message since this comes from a real conversation):
$ cat selected-convo.json | jq '.[3554]'
{
"timestamp": 1589547750278,
"attachments": [],
"source": "+491604444444",
"sourceUuid": "a258be99-b00a-456d-bba6-258d72878b64",
"sourceDevice": 1,
"sent_at": 1589536960941,
"sent_to": [
"+31707777777"
],
"received_at": 1589547750278,
"conversationId": "823c0416-9406-4922-8ee9-f3cf36c4784c",
"type": "outgoing",
"sent": true,
"unidentifiedDeliveries": [
"+31707777777"
],
"expirationStartTimestamp": 1589536960941,
"schemaVersion": 10,
"id": "42e9ed93-ad1e-44fc-912a-dd310c16b52e",
"body": "X xxxx X xxxx X xxx xxxxxxxxx xx xxx.",
"contact": [],
"decrypted_at": 1589547750368,
"errors": [],
"flags": 0,
"hasAttachments": 0,
"isViewOnce": false,
"preview": [],
"requiredProtocolVersion": 0,
"supportedVersionAtReceive": 4,
"quote": null,
"sticker": null,
"recipients": [
"+31707777777"
]
}
I am only interested in measuring the time it took one person to respond to the other person. So, the key-value pairs I want are the sent at timestamp and whether the message is incoming or outgoing.
$ cat selected-convo.json | jq '.[] | .sent_at, .type'
give me the following output (first ten in the array of 3555):
1577640636917
"outgoing"
1577674806478
"incoming"
1577674810527
"incoming"
1578513043504
"outgoing"
1578520666264
"outgoing"
1580600735958
"outgoing"
1580600816040
"outgoing"
1580601327790
"incoming"
1580602829082
"outgoing"
1580602833184
"outgoing"
BUT, I only want to see the first outgoing message followed by the first incoming message followed by the next outgoing message followed by the next incoming message, etc. (If I sent three messages in a row, I want do delete/ignore the second and third message and only look at the first one. If I received eight messages in a row before I responded, I want to only see the first of those messages and delete/skip/forward past the following seven. So from the list above, I want:
1577640636917
"outgoing"
1577674806478
"incoming"
1578513043504
"outgoing"
1580601327790
"incoming"
1580602829082
"outgoing"
Any ideas?
I'd use foreach for that as the task basically requires a state machine that extracts some values from the current input whenever the state changes.
foreach .[] as {$type, $sent_at} (
{};
{prev: .curr, curr: $type};
if .curr != .prev
then $type, $sent_at
else empty end
)

Removing unnecessary sentences in a json file

I am trying to remove the lines that contain [CLS] and [SEP] in the following json file. Is there any way to do this in python? How to remove these lines with the given text?
"Tirukkollampudur Vilvavaneswarar -. Temple  - Shivastalam.txt": {
"context": " may be reproduced or used in any form without permission. This Shivastalam is located 5 km south of Kodavasal and Koradacheri on the Tiruvarur Thanjavur railroad. Koovilamputhur the original name became Kollampudur. Koovilam stands for Vilvam, hence Vilvavanam. This shrine is regarded as a Muktistalam. This shrine is regarded as the 113rd in the series of Tevara Stalams in the Chola Region south of the river Kaveri. Legends The Vilva trees are said to represent splashes of the celestial nectar Amritam, and this stalam is considered on par with Banares. Sundarar is believed to have floated across the river to this temple in a boatmanless raft in a river in spate singing a Patikam . This event is celebrated in a festival in the monsoon month of Libra. The Avimukteswarar temple nearby is also associated with this legend as is the Shivastalam at Kodavasal. Shiva is said to have blessed Durvasa Muni with a vision of the Cosmic Dance here. Legend also has it that Arjuna worshipped Shiva at this shrine. The Temple There are several inscriptions here, and the Cholas have made immense contributions here.to this temple which was built during the time of Kulottunga Chola I. This temple occupies an area of over 2 acres, and its second prakaram has a 5 tiered rajagopuram. The Vinayakar in this temple is also of great Festivals Six worship services are offered each day. Kartikai Deepam, Arudra Darisanam, Sivaratri, Skanda Sashti are some of the festivals celebrated here. ",
"answers": [
[
"5 km south of kodavasal and koradacheri on the tiruvarur thanjavur railroad"
],
[
"5 km south of kodavasal and koradacheri on the tiruvarur thanjavur railroad"
],
[
" "
],
[
" "
],
[
"during the time of kulottunga chola i"
],
[
"[CLS] what are the darshan hours ? [SEP] may be reproduced or used in any form without permission . this shivastalam is located 5 km south of kodavasal and koradacheri on the tiruvarur thanjavur railroad . koovilamputhur the original name became kollampudur . koovilam stands for vilvam , hence vilvavanam . this shrine is regarded as a muktistalam . this shrine is regarded as the 113rd in the series of tevara stalams in the chola region south of the river kaveri . legends the vilva trees are said to represent splashes of the celestial nectar amritam , and this stalam is considered on par with banares . sundarar is believed to have floated across the river to this temple in a boatmanless raft in a river in spate singing a patikam . this event is celebrated in a festival in the monsoon month of libra . the avimukteswarar temple nearby is also associated with this legend as is the shivastalam at kodavasal . shiva is said to have blessed durvasa muni with a vision of the cosmic dance here . legend also has it that arjuna worshipped shiva at this shrine . the temple there are several inscriptions here , and the cholas have made immense contributions here . to this temple which was built during the time of kulottunga chola i . this temple occupies an area of over 2 acres , and its second prakaram has a 5 tiered rajagopuram . the vinayakar in this temple is also of great festivals six worship services are offered each day . kartikai deepam , arudra darisanam"
],
[
"[CLS] what is the average darshan duration ? [SEP]"
],
[
" "
],
[
" "
],
[
" "
],
[
" "
],
[
" "
],
[
" "
]
]
},
you can try the following approach. Since you got lists of sublists we can do the following.
import json
def remove_from_sublists(the_list, to_be_removed):
for each_item in list(the_list):
if isinstance(each_item, list):
remove_from_sublists(each_item, to_be_removed)
elif to_be_removed in each_item :
the_list.remove(each_item)
return the_list
dic = {}
with open('WebTempleCorpus.json') as json_file:
data = json.load(json_file)
for (i, v) in data.items():
sub_dict = v
if(v.get("answers")):
sub_dict["answers"] = remove_from_sublists(v["answers"], "CLS")
sub_dict["answers"] = remove_from_sublists(v["answers"], "SEP")
dic[i] = sub_dict
with open('result.json', 'w') as fp:
json.dump(dic, fp)

How can I sort JSON object?

I have been looking for this for a while now. I want to sort the JSON file following way:
{
"rss": {
"-version": "2.0",
"channel": {
"title": "pubmed: wonpil im",
"link": "https://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=PubMed&term=wonpil%20im",
"description": "NCBI: db=pubmed; Term=wonpil im",
"language": "en-us",
"docs": "http://blogs.law.harvard.edu/tech/rss",
"ttl": "1440",
"image": {
"title": "NCBI pubmed",
"url": "https://www.ncbi.nlm.nih.gov/entrez/query/static/gifs/iconsml.gif",
"link": "https://www.ncbi.nlm.nih.gov/sites/entrez",
"description": "PubMed comprises more than millions of citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites."
},
"item": [
{
"title": "Modeling and simulation of bacterial outer membranes and interactions with membrane proteins.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/28157627?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--linkinghub.elsevier.com-ihub-images-PubMedLink.gif\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>Modeling and simulation of bacterial outer membranes and interactions with membrane proteins.</b></p>
<p>Curr Opin Struct Biol. 2017 Jan 31;43:131-140</p>
<p>Authors: Patel DS, Qi Y, Im W</p>
<p>Abstract<br/>
The outer membrane (OM) of Gram-negative bacteria is composed of phospholipids in the periplasmic leaflet and lipopolysaccharides (LPS) in the external leaflet, along with β-barrel OM proteins (OMPs) and lipidated periplasmic lipoproteins. As a defensive barrier to toxic compounds, an LPS molecule has high antigenic diversity and unique combination of OM-anchored lipid A with core oligosaccharides and O-antigen polysaccharides, creating dynamic protein-LPS and LPS-LPS interactions. Here, we review recent efforts on modeling and simulation of native-like bacterial OMs to explore structures, dynamics, and interactions of different OM components and their roles in transportation of ions, substrates, and antibiotics across the OM and accessibility of monoclonal antibodies (mAbs) to surface epitopes. Simulation studies attempting to provide insight into the structural basis for LPS transport and OMP insertion in the bacterial OM are also highlighted.<br/>
</p><p>PMID: 28157627 [PubMed - as supplied by publisher]</p>
",
"author": " Patel DS, Qi Y, Im W",
"category": "Curr Opin Struct Biol",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:28157627"
}
},
{
"title": "Refinement of OprH-LPS Interactions by Molecular Simulations.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/28122220?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--linkinghub.elsevier.com-ihub-images-cellhub.gif\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>Refinement of OprH-LPS Interactions by Molecular Simulations.</b></p>
<p>Biophys J. 2017 Jan 24;112(2):346-355</p>
<p>Authors: Lee J, Patel DS, Kucharska I, Tamm LK, Im W</p>
<p>Abstract<br/>
The outer membrane (OM) of Gram-negative bacteria is composed of lipopolysaccharide (LPS) in the outer leaflet and phospholipids in the inner leaflet. The outer membrane protein H (OprH) of Pseudomonas aeruginosa provides an increased stability to the OMs by directly interacting with LPS. Here we report the influence of various P. aeruginosa and, for comparison, Escherichia coli LPS environments on the physical properties of the OMs and OprH using all-atom molecular dynamics simulations. The simulations reveal that although the P. aeruginosa OMs are thinner hydrophobic bilayers than the E. coli OMs, which is expected from the difference in the acyl chain length of their lipid A, this effect is almost imperceptible around OprH due to a dynamically adjusted hydrophobic match between OprH and the OM. The structure and dynamics of the extracellular loops of OprH show distinct behaviors in different LPS environments. Including the O-antigen greatly reduces the flexibility of the OprH loops and increases the interactions between these loops and LPS. Furthermore, our study shows that the interactions between OprH and LPS mainly depend on the secondary structure of OprH and the chemical structure of LPS, resulting in distinctive patterns in different LPS environments.<br/>
</p><p>PMID: 28122220 [PubMed - in process]</p>
",
"author": " Lee J, Patel DS, Kucharska I, Tamm LK, Im W",
"category": "Biophys J",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:28122220"
}
},
{
"title": "CHARMM-GUI MDFF/xMDFF Utilizer for Molecular Dynamics Flexible Fitting Simulations in Various Environments.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/27936734?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--pubs.acs.org-images-pubmed-acspubs.jpg\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>CHARMM-GUI MDFF/xMDFF Utilizer for Molecular Dynamics Flexible Fitting Simulations in Various Environments.</b></p>
<p>J Phys Chem B. 2016 Dec 23;:</p>
<p>Authors: Qi Y, Lee J, Singharoy A, McGreevy R, Schulten K, Im W</p>
<p>Abstract<br/>
X-ray crystallography and cryo-electron microscopy are two popular methods for the structure determination of biological molecules. Atomic structures are derived through the fitting and refinement of an initial model into electron density maps constructed by both experiments. Two computational approaches, MDFF and xMDFF, have been developed to facilitate this process by integrating the experimental data with molecular dynamics simulation. However, the setup of an MDFF/xMDFF simulation requires knowledge of both experimental and computational methods, which is not straightforward for nonexpert users. In addition, sometimes it is desirable to include realistic environments, such as explicit solvent and lipid bilayers during the simulation, which poses another challenge even for expert users. To alleviate these difficulties, we have developed MDFF/xMDFF Utilizer in CHARMM-GUI that helps users to set up an MDFF/xMDFF simulation. The capability of MDFF/xMDFF Utilizer is greatly enhanced by integration with other CHARMM-GUI modules, including protein structure manipulation, a diverse set of lipid types, and all-atom CHARMM and coarse-grained PACE force fields. With this integration, various simulation environments are available for MDFF Utilizer (vacuum, implicit/explicit solvent, and bilayers) and xMDFF Utilizer (vacuum and solution). In this work, three examples are shown to demonstrate the usage of MDFF/xMDFF Utilizer.<br/>
</p><p>PMID: 27936734 [PubMed - as supplied by publisher]</p>
",
"author": " Qi Y, Lee J, Singharoy A, McGreevy R, Schulten K, Im W",
"category": "J Phys Chem B",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:27936734"
}
},
]
}
}
}
I want this to be sorted according to "PubMed" numbers following way:
{
"rss": {
"-version": "2.0",
"channel": {
"title": "pubmed: wonpil im",
"link": "https://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=PubMed&term=wonpil%20im",
"description": "NCBI: db=pubmed; Term=wonpil im",
"language": "en-us",
"docs": "http://blogs.law.harvard.edu/tech/rss",
"ttl": "1440",
"image": {
"title": "NCBI pubmed",
"url": "https://www.ncbi.nlm.nih.gov/entrez/query/static/gifs/iconsml.gif",
"link": "https://www.ncbi.nlm.nih.gov/sites/entrez",
"description": "PubMed comprises more than millions of citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites."
},
"item": [
{
"title": "CHARMM-GUI MDFF/xMDFF Utilizer for Molecular Dynamics Flexible Fitting Simulations in Various Environments.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/27936734?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--pubs.acs.org-images-pubmed-acspubs.jpg\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>CHARMM-GUI MDFF/xMDFF Utilizer for Molecular Dynamics Flexible Fitting Simulations in Various Environments.</b></p>
<p>J Phys Chem B. 2016 Dec 23;:</p>
<p>Authors: Qi Y, Lee J, Singharoy A, McGreevy R, Schulten K, Im W</p>
<p>Abstract<br/>
X-ray crystallography and cryo-electron microscopy are two popular methods for the structure determination of biological molecules. Atomic structures are derived through the fitting and refinement of an initial model into electron density maps constructed by both experiments. Two computational approaches, MDFF and xMDFF, have been developed to facilitate this process by integrating the experimental data with molecular dynamics simulation. However, the setup of an MDFF/xMDFF simulation requires knowledge of both experimental and computational methods, which is not straightforward for nonexpert users. In addition, sometimes it is desirable to include realistic environments, such as explicit solvent and lipid bilayers during the simulation, which poses another challenge even for expert users. To alleviate these difficulties, we have developed MDFF/xMDFF Utilizer in CHARMM-GUI that helps users to set up an MDFF/xMDFF simulation. The capability of MDFF/xMDFF Utilizer is greatly enhanced by integration with other CHARMM-GUI modules, including protein structure manipulation, a diverse set of lipid types, and all-atom CHARMM and coarse-grained PACE force fields. With this integration, various simulation environments are available for MDFF Utilizer (vacuum, implicit/explicit solvent, and bilayers) and xMDFF Utilizer (vacuum and solution). In this work, three examples are shown to demonstrate the usage of MDFF/xMDFF Utilizer.<br/>
</p><p>PMID: 27936734 [PubMed - as supplied by publisher]</p>
",
"author": " Qi Y, Lee J, Singharoy A, McGreevy R, Schulten K, Im W",
"category": "J Phys Chem B",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:27936734"
}
},
{
"title": "Modeling and simulation of bacterial outer membranes and interactions with membrane proteins.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/28157627?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--linkinghub.elsevier.com-ihub-images-PubMedLink.gif\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>Modeling and simulation of bacterial outer membranes and interactions with membrane proteins.</b></p>
<p>Curr Opin Struct Biol. 2017 Jan 31;43:131-140</p>
<p>Authors: Patel DS, Qi Y, Im W</p>
<p>Abstract<br/>
The outer membrane (OM) of Gram-negative bacteria is composed of phospholipids in the periplasmic leaflet and lipopolysaccharides (LPS) in the external leaflet, along with β-barrel OM proteins (OMPs) and lipidated periplasmic lipoproteins. As a defensive barrier to toxic compounds, an LPS molecule has high antigenic diversity and unique combination of OM-anchored lipid A with core oligosaccharides and O-antigen polysaccharides, creating dynamic protein-LPS and LPS-LPS interactions. Here, we review recent efforts on modeling and simulation of native-like bacterial OMs to explore structures, dynamics, and interactions of different OM components and their roles in transportation of ions, substrates, and antibiotics across the OM and accessibility of monoclonal antibodies (mAbs) to surface epitopes. Simulation studies attempting to provide insight into the structural basis for LPS transport and OMP insertion in the bacterial OM are also highlighted.<br/>
</p><p>PMID: 28157627 [PubMed - as supplied by publisher]</p>
",
"author": " Patel DS, Qi Y, Im W",
"category": "Curr Opin Struct Biol",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:28157627"
}
},
{
"title": "Refinement of OprH-LPS Interactions by Molecular Simulations.",
"link": "https://www.ncbi.nlm.nih.gov/pubmed/28122220?dopt=Abstract",
"description": "
<table border=\"0\" width=\"100%\"><tr><td align=\"left\"><img src=\"//www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--linkinghub.elsevier.com-ihub-images-cellhub.gif\" border=\"0\"/> </td><td align=\"right\">Related Articles</td></tr></table>
<p><b>Refinement of OprH-LPS Interactions by Molecular Simulations.</b></p>
<p>Biophys J. 2017 Jan 24;112(2):346-355</p>
<p>Authors: Lee J, Patel DS, Kucharska I, Tamm LK, Im W</p>
<p>Abstract<br/>
The outer membrane (OM) of Gram-negative bacteria is composed of lipopolysaccharide (LPS) in the outer leaflet and phospholipids in the inner leaflet. The outer membrane protein H (OprH) of Pseudomonas aeruginosa provides an increased stability to the OMs by directly interacting with LPS. Here we report the influence of various P. aeruginosa and, for comparison, Escherichia coli LPS environments on the physical properties of the OMs and OprH using all-atom molecular dynamics simulations. The simulations reveal that although the P. aeruginosa OMs are thinner hydrophobic bilayers than the E. coli OMs, which is expected from the difference in the acyl chain length of their lipid A, this effect is almost imperceptible around OprH due to a dynamically adjusted hydrophobic match between OprH and the OM. The structure and dynamics of the extracellular loops of OprH show distinct behaviors in different LPS environments. Including the O-antigen greatly reduces the flexibility of the OprH loops and increases the interactions between these loops and LPS. Furthermore, our study shows that the interactions between OprH and LPS mainly depend on the secondary structure of OprH and the chemical structure of LPS, resulting in distinctive patterns in different LPS environments.<br/>
</p><p>PMID: 28122220 [PubMed - in process]</p>
",
"author": " Lee J, Patel DS, Kucharska I, Tamm LK, Im W",
"category": "Biophys J",
"guid": {
"-isPermaLink": "false",
"#text": "PubMed:28122220"
}
},
]
}
}
}
Which is simply sorting according to given "#text": "PubMed:28122220". I would appreciate if anyone can help.

Dataframe in R to be converted to sequence of JSON objects

I had asked the same question after editing 2 times of a previous question I had posted. I am sorry for the bad usage of this website. I have flagged it for deletion and I am posting a proper new question on the same here. Please look into this.
I am basically working on a recommender system code. The output has to be converted to sequence of JSON objects. I have a matrix that has a look up table for every item ID, with the list of the closest items it is related to and the the similarity scores associated with their combinations.
Let me explain through a example.
Suppose I have a matrix
In the below example, Item 1 is similar to Items 22 and 23 with similarity scores 0.8 and 0.5 respectively. And the remaining rows follow the same structure.
X1 X2 X3 X4 X5
1 22 23 0.8 0.5
34 4 87 0.4 0.4
23 7 92 0.6 0.5
I want a JSON structure for every item (every X1 for every row) along with the recommended items and the similarity scores for each combination as a separate JSON entity and this being done in sequence. I don't want an entire JSON object containing these individual ones.
Assume there is one more entity called "coid" that will be given as input to the code. I assume it is XYZ and it is same for all the rows.
{ "_id" : { "coid" : "XYZ", "iid" : "1"}, "items" : [ { "item" : "22", "score" : 0.8},{ "item": "23", "score" : 0.5}] }
{ "_id" : { "coid" : "XYZ", "iid" : "34"},"items" : [ { "item" : "4", "score" : 0.4},{ "item": "87", "score" : 0.4}] }
{ "_id" : { "coid" : "XYZ", "iid" : "23"},"items" : [ { "item" : "7", "score" : 0.6},{ "item": "92", "score" : 0.5}] }
As in the above, each entity is a valid JSON structure/object but they are not put together into a separate JSON object as a whole.
I appreciate all the help done for the previous question but somehow I feel this new alteration I have here is not related to them because in the end, if you do a toJSON(some entity), then it converts the entire thing to one JSON object. I don't want that.
I want individual ones like these to be written to a file.
I am very sorry for my ignorance and inconvenience. Please help.
Thanks.
library(rjson)
## Your matrix
mat <- matrix(c(1,34,23,
22, 4, 7,
23,87,92,
0.8, 0.4, 0.6,
0.5, 0.4, 0.5), byrow=FALSE, nrow=3)
I use a function (not very interesting name makejson) that takes a row of the matrix and returns a JSON object. It makes two list objects, _id and items, and combines them to a JSON object
makejson <- function(x, coid="ABC") {
`_id` <- list(coid = coid, iid=x[1])
nitem <- (length(x) - 1) / 2 # Number of items
items <- list()
for(i in seq(1, nitem)) {
items[[i]] <- list(item = x[i + 1], score = x[i + 1 + nitem])
}
toJSON(list(`_id`=`_id`, items=items))
}
Then using apply (or a for loop) I use the function for each row of the matrix.
res <- apply(mat, 1, makejson, coid="XYZ")
cat(res, sep = "\n")
## {"_id":{"coid":"XYZ","iid":1},"items":[{"item":22,"score":0.8},{"item":23,"score":0.5}]}
## {"_id":{"coid":"XYZ","iid":34},"items":[{"item":4,"score":0.4},{"item":87,"score":0.4}]}
## {"_id":{"coid":"XYZ","iid":23},"items":[{"item":7,"score":0.6},{"item":92,"score":0.5}]}
The result can be saved to a file with cat by specifying the file argument.
## cat(res, sep="\n", file="out.json")
There is a small difference in your output and mine, the numbers are in quotes ("). If you want to have it like that, mat has to be character.
## mat <- matrix(as.character(c(1,34,23, ...
Hope it helps,
alex

How to model boolean expressions in JSON tree structure

I've spent a few hours on google and stack overflow, but I'm yet to come to a conclusion on just how to model nested boolean data.
Let's say I have the following expression:
123 and 321 and (18 or 19 and (20 or 21))
How could I model this in a JSON tree structure so that I could rebuild the expression as you see it above by simply traversing the tree? I don't need to actually evaluate the logic, but simply structure it in such a way that it portrays the logic in tree-form.
Thanks in advance.
For the record, this is the type of system I'm trying to accomplish and how I'm guessing the tree should be structured based on the answer below.
ANY OF THESE:
13
14
ALL OF THESE:
18
19
20
or
/ \
or 13
/ \
14 and
/ \
and 18
/ \
20 19
My ConditionSet in json format :
"FilterCondition": {
"LogicalOperator": "AND",
"Conditions": [
{
"Field": "age",
"Operator": ">",
"Value": "8"
},
{
"LogicalOperator": "OR",
"Conditions": [
{
"Field": "gender",
"Operator": "=",
"Value": "female"
},
{
"Field": "occupation",
"Operator": "IN",
"Value": ["business","service"]
}
]
}
]
}
Reference : https://zebzhao.github.io/Angular-QueryBuilder/demo/
Think about which order the programming language would evaluate the parts of your statement in. Depending on the precedence of and and or and their left or right associativity, it will have to pick some part that is the 'deepest' and it must be evaluated first, then it is given to its 'parent' (the closest less associative operator) as one of its fully evaluated operands, then when that is evaluated it has a parent and so on.
So, you would have a tree where the root is reached after full evaluation, and leaf nodes are the parts of the expression that can be evaluated first (don't rely on any evaluations to come to their result).
As a simple example,1 and (2 OR 3) would be modelled as
and
/ \
1 or
/ \
2 3
If operators at the same precedence are evaluated left to right and AND is higher precedence than OR (for example true in C++: http://en.cppreference.com/w/cpp/language/operator_precedence ) then
123 and 321 and (18 or 19 and (20 or 21))
becomes
and
/ \
and \
/ \ \
123 321 \
\
or
/ \
18 and
/ \
or 19
/ \
20 21
And to evaluate the result of this tree, you would evaluate deepest first, replacing each node with the result of evaluating its left and its right with its current operator until there is only one number left in the root.
To go from a boolean expression to a boolean expression tree programatically you need to write a parser*, for example in Python you would write it using PLY http://www.dabeaz.com/ply/ and each language has a different third party parser construction library that is the most popular.