JSONPath Syntax when dot in key - json

Please forgive me if I use the incorrect terminology, I am quite the novice.
I have some simple JSON:
{
"properties": {
"footer.navigationLinks": {
"group": "layout"
, "default": [
{
"text": "Link a"
, "href": "#"
}
]
}
}
}
I am trying to pinpoint "footer.navigationLinks" but I am having trouble with the dot in the key name. I am using http://jsonpath.com/ and when I enter
$.properties['footer.navigationLinks']
I get 'No match'. If I change the key to "footernavigationLinks" it works but I cannot control the key names in the JSON file.
Please can someone help me target that key name?

Having a json response:
{
"0": {
"SKU": "somevalue",
"Merchant.Id": 234
}
}
I can target a key with a . (dot) in the name.
jsonPath.getJsonObject("0.\"Merchant.Id\"")
Note: the quotes and the fact that they are escaped.
Note not sure of other versions, but I'm using
'com.jayway.restassured', name: 'json-path', version: '2.9.0'
A few samples/solutions I've seen, was using singe quotes with brackets, but did not work for me.

For information, jsonpath.com has been patched since the question was asked, and it now works for the example given in the question. I tried these paths successfully:
$.properties['footer.navigationLinks']
$.properties.[footer.navigationLinks]
$.properties.['footer.navigationLinks']
$['properties']['footer.navigationLinks']
$.['properties'].['footer.navigationLinks']
properties.['footer.navigationLinks']
etc.

This issue was reported in 2007 as issue #4 - Member names containing dot fail and fixed.
The fix is not present in this online jsonpath.com implementation, but it is fixed in this old archive and probably in most of the forks that have been created since (like here and here).
Details about the bug
A comparison between the buggy and 2007-corrected version of the code, reveals that the correction was made in the private normalize function.
In the 2007-corrected version it reads:
normalize: function(expr) {
var subx = [];
return expr.replace(/[\['](\??\(.*?\))[\]']|\['(.*?)'\]/g, function($0,$1,$2){
return "[#"+(subx.push($1||$2)-1)+"]";
}) /* http://code.google.com/p/jsonpath/issues/detail?id=4 */
.replace(/'?\.'?|\['?/g, ";")
.replace(/;;;|;;/g, ";..;")
.replace(/;$|'?\]|'$/g, "")
.replace(/#([0-9]+)/g, function($0,$1){
return subx[$1];
});
},
The first and last replace in that sequence make sure the second replace does not interpret a point in a property name as a property separator.
I had a look at the more up-to-date forks that have been made since then, and the code has evolved enormously since.
Conclusion:
jsonpath.com is based on an outdated version of JSONPath and is not reliable for previewing what current libraries would provide you with.

You can encapsulate the 'key with dots' with single quotes as below
response.jsonpath().get("properties.'footer.navigationLinks'")
Or even escape the single quotes as shown:
response.jsonpath().get("properties.\'footer.navigationLinks\'")
Both work fine

Related

How to access the key of a jsoncpp Value

I kind of feel stupid for asking this, but haven't been able to find a way to get the key of a JSON value. I know how to retrieve the key if I have an iterator of the object. I also know of operator[].
In my case the key is not a known value, so can't use get(const char *key) or operator[]. Also can't find a getKey() method.
My JSON looks like this:
{Obj_Array: [{"122":{"Member_Array":["241", "642"]}}]}
For the piece of code to parse {"122":{"Member_Array":["241", "642"]}} I want to use get_key()-like function just to retrieve "122" but seems like I have to use an iterator which to me seems to be overkill.
I might have a fundamental lack of understanding of how jsoncpp is representing a JSON file.
First, what you have won't parse in JsonCPP. Keys must always be enclosed in double quotes:
{"Obj_Array": [{"122":{"Member_Array":["241", "642"]}}]}
Assuming that was just an oversight, if we add whitespace and tag the elements:
{
root-> "Obj_Array" : [
elem0-> {
key0-> "122":
val0-> {
key0.1-> "Member_Array" :
val0.1-> [
elem0.1.0-> "241",
elem0.1.1-> "642" ]
}
}
]
}
Assuming you have managed to read your data into a Json::Value (let's call it root), each of the tagged values can be accessed like this:
elem0 = root[0];
val0 = elem0["122"]
val0_1 = val0["Member_Array"];
elem0_1_0 = val0_1[0];
elem0_1_1 = val0_1[1];
You notice that this only retrieves values; the keys were known a priori. This is not unusual; the keys define the schema of the data; you have to know them to directly access the values.
In your question, you state that this is not an option, because the keys are not known. Applying semantic meaning to unknown keys could be challenging, but you already came to the answer. If you want to get the key values, then you do have to iterate over the elements of the enclosing Json::Value.
So, to get to key0, you need something like this (untested):
elem0_members = elem0.getMemberNames();
key0 = elem0_members[0];
This isn't production quality, by any means, but I hope it points in the right direction.

Retain trailing 's' for table in Postgraphile

Is there a way to disable the 'remove-the-plural-s' feature in Postgraphile?
I have a table OS in my database and am using the very awesome Postgraphile library to create a GraphQL interface for free. Everything is great, but Postgraphile is truncating my table name, thinking it is plural. So I get allOs instead of allOses and createO, updateO, etc...
I tried:
Adding an underscore after the table name, and then it just retains the entire thing with an underscore.
Adding an underscore (O_S) and then the plural has capital-s allOS but the singular is O_
A smart comment specifying E'#name os' but it still drops the s
A smart comment specifying E'#name oss' which then pluralizes correctly allOsses (haha) and keeps both for the singular oss
PS in case you see this Benjie/other contributors, your documentation is incredible and the library will save me months of work.
This change is performed by PostGraphile's inflector; however it doesn't always get it right (e.g. in this case) but fortunately it's possible to override it with a small plugin.
In this case, it's probably best to add specific exceptions to the pluralize and singularize functions; you can do this using makeAddInflectorsPlugin from our inflection system. Be sure to pass true as the second argument so that the system knows you're deliberately overwriting the inflectors.
const { makeAddInflectorsPlugin } = require('graphile-utils');
module.exports = makeAddInflectorsPlugin(oldInflectors => ({
pluralize(str) {
if (str.match(/^os$/i)) {
return str + 'ses';
}
return oldInflectors.pluralize(str);
},
singularize(str) {
if (str.match(/^osses$/i) {
return str.substr(0, 2);
}
return oldInflectors.singularize(str);
}
}), true);
I'm glad you're enjoying PostGraphile 🤘

What's the difference between the 'originalText' and 'word' keys in a token?

When using CoreNLPParser from NLTK with CoreNLP Server, the resulting tokens contain both an 'originalText' key and a 'word' key.
What's the difference between the two? Is there any documentation about them?
I've only found this issue, which mentioned the origintalText key, but it doesn't answer my questions.
from nltk.parse.corenlp import CoreNLPParser
corenlp_parser = CoreNLPParser('http://localhost:9000', encoding='utf8')
text = u'我家没有电脑。'
result = corenlp_parser.api_call(text, {'annotators': 'tokenize,ssplit'})
print(result)
prints
{
"sentences":[
{
"index":0,
"tokens":[
{
"index":1,
"word":"我家",
"originalText":"我家",
"characterOffsetBegin":0,
"characterOffsetEnd":2
},
{
"index":2,
"word":"没有",
"originalText":"没有",
"characterOffsetBegin":2,
"characterOffsetEnd":4
},
{
"index":3,
"word":"电脑",
"originalText":"电脑",
"characterOffsetBegin":4,
"characterOffsetEnd":6
},
{
"index":4,
"word":"。",
"originalText":"。",
"characterOffsetBegin":6,
"characterOffsetEnd":7
}
]
}
]
}
Update:
It seems the Token implements HasWord and HasOriginalText
A word is transformed a little bit to make it, e.g., possible to print it in an S-Expression (i.e., a parse tree). So, parentheses and other braces become tokens like -LRB- (left round brace). In addition, quotes are normalized to be backticks (``) and forward ticks ('') and some other little things.
originalText, by contrast, is the literal original text of the token that can be used to reconstruct the original sentence.

Convert plain text with a specific format into JSON in VIM

All my university notes are in JSON format and when I get a set of practical questions from a pdf it is formatted like this:
1. Download and compile the code. Run the example to get an understanding of how it works. (Note that both
threads write to the standard output, and so there is some mixing up of the two conceptual streams, but this
is an interface issue, not of concern in this course.)
2. Explore the classes SumTask and StringTask as well as the abstract class Task.
3. Modify StringTask.java so that it also writes out “Executing a StringTask task” when the execute() method is
called.
4. Create a new subclass of Task called ProdTask that prints out the product of a small array of int. (You will have
to add another option in TaskGenerationThread.java to allow the user to generate a ProdTask for the queue.)
Note: you might notice strange behaviour with a naïve implementation of this and an array of int that is larger
than 7 items with numbers varying between 0 (inclusive) and 20 (exclusive); see ProdTask.java in the answer
for a discussion.
5. Play with the behaviour of the processing thread so that it polls more frequently and a larger number of times,
but “pop()”s off only the first task in the queue and executes it.
6. Remove the “taskType” member variable definition from the abstract Task class. Then add statements such as
the following to the SumTask class definition:
private static final String taskType = "SumTask";
Investigate what “static” and “final” mean.
7. More challenging: write an interface and modify the SumTask, StringTask and ProdTask classes so that they
implement this interface. Here’s an example interface:
What I would like to do is copy it into vim and execute a find and replace to convert it into this:
"1": {
"Task": "Download and compile the code. Run the example to get an understanding of how it works. (Note that both threads write to the standard output, and so there is some mixing up of the two conceptual streams, but this is an interface issue, not of concern in this course.)",
"Solution": ""
},
"2": {
"Task": "Explore the classes SumTask and StringTask as well as the abstract class Task.",
"Solution": ""
},
"3": {
"Task": "Modify StringTask.java so that it also writes out “Executing a StringTask task” when the execute() method is called.",
"Solution": ""
},
"4": {
"Task": "Create a new subclass of Task called ProdTask that prints out the product of a small array of int. (You will have to add another option in TaskGenerationThread.java to allow the user to generate a ProdTask for the queue.) Note: you might notice strange behaviour with a naïve implementation of this and an array of int that is larger than 7 items with numbers varying between 0 (inclusive) and 20 (exclusive); see ProdTask.java in the answer for a discussion.",
"Solution": ""
},
"5": {
"Task": "Play with the behaviour of the processing thread so that it polls more frequently and a larger number of times, but “pop()”s off only the first task in the queue and executes it.",
"Solution": ""
},
"6": {
"Task": "Remove the “taskType” member variable definition from the abstract Task class. Then add statements such as the following to the SumTask class definition: private static final String taskType = 'SumTask'; Investigate what “static” and “final” mean.",
"Solution": ""
},
"7": {
"Task": "More challenging: write an interface and modify the SumTask, StringTask and ProdTask classes so that they implement this interface. Here’s an example interface:",
"Solution": ""
}
After trying to figure this out during the practical (instead of actually doing the practical) this is the closest I got:
%s/\([1-9][1-9]*\)\. \(\_.\{-}\)--end--/"\1": {\r "Task": "\2",\r"Solution": "" \r},/g
The 3 problems with this are
I have to add --end-- to the end of each question. I would like it to know when the question ends by looking ahead to a line which starts with [1-9][1-9]*. unfortunately when I search for that It also replaces that part.
This keeps all the new lines within the question (which is invalid in JSON). I would like it to remove the new lines.
The last entry should not contain a "," after the input because that would also be invalid JSON (Note I don't mind this very much as it is easy to remove the last "," manually)
Please keep in mind I am very bad at regular expressions and one of the reasons I am doing this is to learn more about regex so please explain any regex you post as a solution.
In two steps:
%s/\n/\ /g
to solve problem 2, and then:
%s/\([1-9][1-9]*\)\. \(\_.\{-}\([1-9][1-9]*\. \|\%$\)\#=\)/"\1": {\r "Task": "\2",\r"Solution": "" \r},\r/g
to solve problem 1.
You can solve problem 3 with another replace round. Also, my solution inserts an unwanted extra space at the end of the task entries. Try to remove it yourself.
Short explanation of what I have added:
\|: or;
\%$: end of file;
\#=: find but don't include in match.
If each item sits in single line, I would transform the text with macro, it is shorter and more straightforward than the :s:
I"<esc>f.s": {<enter>"Task": "<esc>A"<enter>"Solution": ""<enter>},<esc>+
Record this macro in a register, like q, then you can just replay it like 100#q to do the transformation.
Note that
the result will leave a comma , and the end, just remove it.
You can also add indentations during your macro recording, then your json will be "pretty printed". Or you can make it sexy later with other tool.
You could probably do this with one large regular expression, but that quickly becomes unmaintainable. I would break the task up into 3 steps instead:
Separate each numbered step into its own paragraph .
Put each paragraph on its own line .
Generate the JSON .
Taken together:
%s/^[0-9]\+\./\r&/
%s/\(\S\)\n\(\S\)/\1 \2/
%s/^\([0-9]\+\)\. *\(.*\)/"\1": {\r "Task": "\2",\r "Solution": ""\r},/
This solution also leaves a comma after the last element. This can be removed with:
$s/,//
Explanation
%s/^[0-9]\+\./\r&/ this matches a line starting with a number followed by a dot, e.g. 1., 8., 13., 131, etc. and replaces it with a newline (\r) followed by the match (&).
%s/\(\S\)\n\(\S\)/\1 \2/ this removes any newline that is flanked by non-white-space characters (\S).
%s/^\([0-9]\+\)\. *\(.*\) ... capture the number and text in \1 and \2.
... /"\1": {\r "Task": "\2",\r "Solution": ""\r},/ format text appropriately.
Alternative way using sed, awk and jq
You can perform steps one and two from above straightforwardly with sed and awk:
sed 's/^[0-9]\+\./\n&/' infile
awk '$1=$1; { print "\n" }' RS= ORS=' '
Using jq for the third step ensures that the output is valid JSON:
jq -R 'match("([0-9]+). *(.*)") | .captures | {(.[0].string): { "Task": (.[1].string), "Solution": "" } }'
Here as one command line:
sed 's/^[0-9]\+\./\n&/' infile |
awk '$1=$1; { print "\n" }' RS= ORS=' ' |
jq -R 'match("([0-9]+). *(.*)") | .captures | {(.[0].string): { "Task": (.[1].string), "Solution": "" } }'

Difference Between Two Mongo Queries

what is the difference between two mongo queries.
db.test.find({"field" : "Value"})
db.test.find({field : "Value"})
mongo shell accepts both.
There is no difference in your example.
The problem happens when your field names contain characters which cannot be a part of an identifier in Javascript (because the query engine is run in a javascript repl/shell)
For example user-name because there is a hyphen in it.
Then you would have to query like db.test.find({"user-name" : "Value"})
For the mongo shell there is no actual difference, but in some other language cases it does matter.
The actual case here is presenting what is valid JSON, and with myself as a given example, I try to do this in responses on this forum and others as JSON is a data format that can easily be "parsed" into native data structures, where alternate "JavaScript" notation may not be translated so easily.
There are certain cases where the quoting is required, as in:
db.test.find({ "field-value": 1 })
or:
db.test.find({ "field.value": 1 })
As the values would otherwise be "invalid JavaScript".
But the real point here is adhering to the JSON form.
You can understand with example: suppose that you have test collection with two records
{
'_id': ObjectId("5370a826fc55bb23128b4568"),
'name': 'nanhe'
}
{
'_id': ObjectId("5370a75bfc55bb23128b4567"),
'your name': 'nanhe'
}
db.test.find({'your name':'nanhe'});
{ "_id" : ObjectId("5370a75bfc55bb23128b4567"), "your name" : "nanhe" }
db.test.find({your name:'nanhe'});
SyntaxError: Unexpected identifier