OpenEdge ABL reserved keyword as temp-table field name (inferred from JSON data) - json

I am stuck with the following situation:
My method receives a response from external REST API call. The JSON response structure is as below:
{
"members": [
{
"email_address": "random#address.org",
"status": "randomstatus"
},
...etc...
]}
I am reading this to temp-table with READ-JSON (Inferring ABL schema from JSON data) and try to process the temp-table. And this is where I am stuck:
when I am trying to put together a query that contains temp-table field "status", the error is raised.
Example:
hQuery:QUERY-PREPARE('FOR EACH ' + httSubscriber:NAME + ' WHERE ' + hBuffer:BUFFER-FIELD(iStatus):NAME + ' = "randomstatus"').
gives:
**Unable to understand after -- "members WHERE".(247)
I have tried referencing directly by name as well, same result.
Probably the "status" is a reserved keyword in ABL. Might that be the case? And how can I get over this issue to reference that "status" field?
Unfortunately the format and key names of JSON response are not under my control and I have to work with that.

You could use SERIALIZE-NAME in the temp-table definition to internally rename the field in question. Then you would have to refer to the field with another name but in it's serialized form it would still be known as status.
Here's an example where the status-field is renamed to exampleStatus.
DEFINE TEMP-TABLE ttExample NO-UNDO
FIELD exampleStatus AS CHARACTER SERIALIZE-NAME "status".
/* Code to read json goes here... */
/* Access the field */
FOR EACH ttExample:
DISPLAY ttExample.exampleStatus.
END.

I've been known to do silly things like this:
JSONData = replace( JSONData, '"status":', '"xstatus":' ).

Try naming the temp-table (hard-coded or via string appending) + '.' + hBuffer:BUFFER-FIELD(iStatus):NAME (...)
It should help the compiler understand you're talking about the field. Since it's not restricted, this should force its hand and allow you to query.

Related

Redshift JSON Parsing

I have some JSON data in Redshift table of type character varying. An example entry is:
[{"value":["*"], "key":"testData"}, {"value":"["GGG"], key: "differentData"}]
I want to return vales based on keys, how can i do this? I'm attempting to do something like
json_extract_path_text(column, 'value') but unfortunately it errors out. Any ideas?
So the first issue is that your string isn't valid JSON. There are mismatched and missing quotes. I think you mean:
[{"value":["*"], "key":"testData"}, {"value":["GGG"], "key": "differentData"}]
I don't know if this is a data issue or a transcription error but these functions won't work unless the json text is valid.
The next thing to consider is that at the top level this json is an array so you will need to use json_extract_array_element_text() function to pick up an element of the array. For example:
json_extract_array_element_text('json string', 0)
So putting this together we can extract the first "value" with (untested):
json_extract_path_text(
json_extract_array_element_text(
'[{"value":["*"], "key":"testData"}, {"value":["GGG"], "key": "differentData"}]', 0
), 'value'
)
Should return the string ["*"].

How to access the key of a jsoncpp Value

I kind of feel stupid for asking this, but haven't been able to find a way to get the key of a JSON value. I know how to retrieve the key if I have an iterator of the object. I also know of operator[].
In my case the key is not a known value, so can't use get(const char *key) or operator[]. Also can't find a getKey() method.
My JSON looks like this:
{Obj_Array: [{"122":{"Member_Array":["241", "642"]}}]}
For the piece of code to parse {"122":{"Member_Array":["241", "642"]}} I want to use get_key()-like function just to retrieve "122" but seems like I have to use an iterator which to me seems to be overkill.
I might have a fundamental lack of understanding of how jsoncpp is representing a JSON file.
First, what you have won't parse in JsonCPP. Keys must always be enclosed in double quotes:
{"Obj_Array": [{"122":{"Member_Array":["241", "642"]}}]}
Assuming that was just an oversight, if we add whitespace and tag the elements:
{
root-> "Obj_Array" : [
elem0-> {
key0-> "122":
val0-> {
key0.1-> "Member_Array" :
val0.1-> [
elem0.1.0-> "241",
elem0.1.1-> "642" ]
}
}
]
}
Assuming you have managed to read your data into a Json::Value (let's call it root), each of the tagged values can be accessed like this:
elem0 = root[0];
val0 = elem0["122"]
val0_1 = val0["Member_Array"];
elem0_1_0 = val0_1[0];
elem0_1_1 = val0_1[1];
You notice that this only retrieves values; the keys were known a priori. This is not unusual; the keys define the schema of the data; you have to know them to directly access the values.
In your question, you state that this is not an option, because the keys are not known. Applying semantic meaning to unknown keys could be challenging, but you already came to the answer. If you want to get the key values, then you do have to iterate over the elements of the enclosing Json::Value.
So, to get to key0, you need something like this (untested):
elem0_members = elem0.getMemberNames();
key0 = elem0_members[0];
This isn't production quality, by any means, but I hope it points in the right direction.

How do I search for a string in this JSON with Python

My JSON file looks something like:
{
"generator": {
"name": "Xfer Records Serum",
....
},
"generator": {
"name: "Lennar Digital Sylenth1",
....
}
}
I ask the user for search term and the input is searched for in the name key only. All matching results are returned. It means if I input 's' only then also both the above ones would be returned. Also please explain me how to return all the object names which are generators. The more simple method the better it will be for me. I use json library. However if another library is required not a problem.
Before switching to JSON I tried XML but it did not work.
If your goal is just to search all name properties, this will do the trick:
import re
def search_names(term, lines):
name_search = re.compile('\s*"name"\s*:\s*"(.*' + term + '.*)",?$', re.I)
return [x.group(1) for x in [name_search.search(y) for y in lines] if x]
with open('path/to/your.json') as f:
lines = f.readlines()
print(search_names('s', lines))
which would return both names you listed in your example.
The way the search_names() function works is it builds a regular expression that will match any line starting with "name": " (with varying amount of whitespace) followed by your search term with any other characters around it then terminated with " followed by an optional , and the end of string. Then applies that to each line from the file. Finally it filters out any non-matching lines and returns the value of the name property (the capture group contents) for each match.

Regular expression to remove the key:value from hash / json

Suppose I have following json and I want to skip the entry "data_type" from it.
{
"marketing_type":"FIT",
"controllable":"true",
"plannable":"true",
"sbm_qualified":"true",
"marginal_cost":"{:type=>\"float\", :label=>\"Marginal Cost to steer\",:unit=>\"$/MWh\", :default=>100} must be float.",
"data_type": "any_value",
"start_cost":"{:type=>\"float\", :label=>\"Start Cost\", :unit=>\"$\",:default=>0} must be float."
}
Expected output is "data_type" entry should be removed from above.
Instead of using regex and string manipulations, and if you're running at least MySQL 5.7, you can use one of the built-in JSON functions, json_remove:
update table_name set column_name = json_remove(column_name, "$.data_type")

Parse complex Json string contained in Hadoop

I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. I found that complex JSON can be parsed by using Twitter's Elephant Bird or Mozilla's Akela library. (I found some additional libraries, but I cannot use 'Loader' based approach since I use HCatalog Loader to load data from Hive.)
But, the problem is the structure of my data; each value of Map structure contains value part of complex JSON. For example,
1. My table looks like (WARNING: type of 'complex_data' is not STRING, a MAP of <STRING, STRING>!)
TABLE temp_table
(
user_id BIGINT COMMENT 'user ID.',
complex_data MAP <STRING, STRING> COMMENT 'complex json data'
)
COMMENT 'temp data.'
PARTITIONED BY(created_date STRING)
STORED AS RCFILE;
2. And 'complex_data' contains (a value that I want to get is marked with two *s, so basically #'d'#'f' from each PARSED_STRING(complex_data#'c') )
{ "a": "[]",
"b": "\"sdf\"",
"**c**":"[{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},
{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},]"
}
3. So, I tried... (same approach for Elephant Bird)
REGISTER '/path/to/akela-0.6-SNAPSHOT.jar';
DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap();
data = LOAD temp_table USING org.apache.hive.hcatalog.pig.HCatLoader();
values_of_map = FOREACH data GENERATE complex_data#'c' AS attr:chararray; -- IT WORKS
-- dump values_of_map shows correct chararray data per each row
-- eg) ([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }])
([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }]) ...
attempt1 = FOREACH data GENERATE JsonTupleMap(complex_data#'c'); -- THIS LINE CAUSE AN ERROR
attempt2 = FOREACH data GENERATE JsonTupleMap(CONCAT(CONCAT('{\\"key\\":', complex_data#'c'), '}'); -- IT ALSO DOSE NOT WORK
I guessed that "attempt1" was failed because the value doesn't contain full JSON. However, when I CONCAT like "attempt2", I generate additional \ mark with. (so each line starts with {\"key\": ) I'm not sure that this additional marks breaks the parsing rule or not. In any case, I want to parse the given JSON string so that Pig can understand. If you have any method or solution, please Feel free to let me know.
I finally solved my problem by using jyson library with jython UDF.
I know that I can solve it by using JAVA or other languages.
But, I think that jython with jyson is the most simplist answer to this issue.