I have a Erlang list of tuples as follows:
[ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]} ]
I wanted this list of tuples in this form:
<<" [ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]}] ">>
So I tried using JSON parsing libraries in erlang (both jiffy and jsx )
Here is what I did:
A=[ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]} ],
B=erlang:iolist_to_binary(io_lib:write(A)),
jsx:encode(B).
and I get the following output(here I have changed the list to binary since jsx accepts binary):
<<"[{{[97]},[2],[{3,[98]},{4,[99]}],[5,[100]],[1,1],{e},[[102]]},{{[103]},
[3],[{6,[104]},{7,[105]}],[{8,[106]}],[1,1,1],{k},[[76]]}]">>
jiffy:encode(B) also gives the same output.
Can anyone help me to get the output as :
<<" [ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]}] ">>
instead of
<<"[{{[97]},[2],[{3,[98]},{4,[99]}],[5,[100]],[1,1],{e},[[102]]},{{[103]},
[3],[{6,[104]},{7,[105]}],[{8,[106]}],[1,1,1],{k},[[76]]}]">>
Thank you in advance
Instead of io_lib:write(A), use io_lib:format("~p", [A]). It tries to guess which lists are actually meant to be strings. (In Erlang, strings are actually lists of integers. Try it: "A" == [65])
> A=[ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]} ].
[{{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]},
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]}]
> B = erlang:iolist_to_binary(io_lib:format("~p", [A])).
<<"[{{\"a\"},[2],[{3,\"b\"},{4,\"c\"}],[5,\"d\"],[1,1],{e},[\"f\"]},\n {{\"g\"},[3],[{6,\"h\"},{7,\"i\"}],[{8,\"j\"}],[1,1,1],{k},[\"L\"]}]">>
If you don't want to see the backslashes before the double quotes, you can print the string to standard output:
> io:format("~s\n", [B]).
[{{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]},
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]}]
<<" [ {{"a"},[2],[{3,"b"},{4,"c"}],[5,"d"],[1,1],{e},["f"]} ,
{{"g"},[3],[{6,"h"},{7,"i"}],[{8,"j"}],[1,1,1],{k},["L"]}] ">>
This ^^ isn't a valid erlang term, but I think what you're getting at is that you want the "listy" strings, like "a" to be printed out like "a" instead of [97]. Unfortunately, I've found this to be a serious shortcoming of Erlang. The problem is that the string literal "a" is only syntactic sugar and is identical to the term [97], so any time you output it, you're subject to the vagaries of "is this thing a string or a list of integers?" The best way I know to get out of that is to use binaries as your strings wherever possible, like <<"a">> instead of "a".
I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. I found that complex JSON can be parsed by using Twitter's Elephant Bird or Mozilla's Akela library. (I found some additional libraries, but I cannot use 'Loader' based approach since I use HCatalog Loader to load data from Hive.)
But, the problem is the structure of my data; each value of Map structure contains value part of complex JSON. For example,
1. My table looks like (WARNING: type of 'complex_data' is not STRING, a MAP of <STRING, STRING>!)
TABLE temp_table
(
user_id BIGINT COMMENT 'user ID.',
complex_data MAP <STRING, STRING> COMMENT 'complex json data'
)
COMMENT 'temp data.'
PARTITIONED BY(created_date STRING)
STORED AS RCFILE;
2. And 'complex_data' contains (a value that I want to get is marked with two *s, so basically #'d'#'f' from each PARSED_STRING(complex_data#'c') )
{ "a": "[]",
"b": "\"sdf\"",
"**c**":"[{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},
{\"**d**\":{\"e\":\"sdfsdf\"
,\"**f**\":\"sdfs\"
,\"g\":\"qweqweqwe\"},
\"c\":[{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"},
{\"d\":21321,\"e\":\"ewrwer\"}]
},]"
}
3. So, I tried... (same approach for Elephant Bird)
REGISTER '/path/to/akela-0.6-SNAPSHOT.jar';
DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap();
data = LOAD temp_table USING org.apache.hive.hcatalog.pig.HCatLoader();
values_of_map = FOREACH data GENERATE complex_data#'c' AS attr:chararray; -- IT WORKS
-- dump values_of_map shows correct chararray data per each row
-- eg) ([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }])
([{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... },
{"d":{"e":"sdfsdf","f":"sdfs","g":"sdf"},... }]) ...
attempt1 = FOREACH data GENERATE JsonTupleMap(complex_data#'c'); -- THIS LINE CAUSE AN ERROR
attempt2 = FOREACH data GENERATE JsonTupleMap(CONCAT(CONCAT('{\\"key\\":', complex_data#'c'), '}'); -- IT ALSO DOSE NOT WORK
I guessed that "attempt1" was failed because the value doesn't contain full JSON. However, when I CONCAT like "attempt2", I generate additional \ mark with. (so each line starts with {\"key\": ) I'm not sure that this additional marks breaks the parsing rule or not. In any case, I want to parse the given JSON string so that Pig can understand. If you have any method or solution, please Feel free to let me know.
I finally solved my problem by using jyson library with jython UDF.
I know that I can solve it by using JAVA or other languages.
But, I think that jython with jyson is the most simplist answer to this issue.
I've stared at this so long I'm going in circles...
I'm using the rbvmomi gem, and in Pry, when I display an object, it recurses down thru the structure showing me the nested objects - but to_json seems to "dig down" into some objects, but just dump the reference for others> Here's an example:
[24] pry(main)> g
=> [GuestNicInfo(
connected: true,
deviceConfigId: 4000,
dynamicProperty: [],
ipAddress: ["10.102.155.146"],
ipConfig: NetIpConfigInfo(
dynamicProperty: [],
ipAddress: [NetIpConfigInfoIpAddress(
dynamicProperty: [],
ipAddress: "10.102.155.146",
prefixLength: 20,
state: "preferred"
)]
),
macAddress: "00:50:56:a0:56:9d",
network: "F5_Real_VM_IPs"
)]
[25] pry(main)> g.to_json
=> "[\"#<RbVmomi::VIM::GuestNicInfo:0x000000085ecc68>\"]"
Pry apparently just uses a souped-up pp, and while "pp g" gives me close to what I want, I'm kinda steering as hard as I can toward json so that I don't need a custom parser to load up and manipulate the results.
The question is - how can I get the json module to dig down like pp does? And if the answer is "you can't" - any other suggestions for achieving the goal? I'm not married to json - if I can get the data serialized and read it back later (without writing something to parse pp output... which may already exist and I should look for it), then it's all win.
My "real" goal here is to slurp up a bunch of info from our vsphere stuff via rbvmomi so that I can do some network/vm analysis on it, which is why I'd like to get it in a nice machine-parsed format. If I'm doing something stupid here and there's an easier way to go about this - lay it on me, I'm not proud. Thank you all for your time and attention.
Update: Based on Arnie's response, I added this monkeypatch to my script:
class RbVmomi::BasicTypes::DataObject
def to_json(*args)
h = self.props
m = h.merge({ JSON.create_id => self.class.name })
m.to_json(*args)
end
end
and now my to_json recurses down nicely. I'll see about submitting this (or the def, really) to the project.
The .to_json works in a recursive manner, the default behavior is defined as:
Converts this object to a string (calling to_s), converts it to a JSON string, and returns the result. This is a fallback, if no special method to_json was defined for some object.
json library has added some implementation for some common classes (check the left hand side of this documentation), such as Array, Range, DateTime.
For an array, to_json first convert all the elements to json object, concat then together, and then add the array mark [/].
For your case, you need to define your customized to_json method for GuestNicInfo, NetIpConfigInfo and NetIpConfigInfoIpAddress. I don't know your implementation about these three classes, so I wrote a example to demonstrate how to achieve this:
require 'json'
class MyClass
attr_accessor :a, :b
def initialize(a, b)
#a = a
#b = b
end
end
data = [MyClass.new(1, "foobar")]
puts data.to_json
#=> ["#<MyClass:0x007fb6626c7260>"]
class MyClass
def to_json(*args)
{
JSON.create_id => self.class.name,
:a => a,
:b => b
}.to_json(*args)
end
end
puts data.to_json
#=> [{"json_class":"MyClass","a":1,"b":"foobar"}]
I have a list of JSON objects (received from a nosql db) and want to remove or rename some keys. And then I want to return data as a list of JSON objects once again.
This Stackoverflow post provides a good sense of how to use mochijson2. And I'm thinking I could use a list comprehension to go through the list of JSON objects.
The part I am stuck with is how to remove the key for each JSON object (or proplist if mochijson2 is used) within a list comprehension. I can use the delete function of proplists. But I am unsuccessful when trying to do that within a list comprehension.
Here is a bit code for context :
A = <<"[{\"id\": \"0129\", \"name\": \"joe\", \"photo\": \"joe.jpg\" }, {\"id\": \"0759\", \"name\": \"jane\", \"photo\": \"jane.jpg\" }, {\"id\": \"0929\", \"name\": \"john\", \"photo\": \"john.jpg\" }]">>.
Struct = mochijson2:decode(A).
{struct, JsonData} = Struct,
{struct, Id} = proplists:get_value(<<"id">>, JsonData),
Any suggestions illustrated with code much appreciated.
You can use lists:keydelete(Key, N, TupleList) to return a new tuple list with certain tuples removed. So in the list comprehension, for each entry extract the tuple lists (or proplists), and create a new struct with the key removed:
B = [{struct, lists:keydelete(<<"name">>, 1, Props)} || {struct, Props} <- Struct].
gives:
[{struct,[{<<"id">>,<<"0129">>},
{<<"photo">>,<<"joe.jpg">>}]},
{struct,[{<<"id">>,<<"0759">>},
{<<"photo">>,<<"jane.jpg">>}]},
{struct,[{<<"id">>,<<"0929">>},
{<<"photo">>,<<"john.jpg">>}]}]
and
iolist_to_binary(mochijson2:encode(B)).
gives:
<<"[{\"id\":\"0129\",\"photo\":\"joe.jpg\"},{\"id\":\"0759\",\"photo\":\"jane.jpg\"},{\"id\":\"0929\",\"photo\":\"john.jpg\"}]">>
By the way, using the lists/* tuple lists functions are much faster, but sometimes slightly less convenient than the proplists/* functions: http://sergioveiga.com/index.php/2010/05/14/erlang-listskeyfind-vs-proplistsget_value/