I loaded a json file using JsonLoad() and it loads correctly. Now I want to store only few fields of the json object into a file using jsonStorage(). My Pig script is:
data_input = LOAD '$DATA_INPUT' USING JsonLoader(<<schema>>);
x = FOREACH data_input GENERATE (user__id_str);
STORE x INTO '$DATA_OUTPUT' USING JsonStorage();
Expected output:
{"user__id_str":12345}
{"user__id_str":12345}
{"user__id_str":123467}
Output I am getting:
{"user__id_str":null}
{"user__id_str":null}
{"user__id_str":null}
What is wrong?
EDIT: The schema is huge: consists of 306 fields:
user__contributors_enabled:chararray,retweeted_status__user__friends_count:int,quoted_status__extended_entities__media:chararray,retweeted_status__user__profile_background_image_url:chararray,quoted_status__user__is_translation_enabled:chararray,user__geo_enabled:chararray,avl_word_tags_all:chararray,quoted_status__user__profile_background_color:chararray,quoted_status__user__id_str:chararray,retweeted_status__place__bounding_box__coordinates:chararray,retweeted_status__quoted_status__metadata__result_type:chararray,retweeted_status__user__utc_offset:int,retweeted_status__user__contributors_enabled:chararray,retweeted_status__in_reply_to_screen_name:chararray,retweeted_status__place__place_type:chararray,retweeted_status__quoted_status__user__profile_background_image_url_https:chararray,user__utc_offset:int,quoted_status__favorited:chararray,user__entities__description__urls:chararray,place__url:chararray,quoted_status__user__profile_sidebar_border_color:chararray,favorited:chararray,retweeted_status__user__profile_banner_url:chararray,quoted_status__entities__user_mentions:chararray,retweet_count:int,retweeted_status__user__entities__description__urls:chararray,retweeted_status__quoted_status__user__is_translation_enabled:chararray,retweeted_status__entities__media:chararray,place__bounding_box__type:chararray,text_to_syntaxnet:chararray,quoted_status__user__chararrayed_count:int,avl_pos_tags:chararray,retweeted_status__user__statuses_count:int,quoted_status__metadata__iso_language_code:chararray,created_at:chararray,avl_lexicon_text:chararray,retweeted_status__lang:chararray,place__country:chararray,quoted_status__user__verified:chararray,retweeted_status__quoted_status__user__profile_background_tile:chararray,quoted_status__user__utc_offset:int,retweeted_status__quoted_status__user__location:chararray,quoted_status__created_at:chararray,retweeted_status__quoted_status__lang:chararray,place__place_type:chararray,user__profile_image_url:chararray,quoted_status__user__profile_use_background_image:chararray,user__name:chararray,user__notifications:chararray,user__id:int,in_reply_to_status_id:int,retweeted_status__metadata__iso_language_code:chararray,id:int,retweeted_status__user__follow_request_sent:chararray,retweeted_status__quoted_status__user__profile_use_background_image:chararray,retweeted_status__quoted_status__user__statuses_count:int,quoted_status__id_str:chararray,retweeted_status__user__profile_image_url:chararray,user__protected:chararray,user__profile_image_url_https:chararray,retweeted_status__source:chararray,quoted_status__source:chararray,retweeted_status__user__profile_link_color:chararray,retweeted_status__quoted_status__id_str:chararray,user__followers_count:int,retweeted_status__quoted_status__user__notifications:chararray,avl_num_sentences:int,retweeted_status__quoted_status__truncated:chararray,retweeted_status__text:chararray,quoted_status__favorite_count:int,quoted_status__metadata__result_type:chararray,truncated:chararray,metadata__iso_language_code:chararray,user__profile_banner_url:chararray,retweeted_status__quoted_status__user__profile_image_url_https:chararray,retweeted_status__quoted_status__user__utc_offset:int,quoted_status__user__profile_link_color:chararray,quoted_status__user__profile_image_url_https:chararray,retweeted_status__user__screen_name:chararray,retweeted_status__favorited:chararray,avl_lang:chararray,retweeted_status__user__location:chararray,retweeted_status__quoted_status__user__has_extended_profile:chararray,retweeted_status__quoted_status__user__verified:chararray,user__description:chararray,retweeted_status__user__profile_use_background_image:chararray,retweeted_status__quoted_status__user__contributors_enabled:chararray,quoted_status__is_quote_status:chararray,avl_sent:chararray,quoted_status__entities__media:chararray,quoted_status__possibly_sensitive:chararray,quoted_status__user__favourites_count:int,retweeted_status__quoted_status__user__default_profile_image:chararray,avl_num_words:int,quoted_status__user__friends_count:int,id_str:chararray,user__default_profile:chararray,user__profile_text_color:chararray,quoted_status__user__description:chararray,retweeted_status__user__favourites_count:int,retweeted_status__quoted_status__user__friends_count:int,quoted_status__user__name:chararray,retweeted_status__quoted_status__created_at:chararray,user__verified:chararray,quoted_status_id_str:chararray,user__profile_sidebar_border_color:chararray,retweeted_status__quoted_status__user__profile_text_color:chararray,retweeted_status__quoted_status__user__following:chararray,favorite_count:int,retweeted_status__quoted_status__entities__symbols:chararray,source:chararray,quoted_status_id:int,user__profile_use_background_image:chararray,retweeted_status__user__following:chararray,quoted_status__user__location:chararray,coordinates__type:chararray,retweeted_status__user__id:int,retweeted_status__quoted_status__text:chararray,quoted_status__entities__urls:chararray,retweeted_status__in_reply_to_status_id_str:chararray,text:chararray,retweeted_status__quoted_status__is_quote_status:chararray,quoted_status__id:int,user__entities__url__urls:chararray,quoted_status__user__contributors_enabled:chararray,retweeted_status__quoted_status__user__favourites_count:int,retweeted_status__quoted_status__id:int,retweeted_status__retweet_count:int,retweeted_status__favorite_count:int,metadata__result_type:chararray,retweeted_status__user__protected:chararray,retweeted_status__quoted_status__user__name:chararray,possibly_sensitive:chararray,retweeted_status__user__profile_sidebar_fill_color:chararray,retweeted_status__user__profile_image_url_https:chararray,retweeted_status__quoted_status_id:int,place__contained_within:chararray,retweeted_status__user__id_str:chararray,retweeted_status__user__entities__url__urls:chararray,retweeted_status__id_str:chararray,retweeted_status__quoted_status__entities__user_mentions:chararray,in_reply_to_status_id_str:chararray,retweeted_status__user__has_extended_profile:chararray,user__default_profile_image:chararray,user__is_translator:chararray,place__bounding_box__coordinates:chararray,retweeted_status__is_quote_status:chararray,quoted_status__user__entities__description__urls:chararray,entities__urls:chararray,retweeted_status__quoted_status__favorite_count:int,quoted_status__truncated:chararray,retweeted_status__user__default_profile_image:chararray,user__statuses_count:int,retweeted_status__quoted_status__user__entities__description__urls:chararray,retweeted_status__quoted_status__entities__hashtags:chararray,retweeted_status__quoted_status__user__description:chararray,retweeted_status__user__verified:chararray,retweeted_status__user__followers_count:int,avl_syn_1:chararray,quoted_status__user__default_profile:chararray,retweeted_status__place__bounding_box__type:chararray,retweeted_status__id:int,retweeted_status__user__lang:chararray,retweeted_status__quoted_status__user__default_profile:chararray,retweeted_status__quoted_status__user__profile_link_color:chararray,retweeted_status__in_reply_to_user_id:int,retweeted_status__user__is_translation_enabled:chararray,retweeted_status__user__chararrayed_count:int,quoted_status__user__default_profile_image:chararray,quoted_status__retweet_count:int,retweeted_status__user__profile_background_tile:chararray,quoted_status__user__id:int,retweeted_status__quoted_status__user__screen_name:chararray,retweeted_status__user__notifications:chararray,coordinates__coordinates:chararray,avl_brand_1:chararray,retweeted_status__quoted_status__metadata__iso_language_code:chararray,retweeted_status__quoted_status__retweeted:chararray,retweeted_status__quoted_status_id_str:chararray,retweeted_status__user__profile_text_color:chararray,quoted_status__retweeted:chararray,retweeted_status__user__is_translator:chararray,retweeted_status__user__default_profile:chararray,retweeted_status__extended_entities__media:chararray,avl_word_tags:chararray,quoted_status__user__follow_request_sent:chararray,retweeted_status__quoted_status__possibly_sensitive:chararray,user__screen_name:chararray,quoted_status__user__profile_banner_url:chararray,extended_entities__media:chararray,retweeted_status__quoted_status__retweet_count:int,quoted_status__user__profile_background_image_url:chararray,place__name:chararray,user__created_at:chararray,lang:chararray,in_reply_to_screen_name:chararray,retweeted_status__in_reply_to_status_id:int,quoted_status__user__profile_text_color:chararray,user__url:chararray,retweeted_status__user__profile_background_image_url_https:chararray,retweeted_status__truncated:chararray,entities__symbols:chararray,retweeted_status__quoted_status__user__profile_sidebar_border_color:chararray,quoted_status__entities__hashtags:chararray,retweeted_status__created_at:chararray,place__country_code:chararray,quoted_status__user__screen_name:chararray,avl_score:int,quoted_status__user__lang:chararray,avl_source:chararray,place__full_name:chararray,retweeted_status__place__url:chararray,retweeted_status__user__profile_background_color:chararray,quoted_status__user__following:chararray,quoted_status__user__profile_image_url:chararray,quoted_status__text:chararray,user__chararrayed_count:int,retweeted_status__quoted_status__user__protected:chararray,avl_words_not_in_lexicon:chararray,retweeted_status__quoted_status__user__id_str:chararray,quoted_status__user__followers_count:int,retweeted_status__quoted_status__extended_entities__media:chararray,retweeted_status__quoted_status__user__is_translator:chararray,user__time_zone:chararray,retweeted_status__metadata__result_type:chararray,in_reply_to_user_id_str:chararray,quoted_status__user__profile_background_image_url_https:chararray,avl_num_paragraphs:int,retweeted_status__quoted_status__user__profile_background_color:chararray,retweeted_status__quoted_status__user__followers_count:int,quoted_status__user__has_extended_profile:chararray,retweeted_status__user__profile_sidebar_border_color:chararray,avl_brand_all:chararray,retweeted_status__place__country_code:chararray,retweeted_status__user__description:chararray,quoted_status__user__profile_background_tile:chararray,retweeted_status__quoted_status__user__geo_enabled:chararray,quoted_status__user__created_at:chararray,entities__hashtags:chararray,retweeted_status__user__time_zone:chararray,quoted_status__user__geo_enabled:chararray,retweeted_status__possibly_sensitive:chararray,retweeted_status__user__name:chararray,retweeted:chararray,quoted_status__user__entities__url__urls:chararray,user__profile_background_tile:chararray,user__follow_request_sent:chararray,retweeted_status__quoted_status__entities__urls:chararray,quoted_status__user__statuses_count:int,retweeted_status__quoted_status__user__profile_background_image_url:chararray,user__is_translation_enabled:chararray,user__profile_background_image_url_https:chararray,user__friends_count:int,retweeted_status__quoted_status__user__id:int,geo__coordinates:chararray,user__following:chararray,user__favourites_count:int,retweeted_status__place__country:chararray,retweeted_status__quoted_status__user__chararrayed_count:int,user__profile_link_color:chararray,retweeted_status__place__full_name:chararray,quoted_status__user__protected:chararray,quoted_status__user__notifications:chararray,user__lang:chararray,retweeted_status__place__contained_within:chararray,retweeted_status__entities__hashtags:chararray,retweeted_status__entities__urls:chararray,user__profile_background_image_url:chararray,retweeted_status__quoted_status__favorited:chararray,retweeted_status__place__name:chararray,user__profile_background_color:chararray,geo__type:chararray,retweeted_status__entities__symbols:chararray,retweeted_status__place__id:chararray,quoted_status__lang:chararray,retweeted_status__retweeted:chararray,avl_sentences:chararray,avl_global_idx:int,retweeted_status__entities__user_mentions:chararray,retweeted_status__quoted_status__user__time_zone:chararray,user__id_str:chararray,quoted_status__user__profile_sidebar_fill_color:chararray,quoted_status__entities__symbols:chararray,retweeted_status__user__url:chararray,retweeted_status__quoted_status__user__profile_sidebar_fill_color:chararray,quoted_status__user__is_translator:chararray,retweeted_status__quoted_status__user__lang:chararray,user__profile_sidebar_fill_color:chararray,retweeted_status__quoted_status__source:chararray,entities__media:chararray,entities__user_mentions:chararray,retweeted_status__user__created_at:chararray,user__has_extended_profile:chararray,quoted_status__user__time_zone:chararray,is_quote_status:chararray,place__id:chararray,retweeted_status__quoted_status__user__created_at:chararray,user__location:chararray,retweeted_status__quoted_status__user__follow_request_sent:chararray,quoted_status__user__url:chararray,retweeted_status__user__geo_enabled:chararray,in_reply_to_user_id:int,retweeted_status__in_reply_to_user_id_str:chararray,retweeted_status__quoted_status__user__profile_banner_url:chararray,retweeted_status__quoted_status__entities__media:chararray,retweeted_status__quoted_status__user__profile_image_url:chararray
I found the answer: All the json objects in the input file donot have the same schema. So i guess it is not able to load according to the defined schema defined for the JsonLoader().
Used Elephant bird which made life easier.