hadoop - Unable to Read JSON file using Elephant Bird -
trying load json file having null values in using elephant-bird jsonloader.
sample.json
{"created_at": "mon aug 22 10:48:23 +0000 2016","id": 767674772662607873,"id_str": "767674772662607873","text": "kpit image result https:\/\/t.co\/nas2znf1zz... https:\/\/t.co\/9tnelwtivm","source": "\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003etwitter web client\u003c\/a\u003e","truncated": false,"in_reply_to_status_id": 123,"in_reply_to_status_id_str": null,"in_reply_to_user_id": null,"in_reply_to_user_id_str": null,"in_reply_to_screen_name": null,"geo": null,"coordinates": null,"place": null,"contributors": null,"is_quote_status": false,"retweet_count": 0,"favorite_count": 0,"entities": {"hashtags": [],"urls": [{"url": "https:\/\/t.co\/nas2znf1zz","expanded_url": "http:\/\/miltonious.com\/","display_url": "miltonious.com","indices": [24, 47]}],"user_mentions": [],"symbols": []},"favorited": false,"retweeted": false,"possibly_sensitive": false,"filter_level": "low","lang": "en","timestamp_ms": "1471862903167"}
script:
register piggybank.jar register json-simple-1.1.1.jar register elephant-bird-pig-4.3.jar register elephant-bird-core-4.1.jar register elephant-bird-hadoop-compat-4.3.jar json = load 'sample.json' using jsonloader('created_at:chararray, id:chararray, id_str:chararray, text:chararray, source:chararray, in_reply_to_status_id:chararray, in_reply_to_status_id_str:chararray, in_reply_to_user_id:chararray, in_reply_to_user_id_str:chararray, in_reply_to_screen_name:chararray, geo:chararray, coordinates:chararray, place:chararray, contributors:chararray, is_quote_status:bytearray, retweet_count:long, favorite_count:chararray, entities:map[], favorited:bytearray, retweeted:bytearray, possibly_sensitive:bytearray, lang:chararray'); describe json; dump json;
when dump json,i getting following output , worning
(mon aug 22 10:48:23 +0000 2016,767674772662607873,767674772662607873,google image result twitter web client,false,1234,12345,3214,43215,,,,,,,,,,,,,,)
warn org.apache.pig.backend.hadoop.executionengine.mapreducelayer.pighadooplogger - org.apache.pig.builtin.jsonloader(udf_warning_1): bad record, returning null {complete json}
by warning guess getting null values. how can load json having null values in it.
and have tried in way i.e
json = load 'sample.json' using com.twitter.elephantbird.pig.load.jsonloader('created_at:chararray, id:chararray, id_str:chararray, text:chararray, source:chararray, in_reply_to_status_id:chararray, in_reply_to_status_id_str:chararray, in_reply_to_user_id:chararray, in_reply_to_user_id_str:chararray, in_reply_to_screen_name:chararray, geo:chararray, coordinates:chararray, place:chararray, contributors:chararray, is_quote_status:bytearray, retweet_count:long, favorite_count:chararray, entities:map[], favorited:bytearray, retweeted:bytearray, possibly_sensitive:bytearray, lang:chararray'); describe json;
output
schema json unknown.
please suggest me.
thanks.
Comments
Post a Comment