Starting graphhopper in flexible mode - graphhopper

Although I activated "prepare.chWeightings=no" in my property file it seems like the starting log still wants to prepare CHWeightings!
My config-example.properties file:
##### Vehicles #####
# Possible options: car,foot,bike,bike2,mtb,racingbike,motorcycle (comma separated)
# bike2 takes elevation data into account (like up-hill is slower than down-hill) and requires enabling graph.elevation.provider below
graph.flagEncoders=car
# Enable turn restrictions for car or motorcycle.
# Currently you need to additionally set prepare.chWeightings=no before using this (see below and #270)
# graph.flagEncoders=car|turnCosts=true
##### Elevation #####
# To populate your graph with elevation data use SRTM, default is noop (no elevation)
# graph.elevation.provider=srtm
# default location for cache is /tmp/srtm
# graph.elevation.cachedir=./srtmprovider/
# If you have a slow disk or plenty of RAM change the default MMAP to:
# graph.elevation.dataaccess=RAM_STORE
#### Speed-up mode vs. flexibility mode ####
# By default the speed-up mode with the 'fastest' weighting is used. Internally a graph preparation via
# contraction hierarchies (CH) is done to speed routing up. This requires more RAM/disc space for holding the
# graph but less for every request. You can also setup multiple weightings, by providing a coma separated list.
# prepare.chWeightings=fastest
# Disable the speed-up mode. Should be use only with routing.maxVisitedNodes
prepare.chWeightings=no
# To make preparation faster for multiple flagEncoders you can increase the default threads if you have enough RAM.
# Change this setting only if you know what you are doing and if the default worked for you and really make sure you have enough RAM!
# prepare.threads=1
##### Routing #####
# You can define the maximum visited nodes when routing. This may result in not found connections if there is no
# connection between two points wihtin the given visited nodes. The default is Integer.MAX_VALUE. Useful for flexibility mode
routing.maxVisitedNodes = 1000000
# If enabled, allows a user to run flexibility requests even if speed-up mode is enabled. Every request then has to include a hint routing.flexibleMode.force=true.
# Attention, non-CH route calculations take way more time and resources, compared to CH routing.
# A possible attacker might exploit this to slow down your service. Only enable it if you need it and with routing.maxVisitedNodes
routing.flexibleMode.allowed=true
##### Web #####
# if you want to support jsonp response type you need to add it explicitely here. By default it is disabled for stronger security.
web.jsonpAllowed=true
##### Storage #####
#
# configure the memory access, use RAM_STORE for well equipped servers (default and recommended) or MMAP_STORE_SYNC
graph.dataaccess=RAM_STORE
# if you don't need turn instruction, you can reduce storage size by not storing way names:
# osmreader.instructions=false
# will write way names in the preferred language (language code as defined in ISO 639-1 or ISO 639-2):
# osmreader.preferred-language=en
My Console:
java -jar *.jar jetty.resourcebase=webapp config=config-example.properties osmreader.osm=germany-latest.osm.pbf
[main] INFO com.graphhopper.GraphHopper - version 0.7|2016-04-27T10:00:38Z (4,13,3,2,2,1)
[main] INFO com.graphhopper.GraphHopper - graph CH|car|RAM_STORE|2D|NoExt|,,,,, details:edges:0(0MB), nodes:0(0MB), name:(0MB), geo:0(0MB), bounds:1.7976931348623157E308,-1.7976931348623157E308,1.7976931348623157E308,-1.7976931348623157E308, CHGraph|fastest|car, shortcuts:0, nodesCH:(0MB)
[main] INFO com.graphhopper.GraphHopper - start creating graph from germany-latest.osm.pbf
[main] INFO com.graphhopper.GraphHopper - using CH|car|RAM_STORE|2D|NoExt|,,,,, memory:totalMB:964, usedMB:25
[main] INFO com.graphhopper.reader.OSMReader - 5 000 000 (preprocess), osmIdMap:32 042 694 (381MB) totalMB:6281, usedMB:3934
[main] INFO com.graphhopper.reader.OSMReader - 50 000 (preprocess), osmWayMap:0 totalMB:6465, usedMB:3537
[main] INFO com.graphhopper.reader.OSMReader - 100 000 (preprocess), osmWayMap:0 totalMB:6465, usedMB:4153
[main] INFO com.graphhopper.reader.OSMReader - 150 000 (preprocess), osmWayMap:0 totalMB:6465, usedMB:4837
[main] INFO com.graphhopper.reader.OSMReader - 200 000 (preprocess), osmWayMap:0 totalMB:6465, usedMB:5420
[main] INFO com.graphhopper.reader.OSMReader - 250 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:1120
[main] INFO com.graphhopper.reader.OSMReader - 300 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:1677
[main] INFO com.graphhopper.reader.OSMReader - 350 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:2212
[main] INFO com.graphhopper.reader.OSMReader - 400 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:2896
[main] INFO com.graphhopper.reader.OSMReader - 450 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:3479
[main] INFO com.graphhopper.reader.OSMReader - 500 000 (preprocess), osmWayMap:0 totalMB:6457, usedMB:4014
[main] INFO com.graphhopper.reader.OSMReader - creating graph. Found nodes (pillar+tower):36 542 424, totalMB:6457, usedMB:4111
[main] INFO com.graphhopper.reader.OSMReader - 100 000 000, locs:24 711 933 (0) totalMB:6415, usedMB:1203
[main] INFO com.graphhopper.reader.OSMReader - 200 000 000, locs:33 458 643 (0) totalMB:6282, usedMB:2842
[main] INFO com.graphhopper.reader.OSMReader - 241 756 168, now parsing ways
[main] WARN com.graphhopper.routing.util.AbstractFlagEncoder - Unrealistic long duration ignored in way with OSMID=409892450 : Duration tag value=13:15 (=795 minutes)
[main] INFO com.graphhopper.reader.OSMReader - 280 449 722, now parsing relations
[main] INFO com.graphhopper.reader.OSMReader - finished way processing. nodes: 8773936, osmIdMap.size:36674781, osmIdMap:468MB, nodeFlagsMap.size:132357, relFlagsMap.size:0, zeroCounter:131245 totalMB:7276, usedMB:6425
[main] INFO com.graphhopper.reader.OSMReader - time(pass1): 104 pass2: 132 total:236
[main] INFO com.graphhopper.GraphHopper - start finding subnetworks, totalMB:7276, usedMB:6427
[main] INFO com.graphhopper.routing.util.PrepareRoutingSubnetworks - 165929 subnetworks found for car, totalMB:7276, usedMB:6743
[main] INFO com.graphhopper.routing.util.PrepareRoutingSubnetworks - optimize to remove subnetworks (165929), unvisited-dead-end-nodes (0), maxEdges/node (13)
[main] INFO com.graphhopper.GraphHopper - edges: 10586032, nodes 8344328, there were 165929 subnetworks. removed them => 429608 less nodes
[main] INFO com.graphhopper.storage.index.LocationIndexTree - location index created in 7.324438s, size:10 241 180, leafs:2 299 038, precision:300, depth:5, checksum:8344328, entries:[64, 64, 64, 16, 4], entriesPerLeaf:4.4545503
[main] INFO com.graphhopper.routing.ch.CHAlgoFactoryDecorator - 1/1 calling prepare.doWork for fastest|car ... (totalMB:7271, usedMB:3588)
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 0, updates:0, nodes: 8 344 328, shortcuts:0, dijkstras:33 871 012, t(dijk):5.4, t(period):0.0, t(lazy):0.0, t(neighbor):0.0, meanDegree:1, algo:127MB, totalMB:7271, usedMB:4275
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 1 668 860, updates:0, nodes: 6 675 468, shortcuts:675, dijkstras:34 551 628, t(dijk):6.55, t(period):0.0, t(lazy):0.0, t(neighbor):1.73, meanDegree:0, algo:127MB, totalMB:7271, usedMB:4301
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 3 337 720, updates:1, nodes: 5 006 608, shortcuts:1 043 389, dijkstras:65 810 344, t(dijk):22.94, t(period):11.56, t(lazy):0.0, t(neighbor):8.92, meanDegree:1, algo:127MB, totalMB:7271, usedMB:5068
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 5 006 580, updates:2, nodes: 3 337 748, shortcuts:2 164 049, dijkstras:94 377 631, t(dijk):99.06, t(period):72.35, t(lazy):0.0, t(neighbor):19.56, meanDegree:1, algo:127MB, totalMB:7271, usedMB:5784
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 6 675 440, updates:3, nodes: 1 668 888, shortcuts:3 836 152, dijkstras:117 298 010, t(dijk):182.91, t(period):123.25, t(lazy):0.0, t(neighbor):39.4, meanDegree:2, algo:127MB, totalMB:7271, usedMB:6553
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - 8 344 300, updates:4, nodes: 254 254, shortcuts:5 744 154, dijkstras:146 679 623, t(dijk):333.88, t(period):171.72, t(lazy):35.62, t(neighbor):79.67, meanDegree:3, algo:127MB, totalMB:7457, usedMB:1791
[fastest_car] INFO com.graphhopper.routing.ch.PrepareContractionHierarchies - took:515, new shortcuts: 6 499 068, prepare|fastest, car, dijkstras:167252328, t(dijk):451.61, t(period):188.89, t(lazy):70.18, t(neighbor):134.1, meanDegree:1, initSize:8344328, periodic:20, lazy:10, neighbor:20, totalMB:7457, usedMB:2019
[main] INFO com.graphhopper.GraphHopper - flushing graph CH|car|RAM_STORE|2D|NoExt|4,13,3,2,2, details:edges:10 586 032(324MB), nodes:8 344 328(96MB), name:(35MB), geo:52 329 514(200MB), bounds:5.863066148677457,25.196558055204704,47.27804356689848,60.22003669783555, CHGraph|fastest|car, shortcuts:6 499 068, nodesCH:(64MB), totalMB:7457, usedMB:2036)
[main] INFO com.graphhopper.http.DefaultModule - loaded graph at:germany-latest.osm-gh, source:germany-latest.osm.pbf, flagEncoders:car, class:edges:10 586 032(324MB), nodes:8 344 328(96MB), name:(35MB), geo:52 329 514(200MB), bounds:5.863066148677457,25.196558055204704,47.27804356689848,60.22003669783555, CHGraph|fastest|car, shortcuts:6 499 068, nodesCH:(64MB)
[main] INFO com.graphhopper.http.GHServer - Started server at HTTP : 8888
So is my config file correct for flexible mode?
And is "PrepareContractionHierarchies" mentioned in the logs of graphhopper still correct when I want to start it in flexible mode?
thanks for answers

Related

Caused by: org.apache.ignite.IgniteCheckedException: Failed to validate cache configuration. Cache store factory is not serializable. Cache name:

I am trying to set up an Apache Ignite cache store using Mysql as external storage.
I have read all official documentation about it and examined many other examples, but I can't make it run:
[2022-06-02 16:45:56:551] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
[2022-06-02 16:45:56:874] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
[2022-06-02 16:45:56:874] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[16:45:56] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[2022-06-02 16:45:56:898] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
[2022-06-02 16:45:56:926] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Collision resolution is disabled (all jobs will be activated upon arrival).
[16:45:56] Security status [authentication=off, sandbox=off, tls/ssl=off]
[2022-06-02 16:45:56:927] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Security status [authentication=off, sandbox=off, tls/ssl=off]
[2022-06-02 16:45:57:204] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=b397c114-d34d-4245-9645-f78c5d184888]
[2022-06-02 16:45:57:242] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - DataRegionConfiguration.maxWalArchiveSize instead DataRegionConfiguration.walHistorySize would be used for removing old archive wal files
[2022-06-02 16:45:57:253] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Configured data regions initialized successfully [total=4]
[2022-06-02 16:45:57:307] [ERROR] - 55333 - org.apache.ignite.logger.java.JavaLogger.error(JavaLogger.java:310) - Exception during start processors, node will be stopped and close connections
org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []
at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1989) ~[ignite-core-2.10.0.jar:2.10.0]
Caused by: org.apache.ignite.IgniteCheckedException: Failed to validate cache configuration. Cache store factory is not serializable. Cache name: StockConfigCache
Caused by: org.apache.ignite.IgniteCheckedException: Failed to serialize object: CacheJdbcPojoStoreFactory [batchSize=512, dataSrcBean=null, dialect=org.apache.ignite.cache.store.jdbc.dialect.MySQLDialect#14993306, maxPoolSize=8, maxWrtAttempts=2, parallelLoadCacheMinThreshold=512, hasher=org.apache.ignite.cache.store.jdbc.JdbcTypeDefaultHasher#73ae82da, transformer=org.apache.ignite.cache.store.jdbc.JdbcTypesDefaultTransformer#6866e740, dataSrc=null, dataSrcFactory=com.anyex.ex.memory.model.CacheConfig$$Lambda$310/1421763091#31183ee2, sqlEscapeAll=false]
Caused by: java.io.NotSerializableException: com.anyex.ex.database.DynamicDataSource
Any advice or idea would be appreciated, thank you!
public static CacheConfiguration cacheStockConfigCache(DataSource dataSource, Boolean writeBehind)
{
CacheConfiguration ccfg = new CacheConfiguration();
ccfg.setSqlSchema("public");
ccfg.setName("StockConfigCache");
ccfg.setCacheMode(CacheMode.REPLICATED);
ccfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
ccfg.setIndexedTypes(Long.class, StockConfigMem.class);
CacheJdbcPojoStoreFactory cacheStoreFactory = new CacheJdbcPojoStoreFactory();
cacheStoreFactory.setDataSourceFactory((Factory<DataSource>) () -> dataSource);
//cacheStoreFactory.setDialect(new OracleDialect());
cacheStoreFactory.setDialect(new MySQLDialect());
cacheStoreFactory.setTypes(JdbcTypes.jdbcTypeStockConfigMem(ccfg.getName(), "StockConfig"));
ccfg.setCacheStoreFactory(cacheStoreFactory);
ccfg.setReadFromBackup(false);
ccfg.setCopyOnRead(true);
if(writeBehind){
ccfg.setWriteThrough(true);
ccfg.setWriteBehindEnabled(true);
}
return ccfg;
} public static JdbcType jdbcTypeStockConfigMem(String cacheName, String tableName)
{
JdbcType type = new JdbcType();
type.setCacheName(cacheName);
type.setKeyType(Long.class);
type.setValueType(StockConfigMem.class);
type.setDatabaseTable(tableName);
type.setKeyFields(new JdbcTypeField(Types.NUMERIC, "id", Long.class, "id"));
type.setValueFields(
new JdbcTypeField(Types.NUMERIC, "id", Long.class, "id"),
new JdbcTypeField(Types.NUMERIC, "stockinfoId", Long.class, "stockinfoId"),
new JdbcTypeField(Types.VARCHAR, "remark", String.class, "remark"),
new JdbcTypeField(Types.TIMESTAMP, "updateTime", Timestamp.class, "updateTime")
);
return type;
} igniteConfiguration.setCacheConfiguration(
CacheConfig.cacheStockConfigCache(dataSource, igniteProperties.getJdbc().getWriteBehind())
); #Bean("igniteInstance")
#ConditionalOnProperty(value = "ignite.enable", havingValue = "true", matchIfMissing = true)
public Ignite ignite(IgniteConfiguration igniteConfiguration)
{
log.info("igniteConfiguration info:{}", igniteConfiguration.toString());
Ignite ignite = Ignition.start(igniteConfiguration);
log.info("{} ignite started with discovery type {}", ignite.name(), igniteProperties.getType());
return ignite;
}

ERROR 1066: Unable to open iterator for alias- PIG SCRIPT

I have been facing this issue from long time. I tried to solve this but i couldn't. I need some experts advice to solve this.
I am trying to load a sample tweets json file.
sample.json;-
{"filter_level":"low","retweeted":false,"in_reply_to_screen_name":"FilmFan","truncated":false,"lang":"en","in_reply_to_status_id_str":null,"id":689085590822891521,"in_reply_to_user_id_str":"6048122","timestamp_ms":"1453125782100","in_reply_to_status_id":null,"created_at":"Mon Jan 18 14:03:02 +0000 2016","favorite_count":0,"place":null,"coordinates":null,"text":"#filmfan hey its time for you guys follow #acadgild To #AchieveMore and participate in contest Win Rs.500 worth vouchers","contributors":null,"geo":null,"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}],"user_mentions":[{"id":6048122,"name":"Tanya","indices":[0,8],"screen_name":"FilmFan","id_str":"6048122"},{"id":2649945906,"name":"ACADGILD","indices":[42,51],"screen_name":"acadgild","id_str":"2649945906"}]},"is_quote_status":false,"source":"<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck<\/a>","favorited":false,"in_reply_to_user_id":6048122,"retweet_count":0,"id_str":"689085590822891521","user":{"location":"India ","default_profile":false,"profile_background_tile":false,"statuses_count":86548,"lang":"en","profile_link_color":"94D487","profile_banner_url":"https://pbs.twimg.com/profile_banners/197865769/1436198000","id":197865769,"following":null,"protected":false,"favourites_count":1002,"profile_text_color":"000000","verified":false,"description":"Proud Indian, Digital Marketing Consultant,Traveler, Foodie, Adventurer, Data Architect, Movie Lover, Namo Fan","contributors_enabled":false,"profile_sidebar_border_color":"000000","name":"Bahubali","profile_background_color":"000000","created_at":"Sat Oct 02 17:41:02 +0000 2010","default_profile_image":false,"followers_count":4467,"profile_image_url_https":"https://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","geo_enabled":true,"profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","follow_request_sent":null,"url":null,"utc_offset":19800,"time_zone":"Chennai","notifications":null,"profile_use_background_image":false,"friends_count":810,"profile_sidebar_fill_color":"000000","screen_name":"Ashok_Uppuluri","id_str":"197865769","profile_image_url":"http://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","listed_count":50,"is_translator":false}}
I have tried to load this json file using ELEPHANT BIRD
script:-
REGISTER json-simple-1.1.1.jar
REGISTER elephant-bird-2.2.3.jar
REGISTER guava-11.0.2.jar
REGISTER avro-1.7.7.jar
REGISTER piggybank-0.12.0.jar
twitter = LOAD 'sample.json' USING com.twitter.elephantbird.pig.load.JsonLoader();
B = foreach twitter generate (chararray)$0#'created_at' as created_at,(chararray)$0#'id' as id,(chararray)$0#'id_str' as id_str,(chararray)$0#'text' as text,(chararray)$0#'source' as source,com.twitter.elephantbird.pig.piggybank.JsonStringToMap($0#'entities') as entities,(boolean)$0#'favorited' as favorited;
describe B;
OUTPUT:-
B: {created_at: chararray,id: chararray,id_str: chararray,text: chararray,source: chararray,entitis: map[chararray],favorited: boolean}
But when I tried to DUMP B the follwoing error has occured
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
I am providing the complete logs here.
2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1 2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1 2016-09-11 14:07:57,194 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script
settings are added to the job 2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is
false, will not generate code. 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.data.SchemaTupleFrontend - Starting process to move
generated code to distributed cacche 2016-09-11 14:07:57,199 [main]
INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache not
supported or needed in local mode. Setting key
[pig.schematuple.local.dir] with code temp directory:
/tmp/1473583077199-0 2016-09-11 14:07:57,206 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission. 2016-09-11 14:07:57,207 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= -
already initialized 2016-09-11 14:07:57,208 [JobControl] WARN
org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.
User classes may not be found. See Job or Job#setJar(String).
2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1 2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1 2016-09-11 14:07:57,212
[JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1 2016-09-11 14:07:57,216 [JobControl] INFO
org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job:
job_local360376249_0009 2016-09-11 14:07:57,267 [JobControl] INFO
org.apache.hadoop.mapreduce.Job - The url to track the job:
http://localhost:8080/ 2016-09-11 14:07:57,267 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in
config null 2016-09-11 14:07:57,270 [Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,270
[Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner
- OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2016-09-11 14:07:57,271 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2016-09-11 14:07:57,272 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local360376249_0009_m_000000_0 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree :
[ ] 2016-09-11 14:07:57,278 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits
:1 Total Length = 2416 Input split[0]: Length = 2416 ClassName:
org.apache.hadoop.mapreduce.lib.input.FileSplit Locations:
----------------------- 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader
- Current split being processed file:/root/PIG/PIG/sample.json:0+2416 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,282
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,288 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not
set... will not generate code. 2016-09-11 14:07:57,290 [LocalJobRunner
Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map
- Aliases being processed per job phase (AliasName[line,offset]): M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,291 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2016-09-11 14:07:57,296 [Thread-214] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local360376249_0009
java.lang.Exception: java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.Counter, but class was expected
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.Counter, but class was expected at
com.twitter.elephantbird.pig.util.PigCounterHelper.incrCounter(PigCounterHelper.java:55)
at
com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.incrCounter(LzoBaseLoadFunc.java:70)
at
com.twitter.elephantbird.pig.load.JsonLoader.getNext(JsonLoader.java:130)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) 2016-09-11 14:07:57,467
[main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_local360376249_0009 2016-09-11 14:07:57,467 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Processing aliases B,twitter 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- detailed locations: M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete 2016-09-11 14:07:57,468 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_local360376249_0009 has failed! Stop running all dependent jobs 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete 2016-09-11 14:07:57,469 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] ERROR
org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce
job(s) failed! 2016-09-11 14:07:57,470 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script
Statistics: HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.7.1.2.3.4.7-40.15.0.2.3.4.7-4root2016-09-11 14:07:572016-09-11 14:07:57UNKNOWN Failed! Failed Jobs: JobIdAliasFeatureMessageOutputs
job_local360376249_0009B,twitterMAP_ONLYMessage: Job
failed!file:/tmp/temp252944192/tmp-470484503, Input(s): Failed to read
data from "file:///root/PIG/PIG/sample.json" Output(s): Failed to
produce result in "file:/tmp/temp252944192/tmp-470484503" Counters:
Total records written : 0 Total bytes written : 0 Spillable Memory
Manager spill count : 0 Total bags proactively spilled: 0 Total
records proactively spilled: 0 Job DAG: job_local360376249_0009
And please give a clarification on how to use jar files,
And what are the versions to use.I am so confused which version to use.
Someone says use Elephant Bird, and Someone says use AVRO. But I have with both non of them are working.
Please help.
Mohan.V
I got it on my own.
It is of jar versions issue.
script:-
REGISTER elephant-bird-core-4.1.jar
REGISTER elephant-bird-pig-4.1.jar
REGISTER elephant-bird-hadoop-compat-4.1.jar
And it worked fine.

Apache Pig error while dumping Json data

I have a JSON file and want to load using Apache Pig.
I am using the built-in JSONLOADER to load json data, Below is the sample json data.
cat jsondata1.json
{ "response": { "id": 10123, "thread": "Sloths", "comments": ["Sloths are adorable So chill"] }, "response_time": 0.425 }
{ "response": { "id": 13828, "thread": "Bigfoot", "comments": ["hello world"] } , "response_time": 0.517 }
Here I loading json data using builtin Json loader. While loading there is no error, but while dumping the data it gives the following error.
grunt> a = load '/home/cloudera/jsondata1.json' using JsonLoader('response:tuple (id:int, thread:chararray, comments:bag {tuple(comment:chararray)}), response_time:double');
grunt> dump a;
2016-04-17 01:11:13,286 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/home/cloudera/jsondata1.json:0+229
2016-04-17 01:11:13,287 [pool-4-thread-1] WARN org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2016-04-17 01:11:13,311 [pool-4-thread-1] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-04-17 01:11:13,321 [pool-4-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: a[5,4] C: R:
2016-04-17 01:11:13,349 [Thread-16] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
2016-04-17 01:11:13,351 [Thread-16] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local801054416_0004
java.lang.Exception: org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream#2484de3c; line: 1, column: 120]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream#2484de3c; line: 1, column: 120]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonNumericParserBase._parseNumericValue(JsonNumericParserBase.java:399)
at org.codehaus.jackson.impl.JsonNumericParserBase.getDoubleValue(JsonNumericParserBase.java:311)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:203)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:483)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local801054416_0004
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-04-17 01:11:13,548 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[5,4] C: R:
2016-04-17 01:11:18,059 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-04-17 01:11:18,059 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local801054416_0004 has failed! Stop running all dependent jobs
2016-04-17 01:11:18,059 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-04-17 01:11:18,059 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.7.0 0.11.0-cdh4.7.0 cloudera 2016-04-17 01:11:12 2016-04-17 01:11:18 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local801054416_0004 a MAP_ONLY Message: Job failed! file:/tmp/temp-1766116741/tmp1151698221,
Input(s):
Failed to read data from "/home/cloudera/jsondata1.json"
Output(s):
Failed to produce result in "file:/tmp/temp-1766116741/tmp1151698221"
Job DAG:
job_local801054416_0004
2016-04-17 01:11:18,060 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2016-04-17 01:11:18,061 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a
Details at logfile: /home/cloudera/pig_1460877001124.log
I could not able to find the issue. Can I know how to define the correct schema for the above json data?.
Try this:
comments:{(chararray)}
because this version:
comments:bag {tuple(comment:chararray)}
fits this JSON schema:
"comments": [{comment:"hello world"}]
and you have simple string values, not another nested documents:
"comments": ["hello world"]

Error processing complex json object of twitter with pig JsonLoader() of elephant-bird Jars

I wanted to process twitter json object with pig using elephant-bird jars for which i wrote the pig script as below.
REGISTER '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar';
A = LOAD '/user/flume/tweets/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap;
B = FOREACH A GENERATE myMap#'id' AS ID,myMap#'created_at' AS createdAT;
DUMP B;
which gave me error as below
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1439883208520_0177
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[3,4],B[4,4] C: R:
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:09,458 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-08-25 11:07:09,458 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1439883208520_0177 has failed! Stop running all dependent jobs
2015-08-25 11:07:09,459 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-08-25 11:07:09,667 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/
2015-08-25 11:07:09,668 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at trinityhadoopmaster.com/192.168.1.135:8032
2015-08-25 11:07:09,678 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.14.0 hdfs 2015-08-25 11:06:33 2015-08-25 11:07:09 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1439883208520_0177 A,B MAP_ONLY Message: Job failed! hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559,
Input(s):
Failed to read data from "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json"
Output(s):
Failed to produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1439883208520_0177
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-08-25 11:07:09,787 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B. Backend error : java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
Details at logfile: /tmp/pig-err.log
grunt>
which i have no clue on how to approach, can any one help me on this.
REGISTER '/tmp/elephant-bird-core-4.1.jar';
REGISTER '/tmp/elephant-bird-pig-4.1.jar';
REGISTER '/tmp/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/tmp/google-collections-1.0.jar';
REGISTER '/tmp/json-simple-1.1.jar';
It works.

JsonLoader throws error in pig

I am unable to decode this simple json , i dont know what i am doing wrong.
please help me in this pig script.
I have to decode the below data in json format.
3.json
{
"id": 6668,
"source_name": "National Stock Exchange of India",
"source_code": "NSE"
}
and my pig script is
a = LOAD '3.json' USING org.apache.pig.builtin.JsonLoader ('id:int, source_name:chararray, source_code:chararray');
dump a;
the error i get is given below:
2015-07-23 13:40:08,715 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1664361500_0001_m_000000_0
2015-07-23 13:40:08,775 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2015-07-23 13:40:08,780 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 88
Input split[0]:
Length = 88
Locations:
-----------------------
2015-07-23 13:40:08,793 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/home/hariprasad.sudo/3.json:0+88
2015-07-23 13:40:08,844 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: a[1,4] C: R:
2015-07-23 13:40:08,861 [Thread-5] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2015-07-23 13:40:08,867 [Thread-5] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1664361500_0001
java.lang.Exception: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 3]
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream#61a79110; line: 1, column: 3]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
at org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
at org.codehaus.jackson.impl.Utf8StreamParser._skipWSOrEnd(Utf8StreamParser.java:1841)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:275)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:180)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-07-23 13:40:09,179 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-07-23 13:40:09,179 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1664361500_0001 has failed! Stop running all dependent jobs
2015-07-23 13:40:09,179 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-07-23 13:40:09,180 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2015-07-23 13:40:09,180 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
2015-07-23 13:40:09,181 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.3.0-cdh5.1.3 0.12.0-cdh5.1.3 hariprasad.sudo 2015-07-23 13:40:07 2015-07-23 13:40:09 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1664361500_0001 a MAP_ONLY Message: Job failed! file:/tmp/temp-65649055/tmp1240506051,
Input(s):
Failed to read data from "file:///home/hariprasad.sudo/3.json"
Output(s):
Failed to produce result in "file:/tmp/temp-65649055/tmp1240506051"
Job DAG:
job_local1664361500_0001
2015-07-23 13:40:09,181 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-07-23 13:40:09,186 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias a
Details at logfile: /home/hariprasad.sudo/pig_1437673203961.log
grunt> 2015-07-23 13:40:14,754 [communication thread] INFO org.apache.hadoop.mapred.LocalJobRunner - map > map
Please help me in understanding what is wrong.
Thanks,
Hari
Have the compact version of json in 3.json. We can use http://www.jsoneditoronline.org for the same.
3.json
{"id":6668,"source_name":"National Stock Exchange of India","source_code":"NSE"}
with this we are able to dump the data :
(6668,National Stock Exchange of India,NSE)
Ref : Error from Json Loader in Pig where similar issue is discussed.
Extract from the above ref. link :
Pig doesn't usually like "human readable" json. Get rid of the spaces and/or indentations, and you're good.