I am trying to set up an Apache Ignite cache store using Mysql as external storage.
I have read all official documentation about it and examined many other examples, but I can't make it run:
[2022-06-02 16:45:56:551] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
[2022-06-02 16:45:56:874] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
[2022-06-02 16:45:56:874] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[16:45:56] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[2022-06-02 16:45:56:898] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
[2022-06-02 16:45:56:926] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - Collision resolution is disabled (all jobs will be activated upon arrival).
[16:45:56] Security status [authentication=off, sandbox=off, tls/ssl=off]
[2022-06-02 16:45:56:927] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Security status [authentication=off, sandbox=off, tls/ssl=off]
[2022-06-02 16:45:57:204] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=b397c114-d34d-4245-9645-f78c5d184888]
[2022-06-02 16:45:57:242] [WARN] - 55333 - org.apache.ignite.logger.java.JavaLogger.warning(JavaLogger.java:295) - DataRegionConfiguration.maxWalArchiveSize instead DataRegionConfiguration.walHistorySize would be used for removing old archive wal files
[2022-06-02 16:45:57:253] [INFO] - 55333 - org.apache.ignite.logger.java.JavaLogger.info(JavaLogger.java:285) - Configured data regions initialized successfully [total=4]
[2022-06-02 16:45:57:307] [ERROR] - 55333 - org.apache.ignite.logger.java.JavaLogger.error(JavaLogger.java:310) - Exception during start processors, node will be stopped and close connections
org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter []
at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1989) ~[ignite-core-2.10.0.jar:2.10.0]
Caused by: org.apache.ignite.IgniteCheckedException: Failed to validate cache configuration. Cache store factory is not serializable. Cache name: StockConfigCache
Caused by: org.apache.ignite.IgniteCheckedException: Failed to serialize object: CacheJdbcPojoStoreFactory [batchSize=512, dataSrcBean=null, dialect=org.apache.ignite.cache.store.jdbc.dialect.MySQLDialect#14993306, maxPoolSize=8, maxWrtAttempts=2, parallelLoadCacheMinThreshold=512, hasher=org.apache.ignite.cache.store.jdbc.JdbcTypeDefaultHasher#73ae82da, transformer=org.apache.ignite.cache.store.jdbc.JdbcTypesDefaultTransformer#6866e740, dataSrc=null, dataSrcFactory=com.anyex.ex.memory.model.CacheConfig$$Lambda$310/1421763091#31183ee2, sqlEscapeAll=false]
Caused by: java.io.NotSerializableException: com.anyex.ex.database.DynamicDataSource
Any advice or idea would be appreciated, thank you!
public static CacheConfiguration cacheStockConfigCache(DataSource dataSource, Boolean writeBehind)
{
CacheConfiguration ccfg = new CacheConfiguration();
ccfg.setSqlSchema("public");
ccfg.setName("StockConfigCache");
ccfg.setCacheMode(CacheMode.REPLICATED);
ccfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
ccfg.setIndexedTypes(Long.class, StockConfigMem.class);
CacheJdbcPojoStoreFactory cacheStoreFactory = new CacheJdbcPojoStoreFactory();
cacheStoreFactory.setDataSourceFactory((Factory<DataSource>) () -> dataSource);
//cacheStoreFactory.setDialect(new OracleDialect());
cacheStoreFactory.setDialect(new MySQLDialect());
cacheStoreFactory.setTypes(JdbcTypes.jdbcTypeStockConfigMem(ccfg.getName(), "StockConfig"));
ccfg.setCacheStoreFactory(cacheStoreFactory);
ccfg.setReadFromBackup(false);
ccfg.setCopyOnRead(true);
if(writeBehind){
ccfg.setWriteThrough(true);
ccfg.setWriteBehindEnabled(true);
}
return ccfg;
} public static JdbcType jdbcTypeStockConfigMem(String cacheName, String tableName)
{
JdbcType type = new JdbcType();
type.setCacheName(cacheName);
type.setKeyType(Long.class);
type.setValueType(StockConfigMem.class);
type.setDatabaseTable(tableName);
type.setKeyFields(new JdbcTypeField(Types.NUMERIC, "id", Long.class, "id"));
type.setValueFields(
new JdbcTypeField(Types.NUMERIC, "id", Long.class, "id"),
new JdbcTypeField(Types.NUMERIC, "stockinfoId", Long.class, "stockinfoId"),
new JdbcTypeField(Types.VARCHAR, "remark", String.class, "remark"),
new JdbcTypeField(Types.TIMESTAMP, "updateTime", Timestamp.class, "updateTime")
);
return type;
} igniteConfiguration.setCacheConfiguration(
CacheConfig.cacheStockConfigCache(dataSource, igniteProperties.getJdbc().getWriteBehind())
); #Bean("igniteInstance")
#ConditionalOnProperty(value = "ignite.enable", havingValue = "true", matchIfMissing = true)
public Ignite ignite(IgniteConfiguration igniteConfiguration)
{
log.info("igniteConfiguration info:{}", igniteConfiguration.toString());
Ignite ignite = Ignition.start(igniteConfiguration);
log.info("{} ignite started with discovery type {}", ignite.name(), igniteProperties.getType());
return ignite;
}
Configured MySQL-Debezium for CDC. It was capturing DDL changes like create/Drop table, but not capturing DML events.
Using MySQL 8.0.11 and Embedded debezium version 0.8.3.Final.
No additional configurations were done in MySQL server while creating table.
Configuration bean is created with below code
#Bean
public io.debezium.config.Configuration customerConnector() {
return io.debezium.config.Configuration.create()
.with(EmbeddedEngine.CONNECTOR_CLASS, "io.debezium.connector.mysql.MySqlConnector")
.with(EmbeddedEngine.OFFSET_STORAGE, "org.apache.kafka.connect.storage.FileOffsetBackingStore")
.with(EmbeddedEngine.OFFSET_STORAGE_FILE_FILENAME, "path-to-file")
.with("offset.flush.interval.ms", 60000)
.with(EmbeddedEngine.ENGINE_NAME, "customer-mysql-connector")
.with(MySqlConnectorConfig.SERVER_NAME, databaseServer)
.with(MySqlConnectorConfig.HOSTNAME, databaseServer)
.with(MySqlConnectorConfig.PORT, databasePort)
.with(MySqlConnectorConfig.USER, databaseUser)
.with(MySqlConnectorConfig.PASSWORD, databasePassword)
.with(MySqlConnectorConfig.DATABASE_WHITELIST, databaseSchemaName)
.with(MySqlConnectorConfig.TABLE_WHITELIST, databaseTable)
.with(MySqlConnectorConfig.DATABASE_HISTORY,
MemoryDatabaseHistory.class.getName()).build();
}
Below is the log when starting it as Springboot application
2020-05-29 21:24:28.028 INFO 5576 --- [pool-1-thread-1] i.d.connector.mysql.MySqlConnectorTask : MySQL has the binlog file 'binlog.000009' required by the connector
2020-05-29 21:24:28.072 INFO 5576 --- [pool-1-thread-1] io.debezium.util.Threads : Requested thread factory for connector MySqlConnector, id = localhost named = binlog-client
2020-05-29 21:24:28.074 INFO 5576 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
2020-05-29 21:24:28.074 INFO 5576 --- [pool-1-thread-1] io.debezium.util.Threads : Creating thread debezium-mysqlconnector-localhost-binlog-client
2020-05-29 21:24:28.090 INFO 5576 --- [-localhost:3306] io.debezium.util.Threads : Creating thread debezium-mysqlconnector-localhost-binlog-client
2020-05-29 21:24:28.121 INFO 5576 --- [-localhost:3306] c.g.shyiko.mysql.binlog.BinaryLogClient : Connected to localhost:3306 at binlog.000009/3786 (sid:6293, cid:36)
2020-05-29 21:24:28.121 INFO 5576 --- [-localhost:3306] i.debezium.connector.mysql.BinlogReader : Connected to MySQL binlog at localhost:3306, starting at binlog file 'binlog.000009', pos=3786, skipping 8 events plus 0 rows
2020-05-29 21:24:28.121 INFO 5576 --- [-localhost:3306] io.debezium.util.Threads : Creating thread debezium-mysqlconnector-localhost-binlog-client
2020-05-29 21:24:28.183 INFO 5576 --- [ main] d.s.w.p.DocumentationPluginsBootstrapper : Context refreshed
2020-05-29 21:24:28.199 INFO 5576 --- [ main] d.s.w.p.DocumentationPluginsBootstrapper : Found 1 custom documentation plugin(s)
2020-05-29 21:24:28.199 INFO 5576 --- [ main] s.d.s.w.s.ApiListingReferenceScanner : Scanning for api listing references
Any Clue?
Thanks!
table.whitelist should be set to to <schema>.<table> so in your case source.customer
WSO2 installed on Linux with Oracle-RAC. Followed all steps (I think!)
When starting it for the first time, i don't got any error:
TID: [-1234] [] [2017-04-25 15:28:17,964] INFO {org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor} - Started Spark CLIENT in the cluster pointing to MASTER local with the application name : CarbonAnalytics and UI port : 4040 {org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor}
TID: [-1234] [] [2017-04-25 15:28:17,987] INFO {org.wso2.carbon.ml.core.internal.MLCoreDS} - H2O Server will start in local mode. {org.wso2.carbon.ml.core.internal.MLCoreDS}
TID: [-1234] [] [2017-04-25 15:28:18,655] INFO {org.wso2.carbon.ml.core.impl.H2OServer} - H2o Server has started. {org.wso2.carbon.ml.core.impl.H2OServer}
TID: [-1234] [] [2017-04-25 15:28:18,659] INFO {org.wso2.carbon.ml.core.internal.MLCoreDS} - Machine Learner Wizard URL : https://172.17.9.67:9443/ml {org.wso2.carbon.ml.core.internal.MLCoreDS}
TID: [-1234] [] [2017-04-25 15:28:18,660] INFO {org.wso2.carbon.ml.core.internal.MLCoreDS} - ML core bundle activated
TID: [-1234] [] [2017-04-25 15:28:19,229] INFO {org.wso2.carbon.ntask.core.impl.AbstractQuartzTaskManager} - Task scheduled: [-1234][ANALYTICS_SPARK_EVENTING][STORE_EVENT_ROUTER_TASK] {org.wso2.carbon.ntask.core.impl.AbstractQuartzTaskManager}
TID: [-1234] [] [2017-04-25 15:28:19,315] INFO successfully. {org.wso2.carbon.ml.core.internal.MLCoreDS}{org.wso2.carbon.core.init.JMXServerManager} - JMX Service URL : service:jmx:rmi://localhost:11111/jndi/rmi://localhost:9999/jmxrmi {org.wso2.carbon.core.init.JMXServerManager}
TID: [-1234] [] [2017-04-25 15:28:19,358] INFO {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Server : WSO2 Data Analytics Server-3.1.0 {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent}
TID: [-1234] [] [2017-04-25 15:28:19,360] INFO {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - WSO2 Carbon started in 40 sec {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent}
TID: [-1234] [] [2017-04-25 15:28:19,983] INFO {org.wso2.carbon.ui.internal.CarbonUIServiceComponent} - Mgt Console URL : https://172.17.9.67:9443/carbon/ {org.wso2.carbon.ui.internal.CarbonUIServiceComponent}
TID: [-1] [] [2017-04-25 15:28:45,332] INFO {org.wso2.carbon.event.processor.manager.core.internal.CarbonEventManagementService} - Starting polling event receivers {org.wso2.carbon.event.processor.manager.core.internal.CarbonEventManagementService}
But I'm not able to run console, nothing showed when loading http url :-(
Also trying 172.17.9.67:9763/carbon/ after deleting comment on AllowHttp label
I have been facing this issue from long time. I tried to solve this but i couldn't. I need some experts advice to solve this.
I am trying to load a sample tweets json file.
sample.json;-
{"filter_level":"low","retweeted":false,"in_reply_to_screen_name":"FilmFan","truncated":false,"lang":"en","in_reply_to_status_id_str":null,"id":689085590822891521,"in_reply_to_user_id_str":"6048122","timestamp_ms":"1453125782100","in_reply_to_status_id":null,"created_at":"Mon Jan 18 14:03:02 +0000 2016","favorite_count":0,"place":null,"coordinates":null,"text":"#filmfan hey its time for you guys follow #acadgild To #AchieveMore and participate in contest Win Rs.500 worth vouchers","contributors":null,"geo":null,"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}],"user_mentions":[{"id":6048122,"name":"Tanya","indices":[0,8],"screen_name":"FilmFan","id_str":"6048122"},{"id":2649945906,"name":"ACADGILD","indices":[42,51],"screen_name":"acadgild","id_str":"2649945906"}]},"is_quote_status":false,"source":"<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck<\/a>","favorited":false,"in_reply_to_user_id":6048122,"retweet_count":0,"id_str":"689085590822891521","user":{"location":"India ","default_profile":false,"profile_background_tile":false,"statuses_count":86548,"lang":"en","profile_link_color":"94D487","profile_banner_url":"https://pbs.twimg.com/profile_banners/197865769/1436198000","id":197865769,"following":null,"protected":false,"favourites_count":1002,"profile_text_color":"000000","verified":false,"description":"Proud Indian, Digital Marketing Consultant,Traveler, Foodie, Adventurer, Data Architect, Movie Lover, Namo Fan","contributors_enabled":false,"profile_sidebar_border_color":"000000","name":"Bahubali","profile_background_color":"000000","created_at":"Sat Oct 02 17:41:02 +0000 2010","default_profile_image":false,"followers_count":4467,"profile_image_url_https":"https://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","geo_enabled":true,"profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","follow_request_sent":null,"url":null,"utc_offset":19800,"time_zone":"Chennai","notifications":null,"profile_use_background_image":false,"friends_count":810,"profile_sidebar_fill_color":"000000","screen_name":"Ashok_Uppuluri","id_str":"197865769","profile_image_url":"http://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","listed_count":50,"is_translator":false}}
I have tried to load this json file using ELEPHANT BIRD
script:-
REGISTER json-simple-1.1.1.jar
REGISTER elephant-bird-2.2.3.jar
REGISTER guava-11.0.2.jar
REGISTER avro-1.7.7.jar
REGISTER piggybank-0.12.0.jar
twitter = LOAD 'sample.json' USING com.twitter.elephantbird.pig.load.JsonLoader();
B = foreach twitter generate (chararray)$0#'created_at' as created_at,(chararray)$0#'id' as id,(chararray)$0#'id_str' as id_str,(chararray)$0#'text' as text,(chararray)$0#'source' as source,com.twitter.elephantbird.pig.piggybank.JsonStringToMap($0#'entities') as entities,(boolean)$0#'favorited' as favorited;
describe B;
OUTPUT:-
B: {created_at: chararray,id: chararray,id_str: chararray,text: chararray,source: chararray,entitis: map[chararray],favorited: boolean}
But when I tried to DUMP B the follwoing error has occured
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
I am providing the complete logs here.
2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1 2016-09-11 14:07:57,184 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1 2016-09-11 14:07:57,194 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script
settings are added to the job 2016-09-11 14:07:57,194 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is
false, will not generate code. 2016-09-11 14:07:57,199 [main] INFO
org.apache.pig.data.SchemaTupleFrontend - Starting process to move
generated code to distributed cacche 2016-09-11 14:07:57,199 [main]
INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache not
supported or needed in local mode. Setting key
[pig.schematuple.local.dir] with code temp directory:
/tmp/1473583077199-0 2016-09-11 14:07:57,206 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission. 2016-09-11 14:07:57,207 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot
initialize JVM Metrics with processName=JobTracker, sessionId= -
already initialized 2016-09-11 14:07:57,208 [JobControl] WARN
org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.
User classes may not be found. See Job or Job#setJar(String).
2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1 2016-09-11 14:07:57,211 [JobControl] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1 2016-09-11 14:07:57,212
[JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of
splits:1 2016-09-11 14:07:57,216 [JobControl] INFO
org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job:
job_local360376249_0009 2016-09-11 14:07:57,267 [JobControl] INFO
org.apache.hadoop.mapreduce.Job - The url to track the job:
http://localhost:8080/ 2016-09-11 14:07:57,267 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in
config null 2016-09-11 14:07:57,270 [Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,270
[Thread-214] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner
- OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2016-09-11 14:07:57,271 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2016-09-11 14:07:57,272 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.LocalJobRunner - Starting task:
attempt_local360376249_0009_m_000000_0 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,277
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree :
[ ] 2016-09-11 14:07:57,278 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapred.MapTask - Processing split: Number of splits
:1 Total Length = 2416 Input split[0]: Length = 2416 ClassName:
org.apache.hadoop.mapreduce.lib.input.FileSplit Locations:
----------------------- 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader
- Current split being processed file:/root/PIG/PIG/sample.json:0+2416 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File
Output Committer Algorithm version is 1 2016-09-11 14:07:57,282
[LocalJobRunner Map Task Executor #0] INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false 2016-09-11
14:07:57,288 [LocalJobRunner Map Task Executor #0] INFO
org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not
set... will not generate code. 2016-09-11 14:07:57,290 [LocalJobRunner
Map Task Executor #0] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map
- Aliases being processed per job phase (AliasName[line,offset]): M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,291 [Thread-214] INFO
org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2016-09-11 14:07:57,296 [Thread-214] WARN
org.apache.hadoop.mapred.LocalJobRunner - job_local360376249_0009
java.lang.Exception: java.lang.IncompatibleClassChangeError: Found
interface org.apache.hadoop.mapreduce.Counter, but class was expected
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.Counter, but class was expected at
com.twitter.elephantbird.pig.util.PigCounterHelper.incrCounter(PigCounterHelper.java:55)
at
com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.incrCounter(LzoBaseLoadFunc.java:70)
at
com.twitter.elephantbird.pig.load.JsonLoader.getNext(JsonLoader.java:130)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) 2016-09-11 14:07:57,467
[main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_local360376249_0009 2016-09-11 14:07:57,467 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Processing aliases B,twitter 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- detailed locations: M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete 2016-09-11 14:07:57,468 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_local360376249_0009 has failed! Stop running all dependent jobs 2016-09-11 14:07:57,468 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete 2016-09-11 14:07:57,469 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
Metrics with processName=JobTracker, sessionId= - already initialized
2016-09-11 14:07:57,469 [main] ERROR
org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce
job(s) failed! 2016-09-11 14:07:57,470 [main] INFO
org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script
Statistics: HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.7.1.2.3.4.7-40.15.0.2.3.4.7-4root2016-09-11 14:07:572016-09-11 14:07:57UNKNOWN Failed! Failed Jobs: JobIdAliasFeatureMessageOutputs
job_local360376249_0009B,twitterMAP_ONLYMessage: Job
failed!file:/tmp/temp252944192/tmp-470484503, Input(s): Failed to read
data from "file:///root/PIG/PIG/sample.json" Output(s): Failed to
produce result in "file:/tmp/temp252944192/tmp-470484503" Counters:
Total records written : 0 Total bytes written : 0 Spillable Memory
Manager spill count : 0 Total bags proactively spilled: 0 Total
records proactively spilled: 0 Job DAG: job_local360376249_0009
And please give a clarification on how to use jar files,
And what are the versions to use.I am so confused which version to use.
Someone says use Elephant Bird, and Someone says use AVRO. But I have with both non of them are working.
Please help.
Mohan.V
I got it on my own.
It is of jar versions issue.
script:-
REGISTER elephant-bird-core-4.1.jar
REGISTER elephant-bird-pig-4.1.jar
REGISTER elephant-bird-hadoop-compat-4.1.jar
And it worked fine.
I wanted to process twitter json object with pig using elephant-bird jars for which i wrote the pig script as below.
REGISTER '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar';
A = LOAD '/user/flume/tweets/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap;
B = FOREACH A GENERATE myMap#'id' AS ID,myMap#'created_at' AS createdAT;
DUMP B;
which gave me error as below
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1439883208520_0177
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[3,4],B[4,4] C: R:
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177]
2015-08-25 11:07:09,458 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-08-25 11:07:09,458 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1439883208520_0177 has failed! Stop running all dependent jobs
2015-08-25 11:07:09,459 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-08-25 11:07:09,667 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/
2015-08-25 11:07:09,668 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at trinityhadoopmaster.com/192.168.1.135:8032
2015-08-25 11:07:09,678 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.14.0 hdfs 2015-08-25 11:06:33 2015-08-25 11:07:09 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1439883208520_0177 A,B MAP_ONLY Message: Job failed! hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559,
Input(s):
Failed to read data from "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json"
Output(s):
Failed to produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1439883208520_0177
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2015-08-25 11:07:09,787 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B. Backend error : java.lang.ClassNotFoundException: org.json.simple.parser.ParseException
Details at logfile: /tmp/pig-err.log
grunt>
which i have no clue on how to approach, can any one help me on this.
REGISTER '/tmp/elephant-bird-core-4.1.jar';
REGISTER '/tmp/elephant-bird-pig-4.1.jar';
REGISTER '/tmp/elephant-bird-hadoop-compat-4.1.jar';
REGISTER '/tmp/google-collections-1.0.jar';
REGISTER '/tmp/json-simple-1.1.jar';
It works.