My jobs have some exception before map-reduce steps, but jobs are not getting killed. How to configure hadoop such that jobs get killed after exception?
Invoking Main class now
Heart beat
Heart beat
Invocation of Main class completed
Oozie Launcher ends
stderr logs
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Io exception: Unknown host specified )
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:577)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:792)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:815)
at com.seven.crcs.export.dao.ReportDAOImpl.recreateReportEntity(ReportDAOImpl.java:151)
at com.seven.crcs.export.dao.ReportDAOImpl.saveActiveUserCount(ReportDAOImpl.java:93)
at com.seven.crcs.export.ReportJdbcExporter.saveActiveUserCount(ReportJdbcExporter.java:55)
at com.seven.dataprocessor.oc.jobs.reports.export.day.ExportDailyUserReducer.exportUserCounts(ExportDailyUserReducer.java:32)
at com.seven.dataprocessor.oc.jobs.reports.export.ExportActiveUser
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Io exception: Unknown host specified )
And
2013-02-28 06:06:46,487 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_0, Status : FAILED
2013-02-28 06:07:00,600 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_1, Status : FAILED
2013-02-28 06:07:16,650 INFO org.apache.hadoop.mapred.JobClient: Task Id : attempt_201302270945_0181_r_000000_2, Status : FAILED
2013-02-28 06:07:31,731 INFO org.apache.hadoop.mapred.JobClient: Job complete: job_201302270945_0181
But jobs complete SUCCEEDED
Your job was actually terminated, but only after 3 failed attempts of the map task as the task ids show:
attempt_201302270945_0181_r_000000_0
attempt_201302270945_0181_r_000000_1
attempt_201302270945_0181_r_000000_2
You can limit the number of maximum attempts for each task either by setting the parameter mapred.map.max.attempts to 1 or by using JobConf#setMaxMapAttempts(int)JobConf#setMaxMapAttempts.
This will cause your map task to fail on the first exception and thus terminate your job a little faster.
Related
When i am trying descrpting the encrypted files with TF- PGP task in SSIS.I am getting below message with SQL agent job Source: TF PGP Task Description: The Execute method on the task returned error code 0x80131509 (No encryption methods specified). The Execute method must succeed, and indicate the result using an "out" parameter. End Error DTExec: The package execution returned DTSER_FAILURE (1). Started: 1:08:55 PM Finished: 1:09:02 PM Elapsed: 6.609 seconds. The package execution failed. The step failed.
When i am trying the same execute from SSIS package i am getting below error message.
im getting this error, which i try to find why and what happened Suddenly:
and more importantly how to debug such an error .
what this line means :
Error The read operation failed, see inner exception.
where is this : inner exception?
020-09-30T18:47:22.0199830Z ##[section]Starting: Initialize job
2020-09-30T18:47:22.0201330Z Agent name: 'Hosted Agent'
2020-09-30T18:47:22.0201750Z Agent machine name: 'Mac-1601490664598'
2020-09-30T18:47:22.0202040Z Current agent version: '2.175.2'
2020-09-30T18:47:22.0219900Z Current image version: '20200904.1'
2020-09-30T18:47:22.0229850Z Agent running as: 'runner'
2020-09-30T18:47:22.0293150Z Prepare build directory.
2020-09-30T18:47:22.0595770Z Set build variables.
2020-09-30T18:47:22.0631220Z Download all required tasks.
2020-09-30T18:47:22.0751440Z Downloading task: CmdLine (2.164.2)
2020-09-30T18:48:02.2372880Z Downloading task: UseRubyVersion (0.165.2)
2020-09-30T18:48:48.2651220Z Downloading task: DownloadBuildArtifacts (0.167.2)
2020-09-30T18:51:03.2405560Z ##[warning]Failed to download task 'DownloadBuildArtifacts'. Error The read operation failed, see inner exception.
2020-09-30T18:51:03.2423990Z ##[warning]Inner Exception: {ex.InnerException.Message}
2020-09-30T18:51:03.2428450Z ##[warning]Back off 23.799 seconds before retry.
2020-09-30T18:53:07.4698560Z ##[warning]Failed to download task 'DownloadBuildArtifacts'. Error The read operation failed, see inner exception.
2020-09-30T18:53:07.4701220Z ##[warning]Inner Exception: {ex.InnerException.Message}
2020-09-30T18:53:07.4704340Z ##[warning]Back off 13.329 seconds before retry.
2020-09-30T18:57:08.7191850Z ##[error]The read operation failed, see inner exception.
2020-09-30T18:57:08.7198800Z ##[section]Finishing: Initialize job
You are not the only one who encountered this interruption, see this post.
I reviewed our internal service telemetry log, the issue you encountered should caused by our service event. https://status.dev.azure.com/_history
There were some exception occurred on our backend start from 15:23:27 CST, which make you encountered pipeline interruption.
how to debug such an error
As normal, it's hard for users to check the inner exception if you are using hosted pool. The detailed exception messages are recorded in our backend telemetry log. You can contact our team by clicking on Report outage button mentioned below if you are blocked again in the future and would like to know the details message about it:
Since the event has been mitigated now, I'm sure your pipelines will work fine if you re-run the pipeline now.
I have 4 Elastic Beanstalk deployments: 3 are Corretto 8 and the other one is Corretto 11.
On the Corretto 8 deployments, I can set new configuration without issue. On the Corretto 11 instance, however, any attempt to set a new configuration fails and causes a rollback.
The Corretto versions might not be the problem, but it's the only difference I can see. All 4 apps are Spring Boot apps that run as web servers (i.e embedded tomcat with exposed web ports). I am trying to set the exact same configuration name and value, and it only fails on the one instance.
The configuration I'm trying to set is pretty simple:
VALIDATE_RENEWALS = true
Even just trying to set DEBUG = true causes a failure and rollback.
I don't see a lot of information from the console about what's failing. Here is the event log:
2020-03-16 13:55:17 UTC-0600 INFO The environment was reverted to the previous configuration setting.
2020-03-16 13:54:45 UTC-0600 ERROR During an aborted deployment, some instances may have deployed the new application version. To ensure all instances are running the same version, re-deploy the appropriate application version.
2020-03-16 13:54:45 UTC-0600 ERROR Failed to deploy configuration.
2020-03-16 13:54:45 UTC-0600 ERROR Unsuccessful command execution on instance id(s) 'i-00553f4ac36afd327'. Aborting the operation.
2020-03-16 13:54:45 UTC-0600 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2020-03-16 13:54:45 UTC-0600 ERROR [Instance: i-00553f4ac36afd327] Command failed on instance. An unexpected error has occurred [ErrorCode: 0000000001].
2020-03-16 13:54:20 UTC-0600 INFO Updating environment XXX's configuration settings.
2020-03-16 13:54:15 UTC-0600 INFO Environment update is starting.
I've also downloaded the full set of logs for the instance and don't see anything obvious. The app stdout doesn't have any errors or exceptions, it just starts normally and then gets terminated. None of the other log files have messages around the times above, so I'm really not sure what else I can look at.
Edit
The times don't line up but I do see this in eb-engine.log file:
2020/03/16 17:54:38.508634 [INFO] checking whether command is applicable to this instance...
2020/03/16 17:54:38.508658 [INFO] this command is applicable to the instance, thus instance should execute command
2020/03/16 17:54:38.508665 [INFO] check whether this is an enhanced env...
2020/03/16 17:54:38.508794 [INFO] Executing instruction: StageJavaApplication
2020/03/16 17:54:38.508858 [ERROR] GetArchivedFileType with file /opt/elasticbeanstalk/deployment/app_source_bundle failed with error open /opt/elasticbeanstalk/deployment/app_source_bundle: no such file or directory
2020/03/16 17:54:38.508868 [ERROR] An error occurred during execution of command [config-deploy] - [StageJavaApplication]. Stop running the command. Error: staging java app failed with error GetArchivedFileType with file /opt/elasticbeanstalk/deployment/app_source_bundle failed with error open /opt/elasticbeanstalk/deployment/app_source_bundle: no such file or directory
I am trying to configure mysql 8.0.3 database for liferay 7.I am getting this error while configuring db.
08:10:26,259 ERROR [http-nio-8080-exec-3][PoolBase:429] HikariPool-5 -
Failed to execute isValid() for connection, configure connection test
query (null). 24-Oct-2017 08:10:26.260 SEVERE [http-nio-8080-exec-3]
org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service()
for servlet [Main Servlet] in context with path [] threw exception
[Servlet execution threw an exception] with root cause
java.lang.AbstractMethodError at
com.zaxxer.hikari.pool.PoolBase.checkDriverSupport(PoolBase.java:422)
at com.zaxxer.hikari.pool.PoolBase.setupConnection(PoolBase.java:393)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:351)
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:196)
at
com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:442)
at
com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:505)
at com.zaxxer.hikari.pool.HikariPool.(HikariPool.java:113)
this is not complete stack trace please let me if full stack trace is required.
We are using JBoss 5.1 w/ MDB backed by ActiveMQ RAR.
When a message on a queue is consumed and performs some database operations which then result in a deadlock, the deadlock is essentially hosing the entire instance of JBoss until it is restarted. By hosed, any subsequent messages consumed on that queue all fail with the follow exception:
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
The deadlock exception never references my code, which in turn make it very difficult for me to catch and handle.
For example, here is an exception of a deadlock exception:
2012-06-18 18:52:19,848 WARN [JDBCExceptionReporter] : SQL Error: 1213, SQLState: 40001
2012-06-18 18:52:19,848 ERROR [JDBCExceptionReporter] : Deadlock found when trying to get lock; try restarting transaction
2012-06-18 18:52:19,850 ERROR [AbstractFlushingEventListener] : Could not synchronize database state with session
org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:105)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:101)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:269)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:89)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2013)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1449)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:721)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 29 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1064)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1976)
... 34 more
2012-06-18 18:52:19,851 WARN [arjLoggerI18N] : [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2] TwoPhaseCoordinator.beforeCompletion - failed for com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple#480671ab
javax.persistence.PersistenceException: org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:614)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:513)
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:101)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:269)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:89)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:105)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
... 22 more
Caused by: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2013)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1449)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:721)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 29 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1064)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1976)
... 34 more
2012-06-18 18:52:19,912 WARN [TxConnectionManager] : Connection error occured: org.jboss.resource.connectionmanager.TxConnectionManager$TxConnectionEventListener#6acc2da9[state=NORMAL mc=org.jboss.resource.adapter.jdbc.xa.XAManagedConnection#2c9e906 handles=0 lastUse=1340059939649 permit=true trackByTx=true mcp=org.jboss.resource.connectionmanager.JBossManagedConnectionPool$OnePool#10015060 context=org.jboss.resource.connectionmanager.InternalManagedConnectionPool#4643d6d5 xaResource=org.jboss.resource.adapter.jdbc.xa.XAManagedConnection#2c9e906 txSync=null]
com.mysql.jdbc.jdbc2.optional.MysqlXAException: XA_RBDEADLOCK: Transaction branch was rolled back: deadlock was detected
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.mapXAExceptionFromSQLException(MysqlXAConnection.java:605)
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.dispatchCommand(MysqlXAConnection.java:584)
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.end(MysqlXAConnection.java:479)
at org.jboss.resource.adapter.jdbc.xa.XAManagedConnection.end(XAManagedConnection.java:246)
at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:396)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3270)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3248)
at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1933)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:97)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
2012-06-18 18:52:19,914 INFO [ServerSessionImpl:153] : Endpoint failed to process message. Reason: Endpoint after delivery notification failure
I can catch the subsequent errors (the errors on subsequent messages to the queue):
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
But I'm not even sure what to do with it, maybe I can get a new EntityManager that isn't hosed, but I'm getting via Injection to begin with... The only way I know to fix this error is to restart.
I assume that the initial Deadlock is happening as part of ending the transaction in the Queue which is why it's not happening in my code, but does any idea of a way I can handle this gracefully?
Update:
All DataSources are MySQL XA
In transaction-jboss-beans.xml we have transactionTimeout set to 300
There is no way to handle container transaction exception gracefully.
It decides to rollback whatever the reason is and you have no control against it. A complex option to get notified about such exceptions and rollback events may be to write a JCA connector which enrolls fake resources into the distributed transaction manager for each MDB transaction.
I would recommend you to investigate what happens and fix the trouble from its source.
Here is a dead-lock scenario, it always involve multiple threads: if two pieces of code use the same entities but in different order, a first thread will lock resource A (a row, a table or a Java monitor) and wait for B when a second thread have already locked resource B and now waits for A.
So a distributed dead-lock is possible between Java monitors and Database resources and that case is complex to diagnose. The full system is stuck until the transaction timeouts at both ends, Java thread dumps and database sessions and locks must be scanned.
In your case, as MySQL detects the dead-lock itself, it means only database resources are involved.
To help, I guess you should reduce the number of SQL queries run in your batch update for one JMS message, one transaction. With concurrent messages consumed at the same time, working on transaction applying large number of rows, a deadlock situation is more likely to occur.
But it is strange that your JBoss server is stuck after rollback because of a DB-only dead-lock:
either you have exhausted the number of JDBC connections from the DataSource pool
or there is no more MySQL server-side connection available
or all message beans are in dead-lock situation too
the transaction timeout may be too high to get rollbacks in a reasonable time frame
So either you reduce MDB concurrency by down sizing the corresponding pool, either you reduce the number of updates per transaction - maybe even as small as one update per JMS message...
I was able to resolve this by isolating each transaction using the following method:
Annotate the MDB with:
#TransactionManagement(value = TransactionManagementType.BEAN)
Thus telling the MDB that the Bean itself will manage the transaction and not the container (outside of my code), next, inject the necessary resources:
#Resource
MessageDrivenContext mc;
UserTransaction tx;
To create and manage my own transactions
tx = mc.getUserTransaction();
tx.begin();
//Do work
tx.commit();
The above step can be repeated for each block of code.
Using very granular transactions I was able to trace where in my code the lock / race condition was happening. Once I solved that I was able to back out some of the more granular transaction management.