How do I debug HikariCP losing connections? - mysql

I use HikariCP with Play 2.6.10. The application will run fine for days, and then all of our connections will leak. I have leakDetectionThreshold turned on, so we get stack traces for leaked connections like:
2018-01-24 06:29:00,857 - [WARN] - from com.zaxxer.hikari.pool.ProxyLeakTask in
HikariPool-2 housekeeper
Connection leak detection triggered for com.mysql.jdbc.JDBC4Connection#65cd084 on thread pool-1-thread-1, stack trace follows
java.lang.Exception: Apparent connection leak detected
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.jav
a:85)
at play.api.db.DefaultDatabase.getConnection(Databases.scala:142)
at play.api.db.DefaultDatabase.withConnection(Databases.scala:152)
at play.api.db.DefaultDatabase.withConnection(Databases.scala:148)
at models.summaries.ActionSummary$.listByStripe(ActionSummary.scala:137)
I only use the connections through Play's withConnection, so they should be returned to the pool automatically. A thread dump, when the application is in a broken state, shows that all threads inside of a withConnection block are stuck on...
"application-akka.mysql-context-122" #32142 prio=5 os_prio=0 tid=0x00007fca7812a
000 nid=0x28e6 waiting on condition [0x00007fca7541c000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for (a java.util.concurrent.Sync
hronousQueue$TransferQueue)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215
)
at java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Sync
hronousQueue.java:764)
at java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Synchron
ousQueue.java:695)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at com.zaxxer.hikari.util.ConcurrentBag.borrow(ConcurrentBag.java:157)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:165)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:147)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:85)
at play.api.db.DefaultDatabase.getConnection(Databases.scala:142)
at play.api.db.DefaultDatabase.withConnection(Databases.scala:152)
at play.api.db.DefaultDatabase.withConnection(Databases.scala:148)
...waiting for a connection to be available, which should mean that none are currently holding a connection. I have no idea how any connection could possible be leaked, but apparently all of them have. We see logging lines like:
2018-01-24 06:19:21,297 - [DEBUG] - from com.zaxxer.hikari.pool.HikariPool in application-akka.mysql-context-129
HikariPool-2 - Timeout failure stats (total=10, active=10, idle=0, waiting=15)
The only unusual thing we are doing is calling setNetworkTimeout on each connection we obtain, sometimes with a timeout as low as 10 seconds. This is done to ensure that queries fail fast if we lose connection to the DB.
I'm not sure what to do next debugging this. It looks like maybe a potential issue between Hikari and Play, or something broken with MySQL and setNetworkTimeout.

Related

Gatling Runtime Exception from Jenkins :java.io.IOException .. An existing connection was forcibly closed by the remote host

I Get the below exception while I run the Gatling(version 2.2.5) Test through a Jenkins Job at 5 TPS - rampUsersPerSec(0.1) to 5 during(5 minutes),constantUsersPerSec(5) during(15 minutes), the exception starts after 15 mins of test run.
However, I do not see this exception while the test runs at 0.2 TPS - rampUsersPerSec(0.1) to 0.2 during(5 minutes),constantUsersPerSec(0.2) during(15 minutes)
The Jenkins( version 2.66 ) runs on Windows 7(64-bit OS), with 4 GB RAM on Tomcat.
Will this be the limitation and cause for this exception?
what's triggering this, how do I get this resolved ?
Is there any Gatling Jenkins Host(Load Generator) specification that I have to look for ?
Appreciate your inputs !
22:20:28.946 [DEBUG] i.g.h.a.AsyncHandler - Request
T013_GET_Updates' failed for user 601
java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:276)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
It seems, the port you are using is already occupied. Take specific port for host or remote machine.
Cheers,
Peekay

Cassandra down after high traffic

I have a problem with cassandra when the traffic goes high... cassandra crashes
here's what i got in system.log
WARN 11:54:35 JNA link failure, one or more native method will be unavailable.
WARN 11:54:35 jemalloc shared library could not be preloaded to speed up memory allocations
WARN 11:54:35 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO 11:54:35 Initializing SIGAR library
INFO 11:54:35 Checked OS settings and found them configured for optimal performance.
INFO 11:54:35 Initializing system.schema_triggers
ERROR 11:54:36 Exiting due to error while processing commit log during initialization.
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled
Java code at org.apache.cassandra.db.commitlog.CommitLogDescriptor.writeHeader(CommitLogDescriptor.java:87) ~[apache-cassandra-2.2.4.jar:2.2.4]
at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:153) ~[apache-cassandra-2.2.4.jar:2.2.4] at org.apache.cassandra.db.commitlog.MemoryMappedSegment.<init>(MemoryMappedSegment.java:47) ~[apache-cassandra-2.2.4.jar:2.2.4] at
org.apache.cassandra.db.commitlog.CommitLogSegment.createSegment(CommitLogSegment.java:121) ~[apache-cassandra-2.2.4.jar:2.2.4]
atorg.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:122) ~[apache-cassandra-2.2.4.jar:2.2.4]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.2.4.jar:2.2.4]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
and in debug.log
DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,586 SliceQueryPager.java:92 - Querying next page of slice query; new filter: SliceQueryFilter [reversed=false, slices=[[, ]], count=5000, toGroup = 2]
WARN [SharedPool-Worker-2] 2017-05-25 12:54:18,658 SliceQueryFilter.java:307 - Read 2129 live and 27677 tombstone cells in RestCommSMSC.SLOT_MESSAGES_TABLE_2017_05_25 for key: 549031460 (see tombstone_warn_threshold). 5000 columns were requested, slices=[-]DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:95 - Fetched 2129 live rows
DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:112 - Got result (2129) smaller than page size (5000), considering pager exhausted DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:133 - Remaining rows to page: 2147481518
INFO [main] 2017-05-25 12:54:34,826 YamlConfigurationLoader.java:92 - Loading settings from file:/opt/SMGS/apache-cassandra-2.2.4/conf/cassandra.yaml INFO [main] 2017-05-25 12:54:34,923 YamlConfigurationLoader.java:135 Node configuration
[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer;
auto_snapshot=true; batch_size_fail_threshold_in_kb=50;
batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=<REDACTED>; cluster_name=Test Cluster; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=10000; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; concurrent_reads=32; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_user_defined_functions=false; endpoint_snitch=SimpleSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=localhost; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=50000; read_request_timeout_in_ms=10000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=50000; role_manager=CassandraRoleManager; roles_validity_in_ms=2000; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, parameters=[{seeds=127.0.0.1}]}]; server_encryption_options<REDACTED>;snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=true; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=100000; tombstone_warn_threshold=5000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; windows_timer_interval=1; write_request_timeout_in_ms=2000]
DEBUG [main] 2017-05-25 12:54:34,958 DatabaseDescriptor.java:296 - Syncing log with a period of 10000
INFO [main] 2017-05-25 12:54:34,958 DatabaseDescriptor.java:304 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO [main] 2017-05-25 12:54:35,110 DatabaseDescriptor.java:409 - Global memtable on-heap threshold is enabled at 1991MB INFO [main] 2017-05-25 12:54:35,110 DatabaseDescriptor.java:413 - Global memtable off-heap threshold is enabled at 1991MB
i don't know if this problem is related to commitLogs or not, anyways in cassandra.yaml i'm setting:
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
you can start you cassandra with this command:
cd /ASMSC02/apache-cassandra-2.0.11/
nohup bin/cassandra
regards,
Hafiz

Catching MySQL Deadlocks in a MDB (ActiveMQ) in JBoss 5.1

We are using JBoss 5.1 w/ MDB backed by ActiveMQ RAR.
When a message on a queue is consumed and performs some database operations which then result in a deadlock, the deadlock is essentially hosing the entire instance of JBoss until it is restarted. By hosed, any subsequent messages consumed on that queue all fail with the follow exception:
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
The deadlock exception never references my code, which in turn make it very difficult for me to catch and handle.
For example, here is an exception of a deadlock exception:
2012-06-18 18:52:19,848 WARN [JDBCExceptionReporter] : SQL Error: 1213, SQLState: 40001
2012-06-18 18:52:19,848 ERROR [JDBCExceptionReporter] : Deadlock found when trying to get lock; try restarting transaction
2012-06-18 18:52:19,850 ERROR [AbstractFlushingEventListener] : Could not synchronize database state with session
org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:105)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:101)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:269)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:89)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2013)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1449)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:721)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 29 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1064)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1976)
... 34 more
2012-06-18 18:52:19,851 WARN [arjLoggerI18N] : [com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator_2] TwoPhaseCoordinator.beforeCompletion - failed for com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple#480671ab
javax.persistence.PersistenceException: org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:614)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:513)
at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:101)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:269)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:89)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:105)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:168)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
... 22 more
Caused by: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2013)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1449)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:721)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 29 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:407)
at com.mysql.jdbc.Util.getInstance(Util.java:382)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1064)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1976)
... 34 more
2012-06-18 18:52:19,912 WARN [TxConnectionManager] : Connection error occured: org.jboss.resource.connectionmanager.TxConnectionManager$TxConnectionEventListener#6acc2da9[state=NORMAL mc=org.jboss.resource.adapter.jdbc.xa.XAManagedConnection#2c9e906 handles=0 lastUse=1340059939649 permit=true trackByTx=true mcp=org.jboss.resource.connectionmanager.JBossManagedConnectionPool$OnePool#10015060 context=org.jboss.resource.connectionmanager.InternalManagedConnectionPool#4643d6d5 xaResource=org.jboss.resource.adapter.jdbc.xa.XAManagedConnection#2c9e906 txSync=null]
com.mysql.jdbc.jdbc2.optional.MysqlXAException: XA_RBDEADLOCK: Transaction branch was rolled back: deadlock was detected
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.mapXAExceptionFromSQLException(MysqlXAConnection.java:605)
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.dispatchCommand(MysqlXAConnection.java:584)
at com.mysql.jdbc.jdbc2.optional.MysqlXAConnection.end(MysqlXAConnection.java:479)
at org.jboss.resource.adapter.jdbc.xa.XAManagedConnection.end(XAManagedConnection.java:246)
at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:396)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3270)
at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3248)
at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1933)
at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:97)
at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:177)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1423)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:137)
at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.endTransaction(MessageInflowLocalProxy.java:435)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.finish(MessageInflowLocalProxy.java:314)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.after(MessageInflowLocalProxy.java:230)
at org.jboss.ejb3.mdb.inflow.MessageInflowLocalProxy.invoke(MessageInflowLocalProxy.java:136)
at $Proxy677.afterDelivery(Unknown Source)
at org.apache.activemq.ra.MessageEndpointProxy$MessageEndpointAlive.afterDelivery(MessageEndpointProxy.java:128)
at org.apache.activemq.ra.MessageEndpointProxy.afterDelivery(MessageEndpointProxy.java:69)
at org.apache.activemq.ra.ServerSessionImpl.afterDelivery(ServerSessionImpl.java:224)
at org.apache.activemq.ActiveMQSession.run(ActiveMQSession.java:897)
at org.apache.activemq.ra.ServerSessionImpl.run(ServerSessionImpl.java:169)
at org.jboss.resource.work.WorkWrapper.execute(WorkWrapper.java:205)
at org.jboss.util.threadpool.BasicTaskWrapper.run(BasicTaskWrapper.java:260)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
2012-06-18 18:52:19,914 INFO [ServerSessionImpl:153] : Endpoint failed to process message. Reason: Endpoint after delivery notification failure
I can catch the subsequent errors (the errors on subsequent messages to the queue):
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
But I'm not even sure what to do with it, maybe I can get a new EntityManager that isn't hosed, but I'm getting via Injection to begin with... The only way I know to fix this error is to restart.
I assume that the initial Deadlock is happening as part of ending the transaction in the Queue which is why it's not happening in my code, but does any idea of a way I can handle this gracefully?
Update:
All DataSources are MySQL XA
In transaction-jboss-beans.xml we have transactionTimeout set to 300
There is no way to handle container transaction exception gracefully.
It decides to rollback whatever the reason is and you have no control against it. A complex option to get notified about such exceptions and rollback events may be to write a JCA connector which enrolls fake resources into the distributed transaction manager for each MDB transaction.
I would recommend you to investigate what happens and fix the trouble from its source.
Here is a dead-lock scenario, it always involve multiple threads: if two pieces of code use the same entities but in different order, a first thread will lock resource A (a row, a table or a Java monitor) and wait for B when a second thread have already locked resource B and now waits for A.
So a distributed dead-lock is possible between Java monitors and Database resources and that case is complex to diagnose. The full system is stuck until the transaction timeouts at both ends, Java thread dumps and database sessions and locks must be scanned.
In your case, as MySQL detects the dead-lock itself, it means only database resources are involved.
To help, I guess you should reduce the number of SQL queries run in your batch update for one JMS message, one transaction. With concurrent messages consumed at the same time, working on transaction applying large number of rows, a deadlock situation is more likely to occur.
But it is strange that your JBoss server is stuck after rollback because of a DB-only dead-lock:
either you have exhausted the number of JDBC connections from the DataSource pool
or there is no more MySQL server-side connection available
or all message beans are in dead-lock situation too
the transaction timeout may be too high to get rollbacks in a reasonable time frame
So either you reduce MDB concurrency by down sizing the corresponding pool, either you reduce the number of updates per transaction - maybe even as small as one update per JMS message...
I was able to resolve this by isolating each transaction using the following method:
Annotate the MDB with:
#TransactionManagement(value = TransactionManagementType.BEAN)
Thus telling the MDB that the Bean itself will manage the transaction and not the container (outside of my code), next, inject the necessary resources:
#Resource
MessageDrivenContext mc;
UserTransaction tx;
To create and manage my own transactions
tx = mc.getUserTransaction();
tx.begin();
//Do work
tx.commit();
The above step can be repeated for each block of code.
Using very granular transactions I was able to trace where in my code the lock / race condition was happening. Once I solved that I was able to back out some of the more granular transaction management.

c3p0 - hibernate - mysql

hibernate 3.6.8 final
c3p0 jar that came with hibernate 3.6.8 package -> c3p0-0.9.1.jar
1
15
40
0
5
2
The app seems to be working fine, however I get massive log calls with the following stacktrace:
org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already. Could not load com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact.
java.lang.IllegalStateException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1566)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1526)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2411)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2153)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:792)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.GeneratedConstructorAccessor38.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:381)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:134)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:152)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1074)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1061)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1796)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:620)
Any information on how to remove that INFO log would be very much helpful thanks!
UPDATE: Is this a critical error? Or can should I just ignore it?
After a search on the web regarding this issue, I found some similar issues reported by several people. All of them point to a common problem: Threads. Basically, if you start new threads in your application (either in your code or by using a third party tool like Quartz, you have to make sure that all of the threads are stopped appropriately when the application is undeployed from the server. Here are some quotes from the searches:
Mikolaj Rydzewski wrote:
It looks like after webapp's instance has been undeployed, background quartz thread wants to do something and then exception occurs.
Another (and better explanation) on jspwiki.org:
It is possible that this is caused by Tomcat unsuccessfully reloading the web application. The app is unloaded, but all threads don't get shut down properly. As a result, when the threads try to run, they get clobbered by the fact that Tomcat has shut down its classloader, and an error is logged.
So, in order to solve this issue you have to make sure all threads started by your application will be stopped at application undeployment (or redeployment, it's the same). You can do this by registering a ServletContextListener to your application server and stopping your threads inside contextDestroyed(ServletContextEvent) method.
If you are using log4j, change your logging settings to something like this (the word ERROR replace INFO) :
log4j.rootLogger=ERROR, file, stdout
log4j.logger.org.hibernate=ERROR
They are located in the the log4j.properties file in yr project.
OK I switched to boneCP, c3p0 does not really seem to work for java6!!!

While Cassandra compaction Fatal exception in thread CompactionExecutor

I am having cassandra cluster of 12 nodes on EC2 running cassandra-0.8.2.
While compaction I got the following exception which caused Seed node to get down.
Below is the exception stack trace.
ERROR [CompactionExecutor:31] 2011-12-16 08:06:02,308 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[CompactionExecutor:31,1,main]
java.io.IOError: java.io.EOFException: EOF after 430959023 bytes out of 778986868
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:149)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:90)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:74)
at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:179)
at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
at org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
at org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: EOF after 430959023 bytes out of 778986868
at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:63)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:141)
... 23 more
It says it is Caused by: java.io.EOFException:
Is it because of the corrupt sstables?
if it is, then how to remove or repair those sstables?
It looks like this is indeed caused by corrupt sstables (which may indicate a hardware problem). My recommendations:
Upgrade to the latest stable 0.8.x version of Cassandra. This will be a drop-in replacement for 0.8.2.
Run "nodetool scrub" on the machine having problems
Review http://www.datastax.com/docs/1.0/install/cluster_init -- I recommend two seed nodes per data center, but remember that a seed node is only consulted when restarting nodes, so it's not a big deal to have one down during normal operation