KCL stops processing data after throwing error Cancelling subscription, and restarting - aws-sdk

KCL
ShardConsumerSubscriber:131 - shardId-000000000000: Last request was dispatched at 2020-04-28T12:57:25.166Z, but no response as of 2020-04-28T12:58:00.435Z (PT35.269S). Cancelling subscription, and restarting."
But never restarts application and no data is processed after that.
Maven dependency used
<dependency>
<groupId>software.amazon.kinesis</groupId>
<artifactId>amazon-kinesis-client</artifactId>
<version>2.2.2</version>
</dependency>
And the Kinesis configuration
KinesisAsyncClient kinesisClient = KinesisAsyncClient.builder()
.credentialsProvider(new MyCredentialProvider(configVals)).region(region).build();
InitialPositionInStreamExtended initialPositionInStreamExtended = InitialPositionInStreamExtended
.newInitialPosition(InitialPositionInStream.TRIM_HORIZON);
RetrievalConfig retrievalConfig = configsBuilder.retrievalConfig()
.retrievalSpecificConfig(new PollingConfig(configVals.getStreamName(), kinesisClient)
.idleTimeBetweenReadsInMillis(10000).maxRecords(50).kinesisRequestTimeout(Duration.ofSeconds(100)));
retrievalConfig.initialPositionInStreamExtended(initialPositionInStreamExtended);

As it's stated in issue KCL 2.0 stops consuming from some shards #463, the main reason can be the fact that the processor is not able to process the input in 35 seconds. This value is hardcoded in ShardConsumer in MAX_TIME_BETWEEN_REQUEST_RESPONSE variable and is increased to 1 minute in later versions (I'm using kcl version 2.2.10 and it's 60 seconds already).
I've also experienced this issue and additionally had to override kinesis client http client settings to setup higher timeouts - something like that way:
val kinesisClient = KinesisClientUtil
.adjustKinesisClientBuilder(
KinesisAsyncClient.builder.credentialsProvider(credentialsProvider).region(awsRegion)
)
.httpClientBuilder(
NettyNioAsyncHttpClient.builder
.maxConcurrency(Integer.MAX_VALUE)
.maxPendingConnectionAcquires(kinesisHttpMaxPendingConnectionAcquires)
.maxHttp2Streams(kinesisHttpMaxConnections)
.connectionAcquisitionTimeout(kinesisHttpConnectionAcquisitionTimeout)
.connectionTimeout(kinesisHttpConnectionTimeout)
.readTimeout(kinesisHttpReadTimeout)
.http2Configuration(
Http2Configuration.builder
.initialWindowSize(10 * 1024 * 1024)
.healthCheckPingPeriod(kinesisHttpHealthCheckPingPeriod)
.maxStreams(kinesisHttpMaxConnections.longValue())
.build
)
.protocol(Protocol.HTTP2)
)
.build()
I've set all the timeouts to 120 seconds and it did the trick.

Related

Should pika's channel.basic_publish timeout sooner if there is no network connection?

I'm in the process of upgrading to pika 1.1.0, and performed some sanity testing.
I have:
placed a breakpoint at the following
disconnected the network
stepped over this command
... and no exception was thrown. Is this expected?
channel.basic_publish(
exchange=EXCHANGE,
routing_key=ROUTING_KEY,
body=message,
properties=pika.BasicProperties(
delivery_mode=MQ_TRANSIENT_DELIVERY_MODE,
headers=headers,
)
)
The connection is created with:
connection = pika.BlockingConnection(pika.ConnectionParameters(
host=rabbit_config.host,
credentials=credentials,
port=rabbit_config.port,
connection_attempts=1,
blocked_connection_timeout=10,
retry_delay=5,
socket_timeout=20,
heartbeat=30, ))
update:
If I call channel.confirm_delivery() before this, I successfully get a AMQPError.
However, this doesn't happen for 60 seconds (which doesn't my ConnectionParameters). How can I have it notice the connection loss quicker?

AWS Aurora Serverless - Communication Link Failure

I'm using MySQL Aurora Serverless cluster (with the Data API enabled) in my python code and I am getting a communications link failure exception. This usually occurs when the cluster has been dormant for some time.
But, once the cluster is active, I get no error. I have to send 3-4 requests every time before it works fine.
Exception detail:
The last packet sent successfully to the server was 0 milliseconds
ago. The driver has not received any packets from the server. An error
occurred (BadRequestException) when calling the ExecuteStatement
operation: Communications link failure
How can I solve this issue? I am using standard boto3 library
Here is the reply from AWS Premium Business Support.
Summary: It is an expected behavior
Detailed Answer:
I can see that you receive this error when your Aurora Serverless
instance is inactive and you stop receiving it once your instance is
active and accepting connection. Please note that this is an expected
behavior. In general, Aurora Serverless works differently than
Provisioned Aurora , In Aurora Serverless, while the cluster is
"dormant" it has no compute resources assigned to it and when a db.
connection is received, Compute resources are assigned. Because of
this behavior, you will have to "wake up" the clusters and it may take
a few minutes for the first connection to succeed as you have seen.
In order to avoid that you may consider increasing the timeout on the
client side. Also, if you have enabled Pause, you may consider
disabling it [2]. After disabling Pause, you can also adjust the
minimum Aurora capacity unit to higher value to make sure that your
Cluster always having enough computing resource to serve the new
connections [3]. Please note that adjusting the minimum ACU might
increase the cost of service [4].
Also note that Aurora Serverless is only recommend for certain
workloads [5]. If your workload is highly predictable and your
application needs to access the DB on a regular basis, I would
recommend you use Provisioned Aurora cluster/instance to insure high
availability of your business.
[2] How Aurora Serverless Works - Automatic Pause and Resume for Aurora Serverless - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.how-it-works.html#aurora-serverless.how-it-works.pause-resume
[3] Setting the Capacity of an Aurora Serverless DB Cluster - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.setting-capacity.html
[4] Aurora Serverless Price https://aws.amazon.com/rds/aurora/serverless/
[5] Using Amazon Aurora Serverless - Use Cases for Aurora Serverless - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.html#aurora-serverless.use-cases
If it is useful to someone this is how I manage retries while Aurora Serverless wake up.
Client returns a BadRequestException so boto3 will not retry even if you change the config for the client, see https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html.
My first option was to try with Waiters but RDSData does not have any waiter, then I tried to create a custom Waiter with an Error matcher but only tries to match error code, ignoring message, and because a BadRequestException can be raised by an error in a sql statement I needed to validate message too, so I using a kind of waiter function:
def _wait_for_serverless():
delay = 5
max_attempts = 10
attempt = 0
while attempt < max_attempts:
attempt += 1
try:
rds_data.execute_statement(
database=DB_NAME,
resourceArn=CLUSTER_ARN,
secretArn=SECRET_ARN,
sql_statement='SELECT * FROM dummy'
)
return
except ClientError as ce:
error_code = ce.response.get("Error").get('Code')
error_msg = ce.response.get("Error").get('Message')
# Aurora serverless is waking up
if error_code == 'BadRequestException' and 'Communications link failure' in error_msg:
logger.info('Sleeping ' + str(delay) + ' secs, waiting RDS connection')
time.sleep(delay)
else:
raise ce
raise Exception('Waited for RDS Data but still getting error')
and I use it in this way:
def begin_rds_transaction():
_wait_for_serverless()
return rds_data.begin_transaction(
database=DB_NAME,
resourceArn=CLUSTER_ARN,
secretArn=SECRET_ARN
)
I also got this issue, and taking inspiration from the solution used by Arless and the conversation with Jimbo, came up with the following workaround.
I defined a decorator which retries the serverless RDS request until the configurable retry duration expires.
import logging
import functools
from sqlalchemy import exc
import time
logger = logging.getLogger()
def retry_if_db_inactive(max_attempts, initial_interval, backoff_rate):
"""
Retry the function if the serverless DB is still in the process of 'waking up'.
The configration retries follows the same concepts as AWS Step Function retries.
:param max_attempts: The maximum number of retry attempts
:param initial_interval: The initial duration to wait (in seconds) when the first 'Communications link failure' error is encountered
:param backoff_rate: The factor to use to multiply the previous interval duration, for the next interval
:return:
"""
def decorate_retry_if_db_inactive(func):
#functools.wraps(func)
def wrapper_retry_if_inactive(*args, **kwargs):
interval_secs = initial_interval
attempt = 0
while attempt < max_attempts:
attempt += 1
try:
return func(*args, **kwargs)
except exc.StatementError as err:
if hasattr(err.orig, 'response'):
error_code = err.orig.response["Error"]['Code']
error_msg = err.orig.response["Error"]['Message']
# Aurora serverless is waking up
if error_code == 'BadRequestException' and 'Communications link failure' in error_msg:
logger.info('Sleeping for ' + str(interval_secs) + ' secs, awaiting RDS connection')
time.sleep(interval_secs)
interval_secs = interval_secs * backoff_rate
else:
raise err
else:
raise err
raise Exception('Waited for RDS Data but still getting error')
return wrapper_retry_if_inactive
return decorate_retry_if_db_inactive
which can then be used something like this:
#retry_if_db_inactive(max_attempts=4, initial_interval=10, backoff_rate=2)
def insert_alert_to_db(sqs_alert):
with db_session_scope() as session:
# your db code
session.add(sqs_alert)
return None
Please note I'm using sqlalchemy, so the code would need tweaking to suit specific purposes, but hopefully will be useful as a starter.
This may be a little late, but there is a way to deactivate the DORMANT behavior of the database.
When creating the Cluster from the CDK, you can configure an attribute as follows:
new rds.ServerlessCluster(
this,
'id',
{
engine: rds.DatabaseClusterEngine.AURORA_MYSQL,
defaultDatabaseName: 'name',
vpc,
scaling:{
autoPause:Duration.millis(0) //Set to 0 to disable
}
}
)
The attribute is autoPause. The default value is 5 minutes (Communication link failure message may appear after 5 minutes of not using the DB). The max value is 24 hours. However, you can set the value to 0 and this disables the automatic shutdown. After this, the database will not go to sleep even if there are no connections.
When looking at the configuration from AWS (RDS -> Databases -> 'instance' -> Configuration -> Capacity Settings), you'll notice this attribute without a value (if set to 0):
Finally, if you don't want the database to be ON all the time, set your own autoPause value so that it behaves as expected.

Update ttl for all records in aerospike

I was stuck in a situation that I have initialised a namesapce with
default-ttl to 30 days. There was about 5 million data with that (30-day calculated) ttl-value. Actually, my requirement is that ttl should be zero(0), but It(ttl-30d) was kept with unaware or un-recognise.
So, Now I want to update prev(old) 5 million data with new ttl-value (Zero).
I've checked/tried "set-disable-eviction true", but it is not working, it is removing data according to (old)ttl-value.
How do I overcome out this? (and I want to retrieve the removed data, How can I?).
Someone help me.
First, eviction and expiration are two different mechanisms. You can disable evictions in various ways, such as the set-disable-eviction config parameter you've used. You cannot disable the cleanup of expired records. There's a good knowledge base FAQ What are Expiration, Eviction and Stop-Writes?. Unfortunately, the expired records that have been cleaned up are gone if their void time is in the past. If those records were merely evicted (i.e. removed before their void time due to crossing the namespace high-water mark for memory or disk) you can cold restart your node, and those records with a future TTL will come back. They won't return if either they were durably deleted or if their TTL is in the past (such records gets skipped).
As for resetting TTLs, the easiest way would be to do this through a record UDF that is applied to all the records in your namespace using a scan.
The UDF for your situation would be very simple:
ttl.lua
function to_zero_ttl(rec)
local rec_ttl = record.ttl(rec)
if rec_ttl > 0 then
record.set_ttl(rec, -1)
aerospike:update(rec)
end
end
In AQL:
$ aql
Aerospike Query Client
Version 3.12.0
C Client Version 4.1.4
Copyright 2012-2017 Aerospike. All rights reserved.
aql> register module './ttl.lua'
OK, 1 module added.
aql> execute ttl.to_zero_ttl() on test.foo
Using a Python script would be easier if you have more complex logic, with filters etc.
zero_ttl_operation = [operations.touch(-1)]
query = client.query(namespace, set_name)
query.add_ops(zero_ttl_operation)
policy = {}
job = query.execute_background(policy)
print(f'executing job {job}')
while True:
response = client.job_info(job, aerospike.JOB_SCAN, policy={'timeout': 60000})
print(f'job status: {response}')
if response['status'] != aerospike.JOB_STATUS_INPROGRESS:
break
time.sleep(0.5)
Aerospike v6 and Python SDK v7.

Neo4j server hangs every 2 hours consistently. Please help me understand if something is wrong with the configuration

We have a neo4j graph database with around 60 million nodes and an equivalent relationships.
We have been facing consistent packet drops and delays in processing and a complete hung server after 2 hours. We had to shutdown and restart our servers every time this happens and we are having trouble understanding where we went wrong with our configuration.
We are seeing the following kind of exceptions in the console.log file -
java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.jetty.server.HttpConnection - HttpConnection#609c1158{FILLING}
java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.j.util.thread.QueuedThreadPool
java.lang.IllegalStateException: org.eclipse.jetty.util.SharedBlockingCallback$BlockerTimeoutException
o.e.j.util.thread.QueuedThreadPool - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3#59d5a975 in
qtp1667455214{STARTED,14<=21<=21,i=0,q=58}
org.eclipse.jetty.server.Response - Committed before 500 org.neo4j.server.rest.repr.OutputFormat$1#39beaadf
o.e.jetty.servlet.ServletHandler - /db/data/cypher java.lang.IllegalStateException: Committed at
org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253)
~[jetty-server-9.2.
org.eclipse.jetty.server.HttpChannel - /db/data/cypher java.lang.IllegalStateException: Committed at
org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253)
~[jetty-server-9.2.
org.eclipse.jetty.server.HttpChannel - Could not send response error 500: java.lang.IllegalStateException: Committed
o.e.jetty.server.ServerConnector - Stopped
o.e.jetty.servlet.ServletHandler - /db/data/cypher org.neo4j.graphdb.TransactionFailureException: Transaction was marked
as successful, but unable to commit transaction so rolled back.
We are using neo4j enterprise edition 2.2.5 server in SINGLE/NON CLUSTER mode on Azure D series 8 core CPU,56 GB RAM UBUNTU 14.04 LTS machine with an attached 500GB data disk.
Here is a snapshot of the sizes of neostore files
8.5G Oct 2 15:48 neostore.propertystore.db
15G Oct 2 15:48 neostore.relationshipstore.db
2.5G Oct 2 15:48 neostore.nodestore.db
6.9M Oct 2 15:48 neostore.relationshipgroupstore.db
3.7K Oct 2 15:07 neostore.schemastore.db
145 Oct 2 15:07 neostore.labeltokenstore.db
170 Oct 2 15:07 neostore.relationshiptypestore.db
The Neo4j configuration is as follows -
Allocated 30GB to file buffer cache (dbms.pagecache.memory=30G)
Allocated 20GB to JVM heap memory (wrapper.java.initmemory=20480, wrapper.java.maxmemory=20480)
Using the default hpc(High performance) type cache.
Forcing the RULE planner by default (dbms.cypher.planner=RULE)
Maximum threads processing queries is 16(twice the number of cores) - org.neo4j.server.webserver.maxthreads=16
Transaction timeout of 60 seconds - org.neo4j.server.transaction.timeout=60
Guard Timeout if query execution time is greater than 10 seconds - org.neo4j.server.webserver.limit.executiontime=10000
Rest of the settings are default
We actually want to setup a cluster of 3 nodes but before that we want to be sure if our basic configuration is correct. Please help us
--------------------------------------------------------------------------
EDITED to ADD Query Sample
Typically our cypher query frequency is 18K queries in an hour with an average of roughly 5-6 queries a second. There are also times when there are about 80 queries per second.
Our Typical Queries look like the ones below
match (a:TypeA {param:{param}})-[:RELA]->(d:TypeD) with distinct d,a skip {skip} limit 100 optional match (d)-[:RELF]->(c:TypeC)<-[:RELF]-(b:TypeB)<-[:RELB]-(a) with distinct d,a,collect(distinct b.bid) as bids,collect(distinct c.param3) as param3Coll optional match (d)-[:RELE]->(p:TypeE)<-[:RELE]-(b1:TypeB)<-[:RELB]-(a) with distinct d as distD,bids+collect(distinct b1.bid) as tbids,param3Coll,collect(distinct p.param4) as param4Coll optional match (distD)-[:RELC]->(f:TypeF) return id(distD),distD.param5,exists((distD)<-[:RELG]-()) as param6, tbids,param3Coll,param4Coll,collect(distinct id(f)) as fids
match (a:TypeA {param:{param}})-[:RELB]->(b) return count(distinct b)
MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where r.param1=true with a,a1,h match (h)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c) return a1.param,id(a1),collect(b.bid),c.param5
match (a:TypeA {param:{param}}) match (a)-[:RELB]->(b) with distinct b,a skip {skip} limit 100 match (a)-[:RELH]->(h1:TypeH) match (b)-[:RELF|RELE]->(x)<-[:RELF|RELE]-(h2:TypeH)<-[:RELH]-(a1) optional match (a1)<-[rd:RELD]-(a) with distinct a1,a,h1,b,h2,rd.param1 as param2,collect(distinct x.param3) as param3s,collect(distinct x.param4) as param4s optional match (a1)-[:RELB]->(b1) where b1.param7 in [0,1] and exists((b1)-[:RELF|RELE]->()<-[:RELF|RELE]-(h1)) with distinct a1,a,b,h2,param2,param3s,param4s,b1,case when param2 then false else case when ((a1.param5 in [2,3] or length(param3s)>0) or (a1.param5 in [1,3] or length(param4s)>0)) then case when b1.param7=0 then false else true end else false end end as param8 MERGE (a)-[r2:RELD]->(a1) on create set r2.param6=true on match set r2.param6=case when param8=true and r2.param9=false then true else false end MERGE (b)-[r3:RELM]->(h2) SET r2.param9=param8, r3.param9=param8
MATCH (a:TypeA {param:{param}})-[:RELI]->(g:TypeG {type:'type1'}) match (g)<-[r:RELI]-(a1:TypeA)-[:RELJ]->(j)-[:RELK]->(g) return distinct g, collect(j.displayName), collect(r.param1), g.gid, collect(a1.param),collect(id(a1))
match (a:TypeA {param:{param}})-[r:RELD {param2:true}]->(a1:TypeA)-[:RELH]->(b:TypeE) remove r.param2 return id(a1),b.displayName, b.firstName,b.lastName
match (a:TypeA {param:{param}})-[:RELA]->(b:TypeB) return a.param1,count(distinct id(b))
MATCH (a:TypeA {param:{param}}) set a.param1=true;
match (a:TypeE)<-[r:RELE]-(b:TypeB) where a.param4 in {param4s} delete r return count(b);
MATCH (a:TypeA {param:{param}}) return id(a);
Adding a few more strange things I have been noticing....
I am have stopped all my webservers. So, currently there are no incoming requests to neo4j. However I see that there are about 40K open file handles in TCP close/wait state implying the client has closed its connection because of time out and Neo4j has not processed it and responded to that request. I also see (from messages.log) that Neo4j server is
still processing queries and as it does this, the 40K open file handles is slowly reducing. By the time I write this post there are about 27K open file handles in TCP close/wait state.
Also I see that the queries are not continuously processed. Every once in a while I am seeing a pause in messages.log and I see these messages about log rotation because of some out of order sequence as below
Rotating log version:5630
2015-10-04 05:10:42.712+0000 INFO
[o.n.k.LogRotationImpl]: Log Rotation [5630]: Awaiting all
transactions closed...
2015-10-04 05:10:42.712+0000 INFO
[o.n.k.i.s.StoreFactory]: Waiting for all transactions to close...
committed: out-of-order-sequence:95494483 [95494476]
committing:
95494483
closed: out-of-order-sequence:95494480 [95494246]
2015-10-04 05:10:43.293+0000 INFO [o.n.k.LogRotationImpl]: Log
Rotation [5630]: Starting store flush...
2015-10-04 05:10:44.941+0000
INFO [o.n.k.i.s.StoreFactory]: About to rotate counts store at
transaction 95494483 to [/datadrive/graph.db/neostore.counts.db.b],
from [/datadrive/graph.db/neostore.counts.db.a].
2015-10-04
05:10:44.944+0000 INFO [o.n.k.i.s.StoreFactory]: Successfully rotated
counts store at transaction 95494483 to
[/datadrive/graph.db/neostore.counts.db.b], from
[/datadrive/graph.db/neostore.counts.db.a].
I also see these messages once in a while
2015-10-04 04:59:59.731+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]:
NodeCache array:66890956 purge:93 size:1.3485746GiB misses:0.80978173%
collisions:1.9829895% (345785) av.purge waits:13 purge waits:0 avg.
purge time:110ms
or
2015-10-04 05:10:20.768+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]:
RelationshipCache array:66890956 purge:0 size:257.883MiB
misses:10.522135% collisions:11.121769% (5442101) av.purge waits:0
purge waits:0 avg. purge time:N/A
All of this is happening when there are no incoming requests and neo4j is processing old pending 40K requests as I mentioned above.
Since, it is a dedicated server, should not the server be processing the queries continuously without such a large pending queue? Am I missing something here? Please help me
Didn't go completely over your queries. You should examine each of the queries you send often by prefixing with PROFILE or EXPLAIN to see the query plan and get an idea how many accesses they cause.
E.g. the second match in the following query looks like being expensive since the two patterns are not connected with each other:
MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where r.param1=true with a,a1,h match (m)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c) return a1.param,id(a1),collect(b.bid),c.bPhoto
Also enable garbage collection logging in neo4j-wrapper.conf and check if you're suffering from long pauses. If so, consider to reduce heap size.
Looks like that this issue requires more research from your side, but there is some things from my experience.
TL;DR; - I had similar issue with my own unmanaged extension, where transactions were not properly handled.
Language/connector
What language/connector is used in your application?
You should verify that:
If some popular open-source library is used - your application is using latest version. Probably there is bug in your connector.
If you have your own, hand-written solution that works with REST API - verify that ALL http request are closed at client side.
Extension/plugins
It's quite easy to mess things up, if custom-written extensions/plugins are used.
What should be checked:
All transaction are always closed (try-with-resource is used)
Neo4j settings
Verify your server configuration. For example, if you have large value for org.neo4j.server.transaction.timeout and you don't handle properly transaction at client side - you can end up with a lot of running transactions.
Monitoring
You are using Enterprise version. That means that you have access to JMX. It's good idea to check information about active Locks & Transactions.
Another Neo4j version
Maybe you can try another Neo4j version. For example 2.3.0-M03.
This will give answers for questions like:
Is this Neo4j 2.2.5 bug?
Is this existing Neo4j installation misconfiguration?
Linux configuration
Check your Linux configuration.
What is in your /etc/sysctl.conf? Are there any invalid/unrelated settings?
Another server
You can try to spin-up another server (i.e. VM at DigitalOcean), deploy database there and load it with Gatling.
Maybe your server have some invalid configuration?
Try to get rid of everything, that can be cause of the problem, to make it easier to find a problem.

Scala / Slick, "Timeout after 20000ms of waiting for a connection" error

The block of code below has been throwing an error.
Timeout after 20000ms of waiting for a connection.","stackTrace":[{"file":"BaseHikariPool.java","line":228,"className":"com.zaxxer.hikari.pool.BaseHikariPool","method":"getConnection"
Also, my database accesses seem too slow, with each element of xs.map() taking about 1 second. Below, getFutureItem() calls db.run().
xs.map{ x =>
val item: Future[List[Sometype], List(Tables.myRow)] = getFutureItem(x)
Await.valueAfter(item, 100.seconds) match {
case Some(i) => i
case None => println("Timeout getting items after 100 seconds")
}
}
Slick logs this with each iteration of an "x" value:
[akka.actor.default-dispatcher-3] [akka://user/IO-HTTP/listener-0/24] Connection was PeerClosed, awaiting TcpConnection termination...
[akka.actor.default-dispatcher-3] [akka://user/IO-HTTP/listener-0/24] TcpConnection terminated, stopping
[akka.actor.default-dispatcher-3] [akka://system/IO-TCP/selectors/$a/0] New connection accepted
[akka.actor.default-dispatcher-7] [akka://user/IO-HTTP/listener-0/25] Dispatching POST request to http://localhost:8080/progress to handler Actor[akka://system/IO-TCP/selectors/$a/26#-934408297]
My configuration:
"com.zaxxer" % "HikariCP" % "2.3.2"
default_db {
url = ...
user = ...
password = ...
queueSize = -1
numThreads = 16
connectionPool = HikariCP
connectionTimeout = 20000
maxConnections = 40
}
Is there anything obvious that I'm doing wrong that is causing these database accesses to be so slow and throw this error? I can provide more information if needed.
EDIT: I have received one recommendation that the issue could be a classloader error, and that I could resolve it by deploying the project as a single .jar, rather than running it with sbt.
EDIT2: After further inspection, it appears that many connections were being left open, which eventually led to no connections being available. This can likely be resolved by calling db.close() to close the connection at the appropriate time.
EDIT3: Solved. The connections made by slick exceeded the max connections allowed by my mysql config.
OP wrote:
EDIT2: After further inspection, it appears that many connections were being left open, which eventually led to no connections being available. This can likely be resolved by calling db.close() to close the connection at the appropriate time.
EDIT3: Solved. The connections made by slick exceeded the max connections allowed by my mysql config.