akka.net first published message ends up in the dead letter queue, handshake problem - handshake

I have an issue with an akka.net message send/Tell that ends up in the dead letter queue.
I developed a cluster based application using Akka.Cluster.Tools.PublishSubscribe with two ActorSystems each running in a 'console.application' on the same machine.
I start up one actor system with some actors. Then I start up my 2nd. application and immediatelly after I initialized the Actor system I publish the first Message Mediator.Tell(new Publish(Topics.Backend.SomeName, new MyInitialMessage())) to a Topic where the receiving actor is hosted in the 1st. application.
This message ends up in the dead letter queue always.
Ok now, instead of sending the message immediatelly I put in a delay of e.g 5sec. Then the message could be delivered properly.
This seems to me as a handshake problem.
Question: How do I find out when the 2nd. actor system is ready to receive any messages??
My current workaround is: I send scheduler based for each second a MyInitialMessage and wait for the first response message from my 2nd. application. Then I know my 2nd. app is now ready, handshake done.
But this seems to me just as a workaround. What would be a proper solution to this issue?
chris

Akka.Cluster.Tools.PublishSubscribe works over cluster. You need to await for cluster to become initialized before you'll be able to publish any messages. All of cluster operations are encapsulated in Cluster class that can be created from any actor system using Cluster.Get(actorSystem). In order to wait for cluster to initialize:
You can join to cluster programmatically by using await cluster.JoinAsync(address, cancellationToken) - you can use it to initialize both seed nodes (just make actor system join to itself) and new nodes. This will require to leave seed-nodes in your HOCON configuration empty.
If you're initializing cluster from configuration (using HOCON config file), you can register a callback function using cluster.RegisterOnMemberUp(callback) to postpone the rest of processing until local actor system successfully joined the cluster.
The fastest (in terms of performance and resource usage) way is to subscribe to cluster membership events from within a particular actor. In fact this is how other solutions described above are actually implemented under the hood.
class MyActor : ReceiveActor
{
readonly Cluster cluster = Akka.Cluster.Cluster.Get(Context.System);
public MyActor()
{
Receive<ClusterEvent.MemberUp>(up =>
{
if (up.Member.Address == cluster.SelfAddress)
{
Become(Ready);
}
});
}
protected override void PreStart()
{
cluster.Subscribe(Self, new[]{ typeof(ClusterEvent.IMemberEvent) });
}
protected override void PostStop()
{
// rember to unsubscribe once actor is stopping
cluster.Subscribe(Self);
}
void Ready()
{
// other receiver handlers
}
}

Related

Notify subscribers after new messages have stopped coming in

In an app where users are expected to make several changes in a short period of time, I'd like to use a message queue to collect these events, and only notify listeners when new changes have stopped coming in for some period X.
The expected workflow would be:
User makes an edit -> message added to queue
User makes another edit -> message added to queue
Some time passes
Consumer is notified of all pending changes
I've looked into documentation for several different message queues, but none of them seem to have this kind of message batching out of the box.
I did find some features that might help to roll my own, e.g. Kafka has a producer config called linger that tells it to wait X ms for more messages to add to a batch, but this is clearly intended as a performance improvement. In addition, this option is at the producer side, whereas for my use case it would make more sense on the consumer side.
Is this a use case message queues can support? The lack of results makes me think that I may be trying to use message queues wrong.
Queues are not a good fit for such use cases. I would recommend using Cadence Workflow to implement your logic with a minimal effort.
Here is a straw-man design that satisfies your requirements:
Send signalWithStart request that contains an edit information to a user workflow using userID as the workflow ID. It either delivers the signal to the workflow or first starts the workflow and delivers signal to it.
All request to that workflow are buffered by it. Cadence provides hard guarantee that only one workflow with given ID can exist in open state. So all signals (events) are guaranteed to be buffered in the workflow that belongs to the user.
After configured timeout an activity that notifies users about the pending changes is invoked.
The pending changes are applied by the next activity.
The workflow complete.
Here is the workflow code that implements it in Java (Go client is also supported):
public interface BufferedEditsWorkflow {
#WorkflowMethod
void execute(String userId, Duration notifyAfter, Edit firstEdit);
#SignalMethod
void addEdit(Edit edit);
}
public interface BufferedEditsActivities {
void notifyUser(String userId, List<Edit> edits);
void process(String userId, List<Edit> edits);
}
public class BufferedEditsWorkflowImpl implements BufferedEditsWorkflow {
private final List<Edit> edits = new ArrayList<>();
private final BufferedEditsActivities activities = Workflow.newActivityStub(BufferedEditsActivities.class);
#Override
public void execute(String userId, Duration notifyAfter, Edit firstEdit)
{
edits.add(firstEdit);
// Cadence doesn't have limit on sleep duration.
// It can sleep at this line for a year with no problem.
Workflow.sleep(notifyAfter);
activities.notifyUser(userId, edits);
activities.process(userId, edits);
}
#Override
public void addEdit(Edit edit) {
edits.add(edit);
}
}
Code that starts the workflow for the first edit:
private void addFirstEdit(WorkflowClient cadenceClient, Edit edit) {
WorkflowOptions options = new WorkflowOptions.Builder().setWorkflowId(edit.getUserId()).build();
BufferedEditsWorkflow workflow = cadenceClient.newWorkflowStub(BufferedEditsWorkflow.class, options);
workflow.execute(edit.getUserId(), Duration.ofHours(1), edit);
}
Code that adds more edits.
private void addEdit(WorkflowClient cadenceClient, Edit edit) {
WorkflowOptions options = new WorkflowOptions.Builder().setWorkflowId(edit.getUserId()).build();
BufferedEditsWorkflow workflow = cadenceClient.newWorkflowStub(BufferedEditsWorkflow.class, options);
workflow.addEdit(edit);
}
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
See the presentation that goes over Cadence programming model.

How to set up Tomcat for one Database Connection per Request

I have a Sparkjava app which I have deployed on a Tomcat server. It uses SQL2O to interface with the MySQL-database. After some time I start to have trouble connecting to the database. I've tried connecting directly from SQL2O, connecting through HikariCP and connecting through JNDI. They all work for about a day, before I start getting Communications link failure. This app gets hit a handful of times a day at best, so performance is a complete non issue. I want to configure the app to use one database connection per request. How do I go about that?
The app doesn't come online again afterwards until I redeploy it (overwrite ROOT.war again). Restarting tomcat or the entire server does nothing.
Currently every request creates a new Sql2o object and executes the query using withConnection. I'd be highly surprised if I was leaking any connections.
Here's some example code (simplified).
public class UserRepositry {
static {
try {
Class.forName("com.mysql.jdbc.Driver");
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
protected Sql2o sql2o = new Sql2o("jdbc:mysql://mysql.server.name/dbname?serverTimezone=UTC", "username", "password");
public List<Users> getUsers() {
return sql2o.withConnection((c, o) -> {
return c.createQuery(
"SELECT\n" +
" id,\n" +
" name\n" +
"FROM users"
)
.executeAndFetch(User.class);
});
}
}
public class Main {
public static void main(String[] args) {
val gson = new Gson();
port(8080);
get("/users", (req, res) -> {
return new UserRepository().getUsers();
}, gson::toJson);
}
}
If you rely on Tomcat to provide the connection to you: It's coming from a pool. Just go with plain old JDBC and open that connection yourself (and make sure to close it as well) if you don't like that.
So much for the answer to your question, to the letter. Now for the spirit: There's nothing wrong with connections coming from a pool. In all cases, it's your responsibility to handle it properly: Get access to a connection and free it up (close) when you're done with it. It doesn't make a difference if the connection is coming from a pool or has been created manually.
As you say performance is not an issue: Note that the creation of a connection may take some time, so even if the computer is largely idle, creating a new connection per request may have a notable effect on the performance. Your server won't overheat, but it might add a second or two to the request turnaround time.
Check configurations for your pool - e.g. validationQuery (to detect communication failures) or limits for use per connection. And make sure that you don't run into those issues because of bugs in your code. You'll need to handle communication errors anyways. And, again, that handling doesn't differ whether you use pools or not.
Edit: And finally: Are you extra extra sure that there indeed is no communication link failure? Like: Database or router unplugged every night to connect the vacuum cleaner? (no pun intended), Firewall dropping/resetting connections etc?

wakeLock does not wait for network connectivity

I am using a wakelock for a alarm to update the app state regularly. The wifi takes a while to connect on Samsung phones. Also the "keep awake" option on Wifi does not work on Samsung phones (nor are they interested in fixing the issue). So when the wakelock does happen, it should wait for wifi to connect. Do I need to create a listener for the wifi connectivity for this to work, or should wakelock, kinda block for that wifi to connect ?
mWakeLock = ((PowerManager) getSystemService(POWER_SERVICE)).newWakeLock(
PowerManager.PARTIAL_WAKE_LOCK, "Taxeeta");
mWakeLock.acquire();
// do some network activity, in a asynctask
// in the doPost of asyscTask, release lock
Edit :
The question is, that in the AsyncTask if the network is not connected, OR takes time to get on (3g takes a while to get on), the webservice call in the Async doInBackground will fail. And I will have to release the lock anyways.
SO
Should I put in wifi/data connection listeners in ? Or is there a better way ?
I have a similar scenario - I am woken up by an alarm, the alarm's BroadcastReceiver launches a WakefulIntentService and the service starts a scan for networks. I use a stupid way of holding on to the lock1 - I intend to replace this with a latch. I suggest you replace the "AsyncTask" with a WakefulIntentService. Chances are the AsyncTask is not ever fired. In the WakefulIntentService you must acquire and hold on to a wifi lock - I would make this a static field of the YourWakefulIntentService - not entirely clear on this - it's a while back. If this does not work I would use a latch in the YourWakefulIntentService :
// register an alarm
Intent i = new Intent(context, YourReceiver.class);
PendingIntent alarmPendingIntent= PendingIntent.getBroadcast(context, 0, i,
PendingIntent.FLAG_UPDATE_CURRENT);
public class YourReceiver extends BroadcastReceiver {
#Override
public void onReceive(Context context, Intent intent) {
WakefulIntentService.sendWakefulWork(context, YourWIS.class);
}
}
//pseudocode !
public class YourWIS extends WakefulIntentService { // you must add a cstor !
#Override
doWakefulWork() {
acquireWifiLock();
enableScanReceiver();
startScan();
serviceLatch.wait();
releaseWifiLock();
}
}
// in YourScanReceiver
onReceive() {
if(action.equals(SCAN_RESULTS) {
// do something that does not take time or start another/the same
// WakefulIntentService
serviceLatch.notify();
}
}
Try first the WakefulIntentService (I guess you launch the AsyncTask from the alarm receiver). The scan receiver is a receiver registered to receive the scan results (see WifiManager docs - prefer Receivers to listeners for sleep issues)
1 : this is a working class - I just use a second wakeful intent service to keep the wake locks - have still to refactor it to use latches but this approach at least works (I have the second service (the Gatekeeper) wait on a monitor and have the wake lock inside the Gatekeeper. The gatekeeper also holds its CPU lock so all is fine (and ugly)

Is NServiceBus (AsA_Server) without DTC possible?

I am using NServiceBus for the first time and have a small, simple application where a user submits a form, the form fields are then sent to the queue, and the handler collects this data and writes it to the database using linq-to-sql.
Any changes within Component Services is a complete no-no as far as the DBA is concerned, so I'm now looking for an alternative to DTC (which is not enabled on the DB server), but using AsA_Server so that messages do not get purged.
I have tried removing AsA_Server after IConfigureThisEndpoint and specifying the configuration myself, but this doesn't seem to work (the console appears, page loads but nothing happens, it doesn't even stop at breakpoints.) AsA_Client does work, but as I understand it the messages will be purged at startup which I need to avoid.
Any suggestions?
Thanks,
OMK
EDIT: This has now been resolved by using wrapping the call to the database in a suppress transaction scope, which allows the database work to be done with no ambient transaction to enlist in:
using (TransactionScope sc = new TransactionScope(TransactionScopeOption.Suppress))
{
// code here
sc.Complete();
}
When you use AsA_Server, you are specifying you want durable queues and you will need to configure transactional queues.
With a transactional send/receive MSMQ requires you to send, transmit, receive, and process as part of one transaction. However, actually all these stages take place in their own transactions.
For example, the send transaction is complete when the sender sends a message onto their local MSMQ subsystem (even if the queue address is remote, the sender still sends to a local queue which acts as a kind of proxy to the remote queue).
The transmit transaction is complete when the MSMQ subsystem on the senders machine successfully transmits the message to the MSMQ subsystem on the receivers machine.
Even though this may all happen on one machine, I am guessing that your Handle() method is writing to a database on a different machine.
The problem here is that for the receive operation to complete satisfactorily from a transaction perspective, your call to the database must be successful. Only then will the message be de-queued from your input queue. This prevents any chance that the message is lost during processing failure.
However, in order to enforce that across the network you need to involve DTC to coordinate the distributed transaction to the database.
Bottom line, if you want durable queues in a distributed environment then you will need to use MSDTC.
Hope this helps.
There is an alternative. In your connection string you can add the option to not enlist in a distributed transaction and this will have your DB connection ignored in the DTC.
Of course, if this is set in the config then all database transactions for the application are ignored by the DTC rather than just a specific one.
Example:
<add key="DatabaseConnectionString" value="Data Source=SERVERNAME;Initial Catalog=DBNAME;Integrated Security=True;Enlist=False"/>
With NServiceBus 4.0 you can now do the following, which finally worked for me:
Configure.Transactions.Advanced(t =>
{
t.DisableDistributedTransactions();
t.DoNotWrapHandlersExecutionInATransactionScope();
});
When you use the As (AsA_Client, AsA_Server) interfaces, the configuration is applied after Init() so all the settings that you make there regarding MsmqTransport and UnicastBus are overriden.
It's possible to override those settings using IWantTheConfiguration in a IHandleProfile implementation. You get the Configuration after the default roles are applied but before the bus is started.
This way you can change the default profile settings and tailor them to your needs: deactivate transactions, enable impersonation...
Example:
public class DeactivateTransactions : IHandleProfile<Lite>, IWantTheEndpointConfig
{
private IConfigureThisEndpoint configure;
public IConfigureThisEndpoint Config
{
get { return configure; }
set
{
this.configure = value;
Configure.Instance.MsmqTransport()
.PurgeOnStartup(false)
.IsTransactional(false); // Or other changes
}
}
public void ProfileActivated()
{
}
}

NServiceBus: Messages handled multiple times

I am at a complete loss as to why I am experiencing this problem. I am new to NServiceBus and have so far set up a dead simple 'server' which listens for messages sent by a web application. The server asks for custom initialisation (IWantCustomInitialization) and uses a custom builder for Castle Windsor 2.5.1. This custom builder is basically a copy of the one that comes with the NServiceBus source code, with two minor changes to move away from methods deprecated in Windsor 2.5.
Note that my code shares the container instance with NServiceBus.
The problem I experience is that every message sent by the web application is processed five (5) times by the server. The log files have five entries for each attempt, with the fifth attempt looking like this:
2011-03-28 16:04:10,326 [Worker.8] DEBUG NServiceBus.Unicast.UnicastBus [] - Calling 'HandleEndMessage' on NServiceBus.SagaPersisters.NHibernate.NHibernateMessageModule
2011-03-28 16:04:10,327 [Worker.8] DEBUG NServiceBus.Unicast.UnicastBus [] - Calling 'HandleEndMessage' on Server.NHibernateSessionMessageModule
2011-03-28 16:04:10,341 [Worker.8] DEBUG NServiceBus.Unicast.UnicastBus [] - Calling 'HandleError' on NServiceBus.SagaPersisters.NHibernate.NHibernateMessageModule
2011-03-28 16:04:10,342 [Worker.8] DEBUG NServiceBus.Unicast.UnicastBus [] - Calling 'HandleError' on Server.NHibernateSessionMessageModule
2011-03-28 16:04:10,344 [Worker.8] ERROR NServiceBus.Unicast.Transport.Msmq.MsmqTransport [] - Message has failed the maximum number of times allowed, ID=80cffd98-a5bd-43e0-a482-a2d96ca42b22\20677.
I have no indication why the message fails, and I don't know where to dig for more information/output.
The configuration 'endpoint' looks like this:
public void Init()
{
container = Windsor.Container;
NServiceBus.Configure.With().CastleWindsor251Builder(container).XmlSerializer().MsmqTransport().IsolationLevel(System.Transactions.IsolationLevel.Unspecified);
var masterInstaller = new NotificationServerInstaller();
masterInstaller.Install(container, null);
}
The message handler is, at this stage, really contrived, and looks like this:
public class NewUserMessageHandler : IHandleMessages<NotifyNewUserMessage>
{
private readonly IGetUserQuery _getUserQuery;
public NewUserMessageHandler(IGetUserQuery getUserQuery)
{
_getUserQuery = getUserQuery;
}
public void Handle(NotifyNewUserMessage message)
{
var result = _getUserQuery.Invoke(new GetUserRequest { Id = new Guid("C10D0684-D25F-4E5E-A347-16F85DB7BFBF") });
Console.WriteLine("New message received: {0}", message.UserSystemId);
}
}
If the first line in the handler method is commented out, the message is processed only once.
I have found some posts/threads on the web (including StackOverflow) which talk about similar issues, notably http://tech.groups.yahoo.com/group/nservicebus/message/5977 and Anyone using Ninject 2.0 as the nServiceBus ObjectBuilder? - but I haven't had any success in making my problem go away.
I'd be most obliged for any help. I'm a n00b at NServiceBus!
NServiceBus isn't handling it multiple times by default it will retry 5 times if an exception occurs, you can set this in a config file. Have you got distributed transactions turned on? Because you are committing to a database and you have an open transaction (the queue transaction) when you open another transaction it will try and upgrade it to a distributed transaction, I think that may be the issue. Have you run with the console app? You should see some out put on there.
I would recommend wrapping the body of the Handle method in a try/catch and add a break point to the catch and see what is wrong.
Once you work it out, remove the try/catch.