Kafkajs - `The group is rebalancing, so a rejoin is needed` error causes message to be consumed more than once - kafkajs

I have an edge in Kafkajs consumer, where at times I get a rebalancing error:
The group is rebalancing, so a rejoin is needed
[Connection] Response Heartbeat(key: 12, version: 3)
The group is rebalancing, so a rejoin is needed
[Runner] The group is rebalancing, re-joining
Then once the consumer group is rebalanced, the last message that was processed is processed again, as a commit did not occur due to the error.
Kafka consumer initialzation code:
import { Consumer, Kafka } from 'kafkajs';
const kafkaInstance = new Kafka({
clientId: 'some_client_id',
brokers: ['brokers list'],
ssl: true
});
const kafkaConsumer = kafkaInstance.consumer({ groupId: 'some_consumer_group_id });
await kafkaConsumer.connect();
await kafkaConsumer.subscribe({ topic: 'some_topic', fromBeginning: true });
await kafkaConsumer.run({
autoCommit: false, // cancel auto commit in order to control committing
eachMessage: ... some processing function
});
I increased sessionTimeout & heartbeatInteval to higher values and different combinations, but still under heavy message load, I get the error.
I added call to heartbeat function inside of eachMessage function, which seems to resolve the issue.
But was wondering its considered as "good practice" or is there something else I can do on the consumer side in order to prevent such error?

I added a call to heartbeat function inside of eachMessage function, which seems to resolve the issue.

Related

Handling of rabbit messages via NESTJs microservice issue

I'm currently having a problem that i'm unable to solve. This is only in extreme cases, when the server, for some reason, goes offline, messages accumulate (100k or more) and then they need to be processed all at the same time. Even though i'm planning for this never to happen, i would like to have a backup plan for it and more control on this issue.
I'm running an NestJS microservice against a RabbitMQ broker to get messages that arrive from IOT devices and insert them into a MySQL database.
Every message has a little conversion/translation operation that needs to be done before the insert. This conversion is based on a single row query done against a table on the same SQL Server.
The order is the following:
read message;
select 1 row from database (table has few thousand rows);
insert 1 row into the database;
Now, i'm facing this error:
(node:1129233) UnhandledPromiseRejectionWarning: SequelizeConnectionAcquireTimeoutError: Operation timeout
at ConnectionManager.getConnection (/home/nunovivas/NestJSProjects/integrador/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:288:48)
at runNextTicks (internal/process/task_queues.js:60:5)
at listOnTimeout (internal/timers.js:526:9)
at processTimers (internal/timers.js:500:7)
at /home/nunovivas/NestJSProjects/integrador/node_modules/sequelize/lib/sequelize.js:613:26
at MySQLQueryInterface.select (/home/nunovivas/NestJSProjects/integrador/node_modules/sequelize/lib/dialects/abstract/query-interface.js:953:12)
at Function.findAll (/home/nunovivas/NestJSProjects/integrador/node_modules/sequelize/lib/model.js:1752:21)
at Function.findOne (/home/nunovivas/NestJSProjects/integrador/node_modules/sequelize/lib/model.js:1916:12)
node_modules/source-map-support/source-map-support.js:516
(node:1129233) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1349)
I think that the promise rejection is inside the sequelize module.
This is my sequelize configuration:
useFactory: async (ConfigService: ConfigService) => ({
dialect: 'mysql',
host: 'someserver',
port: 3306,
username: 'dede',
password: 'dudu!',
database: 'dada',
autoLoadModels: true,
pool: { max: 5, min: 0, adquire: 1800000, idle: 5000 },
synchronize: true,
logQueryParameters: true,
This is part of my message service:
#RabbitRPC({
exchange: 'BACKEND_MAINEXCHANGE',
routingKey: 'Facility_DeviceReadings',
queue: 'Facility_DeviceReadings',
})
public async rpcHandlerDeviceReadings(mensagem: ReadingDevicesPerFacility) {
const schemavalid = mensagem;
this.mylogger.log( 'Received message from BACKEND_MAINEXCHANGE - listening to the queue Facility_DeviceReadings : ' +
' was registered locally on ' +
schemavalid.DateTimeRegistered,
MessagingService.name,
'rpcHandlerDeviceReadings',
);
if (schemavalid) {
try {
let finalschema = new CreateReadingDevicesDto();
if (element.Slot > 0) {
const result = this.readingTransService
.findOneByPlcId(element.deviceId, element.Slot)
.then((message) => {
if (!message) {
throw new NotFoundException('Message with ID not found');
} else {
finalschema.deviceId = message.deviceId;
finalschema.Slot = message.Slot2;
if (this.isNumeric(element.ReadingValue)) {
finalschema.ReadingValue = element.ReadingValue;
finalschema.DateTimeRegistered =
schemavalid.DateTimeRegistered;
this.readingDeviceService
.create(finalschema)
.then((message) => {
this.mylogger.debug(
'Saved',
MessagingService.name,
'rpcHandlerDeviceReadings',
);
return 42;
});
} else {
this.mylogger.error(
'error',
MessagingService.name,
'rpcHandlerDeviceReadings',
);
}
return message;
}
});
The problem seems that this RPC keeps going against rabbit and reading/consuming messages (8 per millisecond) before SQL as a chance to replay back, forcing sequelize into a state that it can't handle anymore and thus throwing the above error.
I have tried tweaking the sequelize config but to no good outcome.
Is there any way to force the RPC to just handle the next message after the previous one is processed?
Would love if someone could steer me in the right direction since this could eventually become a breaking issue.
Thanks in advance for any input you can give me.
It looks to me like your Sequelize connection pool options need some tweaking.
You have
pool: { max: 5, min: 0, adquire: 1800000, idle: 5000 }
adquire isn't a thing. Maybe acquire? Half an hour (1.8 million milliseconds) is a really long time to wait for a connection. Shorten it? acquire: 300000 will give you five minutes. A big production app such as yours probably should always keep one or two connections open. Increase min to 1 or 2.
A modest maximum number of connections is good as long as each operation grabs a connection from the pool, uses it, and releases it. If your operation grabs a connection and then awaits something external, you'll need more connections.
If it's possible to get your program to read in a whole bunch of messages (at least 10) at a time, then put them into your database in one go with bulkCreate(), you'll speed things up. A lot. That's because inserts are cheap, but the commit operations after those inserts aren't so cheap. So, doing multiple inserts within single transactions, then commiting them all at once, can make things dramatically faster. Read about autocommit for more information on this.
Writing your service to chow down on a big message backlog quickly will make errors like the one you showed us less likely.
Edit To use .bulkCreate() you need to accumulate multiple incoming messages. Try something like this.
Create an array of your received CreateReadingDevicesDto messages. let incomingMessages = []
Instead of using .create() to put each new message into your database as you finish receiving and validating it, instead put it into your array. incomingMessages.push(finalschema).
Set up a Javascript interval to take the data from the array and put it into your database with .bulkCreate(). This will do that every 500ms.
setInterval(
function (this) {
if (incomingMessages.length > 0) {
/* create all the items in the array */
this.readingDeviceService
.bulkCreate(incomingMessages)
/* empty out the array */
incomingMessages = []
}, 500, this);
At the cost of somewhere between 0 and 500ms extra latency, this batches up your messages and will let you process your backlog faster.
I haven't debugged this, and it's probably a little more crude than you want in production code. But I have used similar techniques to good effect.

NodeJS + mysql - automatically closing pool connections?

I wish to use connection pooling using NodeJS with MySQL database. According to docs, there are two ways to do that: either I explicitly get connection from the pool, use it and release it:
var pool = require('mysql').createPool(opts);
pool.getConnection(function(err, conn) {
conn.query('select 1+1', function(err, res) {
conn.release();
});
});
Or I can use it like this:
var mysql = require('mysql');
var pool = mysql.createPool({opts});
pool.query('select 1+1', function(err, rows, fields) {
if (err) throw err;
console.log('The solution is: ', rows[0].solution);
});
If I use the second options, does that mean, that connections are automatically pulled from the pool, used and released? And if so, is there reason to use the first approach?
Yes, the second one means that the pool is responsible to get the next free connection do a query on that and then release it again. You use this for "one shot" queries that have no dependencies.
You use the first one if you want to do multiple queries that depend on each other. A connection holds certain states, like locks, transaction, encoding, timezone, variables, ... .
Here an example that changes the used timezone:
pool.getConnection(function(err, conn) {
function setTimezone() {
// set the timezone for the this connection
conn.query("SET time_zone='+02:00'", queryData);
}
function queryData() {
conn.query( /* some query */, queryData);
}
function restoreTimezoneToUTC() {
// restore the timezone to UTC (or what ever you use as default)
// otherwise this one connection would use +02 for future request
// if it is reused in a future `getConnection`
conn.query("SET time_zone='+00:00'", releseQuery);
}
function releaseQuery() {
// return the query back to the pool
conn.release()
}
setTimezone();
});
In case anyone else stumbles upon this:
When you use pool.query you are in fact calling a shortcut which does what the first example does.
From the readme:
This is a shortcut for the pool.getConnection() -> connection.query() -> connection.release() code flow. Using pool.getConnection() is useful to share connection state for subsequent queries. This is because two calls to pool.query() may use two different connections and run in parallel.
So yes, the second one is also calling connection.release() you just don't need to type it.

mysql - node - Row inserted and queryable via server connection lost on DB restart

I ran into an issue testing today that occurred during or after an insert via a connection on my node server. The code where the insert is performed looks something like this:
// ...
username = esc(username);
firstname = esc(firstname);
lastname = esc(lastname);
var values = [username, firstname, lastname].join(',');
var statement = 'INSERT INTO User(Username,FirstName,LastName) VALUES({0});\n'+
'SELECT FirstName, LastName, Username, Id, IsActive FROM User WHERE Username={1};'
statement = merge( statement, [ values, username ] );
conn.query(statement, function(e, rows, fields){
e ? function() {
res.status(400);
var err = new Error;
err.name = 'Bad request';
err.message = 'A problem occurred during sign-up.';
err.details = e;
res.json(err);
}() : function(){
res.json( rows[1] );
}();
}
A quick note on esc() and merge(), these are simply util functions that help prepare the database statement.
The above code completed successfully, ie. the response was a 200 with the newly inserted user row in the body. The row inserted was queryable via the same connection throughout the day. I only noticed this afternoon when running the following generic query as root via shell, the row was missing.
SELECT Id, FirstName, LastName FROM User;
So at that point I restarted the database and the node server. Unfortunately. Now it would appear the row is gone entirely, as well as any reliable path to troubleshoot.
Here are some details of interest to my server setup. As of yet, no idea how (if at all) any of these could be suspect.
Uses only single connection as opposed to conn pool (for now)
multipleStatements=true in the connection config (obviously above snippet makes use of this)
SET autocommit = 0; START TRANSACTION; COMMIT; used elsewhere in the codebase to control rollback
Using poor man's keep-alive every 30 seconds to avoid connection timing out: SELECT 1;
I've been reading up all evening and am running out of ideas. Any suggestions? Is this likely an issue of uncommitted data? If so, is there a reliable way to debug this? Better yet, is there any way to prevent it? What could be the cause? And finally, if in debugging my server I find data in this state, is there a way to force commit at least so that I don't lose my changes?

ElasticClient an error when checking the connection status

I'm trying to check the connection status, but there is an exception when checking.
var node = new Uri("http://myhost:9200");
var settings = new ConnectionSettings(node);
ElasticClient client = new ElasticClient(settings);
IStatusResponse status = client.Status();
After calling client.Status() throws an exception Newtonsoft.Json.JsonReaderException
JSON integer 12500348306 is too large or small for an Int32. Path 'indices.companyindx.index.primary_size_in_bytes', line 1, position 37862.
If i do not check the status of call, then everything works fine.
I'm using C # and Nest 1.0.0-beta1
What could be the reason?
This is actually a bug in NEST, more info here. Should be fixed in the next release if that PR is merged. Good find!

Groovy Gorm catch util.JDBCExceptionReporter error on save()

I have a problem to catch util.JDBCExceptionReporter during save().
Code:
membership.withTransaction { status ->
try{
def membership = membership.findByUserId(userId)
if(!membership){
membership = new membership(userId: userId, email: userEmail, baseLevel: membership.findByName(membershipName.MEMBER)).save(flush: true)
}
}catch(org.springframework.dao.DataIntegrityViolationException e){
status.setRollbackOnly()
return false
}catch(all){
println "Insert Membership Exception.\n User Id: $userId\n " + e
}
When I create two thread to run this code, it throw a error:
2014-05-06 12:53:07,034 [Actor Thread 5] ERROR util.JDBCExceptionReporter - Duplicate entry 'test#gmail.com' for key 'email'
I don't want to show this error every time when their has two threads doing the same insert, because the first thread will go through and insert successfully, so I don't really care about second one.
My question is how to catch util.JDBCExceptionReporter?
Thanks.
Just guessing:
By default Grails doesn't throw exceptions when saving. To throw integrity exceptions you have to use save(failOnError: true).
So in this case, it's just an internal trace (util.JDBCExceptionReporter is not an exception).
In your case, instead of capturing exceptions I'd use validate before saving so you can get the integrity errors before trying to save.
As lsidroGH said, util.JDBCExceptionReporter is not an exception, it's a log message. It logs both SQLExceptions and SQLWarnings. There is no problem with your code, as one thread will have a save() call that returns true and the other thread's save() will get false.
If you don't want this message to show up in your logs, you will need to increase your log level for org.hibernate.util.JDBCExceptionReporter from ERROR to FATAL but this will potentially exclude valid exceptions you would want logged. Your best bet is to ignore it, as your code works.