Infinite retries when using RpcFilter in NestJS microservice setup with Kafka - exception

I am new to Kafka and I am experiencing a mixed behaviour when trying to setup proper error handling on my consumer when there is an error. In few instances I am observing retry policy in action - kafka retries my message 5 times(as what I configured) then consumer crashes, then recovers and my group rebalanaces. However, in other instances that's not happens - consumer crashes, then recovers and my group rebalances and consumer attempts to consume the message again and again, inifinitely.
Let's say I have a controller method that's subscribed to a Kafka topic
#EventPattern("cat-topic")
public async createCat(
#Payload()
message: CatRequestDto,
#Ctx() context: IKafkaContext
): Promise<void> {
try {
await this.catService.createCat(message);
} catch (ex) {
this.logger.error(ex);
throw new RpcException(
`Couldn't create a cat`
);
}
}
Using RpcFilter on this method, like this one - https://docs.nestjs.com/microservices/exception-filters
:
import { Catch, RpcExceptionFilter, ArgumentsHost } from '#nestjs/common';
import { Observable, throwError } from 'rxjs';
import { RpcException } from '#nestjs/microservices';
#Catch(RpcException)
export class ExceptionFilter implements RpcExceptionFilter<RpcException> {
catch(exception: RpcException, host: ArgumentsHost): Observable<any> {
return throwError(() => exception.getError());
}
}
I feel like it might be something funky happening with properly committing offsets or something else. Can't pinpoint it.
Any comments are suggestions are greatly appreciated.

Related

What is the correct type of Exception to throw in a Nestjs service?

So, by reading the NestJS documentation, I get the main idea behind how the filters work with exceptions.
But from all the code I have seen, it seems like all services always throw HttpExceptions.
My question is: Should the services really be throwing HttpExceptions? I mean, shouldn't they be more generic? And, if so, what kind of Error/Exception should I throw and how should I implement the filter to catch it, so I won't need to change it later when my service is not invoked by a Http controller?
Thanks :)
No they should not. An HttpException should be thrown from within a controller. So yes, your services should expose their own errors in a more generic way.
But "exposing errors" doesn't have to mean "throwing exceptions".
Let's say you have the following project structure :
📁 sample
|_ 📄 sample.controller.ts
|_ 📄 sample.service.ts
When calling one of your SampleService methods, you want your SampleController to know whether or not it should throw an HttpException.
This is where your SampleService comes into play. It is not going to throw anything but it's rather going to return a specific object that will tell your controller what to do.
Consider the two following classes :
export class Error {
constructor(
readonly code: number,
readonly message: string,
) {}
}
export class Result<T> {
constructor(readonly data: T) {}
}
Now take a look at this random SampleService class and how it makes use of them :
#Injectable()
export class SampleService {
isOddCheck(numberToCheck: number): Error | Result<boolean> {
const isOdd = numberToCheck%2 === 0;
if (isOdd) {
return new Result(isOdd);
}
return new Error(
400,
`Number ${numberToCheck} is even.`
);
}
}
Finally this is how your SampleController should look like :
#Controller()
export class SampleController {
constructor(
private readonly sampleService: SampleService
) {}
#Get()
sampleGetResponse(): boolean {
const result = this.sampleService.isOddCheck(13);
if (result instanceof Result) {
return result.data;
}
throw new HttpException(
result.message,
result.code,
);
}
}
As you can see nothing gets thrown from your service. It only exposes whether or not an error has occurred. Only your controller gets the responsibility to throw an HttpException when it needs to.
Also notice that I didn't use any exception filter. I didn't have to. But I hope this helps.

Gracefully closing connection of DB using TypeORM in NestJs

So, before I go deep in the problem let me explain you the basic of my app.
I have connection to DB(TypeOrm), Kafka(kafkajs) in my app.
My app is the Consumer of 1 topic which:
Gets some data in the callback handler, and puts that data in one table using TypeORM Entity
Maintains the Global map (in some Singleton Instance of a class) with some id (that I get in data of point 1).
At the time of app getting shutdown, my task is:
Disconnect all the consumers of the topics (this service is connected to) from the Kafka
Traverse the Global Map (point 2) and repark the message in the some topic
Disconnect the DB connections using the close method.
Here are some piece of code that might help you understand how I added the life cycle events on Server in NestJs.
system.server.life.cycle.events.ts
#Injectable()
export class SystemServerLifeCycleEventsShared implements BeforeApplicationShutdown {
constructor(#Inject(WINSTON_MODULE_PROVIDER) private readonly logger: Logger, private readonly someService: SomeService) {}
async beforeApplicationShutdown(signal: string) {
const [err] = await this.someService.handleAbruptEnding();
if (err) this.logger.info(`beforeApplicationShutdown, error::: ${JSON.stringify(err)}`);
this.logger.info(`beforeApplicationShutdown, signal ${signal}`);
}
}
some.service.ts
export class SomeService {
constructor(private readonly kafkaConnector: KafkaConnector, private readonly postgresConnector: PostgresConnector) {}
public async handleAbruptEnding(): Promise<any> {
await this.kafkaConnector.disconnectAllConsumers();
for(READ_FROM_GLOBAL_STORE) {
await this.kafkaConnector.function.call.to.repark.the.message();
}
await this.postgresConnector.disconnectAllConnections();
return true;
}
}
postgres.connector.ts
export class PostgresConnector {
private connectionManager: ConnectionManager;
constructor () {
this.connectionManager = getConnectionManager();
}
public async disconnectAllConnections(): Promise<void[]> {
const connectionClosePromises: Promise<void> = [];
connectionManager.connections?.forEach((connection) => {
if (connection.isConnected) connectionClosePromises.push(connection.close());
});
return Promise.all(connectionClosePromises);
}
}
ConnectionManager& getConnectionManager() imported from TypeORM module.
Now here are some unusual exceptions / behavior I am facing:
Disconnect all connections is throwing exception/error as in quote:
ERROR [TypeOrmModule] Cannot execute operation on "default" connection because connection is not yet established.
If connection is not yet established then how come my isConnected came true inside of if. I am not getting any clue anywhere how is this possible. And how to do graceful shutdown of the connection in TypeORM.
Do we really need to handle the closure of the connection in TypeORM or it internally handles it.
Even if, TypeORM handles the connection closure internally, how could we achieve it explicitly.
Is there any callback that can be triggered in case the connection is disconnected properly so that I am sure, that disconnection actually happened from the db.
Some of the messages are coming after I press CTRL + C (mimicking the abrupt/closure of the process of my server) and the control comes back to Terminal. This means, some thread is coming back after the handle returns to my terminal (🤷, no clue, how would I handle this, since if you see, my handleAbruptHandling is awaited and also, I cross checked all the promises are being awaited properly.)
Some of the things to know:
I properly added my module to create the hooks of server life cycle events.
Injected the objects in almost all the classes properly.
Not getting any DI issue from NEST and server is getting started properly.
Please shed some light and let me know how can I gracefully disconnect from db using typeorm api inside NestJs in case of abrupt closure.
Thanks in advance and happy coding :)
Littlebit late but may help someone..
You are missing the param keepConnectionAlive as true in TypeOrmModuleOptions, typeOrm dont keep connections alive as default. I set keepConnectionAlive as false, if a transaction keeps the connection open im going to close the connection (typeorm wait until the transaction or other process finish before close the connection), this is my implementation
import { Logger, Injectable, OnApplicationShutdown } from '#nestjs/common';
import { getConnectionManager } from 'typeorm';
#Injectable()
export class LifecyclesService implements OnApplicationShutdown {
private readonly logger = new Logger();
onApplicationShutdown(signal: string) {
this.logger.warn('SIGNTERM: ', signal);
this.closeDBConnection();
}
closeDBConnection() {
const conn = getConnectionManager().get();
if (conn.isConnected) {
conn
.close()
.then(() => {
this.logger.log('DB conn closed');
})
.catch((err: any) => {
this.logger.error('Error clossing conn to DB, ', err);
});
} else {
this.logger.log('DB conn already closed.');
}
}
}
I discovered some TypeORM docs saying "Disconnection (closing all connections in the pool) is made when close is called"
Here: https://typeorm.biunav.com/en/connection.html#what-is-connection
I tried export const AppDataSource = new DataSource({ // details }) and importing it and doing
import { AppDataSource } from "../../src/db/data-source";
function closeConnection() {
console.log("Closing connection to db");
// AppDataSource.close(); // said "deprecated - use destroy() instead"
AppDataSource.destroy(); // hence I did this
}
export default closeConnection;
Maybe this will save someone some time

How do I prevent Nest.js from logging an exception to the console?

Background
We use ApolloHandler to handle the exceptions in our Nest.js + GraphQL application.
Problem
Although ApolloHandler manages to create a structured GraphQL error response, every exception (plus it stack trace) also generates a console log and a logger entry [ExceptionHandler], polluting the application log with thousands of already managed input errors.
Question
How to set Nest.js to supress those ApolloHandler exceptions? Of course non ApolloHandler exceptions should remain logged.
Create your own custom logger to filter out those messages like:
export class AppLogger extends Logger {
error(message: string, trace: string, context?: string) {
if (message !== 'Validations failed!') {
super.error(message, trace, context)
}
}
}
And use it as
app.useLogger(new AppLogger())

Hystrix/Feign to solely react on HTTP status 429

I'm using Feign from the spring-cloud-starter-feign to send requests to a defined backend. I would like to use Hystrix as a circuit-breaker but for only one type of use-case: If the backend responds with a HTTP 429: Too many requests code, my Feign client should wait exactly one hour until it contacts the real backend again. Until then, a fallback method should be executed.
How would I have to configure my Spring Boot (1.5.10) application in order to accomplish that? I see many configuration possibilities but only few examples which are - in my opinion - unfortunately not resolved around use-cases.
This can be achieved by defining an ErrorDecoder and taking manual control of the Hystrix Circuit Breaker. You can inspect the response codes from the exceptions and provide your own fallback. In addition, if you wish to retry the request, wrap and throw your exception in a RetryException.
To meet your Retry requirement, also register a Retryer bean with the appropriate configuration. Keep in mind that using a Retryer will tie up a thread for the duration. The default implementation of Retryer does use an exponential backoff policy as well.
Here is an example ErrorDecoder taken from the OpenFeign documentation:
public class StashErrorDecoder implements ErrorDecoder {
#Override
public Exception decode(String methodKey, Response response) {
if (response.status() >= 400 && response.status() <= 499) {
return new StashClientException(
response.status(),
response.reason()
);
}
if (response.status() >= 500 && response.status() <= 599) {
return new StashServerException(
response.status(),
response.reason()
);
}
return errorStatus(methodKey, response);
}
}
In your case, you would react to 419 as desired.
You can forcibly open the Circuit Breaker setting this property at runtime
hystrix.command.HystrixCommandKey.circuitBreaker.forceOpen
ConfigurationManager.getConfigInstance()
.setProperty(
"hystrix.command.HystrixCommandKey.circuitBreaker.forceOpen", true);
Replace HystrixCommandKey with your own command. You will need to restore this circuit breaker back to closed after the desired time.
I could solve it with the following adjustments:
Properties in application.yml:
hystrix.command.commandKey:
execution.isolation.thread.timeoutInMilliseconds: 10_000
metrics.rollingStats.timeInMilliseconds: 10_000
circuitBreaker:
errorThresholdPercentage: 1
requestVolumeThreshold: 1
sleepWindowInMilliseconds: 3_600_000
Code in the respective Java class:
#HystrixCommand(fallbackMethod = "fallbackMethod", commandKey = COMMAND_KEY)
public void doCall(String parameter) {
try {
feignClient.doCall(parameter);
} catch (FeignException e) {
if (e.status() == 429) {
throw new TooManyRequestsException(e.getMessage());
}
}
}

How to count all HTTP requests sent, retries in?

Some use cases require being able to count the requests sent by the Apache API. For example, when massively requesting a web API, which API requires an authentication through an API key, and which TOS limits the requests count in time for each key.
Being more specific on the case, I'm requesting https://domain1/fooNeedNoKey, and depending on its response analyzed data, I request https://domain2/fooNeedKeyWithRequestsCountRestrictions. All sends of those 1-to-2-requests sequences, are performed through a single org.apache.http.impl.client.FutureRequestExecutionService.
As of now, depending on org.apache.httpcomponents:httpclient:4.3.3, I'm using those API elements:
org.apache.http.impl.client.FutureRequestExecutionService, to perform multi-threaded HTTP requests. It offers time metrics (how much time did an HTTP thread took until terminated), but no requests counter metrics
final CloseableHttpClient httpClient = HttpClients.custom()
// the auto-retry feature of the Apache API will retry up to 5
// times on failure, being also allowed to send again requests
// that were already sent if necessary (I don't really understand
// the purpose of the second parameter below)
.setRetryHandler(new StandardHttpRequestRetryHandler(5, true))
// for HTTP 503 'Service unavailable' errors, also retrying up to
// 5 times, waiting 500ms between each retry. Guessed is that those
// 5 retries are part of the previous "global" 5 retries setting.
// The below setting, when used alone, would allow to only enable
// retries for HTTP 503, or to get a greater count of retries for
// this specific error
.setServiceUnavailableRetryStrategy(new DefaultServiceUnavailableRetryStrategy(5, 500))
.build();, which customizes the Apache API retry behavior
Getting back to the topic :
A request counter could be created by extending the Apache API retry-related classes quoted before
Alternatively, an Apache API support unrelated ticket tends to indicate this requests-counter metrics could be available and forwarded out of the API, into Java NIO
Edit 1:
Looks like the Apache API won't permit this to be done.
Quote from the inside of the API, RetryExec not beeing extendable in the API code I/Os:
package org.apache.http.impl.execchain;
public class RetryExec implements ClientExecChain {
..
public CloseableHttpResponse execute(
final HttpRoute route,
final HttpRequestWrapper request,
final HttpClientContext context,
final HttpExecutionAware execAware) throws IOException, HttpException {
..
for (int execCount = 1;; execCount++) {
try {
return this.requestExecutor.execute(route, request, context, execAware);
} catch (final IOException ex) {
..
if (retryHandler.retryRequest(ex, execCount, context)) {
..
}
..
}
}
The 'execCount' variable is the needed info, and it can't be accessed since it's only locally used.
As well, one can extend 'retryHandler', and manually count requests in it, but 'retryHandler.retryRequest(ex, execCount, context)' is not provided with the 'request' variable, making it impossible to know on what we're incrementing a counter (one may only want to increment the counter for requests sent to a specific domain).
I'm out of Java ideas for it. A 3rd party alternative: having the Java process polling a file on disk, managed by a shell script counting the desired requests. Sure it will make a lot of disk read-accesses and will be a hardware killer option.
Ok, the work around was easy, the HttpContext class of the API is intended for this:
// optionnally, in case your HttpCLient is configured for retry
class URIAwareHttpRequestRetryHandler extends StandardHttpRequestRetryHandler {
public URIAwareHttpRequestRetryHandler(final int retryCount, final boolean requestSentRetryEnabled)
{
super(retryCount, requestSentRetryEnabled);
}
#Override
public boolean retryRequest(final IOException exception, final int executionCount, final HttpContext context)
{
final boolean ret = super.retryRequest(exception, executionCount, context);
if (ret) {
doForEachRequestSentOnURI((String) context.getAttribute("requestURI"));
}
return ret;
}
}
// optionnally, in addition to the previous one, in case your HttpClient has specific settings for the 'Service unavailable' errors retries
class URIAwareServiceUnavailableRetryStrategy extends DefaultServiceUnavailableRetryStrategy {
public URIAwareServiceUnavailableRetryStrategy(final int maxRetries, final int retryInterval)
{
super(maxRetries, retryInterval);
}
#Override
public boolean retryRequest(final HttpResponse response, final int executionCount, final HttpContext context)
{
final boolean ret = super.retryRequest(response, executionCount, context);
if (ret) {
doForEachRequestSentOnURI((String) context.getAttribute("requestURI"));
}
return ret;
}
}
// main HTTP querying code: retain the URI in the HttpContext to make it available in the custom retry-handlers code
httpContext.setAttribute("requestURI", httpGET.getURI().toString());
try {
httpContext.setAttribute("requestURI", httpGET.getURI().toString());
httpClient.execute(httpGET, getHTTPResponseHandlerLazy(), httpContext);
// if request got successful with no need of retries, of if it succeeded on the last send: in any cases, this is the last query sent to server and it got successful
doForEachRequestSentOnURI(httpGET.getURI().toString());
} catch (final ClientProtocolException e) {
// if request definitively failed after retries: it's the last query sent to server, and it failed
doForEachRequestSentOnURI(httpGET.getURI().toString());
} catch (final IOException e) {
// if request definitively failed after retries: it's the last query sent to server, and it failed
doForEachRequestSentOnURI(httpGET.getURI().toString());
} finally {
// restoring the context as it was initially
httpContext.removeAttribute("requestURI");
}
Solved.