Logback: Exception when the file grows beyond maxFileSize - logback

I have below rolling policy in the logback.xml.
Problem is, if file size grows beyond 10 MB , it's throwing exception.
Looks like it is trying to create a new file but since the same file is already present for the same date, it's not able to do that and throws exception.
For example , we already have a file pvExport.2016-05-15.log and if pvExport.log grows beyond 10 MB, it would try to create the file with the same name as pvExport.2016-05-15.log, and hence would throw exception, not sure though
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<!-- rollover daily -->
<fileNamePattern>${EXPORT_LOG_HOME}/pvExport.%d{yyyy-MM-dd}.log
</fileNamePattern>
<timeBasedFileNamingAndTriggeringPolicy
class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<maxFileSize>10MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>

Your fileNamePattern is actually invalid in this case. From the docs
Note the "%i" conversion token in addition to "%d". Both the %i and %d tokens are mandatory. Each time the current log file reaches maxFileSize before the current time period ends, it will be archived with an increasing index, starting at 0.
Adding the %i conversion token to your pattern should fix this:
<fileNamePattern>${EXPORT_LOG_HOME}/pvExport.%d{yyyy-MM-dd}.%i.log</fileNamePattern>

Related

Kafka Consumer - How to set fetch.max.bytes higher than the default 50mb?

I want my consumers to process large batches, so I aim to have the consumer listener "awake", say, on 1800mb of data or every 5min, whichever comes first.
Mine is a kafka-springboot application, the topic has 28 partitions, and this is the configuration I explicitly change:
Parameter
Value I set
Default Value
Why I set it this way
fetch.max.bytes
1801mb
50mb
fetch.min.bytes+1mb
fetch.min.bytes
1800mb
1b
desired batch size
fetch.max.wait.ms
5min
500ms
desired cadence
max.partition.fetch.bytes
1801mb
1mb
unbalanced partitions
request.timeout.ms
5min+1sec
30sec
fetch.max.wait.ms + 1sec
max.poll.records
10000
500
1500 found too low
max.poll.interval.ms
5min+1sec
5min
fetch.max.wait.ms + 1sec
Nevertheless, I produce ~2gb of data to the topic, and I see the consumer-listener (a Batch Listener) is called many times per second -- way more than desired rate.
I logged the serialized-size of the ConsumerRecords<?,?> argument, and found that it is never more than 55mb.
This hints that I was not able to set fetch.max.bytes above the default 50mb.
Any idea how I can troubleshoot this?
Edit:
I found this question: Kafka MSK - a configuration of high fetch.max.wait.ms and fetch.min.bytes is behaving unexpectedly
Is it really impossible as stated?
Finally found the cause.
There is a broker fetch.max.bytes setting, and it defaults to 55mb. I only changed the consumer preferences, unaware of the broker-side limit.
see also
The kafka KIP and the actual commit.

JMeter: Capturing Throughput in Command Line Interface Mode

In Jmeter v2.13, is there a way to capture Throughput via non-GUI/Command Line mode?
I have the jmeter.properties file configured to output via the Summariser and I'm also outputting another [more detailed] .csv results file.
call ..\..\binaries\apache-jmeter-2.13\bin\jmeter -n -t "API Performance.jmx" -l "performanceDetailedResults.csv"
The performanceDetailedResults.csv file provides:
timeStamp
elapsed time
responseCode
responseMessage
threadName
success
failureMessage
bytes sent
grpThreads
allThreads
Latency
However, no amount of tweaking the .properties file or the test itself seems to provide Throuput results like I get via the GUI's Summary Report's Save Table Data button.
All articles, postings, and blogs seem to indicate it wasn't possible without manual manipulation in a spreadsheet. But I'm hoping someone out there has figured out a way to do this with no, or minimal, manual manipulation as the client doesn't want to have to manually calculate the Throughput value each time.
It is calculated by JMeter Listeners so it isn't something you can enable via properties files. Same applies to other metrics which are calculated like:
Average response time
50, 90, 95, and 99 percentiles
Standard Deviation
Basically throughput is calculated as simple as dividing total number of requests by elapsed time.
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time)
Hopefully it won't be too hard for you.
References:
Glossary #1
Glossary #2
Did you take a look at JMeter-Plugins?
This tool can generate aggregate report through commandline.

Kafka SparkStreaming configuration specify offsets/message list size

I am fairly new to both Kafka and Spark and trying to write a job (either Streaming or batch). I would like to read from Kafka a predefined number of messages (say x), process the collection through workers and then only start working on the next set of x messages. Basically each message in Kafka is 10 KB and I want to put 2 GB worth of messages in a single S3 file.
So is there any way of specifying the number of messages that the receiver fetches?
I have read that I can specify 'from offset' while creating DStream, but this use case is somewhat different. I need to be able to specify both 'from offset' and 'to offset'.
There's no way to set ending offset as the initial parameter (as you can for starting offset), but
you can use createDirectStream (the fourth overloaded version in the listing) which gives you the ability to get the offsets of the current micro batch using HasOffsetRanges (which gives you back OffsetRange).
That means that you'll have to compare values that you get from OffsetRange with your ending offset in every micro batch in order to see where you are and when to stop consuming from Kafka.
I guess you also need to think about the fact that each partition has its sequential offset. I assume it would be easiest if you could go a bit over 2GB, as much as it takes to finish the current micro-batch (could be couple of kB, depending on density of your messages), in order to avoid splitting the last batch on consumed and unconsumed part, which may require you to fiddle with offsets that Spark keeps in order to track what's consumed and what isn't.
Hope this helps.

How to finalize log file just after time is over when using logback SizeAndTimeBasedFNATP?

My object is size(10M) and time(daily) based compressed(zip) archiving, so i write the config like this:
<appender name="Behavior" class="ch.qos.logback.core.rolling.RollingFileAppender">
<Encoding>UTF-8</Encoding>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<FileNamePattern>${LOG_HOME}/%d{yyyy-MM-dd}.%i.log.zip</FileNamePattern>
<timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<maxFileSize>10MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>
<layout class="ch.qos.logback.classic.PatternLayout">
<pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} %logger{50} - %msg%n
</pattern>
</layout>
</appender>
But i meet a problem. For example, today is Aug 10th, so logback is writing the log file "2013-08-10.0.log".
But the log file won't be finalized(it means be closed and compressed to "2013-08-10.0.log.zip") at Aug 11st 0:00:00. In fact, it won't be finalized until the first record after Aug 10th is written.
This means, after Aug 10th, if the first record is written at Aug 11st 16:00:00, I can't get "2013-08-10.0.log.zip" when i scan the directory between Aug 11st 0:00:00 and 16:00:00. I can only get "2013-08-10.0.log" and i can't make sure if it is finished.
How can i do to finalize the log file as soon as time is over?
According to the logback-manual, the rollover is triggered on the first log-event AFTER the rollover time, not the time itself:
"For various technical reasons, rollovers are not clock-driven but depend on the arrival of logging events. For example, on 8th of March 2002, assuming the fileNamePattern is set to yyyy-MM-dd (daily rollover), the arrival of the first event after midnight will trigger a rollover. If there are no logging events during, say 23 minutes and 47 seconds after midnight, then rollover will actually occur at 00:23'47 AM on March 9th and not at 0:00 AM. Thus, depending on the arrival rate of events, rollovers might be triggered with some latency." (http://logback.qos.ch/manual/appenders.html)
So there is no configuration-only way to achieve what you intend. If it's this much an issue, you could perhaps try to implement code in your application that sends a logging-event right after midnight to make sure the rollover is triggered in a timely fashion. If you have no access to the main application's code, you could implement a simple little application that just clocks and only sends that one logging-event after midnight every day and uses the same appender.

How to limit the number of files per day when using Logback SizeAndTimeBasedFNATP

When using the Logback SizeAndTimeBasedFNATP triggering policy, how can the number of files per day be limited? For example, on any given day, I don't want to have more than 100MB of logs. Given that each log (in the example below) is 20MB, I would want to be able to set a max limit of 5 files per day.
The FixedWindowRollingPolicy provides a maxIndex property, but the TimeBasedRollingPolicy does not have maxIndex. Is there a recommended approach to applying a maxIndex when using the TimeBasedRollingPolicy?
<appender name="some.file" class="ch.qos.logback.core.rolling.RollingFileAppender">
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>logs/some_app_%d{yyyyMMdd}.log.%i</fileNamePattern>
<timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<maxFileSize>20MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>
<encoder>
<pattern>%level %date{yyyy-MM-dd HH:mm:ss:SSS} %msg%n</pattern>
</encoder>
currently this is not possible. Look at this answer Logback, set max history files per day.
You cannot rollover both time and size based rolling/triggering policy.
It's possible to limit total size of logs with ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>log/log.%d{yyyy-MM-dd}.%i.log</fileNamePattern>
<!-- each file should be at most 100MB, keep 60 days worth of history, but at most 20GB -->
<maxFileSize>100MB</maxFileSize>
<maxHistory>60</maxHistory>
<totalSizeCap>20GB</totalSizeCap>
</rollingPolicy>
From logback docs