Logback prudent and multiple rollovers - logback

The prudent mode enables multiple JVM writes to one single log file.
If I use a RollingFileAppender with prudent mode, how does Logback knows to do the rollover only once? or sync the rollovers. Does it automagically work or should I worry and tweak my configuration to make sure only one JVM is doing rollovers.
I have extended TimeBasedRollingPolicy to enable file uploading to a backup storage and I wanted to know what's the best way to make sure the uploading is only done from one file. I can do it with a property or other logic but wanted to understand if Logback had a mechanism already to decide "who does the rollover".

Related

Why config files should't be changed line-by-line with Chef / Puppet?

Why is changing lines in configuration file considered an anti-pattern in Chef or Puppet?
It's something like bad habit, as I understood. I assume that this file-editing is done in some idempotent way and with advanced tools (augeas for example).
Why is deploying the whole files, with ERB templates, considered a preferred method?
You can find a lot of examples where dev-ops are suggesting usage of templates instead of file-editing. For example here, here, here, etc.
Actually there is a large part of the DevOps community that sees accepting system/package defaults for config files and only modifying what you need through augeas as the preferred method, Github devops would be one of them(if you happened to catch them at Puppet Conf 2012).
I think having a default pattern of always using templates creates too high of a maintenance load and almost always requires you lock in specific versions for everything across your stack or you risk having an incompatible template against a newer version of that resource.
There's use cases for both options but in general I favor the "own as little as possible" practice vs the "own everything even if you don't have to" practice.
In terms of setting the your system to a known state, deploying whole files is better than editing, because you are sure the file is exactly as intended when you are done.
If you are tinkering around finding potential solutions to a problem and hand edit some configuration file, you don't have to worry about the hand edit you made staying around as an uncontrolled part of your environment. The next time you run chef-client, you know that the state will be exactly as specified in the Chef recipe, and won't include your edit.
Also, it is just in general harder and more complicated to robustly edit a file than it is to just generate one. You might write something that is idempotent in the basic case, but if the file contains a syntax error or something invalid, than your editing no longer works.
As always though, sometimes you don't have a choice, and editing is the only way to go.

What are logging libraries for?

This may be a stupid question, as most of my programming consists of one-man scientific computing research prototypes and developing relatively low-level libraries. I've never programmed in the large in an enterprise environment before. I've always wondered, what are the main things that logging libraries make substantially easier than just using good old fashioned print statements or file output, simple programming logic and a few global variables to determine how verbosely things get logged? How do you know when a few print statements or some basic file output ain't gonna cut it and you need a real logging library?
Logging helps debug problems especially when you move to production and problems occur on people's machines you can't control. Best laid plans never survive contact with the enemy, and logging helps you track how that battle went when faced with real world data.
Off the shel logging libraries are easy to plug in and play in less than 5 minutes.
Log libraries allow for various levels of logging per statement (FATAL, ERROR, WARN, INFO, DEBUG, etc).
And you can turn up or down logging to get more of less information at runtime.
Highly threaded systems help sort out what thread was doing what. Log libraries can log information about threads, timestamps, that ordinary print statements can't.
Most allow you to turn on only portions of the logging to get more detail. So one system can log debug information, and another can log only fatal errors.
Logging libraries allow you to configure logging through an external file so it's easy to turn on or off in production without having to recompile, deploy, etc.
3rd party libraries usually log so you can control them just like the other portions of your system.
Most libraries allow you to log portions or all of your statements to one or many files based on criteria. So you can log to both the console AND a log file.
Log libraries allow you to rotate logs so it will keep several log files based on many different criteria. Say after the log gets 20MB rotate to another file, and keep 10 log files around so that log data is always 100MB.
Some log statements can be compiled in or out (language dependent).
Log libraries can be extended to add new features.
You'll want to start using a logging libraries when you start wanting some of these features. If you find yourself changing your program to get some of these features you might want to look into a good log library. They are easy to learn, setup, and use and ubiquitous.
There are used in environments where the requirements for logging may change, but the cost of changing or deploying a new executable are high. (Even when you have the source code, adding a one line logging change to a program can be infeasible because of internal bureaucracy.)
The logging libraries provide a framework that the program will use to emit a wide variety of messages. These can be described by source (e.g. the logger object it is first sent to, often corresponding to the class the event has occurred in), severity, etc.
During runtime the actual delivery of the messaages is controlled using an "easily" edited config file. For normal situations most messages may be obscured altogether. But if the situation changes, it is a simpler fix to enable more messages, without needing to deploy a new program.
The above describes the ideal logging framework as I understand the intention; in practice I have used them in Java and Python and in neither case have I found them worth the added complexity. :-(
They're for logging things.
Or more seriously, for saving you having to write it yourself, giving you flexible options on where logs are store (database, event log, text file, CSV, sent to a remote web service, delivered by pixies on a velvet cushion) and on what is logged at runtime, rather than having to redefine a global variable and then recompile.
If you're only writing for yourself then it's unlikely you need one, and it may introduce an external dependency you don't want, but once your libraries start to be used by others then having a logging framework in place may well help your users, and you, track down problems.
I know that a logging library is useful when I have more than one subsystem with "verbose logging," but where I only want to see that verbose data from one of them.
Certainly this can be achieved by having a global log level per subsystem, but for me it's easier to use a "system" of some sort for that.
I generally have a 2D logging environment too; "Info/Warning/Error" (etc) on one axis and "AI/UI/Simulation/Networking" (etc) on the other. With this I can specify the logging level that I care about seeing for each subsystem easily. It's not actually that complicated once it's in place, indeed it's a lot cleaner than having if my_logging_level == DEBUG then print("An error occurred"); Plus, the logging system can stuff file/line info into the messages, and then getting totally fancy you can redirect them to multiple targets pretty easily (file, TTY, debugger, network socket...).

What is a Smalltalk "image"?

What is a Smalltalk "image"? Is it like serializing a Smalltalk run-time?
Most popular programming systems separate program code (in the form of class definitions, functions or procedures) from program state (such as objects or other forms of application data). They load the program code when an application is started, and any previous application state has to be recreated explicitly from configuration files or other data sources. Any settings the application programmer doesn't explicitly save, you have to set back up whenever you restart.
Many Smalltalk systems, however, do not differentiate between application data (objects) and code (classes). In fact, classes are objects themselves. Therefore most Smalltalk systems store the entire application state (including both Class and non-Class objects) in an image file. The image can then be loaded by the Smalltalk virtual machine to restore a Smalltalk-like system to a previous state.
http://en.wikipedia.org/wiki/Smalltalk#Image-based_persistence
The Smalltalk image is a very interesting beast. Look at it as a kind of immortality. Many current Smalltalk systems, Pharo, Squeak, VisualWorks among them, share a common ancestor, that is, a Smalltalk image from Xerox PARC. This common ancestor however is not some remote thing, but actually still alive in those modern systems. The modern variants were produced by sending messages to the objects in that image. Some of those messages actually morphed the current objects. Classes are full-blown objects, and creating new classes is done by sending messages to class objects. Some of the objects in a Smalltalk image may date back to 1972, when the first Smalltalk image bootstrapped! Smalltalk images never die, they just fade into something potentially fundamentally different. You should view your application building as not fundamentally different from creating a new Smalltalk version.
When the smalltalk VM starts, it loads a saved state of objects (yes: including open file streams, windows, threads and more) from the "image" into its memory and resumes execution where it left when the image was saved.
At any time during your work, you can "save an image" (aka: a snapshot of the current overall state) into an image file. You can keep multiple images on your disk. Useful if you work on different projects.
Images are often (but not in all smalltalk systems) portable across architectures; for example, a squeak image can be loaded into bot a windows and a mac (and even an android) squeak VM. Images are not portable across dialects, and sometimes not across versions within a dialect.
Images usually contain everything - even the debugger, compiler, editors, browsers etc. However, for deployment, it is sometimes useful to "strip" (i.e. remove unused stuff) from an image - either to hide secrets (;-) or to make it smaller (for embedded or mobile devices).
Most Smalltalks cannot live without an image, with the exception of Smalltalk/X and (I think) S#-Smalltalk (but I am on thin ice here...)
To save and transport source code, images are not useful - use either fileout in standard format or in xml or in any other transport format (there are many).
Images are also not useful for marshalling/unmarshalling; use xml, binarystorage, databases, glorb or any other serialization method for that.
It's serialising everything in the whole system, including all development work and all user data. Everything apart from the kernel of the runtime environment.
Smalltalk, like Java, runs on a Virtual Machine running symbolic bytecode, and it contains low-level things like the garbage collector. This makes Smalltalk very portable, and also very write-once-run-anywhere.
Unsurprisingly, this was the inspiration for Java. So the Smalltalk VM (StVM) is the equivalent of the Java Runtime Environment.
In Smalltalk, everything else is stored in RAM. The codebase, which is dynamically compiled on-the-fly for the StVM. All the object data that you've built up by running your vertical and horizontal end-user apps. All the customisation you've done to the windowing environment and its appearance. All the new code you've written. A song you have loaded onto the VM to play in a music player. Any other data, code or objects you're using or have loaded in.
It's all live in the PC's memory.
Periodically, you may want to save the current-state-play to disk. When you do that, you freeze the Smalltalk VM momentarily, and it copies everything into a single disk-file. That disk-file is called the image file, and by default it will have a .image suffix in most distributions on PCs (whether they are running Linux, MacOS, Windows or RiscOS).
It's a like the way you save your work-in-progress when you are in a word-processor or a spreadsheet on a typical PC. Except that this save includes the latest version of the spreadsheet code that the spreadsheet app itself is made out of.
The Smalltalk system does have other ways of securing your data. If you develop any software, or alter any of the codebase that the Smalltalk system is written in, it logs every change to disk in real-time.
You have the option of writing code, or loading an app, that will can save your source-code and it's associated data structures, to distributed source code repositories, or to repositories on your local disc. Or to relational databases. Or to object databases or the newly-fashionable NoSQL databases.
Most pre-written apps backup the data to disk(s) or database(s) on-the-fly.
The image is a save of the entire Smalltalk system, (apart from the Virtual Machine. The Virtual Machine is equivalent to the Java Runtime Environment. Everything else is stored in the image.
Write a new File System to access the underlying OS's discs? That's in the image. (and all the changes have also been logged to disk automatically by the Smalltalk system).
Enter a whole bunch of data into your Smalltalk image-based object database? That's in the image.
Want to do a factory-reset to your Smalltalk system? Simply go back to using the image file you received when you first installed Smalltalk. Want to save the image every hour on the hour, and then restore back to 4 hours ago? Just load the image file from four hours ago.
The image is a copy of everything that the Smalltalk system has in memory. Except for the small, unchanging, vital proportion of the system which is the Virtual Machine.
I recommend you read Pharo By Example. To quote from its first chapter,
"The current system image is a snapshot of a running Pharo system,
frozen in time. It consists of two files: an .image file, which contains the
state of all of the objects in the system (including classes and methods,
since they are objects too), and a .changes file, which contains a log of all
of the changes to the source code of the system. In Figure 1.1, these files
are called pharo.image and pharo.changes."
HTH
http://book.seaside.st/book/getting-started/pharo-squeak/what-is-image
All Smalltalk objects live in
something called an image. An image is
a snapshot of memory containing all
the objects at a given point in time.
Second hit on google.
Put simply, a Smalltalk image is an image of the Smalltalk environment which has been saved at a given point in time. When this image is reloaded into the Smalltalk runtime system, everything is as it was at the time the image was saved.
Images saved by one Smalltalk system cannot, in general, be loaded by a different Smalltalk system.
I find image-based development incredibly empowering. If I get interrupted I can save the image, and when I get back to it I'm right back where I was. Debuggers that were open are still open, waiting to continue. There's very little "got to figure out how to get back where I was" - it's more "OK, let's continue...".
Share and enjoy.
In almost every other language (apart from ABAP as far as I have been told by some senior SAP developers), you have a clear separation:
Code you are working on which defines logic
State of the program you are running
database, generally things you need as input for your code
In Smalltalk, all of this can be - note can - in the image.
In theory, you can deploy a Smalltalk application in a loaded image which brings all the data and logic and runs the application at startup.
In practice, as far as my experience goes, you tend not to do that for reasons.
If you stay inside the image, you have everything available at your disposal, for good or ill.
Classes are objects, methods are objects, so you can actually do things like add a
self halt
in a method, run some code which calls this method, change the method while it is being executed, recompile it and have the code continue.
You can also do wonderful things like passing method names as string to a method and then performing the argument without having this visible in code anywhere.
Both things are very good for learning and trying out. Not so good for production code or maintaining producion code.
The thing I was struggling with in the beginning was that for instance for UI creation, you create a window in the Smalltalk image which is then created "a second time" by the OS, with window handle and everything.
Of course, you can save the window with the Smalltalk image, it will also open up again (normally), but what happens internally is that there is a list of windows (i.e. all the UI components) which have been saved with the image in their Smalltalk state.
During Image Startup, there is a process that iterates over this list and asks the OS to recreate all of them.
In theory, you can do this for everything the OS offers:
file handles, resource handles, ports and so on
In practice, you probably dont want to do that.
The companies I worked at had nice beginner tutorials of code to run before saving your image in order not to get into trouble when restarting the next day.
Ideally, you could combine a Smalltalk image concept with peristence and just store all your objects in a real database.
I do not have the overview whether any Smalltalk dialect has done this, however.

Sensible defaults for configuration

I've recently started to play with Ruby on Rails which favours convention over configuration and relies on sensible defaults to tie various aspects of the application together.
I was thinking that it might be useful if this concept of sensible default configuration was used in general configation for various frameworks then it might save some development headache.
For example, in a .net app I usually want to log an exception in the windows event log using enterprise library exception handling block but if I don't explicity state the behaviour I want in a config file then EL will complain. I think that instead, if it can't find custom configuration then it should revert to a sensible default configuration, like logging my exception in the event log.
Would this be a good or bad concept for frameworks to adopt for their configuration?
I work a lot with a framework that does this exact thing. My trouble with this way of working is that:
the framework grew to having an excessive amount of configuration keys that are actually never used/set in a configuration file.
behavior of the software becomes implicit sometimes, I want to explicitly set the system to behave a certain way instead of having it fallback on some other code path due to a 'default'.
a missed typo in configuration key may result in a very long diagnostic session before figuring out what is going on.
When forgetting to set a configuration value I rather have the software tell me, instead of assuming some form of behavior that I might not at all be after.
I'd prefer a 'template' configuration file in which I change what I want and have the unchanged settings serve as the default.
Figuring which out which convention the software picked when debugging can be a lot of time wasted also.

What are the best practices to log an error?

Many times I saw logging of errors like these:
System.out.println("Method aMethod with parameters a:"+a+" b: "+b);
print("Error in line 88");
so.. What are the best practices to log an error?
EDIT:
This is java but could be C/C++, basic, etc.
Logging directly to the console is horrendous and frankly, the mark of an inexperienced developer. The only reason to do this sort of thing is 1) he or she is unaware of other approaches, and/or 2) the developer has not thought one bit about what will happen when his/her code is deployed to a production site, and how the application will be maintained at that point. Dealing with an application that is logging 1GB/day or more of completely unneeded debug logging is maddening.
The generally accepted best practice is to use a Logging framework that has concepts of:
Different log objects - Different classes/modules/etc can log to different loggers, so you can choose to apply different log configurations to different portions of the application.
Different log levels - so you can tweak the logging configuration to only log errors in production, to log all sorts of debug and trace info in a development environment, etc.
Different log outputs - the framework should allow you to configure where the log output is sent to without requiring any changes in the codebase. Some examples of different places you might want to send log output to are files, files that roll over based on date/size, databases, email, remoting sinks, etc.
The log framework should never never never throw any Exceptions or errors from the logging code. Your application should not fail to load or fail to start because the log framework cannot create it's log file or obtain a lock on the file (unless this is a critical requirement, maybe for legal reasons, for your app).
The eventual log framework you will use will of course depend on your platform. Some common options:
Java:
Apache Commons Logging
log4j
logback
Built-in java.util.logging
.NET:
log4net
C++:
log4cxx
Apache Commons Logging is not intended for applications general logging. It's intended to be used by libraries or APIs that don't want to force a logging implementation on the API's user.
There are also classloading issues with Commons Logging.
Pick one of the [many] logging api's, the most widely used probably being log4j or the Java Logging API.
If you want implementation independence, you might want to consider SLF4J, by the original author of log4j.
Having picked an implementation, then use the logging levels/severity within that implementation consistently, so that searching/filtering logs is easier.
The easiest way to log errors in a consistent format is to use a logging framework such as Log4j (assuming you're using Java). It is useful to include a logging section in your code standards to make sure all developers know what needs to be logged. The nice thing about most logging frameworks is they have different logging levels so you can control how verbose the logging is between development, test, and production.
A best practice is to use the java.util.logging framework
Then you can log messages in either of these formats
log.warning("..");
log.fine("..");
log.finer("..");
log.finest("..");
Or
log.log(Level.WARNING, "blah blah blah", e);
Then you can use a logging.properties (example below) to switch between levels of logging, and do all sorts of clever stuff like logging to files, with rotation etc.
handlers = java.util.logging.ConsoleHandler
.level = WARNING
java.util.logging.ConsoleHandler.level = ALL
com.example.blah = FINE
com.example.testcomponents = FINEST
Frameworks like log4j and others should be avoided in my opinion, Java has everything you need already.
EDIT
This can apply as a general practice for any programming language. Being able to control all levels of logging from a single property file is often very important in enterprise applications.
Some suggested best-practices
Use a logging framework. This will allow you to:
Easily change the destination of your log messages
Filter log messages based on severity
Support internationalised log messages
If you are using java, then slf4j is now preferred to Jakarta commons logging as the logging facade.
As stated slf4j is a facade, and you have to then pick an underlying implementation. Either log4j, java.util.logging, or 'simple'.
Follow your framework's advice to ensuring expensive logging operations are not needlessly carried out
The apache common logging API as mentioned above is a great resource. Referring back to java, there is also a standard error output stream (System.err).
Directly from the Java API:
This stream is already open and ready
to accept output data.
Typically this stream corresponds to
display output or another output
destination specified by the host
environment or user. By convention,
this output stream is used to display
error messages or other information
that should come to the immediate
attention of a user even if the
principal output stream, the value of
the variable out, has been redirected
to a file or other destination that is
typically not continuously monitored.
Aside from technical considerations from other answers it is advisable to log a meaningful message and perhaps some steps to avoid the error in the future. Depending on the errors, of course.
You could get more out of a I/O-Error when the message states something like "Could not read from file X, you don't have the appropriate permission."
See more examples on SO or search the web.
There really is no best practice for logging an error. It basically just needs to follow a consistent pattern (within the software/company/etc) that provides enough information to track the problem down. For Example, you might want to keep track of the time, the method, parameters, calling method, etc.
So long as you dont just print "Error in "