So far my log4net writings have focused on specific aspects of the log4net framework.  These next posts are about putting the pieces together.

To that end, I would like to start with one recent example of how logging has saved hide.

The Call

My company phone starting cock-a-doodle-dooing about 3pm one afternoon.  It was the Director of Support calling - I'll refer to him as Bob - he was on-site trying to get a kiosk up and running for a major client, but a security configuration application was reporting a failure.  This application is basically a permissions engine that modifies DACLs on cryptographic keys, files, and registry keys based on a database of access rules.  The client was going live in an hour or so, so time was a critical factor.

"I've tried everything I know - the database is correct, the AD users are there, the security service is running, and I'm running as an administrator," he said nervously, "but it keeps failing at the same point on this one machine.  It worked on all the other machines, but this one kiosk keeps failing."

"Ok, let's open the log file.  It should be at <file path>.  Find it?" I asked.

"Um.... yep.  Ok, it's open."

"Alright.  Jump to the end, find the last line.  Read it to me."  

This is the layout of the log file for the application:

DATETIME LOGGERNAME [LEVEL] (CONTEXT) MESSAGE_STACKTRACE

The last two lines in the log appeared like so:

12:34:56 TaskExecutionContext [DEBUG] (ACLRuleBuilding) Attempting to locate computer domain name using WMI class [Win32_ComputerSystem] property [Domain].

12:34:56 SetCryptoKeyDACLTask [ERROR] (ACLRuleBuilding) An exception was raised while translating the user name to the domain: System.Runtime.InteropServices.COMException at ...

I had never encountered this error before in development, testing, or the two years the application's been used in the field.  However, the data in that one log message was enough for me to help Bob resolve the issue over the phone.  There were four bits of information that I gleaned from Bob's reading of that log message that allowed me to isolate the problem:

  1. The logger name told me that the SetCryptoKeyDACLTask class issued the error.
  2. The logging context told me that the error was occurring while the application was building ACL rules (as opposed to applying them).
  3. The log message told me the failure was occurring when the application was attempting to translate a username to the computer's domain.
  4. The preceding DEBUG log message indicates that the application was trying to find the computer's domain name immediately before the exception was raised.

"Hmmm, haven't seen that one before either," I said, "but it looks like the app is failing when it's trying to determine the computer's domain.  Any idea why this might be happening on this machine as opposed to the others?"

"Well, I had to add the machine to the domain and then ...."  I hear a loud *smack* as Bob hits himself in the head.  "Okay, I think I figured it out."

It turns out that Bob had neglected to reboot the kiosk after adding it to the domain, so when my application started spelunking through WMI to determine the computer's domain, an exception was being raised.  After a few minutes, the kiosk was running perfectly, and Bob and the client enjoyed a smooth go-live.

All things considered, Bob would have resolved this issue on his own eventually by blindly rebooting the machine and starting the security configuration process over again.  The point is that the application log contained enough information to isolate the problem to a very specific cause without resorting to looking at the source code.  COMExceptions are notoriously ambiguous; if that was all we had to go on, we may not have been able to determine the cause of the exception.

Some Recommended Practices

Let's take a look at some of the code that produced the log message:

namespace Log4Net_BestPractices1
{
    public class SetCryptoDACLTask : ITask
    {
        static log4net.ILog Log = log4net.LogManager.GetLogger(
            System.Reflection.MethodBase.GetCurrentMethod().DeclaringType
        );
        
        string userName;
        
        public void Execute( ITaskExecutionContext context )
        {        
            if( Log.IsDebugEnabled )
            {
                Log.DebugFormat( "Executing task under context [{0}]...", ( null == context ? "null" : context.Description ) );
            }
            
            // ...
            
            using( log4net.NDC.Push( "ACLRuleBuilding" ) )
            {
                // ...
                string principalName = null;
                
                try
                {
                    principalName =
                        context.TranslatePrincipalNameToDomain(
                            userName
                        );
                }
                catch( Exception e )                
                {
                    Log.Error(
                        "An exception was raised while translating the user name to the domain",
                        e
                    );
                    
                    throw;
                }
                
                // ...
            }
            
            // ...
            
            Log.Debug( "Task execution has completed." );
        }
    }
}

This code demonstrates some effective logging techniques I recommend you employ in your own applications:

  1. Use a unique logger object for each type in your application (see line 5).  This will automagically tag each log message with the class that wrote it.  Using this technique provides a log that reads like a novel of your application's activity.  
  2. Whenever you catch an exception, log it (see line 30).  Even if you just plan to throw it again, log it.  In addition, log4net knows how to format Exception objects, so don't try and build your own string from the exception data.
  3. Don't be afraid to pepper your application with piddly little log messages (see lines 15 and 47).  Each one becomes a marker of activity and can prove invaluable when trying to work an issue.  While logging does consume resources, that consumption can be controlled and optimized (something I'll talk about in the next post).  In all but the most CPU-intensive applications, the impact of logging won't be noticed when configured properly.
  4. Whenever you use one of the *Format overrides (see line 15), be extra-special-careful about validating the argument list.  It's very easy to forget to check for null references before using a property accessor, for example, because "it's just logging code."  
  5. Always remember that the argument list is evaluated before the logging method is called.  If an argument to the log method is expensive to obtain, be sure to guard your log statement with a check of the appropriate Is*Enabled property (see line 13).
  6. Before an object calls on a shared dependency, consider pushing a tag onto the log context stack (see line 20).  This will provide continuity in the log, allowing you to determine the caller of the shared code that logged a particular message.
  7. Whenever you use a formatted log statement, surround the format argument placeholders (the {}'s) with brackets or parentheses (see line 15).  Doing this will mark the areas in each log message that vary, making the log scan a bit easier.  In addition, it makes empty or null formatting arguments more obvious.  For instance, if the log message in line 15 was formatted with an empty context.Description value, the message would appear as "Executing task under context []..."  Without those brackets, the same message would be written as "Executing task under context ..." The former message makes it obvious that the context failed to provide a description value, the latter does not.

Coming Up

In upcoming posts I'll discuss log4net configuration scenarios and log reader clients.

In the meantime, if you have any additional logging practices you'd like to share, please feel free to comment on this entry!