Friday, 21 April 2017

Log4j Tips : Use MDC or Mapped Dignostic Context to distinguish logging per Client or Request ~ mkniit

The MDC or Mapped Diagnostic Context is a concept or feature of Log4j logging library which can be used to group related log messages together. For example, by using MDC you can stamp a unique identification String like clientId or orderId on each log message and then by using grep command in Linux, you can extract all log messages for a particular client or order to understand exactly what happened to a particular order. This is especially very useful in multi-threaded, concurrent Java applications where multiple threads are simultaneously processing multiple orders from multiple clients.  In such applications, searching for relevant log messages in a big log file where log messages for multiple orders or clients are overlapping is a big task.

The MDC concept is not a new concept and available since log4j 1.2 version. Since most of real world systems e.g. high-frequency trading application process thousands of order simultaneously where different thread runs the same piece of code but with different data, each execution is unique, but without a unique identification it's difficult to find all log messages related to processing a particular order. The MDC or Mapped Diagnostic Context feature of Log4j solves that problem by stamping unique identification string on relevant log messages.


Once you put this information on MDC at the start of your processing cycle, all log messages result with the processing of that order will be stamped with same OrderId, but processing of different order will be having different orderId, this way you can track complete processing of a particular order. You can even find if an order made to your system or not.


Why use Mapped Diagnostic Context or MDC in Log4j?

If you have worked in a multithreaded high volume transaction processing system, then you have faced a situation, where you need to trace processing of a particular order, without MDC, it's nightmare. You need to include Order specific information in every single log messages you print.

With MDC or ThreadContext from Log4j2 onwards, you just need to add that data one time, at the start of your processing, after that it's available in every log messages.

Btw, this issue is not just for multi-threaded core Java application, you will face similar issues on an enterprise or web applications, where multiple client requests are processed simultaneously by different threads. Log messages from processing two different orders may interleave together. MDC allows you to stamp each log messages with a unique identification number e.g. orderId, which is unique for every order, sessionId, which is unique for every session or username, which is unique for every user.

In short, Mapped Diagnostic Context will help you in debugging, troubleshooting, and even in auditing. It's also one of the logging best practices to employ on production systems.


Log4j also provides a similar utility called 
NDC, known as Nested Diagnostic Context, both of which are replaced by ThreadContext class in Log4j 2.

The ThreadContext class provides a Map and a Set to replace MDC and NDC. The Thread Context Map allows any number of items to be added and be identified using key/value pairs, while ThreadContextStack allows one or more items to be pushed on the Stack and then be identified by their order in the Stack or by the data itself. 

Since key/value pairs are more flexible, the Thread Context Map is recommended when data items may be added during the processing of the request or when there are more than one or two items. Btw, if you an experienced Java developer, I suggest you reading a good book e.g. Java Performance Definitive Guide By Scott Oaks to learn more about how logging impacts performance. It's very important for a seasoned developer to know what he is doing. 

Log4j Tips : Use MDC or Mapped Dignostic Context to distinguish logging per Client or Request




How to use MDC to log client specific information in log4j

You can put any information into Mapped Diagnostic Context or MDC by calling the put() method. MDC is a static class i.e. a class with just static methods, which means you can directly call them without creating any instance of MDC.

Remember, this information is stored as thread local variable, which may cause a memory leak in a web application if used incorrectly (see). If you are using MDC to log client or order specific information e.g. orderId in a web application using a filter than you can also remove it once done.

try{
  MDC.put("tradeId", trade.getId());
}finally{
  MDC.remove.remove("tradeId");
}


By the way from log4j2 onwards, MDC and NDC (Nested Diagnostic Context) are merged into a class called ThreadContext So, instead of using MDC, you need to use ThreadContext class to put unique context data.

try{
   ThreadContext.put("tradeId", trade.getId());
}finally {
   ThreadContext.clear();
}


Once available in MDC, this information can be printed in the log file using PatternLayout. You can use %X{tradeId) to print tradeId in log messages, remember tradeId for different trades will be different, which allows you to trace logging messages for individual trades. You can use following Patterns to include MDC contents in log files:
  • Use %X by itself to include the full contents of the Map.
  • Use %X{key} to include the specified key.
  • Use %x to include the full contents of the Stack

Remember MDC is managed on a per thread basis and every child thread automatically inherits a copy of the mapped diagnostic context from its parent. This is achieved by using InheritableThreadLocal class, which is a subclass of the ThreadLocal class.

This class extends ThreadLocal to provide inheritance of values from parent thread to child thread: when a child thread is created, the child receives initial values for all inheritable thread-local variables for which the parent has values.

Here is an example, where both parent and child thread is using the same value from MDC or ThreadContextMap:




Normally the child's values will be identical to the parent's; however, the child's value can be made an arbitrary function of the parent's by overriding the childValue() method


That's all guys, if you have been using Log4j 1.x or Log4j2 and not using MDC, you are missing a very important feature, especially if you often need to know what happened to a particular order, request or user. I have found putting information like OrderId, Username, RquestId, or something which can distinguish each request adds tremendous value during troubleshooting and debugging. It's also easier to trace orders and answer support queries. If you have a unique id to distinguish each message, then always include them into MDC to distinguish logging information from client or order basis.


Other Java Logging Tutorials and Best Practices you may like
  • 5 Java Performance tuning books for experienced Programmers (list)
  • Why use Log4j logging vs System.out.println in Java? (answer)
  • Why use SLF4j over log4j for logging in Java? (answer)
  • How to enable SSL debugging log in Java Virtual Machine? (tips)
  • How to configure Log4j without XML or Properties file in Java? (tutorial)
  • Difference between Functional and Non-Functional Requirement (article)
  • 10 Essential JVM Options for Real world Java Applications (tips)


Thanks for reading this article so far.

No comments:

Post a Comment