Blogs
Introduction
Recently I was asked how to implement MDC and/or NDC in Liferay 7.4, and since I thought it might be interesting to a wider audience, I'm going to blog about it here.
Background
In a multithreaded, multiuser environment such as in a web application like Liferay, different threads in the container will be handling different requests, but likely each request is doing something radically different. I could be changing my Search Blueprint for the next holiday cupcake promotion while you might be reading the blog I wrote about it.
Either way, when either the Liferay code or our custom code is logging messages, the messages just get added into the log in the order they are hit in the code, so basically my log messages and your log messages are all mixed together in the logs. For a handful of concurrent users, this is probably not a big problem, but if you imagine 1,000 concurrent users all doing different things and all generating log messages and those messages being woven together in the log, I'm sure you can see that trying to analyze and diagnose a problem from one user in this kind of log would likely be like trying to find a particular needle in a pile of needles.
To allow for meaningful interpretation and analysis of the logs, major logging frameworks have support for MDC or Mapped Diagnostic Context, and/or NDC or Nested Diagnostic Context.
With a diagnostic context, the code is able to add information on a thread-by-thread basis that can be included in a log message. So I, as the developer, can define a diagnostic context that includes the current user's screen name at the thread level along with a log format to include that detail in every message that is logged.
In each of the diagnostic contexts, information is added at the thread level and will remain with the thread until it is cleared, so if I am adding the current user screen name in the diagnostic context at the start of servicing a request, I have to remember to clear it when done otherwise the value I set might be used in the next thread. I also want to clear them up to avoid leaking memory or sensitive details across reuse of threads by the application server.
Updating Log4j2 Configuration
Liferay 7.4 leverages Log4j2 for the internal logging framework and to get the diagnostic context added to the log messages we'll need to update the log4j2 configuration.
Fortunately for us, Liferay makes this kind of easy. All we need to do is create the $LIFERAY_HOME/tomcat-9.0.56/webapps/ROOT/WEB-INF/classes/META-INF/portal-log4j-ext.xml file with our configuration overrides.
You want to start from Liferay's base file and then make the necessary pattern changes to include either the MDC or NDC you're after. Here's the initial part of the file I am using for NDC:
<?xml version="1.0"?> <Configuration strict="true"> <Appenders> <Appender name="CONSOLE" type="Console"> <Layout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %-5p [%t][%c{1}:%L] %x %m%n" type="PatternLayout" /> </Appender> <Appender filePattern="@liferay.home@/logs/liferay.%d{yyyy-MM-dd}.log" ignoreExceptions="false" name="TEXT_FILE" type="RollingFile"> <Layout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %-5p [%t][%c{1}:%L] %x %m%n" type="PatternLayout" /> ...
When I'm using the MDC pattern I'd use something like
%X{SN}
and I'll get only the screen name right before the message.
When I'm using the NDC pattern, I'll get all the NDC context before the message.
So now we know the basics of MDC and NDC plus we know how to reconfigure Liferay to use the updated patterns, but what do we do in the code to leverage them?
MDC vs NDC
First let's start with what they share... Both of these connect data to the current thread and both must be cleaned up when finished. They also both leverage the ThreadContext object to add/remove context values.
The difference is really that NDC values are simple strings, while an MDC is a map of key/value pairs. Since NDC is a little simpler, I'll start showing that one before the MDC.
So I might use an NDC like:
ThreadContext.push("SN=" + user.getScreenName());
or the MDC equivalent of:
ThreadContext.put("SN", user.getScreenName());
On the logging side, we use the %x
conversion
pattern for the NDC messages, or we use the
%X{key}
conversion pattern for the MDC messages.
So let's say I want to build a service wrapper around the
RoleLocalService
so I could identify who is adding roles
(kind of like an audit, but I'm just writing to the logs).
I'd use a RoleLocalServiceWrapper
and in my
addRole()
method override I'd have to set up and then
clean up my diagnostic context...
NDC Version
Here's what I would do for my addRole()
method
logging for NDC:
public Role addRole( long userId, String className, long classPK, String name, java.util.Map titleMap, java.util.Map descriptionMap, int type, String subtype, ServiceContext serviceContext) throws com.liferay.portal.kernel.exception.PortalException { // set up the context with the user's screen name. User user = _userLocalService.getUser(userId); ThreadContext.push("SN=" + user.getScreenName()); // let the super add the role Role addedRole = super.addRole( userId, className, classPK, name, titleMap, descriptionMap, type, subtype, serviceContext); _log.info("Added role " + name); // remove the screen name we pushed ThreadContext.pop(); return addedRole; }
With this code in place, any time a user action invoked the addRole() call (and using the correct log4j2 config from the previous section), I'd see a message like:
2022-02-14 06:08:03.663 INFO [main][MyRoleLocalServiceWrapper:134] [SN=omniadmin] Added role NewRole
It starts to get ugly as I might add additional NDC context for current layout, current site, etc, as all of those context strings would be tacked onto my log message; I can't really be selective about what to include for the NDC usage.
MDC Version
One feature of the MDC is that we can be selective about what we
want on a log message. In the log4j2.properties file example above,
the MDC conversion pattern includes %X{SN}
which
basically says to use the value of the SN
in the thread
context map in the log message.
The MDC code that I would use is:
public Role addRole( long userId, String className, long classPK, String name, java.util.Map titleMap, java.util.Map descriptionMap, int type, String subtype, ServiceContext serviceContext) throws com.liferay.portal.kernel.exception.PortalException { // set up the context map with the user's screen name. User user = _userLocalService.getUser(userId); ThreadContext.put("SN", user.getScreenName()); // let the super add the role Role addedRole = super.addRole( userId, className, classPK, name, titleMap, descriptionMap, type, subtype, serviceContext); _log.info("Added role " + name); // remove the screen name we stored earlier ThreadContext.remove("SN"); return addedRole; }
And my log file message will now look like:
2022-02-14 06:08:03.663 INFO [main][MyRoleLocalServiceWrapper:134] [omniadmin] Added role NewRole
Now it doesn't matter what is in the thread context
map, only the SN
key will be included in the log
because that's all the conversion pattern includes.
Manual Version
Some developers out there might be asking themselves, well what the heck is wrong with just changing the code so the log message just reads:
_log.info("[" + user.getScreenName() + "] Added role " + name);
This absolutely creates the same message as we generated for the
MDC example, but we didn't have to change the logging config and we
didn't have to worry about the ThreadContext
setup or cleanup.
The issue is in any messages that might actually be logged by
Liferay's RoleLocalServiceImpl
's addRole()
method. Since you're not changing all logging messages to
look up the user and tack their screen name at the end of every
message, those other messages will not have the screen name on them.
Using the MDC or NDC option, any message logged between the two
ThreadContext
method calls will include the users screen
name in the log, that includes the RoleLocalServiceImpl
class, any ModelListener<Role>
classes that might
be logging, etc.
So sure, it can be more work to add the MDC/NDC handling, but I hope you can see the benefits of including the diagnostic contexts to analyze the flow of control from a single thread through the addRole() call.
So Which One Wins?
Manual is clearly out, but for choosing between the MDC and the NDC I think it depends upon who is in control of what will be logged...
The layout pattern in the log4j2.properties (or XML or YAML or whatever you use) is going to be fixed, right?
If I'm the system admin and I want to control what is going into
the log, I might want to use MDC and the %X{SN}
pattern
to only include the screen name in the logs. If the developers want to
define a new value, say Site
that they want to include,
they'll need to come to me and convince me to add it and maybe my
pattern would become %X{SN} %X{Site}
.
On the other hand, if I don't want to be bothered with changing
the log patterns every time a developer thinks they need some new
value to be added and reported in the log message, I might want to
use NDC and the %x
pattern. The developers will be
responsible if they put too much junk in there, but after all I did
put the responsibility on their shoulders by using NDC. At any time
the developers could do a
ThreadContext.push("Site=blah")
and that value
would automatically start showing up in the logs along with the SN
.
Where to put MDC/NDC Code
In my cheesy example above, I was setting the MDC or NDC around
the addRole()
message. Maybe this is okay because I'm
really only concerned with setting up the context only for messages
logged in this particular method.
More often than not, though, you're probably going to want to
put your ThreadContext
code in places that will offer
the widest coverage that you're interested in. Some prime examples are:
- A PortletFilter implementation - here you could set up the MDC/NDC around interactions with a specific portlet and not have it across everything in the logs.
- A Servlet Filter implementation - here you're looking more at the full request handling scope, so every message generated in a thread while servicing the request would include the diagnostic context. This can be a pretty wide net though as you could end up getting called for not just page rendering and actions, but also for web service calls, internal cluster messaging, etc.
I prefer to look at a ServicePreAction/ServicePostAction pair.
These actions are invoked on the way in to a new service call and on the way out of the service call, so it tends to be a great way to set up the context before anything significant happens and then clean it up after the service completes.
We can define a ServicePreAction in 7.4 as follows:
@Component( immediate=true, property="key=servlet.service.events.pre", service=LifecycleAction.class ) public class NDCSetupServicePreAction extends Action { public void run(HttpServletRequest request, HttpServletResponse response) { try { // if we are paranoid, we might want to add // ThreadContext.clearAll() call here to ensure nothing // from the previous thread is still in the context for this // thread. ThemeDisplay themeDisplay = (ThemeDisplay) request.getAttribute(WebKeys.THEME_DISPLAY); User user = null; if (themeDisplay != null) { user = themeDisplay.getUser(); } if (user == null || user.isDefaultUser()) { ThreadContext.push("Guest"); } else { ThreadContext.push(user.getScreenName()); } } catch (Exception e) { ThreadContext.push("Unknown"); } } }
In this pre-action implementation, we grab the
ThemeDisplay
from the request attributes (should already
be set up by the time this is called), get the User
from
the ThemeDisplay
instance, and if the user is not the
guest user we push their screen name, otherwise we just push the word
Guest onto the ThreadContext
.
This is only invoked on the way into the service handling by the portal, so we can't do the cleanup in this class. Instead, we'll include a ServicePostAction in a similar way but to handle the cleanup:
@Component( immediate=true, property="key=servlet.service.events.post", service=LifecycleAction.class ) public class NDCCleanupServicePostAction extends Action { public void run(HttpServletRequest request, HttpServletResponse response) { ThreadContext.clearAll(); } }
The great part about this implementation for the cleanup, it
really won't matter where else in the code that a developer puts a
ThreadContext
push()
or put()
call, this cleanup post action will always clean up the
thread context before the thread starts to handle a new request.
If you're going to be using either MDC or NDC in your code, I would absolutely recommend including this cleanup service post action to purge the thread context at the end of every service request.
Fish Tagging
Yes, that's right, Fish Tagging. If you've not heard of this before, I'm not talking about catching fish, tagging them, and then throwing them back into the river.
Neil Harrison described this method in the book Pattern Languages of Program Design 3 , and basically it is a technique to lightly stamp a unique identifier on a request initiated from the same client transaction.
We can do the same thing with an updated Service Pre Action:
@Component( immediate=true, property="key=servlet.service.events.pre", service= LifecycleAction.class ) public class FishTaggingServicePreAction extends Action { public void run(HttpServletRequest request, HttpServletResponse response) { ThreadContext.push(PortalUUIDUtil.generate()); } }
This is an extremely lightweight method for putting a unique identifier at the beginning of each log message generated in a service call, allowing you to gather all associated messages and filter out the ones you don't need. It also avoids bleeding PII or other sensitive info in the logs.
You'd still use the NDCCleanupServicePostAction
to
strip out the UUID from the ThreadContext
to prevent
mis-tagging messages.
Cleanup, Cleanup, Cleanup
So I can't really emphasize this enough because it is super
important to clean up the ThreadContext
or else you could
end up blowing out memory.
For example, if I only deployed the module with
the NDCSetupServicePreAction
class and I tested on my
local to see that my screen name was included in the log message, I
would think it was working and not know about a possible problem that exists.
If I were to just start navigating around the portal page, each
navigation would trigger the NDCSetupServicePreAction
and
each time I would end up invoking
ThreadContext.push(user.getScreenName())
adding more and
more values to the ThreadContext
until eventually things
would start failing.
Cleanup is important because it would prevent this kind of potential failure from happening. You don't want to clean up too soon lest you get log messages without the diagnostic context in them, but you absolutely want to do them when you are technically done with them.
Conclusion
So here we are, we've successfully added MDC and/or NDC into our Liferay 7.4 platform. Going forward our log messages will each be stamped with either Guest or the user's screen name. When you're called up to figure out what went wrong with Sue's request, you should be able to find all the log messages stamped with her screen name and filter out the ones that belong to Tom, Dick and Harry.
Not all environments will need MDC and/or NDC support. But for those out there that do need this support, I hope this blog helps you to set it up and use it correctly.