Improve your Liferay and Tomcat logging

I recently started working on integrating Liferay with ElasticSearch + Logstash + Kibana, also known as ELK Stack. Every Liferay developer knows very well where to search for logs and what they exactly mean, but very few had time to look closer at the mechanism as a whole. Therefore, I decided that the first step will be to analyze current logging system in Liferay (bundled with Tomcat, as that is still the most common type of deployment) and make some improvements that will make it work smoother with Logstash - the tool responsible for parsing logs and feeding ElasticSearch with their's JSON representation.

All concepts presented in this article were tested on Liferay Portal 6.2 CE GA5 so they should work without any problems on all 6.2 releases, and with some minor changes on 6.1.

How does it work on fresh installation?

By default, Liferay/Tomcat prints all their log messages into the following files:

  • liferay-portal-6.2-ce-ga5/logs/liferay.2016-01-16.log - Standard Liferay Log4j log, configured in portal-log4j.xml from portal-impl.jar.
  • liferay-portal-6.2-ce-ga5/tomcat-7.0.62/logs/
    • catalina.out - Tomcat's stdout and stderr streams directed to this file by catalina.sh during startup.
    • catalina.2016-01-16.log - Standard Tomcat logs configured in liferay-portal-6.2-ce-ga5/tomcat-7.0.62/conf/logging.properties.
    • host-manager.2016-01-16.log and manager.2016-01-16.log - Tomcat's Manager webapp logs (not present in Liferay/Tomcat bundle), configured via liferay-portal-6.2-ce-ga5/tomcat-7.0.62/conf/logging.properties.
    • localhost.2016-01-16.log - Another Tomcat log configured in liferay-portal-6.2-ce-ga5/tomcat-7.0.62/conf/logging.properties.
    • localhost_access_log.2016-01-16.txt - Tomcat access log configured in liferay-portal-6.2-ce-ga5/tomcat-7.0.62/conf/server.xml.

It turns out that log configuration is stored in four different places! Additionally there are two other logging-related files - log4j.properties and logging.properties in folder liferay-portal-6.2-ce-ga5/tomcat-7.0.62/webapps/ROOT/WEB-INF/classes, but they don't seem to be used anywhere.

What is wrong with default configuration?

The first problem is redundancy - all Tomcat and Liferay logs are both written to their own files and standard output. It means that all entries from catalina.2016-01-16.log and liferay.2016-01-16.log are at the same time written to catalina.out which is not rolled-over! According to what Tomcat documentation says "CATALINA_BASE/logs/catalina.out does not rotate. But it should not be an issue because nothing should be printing to standard output since you are using a logging package, right?".

The other problem is standard log format, which for Tomcat it looks like this:

Dec 08, 2015 8:19:57 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-bio-8080"]
Dec 08, 2015 8:19:57 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["ajp-bio-8009"]
Dec 08, 2015 8:19:57 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 1387 ms

and for Liferay:

20:21:22,208 INFO  [localhost-startStop-1][DialectDetector:71] Determine dialect for PostgreSQL 9
20:21:22,318 INFO  [localhost-startStop-1][DialectDetector:136] Found dialect org.hibernate.dialect.PostgreSQLDialect
20:27:12,862 INFO  [localhost-startStop-1][StartupAction:97] There are no patches installed
20:27:14,270 INFO  [localhost-startStop-1][BaseDB:481] Database supports case sensitive queries
20:27:15,165 INFO  [localhost-startStop-1][VerifyProcess:65] Verifying com.liferay.portal.verify.VerifyProcessSuite

Both don't provide enough information for Logstash, first will not work when locale are different than 'en' and second does not even print a full date.

How can we improve it?

There are two common improvements that can be applied either to the Tomcat server and to Liferay webapp.

  • Remove logging redundancy - turn off appenders/handlers responsible for writing messages to stdout
  • Update datetime pattern to include full date, time and timezone (if possible)

Other changes are module-specific and have been described in the next two sections.

Tomcat

Tomcat logging is configured in liferay-portal-6.2-ce-ga5/tomcat-7.0.62/conf/logging.properties. In this case, we should take the following steps:

  • Remove java.util.logging.ConsoleHandler responsible for logging to stdout (occurs in several places in configuration).
  • Introduce new SimpleFormatter date pattern. Unfortunately I didn't manage to configure ISO8601 timezone (formatted as +0000 or +00:00), so in this case it wil be necessary to indicate proper timezone later, in Logstash settings.
  • Remove unnecessary log files for standard Tomcat webapps - manager/host-manager.

Updated logging.properties is available below (check out gist revisions to see what changes have been made to original version): 

After applying the changes, you will notice that manager.[date].loghost-manager.[date].log log files are gone and messages within catalina.[date].log, has been supplemented by full date and source sections:

[16.01.2016 12:36:40,116] INFO [org.apache.coyote.AbstractProtocol init] Initializing ProtocolHandler ["http-bio-8080"]
[16.01.2016 12:36:40,138] INFO [org.apache.coyote.AbstractProtocol init] Initializing ProtocolHandler ["ajp-bio-8009"]
[16.01.2016 12:36:40,142] INFO [org.apache.catalina.startup.Catalina load] Initialization processed in 876 ms
[16.01.2016 12:36:40,182] INFO [org.apache.catalina.core.StandardService startInternal] Starting service Catalina
[16.01.2016 12:36:40,183] INFO [org.apache.catalina.core.StandardEngine startInternal] Starting Servlet Engine: Apache Tomcat/7.0.62

Liferay

Liferay original configuration is stored in META-INF/portal-log4j.xml from liferay-portal-6.2-ce-ga5/tomcat-7.0.62/webapps/ROOT/WEB-INF/lib/portal-impl.jar. Thanks to the Liferay Logging architecture (more precisely to those lines coming from com.liferay.util.log4j.Log4JUtil), we are able to update log4j configuration without the need to repackage jar file. In order to do that, we have to create two files in liferay-portal-6.2-ce-ga5/tomcat-7.0.62/webapps/ROOT/WEB-INF/classes/META-INF/:

  • log4j.dtd copied from github or directly from portal-impl.jar/unzip://META-INF/log4j.dtd
  • Empty portal-log4j-ext.xml to which will contain all configuration settings that override these from portal-log4j.xml

As you can see in the following snippet of newly-created portal-log4j-ext, there are only two small sections of original file present:

  • File Appender which allows us to modify timedate pattern.
  • root section with <appender-ref ref="CONSOLE" /> removed.

(Again, check out gist revisions to see the exact changes)

After applying the changes, liferay logs will look as follows:

[16.01.2016 12:37:03,626 +0000] INFO  [localhost-startStop-1][DialectDetector:71] Determine dialect for HSQL Database Engine 2
[16.01.2016 12:37:03,630 +0000] WARN  [localhost-startStop-1][DialectDetector:86] Liferay is configured to use Hypersonic as its database. Do NOT use Hypersonic in production. Hypersonic is an embedded database useful for development and demo'ing purposes. The database settings can be changed in portal-ext.properties.
[16.01.2016 12:37:03,698 +0000] INFO  [localhost-startStop-1][DialectDetector:136] Found dialect org.hibernate.dialect.HSQLDialect
[16.01.2016 12:42:09,041 +0000] INFO  [localhost-startStop-1][StartupAction:97] There are no patches installed
Note that Liferay's JVM timezone  must be set to GMT (via proper parameter in setenv.sh), otherwise datetime calculations may be incorrect. Additionally, it is generally recommended that logs should be stored in GMT and then presented in local time. However, in our case it is not a problem as Kibana - the tool used for browsing logs - uses client's browser timezone.

Summary

In this article I briefly described Liferay+Tomcat logging system and proposed some changes that will make it's integration with ELK stack easier. I focused on regular Tomcat and Liferay logs as they caused the most problems to Logstash. There are also two other files that I ommitted for the following reasons:

  • localhost_access_log.2016-01-16.txt - This log follows the common access-log pattern so I don't think there will be any problem for Logstash to process it correctly.
  • catalina.out - After removing console appenders from tomcat logging and liferay log4j configs, there are only few lines left in catalina.out. Furthermore, they cannot be altered withouht recompiling Liferay sources as they are printed as a result of System.out.println(...) method.

In the foreseeable future I plan to play more with Liferay / ELK integration and if I get any results worth publishing, I will put them in my next blog posts. If you want to do that on your own, I'd recommend to study resources from Eric's Chin session at Liferay North American Symposium 2015, available at his github: https://github.com/ericchin/nas2015.

Blogs
Thank you Krzysztof!
Very well written blog and perfect timing as I was about to play around with the ELK-stack as well!
Hi Marcus, Thanks for your opinion. I'll be working on ELK and Liferay integration this and next week. When I get any satifsying results and have some spare time, I will write about that on my blog and maybe publish Vagrant machine that contain working integration.
Hi.
Very nice and clear explanation, better than any official one. It saved me a lot of headaches.

I'm currently running Liferay 6.2 CE with runit, so every output log is managed by it and I don't need any other log.
My goal is to get rid of the liferay-log files in the root of liferay folder. Is this possible ? It's possible to set a OFF level entirely through portal-log4j-ext.xml ?

Thanks.