Setting up Remote Staging through https

Setting up Remote Staging for a community (or organization) in Liferay is easy: On the staging machine go to "Manage Settings, Staging", select "Remote Live" as staging type and provide the address, port and remote group id (the community/organization that you want to publish to).

On the production machine (where you want to publish to), you need to configure

tunnel.servlet.hosts.allowed and axis.servlet.host.allowed in portal-ext.properties to allow the staging machine to access web services on the production machine.

Done!

Often problems arise when you want to check the "Use a Secure Network Connection" as well, e.g. use https to publish to your remote system. However, setting up a secure connection is easy when you know a few properties of encrypted connections. I'm going to explain the pitfalls of encrypted server-to-server communication in this post.

What's in this for you?

  • I'm going to discuss the two aspects of https that you need to know to understand the “why” of the following step-by-step instruction

  • I'm giving brief instruction on how to configure tomcat to accept https connections with a self-signed certificate (you'll find these in more details in other places on the internet, but as it's required for ssl-staging it's required to be done first)

  • Then I'm explaining the steps for configuring another server to connect to the first one in order to do remote staging over https.

  • Finally I'm giving some debugging info – typical things that can go wrong.

What problems does https solve?

The best known feature of https is that it provides an encrypted channel for the communication - nobody can eavesdrop on the communication. The lesser known but equally important feature is certified identity of the server: The client can be reasonably sure that it's indeed talking to the expected server, not to some Man-In-The-Middle attacker, intercepting the connection before it has been established. Obviously there would not much value in encrypting the communication when you can't be sure you're sending your data to the expected destination.

This server identity feature can be the reason for some trouble setting up Remote Staging through https as it involves an im- or explicit trust relationship between both machines.

(Far more detail about SSL, TLS, https and encryption in general is available all over the web; I've learned the most in the SecurityNow podcast, see episodes 30-37, 181, 183, 195 and probably more)

Trust?

How does this trust work? You probably have heard about certificates and seen your browser protest when it believes that it doesn't trust a certificate. Why is this? Your browser contains a bunch of "trustworthy" certificate authorities that can "sign" server keys. They typically validate that the key-owner is also the owner of the domain. If you have such a trustworthy certificate, you have less work with the setup of the connection (but you're typically paying these companies for their services)

For internal connections, debugging and test setups, it's somewhat common to use self-signed certificates. These are obviously not trusted by your browser, so in a browser you get a severe looking warning. (if you want to try it: My own server at https://www.olafkock.de uses a self-signed certificate - at least at the time I'm writing this post)

With the interaction that a browser inherently provides (typically there's a user involved that can interact), it's easy to manually trust the certificate when the connection is being established.

However, when a server process like your staging server is connecting to a remote-live server that uses a self-signed certificate, there's no user sitting in front of the server process to manually add the trust relationship while the connection is being established. Consequentially the connection can not be established. This is source of some confusion and what we'll work on in the next section.

The scenario

To setup a scenario for this article, I'm going to use two machines: staging.example.com and production.example.com .

We will be creating content on staging.example.com and publish this remotely through https to production.example.com. In order to do this, we first need to set up https on production.example.com. Note that this is the only server that needs to be available through https for purposes of this post.

Setting up https on production.example.com

There's a lot of documentation available for setting up https with your webserver of choice. I'm going quickly over this by providing the steps to enable https on a tomcat installation on production.example.com: We typically provide the key and the self-signed certificate in a custom keystore to tomcat. This contains the private key that must be well guarded: Everybody knowing this key is able to decrypt the traffic. The only place where it makes sense outside of your tomcat installation is in your backup.

When we have created the custom keystore we'll make it known to tomcat's connector on production.example.com. For this task we're using java's keytool:

production$ keytool -genkey -alias production.example.com -keyalg RSA -keystore /opt/liferay/production-keystore

Enter keystore password: changeit
Re-enter new password: changeit
What is your first and last name?
  [Unknown]: production.example.com
What is the name of your organizational unit?
  [Unknown]: whatever you like
What is the name of your organization?
  [Unknown]: give the name of your organization here
What is the name of your City or Locality?
  [Unknown]: give the name of your location here
What is the name of your State or Province
  [Unknown]: provide it here
What is the two-letter country-code for this unit?
 > de

With this you have generated a private key that can be used in tomcat. Note that the server name is actually in the field for your “first and last name”. You use this by adding a connector in tomcat's server.xml like this:

<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true" maxThreads="150" scheme="https" secure="true" clientAuth="false" sslProtocol="TLS"
keystoreFile="/opt/liferay/production-keystore" keystorePass="changeit" />

(You'll find a predefined, commented, entry similar to this in tomcat's default server.xml)

Start tomcat (liferay) and make sure that it answers on https://production.example.com:8443. When you first access the URL with your browser, you will see that the browser doesn't trust your certificate – it's self-signed and the browser lets you go through some hoops to accept that you indeed trust it.

Note that the keystore now contains the private key – you better guard that file and keep it from being accessed by anybody. However, you definitely want to back it up.

While we're still working on production.example.com, make sure to have the portal-ext.properties configuration from the introduction in place:

tunnel.servlet.hosts.allowed=127.0.0.1,SERVER_IP,10.0.8.15
tunnel.servlet.https.required=true
axis.servlet.hosts.allowed=127.0.0.1,SERVER_IP,10.0.8.15
axis.servlet.https.required=true

and remember to replace 10.0.8.15 with the actual IP of staging.example.com

Establishing trust of staging.example.com to production.example.com

We will use this setup to create a server-to-server connection from staging.example.com to production.example.com. In order to do this, we need to establish a trust relationship. During the setup of production.example.com, we created a separate keystore and referenced it from tomcat's configuration file server.xml. The keystore contains the private key as well as the (self signed) certificate.

When we want to connect from staging.example.com we only need public information from production.example.com – there's no need to copy over the full keystore. To illustrate that I'm going to use the https connection to access production.example.com and retrieve the certificate this way. The openssl tools can do that (available on every linux distribution and probably also on the other platforms):

staging$ openssl s_client -connect production.example.com:8443
CONNECTED(00000003)
depth=0 /C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
verify error:num=18:self signed certificate
verify return:1
depth=0 /C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
verify return:1
---
Certificate chain
 0 s:/C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
   i:/C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
---
Server certificate
-----BEGIN CERTIFICATE-----
MIICeTCCAeKgAwIBAgIETaiUJjANBgkqhkiG9w0BAQUFADCBgDELMAkGA1UEBhMC
REUxDzANBgNVBAgTBkhlc3NlbjERMA8GA1UEBxMIRXNjaGJvcm4xFTATBgNVBAoT
DExpZmVyYXkgR21iSDEVMBMGA1UECxMMTGlmZXJheSBHbWJIMR8wHQYDVQQDExZw
cm9kdWN0aW9uLmV4YW1wbGUuY29tMB4XDTExMDQxNTE4NTMyNloXDTExMDcxNDE4
NTMyNlowgYAxCzAJBgNVBAYTAkRFMQ8wDQYDVQQIEwZIZXNzZW4xETAPBgNVBAcT
CEVzY2hib3JuMRUwEwYDVQQKEwxMaWZlcmF5IEdtYkgxFTATBgNVBAsTDExpZmVy
YXkgR21iSDEfMB0GA1UEAxMWcHJvZHVjdGlvbi5leGFtcGxlLmNvbTCBnzANBgkq
hkiG9w0BAQEFAAOBjQAwgYkCgYEAqeWzhKHuDGI34JSbc0ccrFolTb+THN8SeX1x
CVZirQ8zLqkKayQke0MEH3ZMPNXM6hKslVAOj4NNr+AhHc/BNw8qNvFRNCWFQ07f
aAr+223bPc4tVnii+8xuYFjtc1vAEpagT8W79ebUxv7iRObherQPdtxhzm3MYfR4
/7BI44UCAwEAATANBgkqhkiG9w0BAQUFAAOBgQAaPRMRI9u7cxeiWuo7tiWpPMws
57pxgBumUyyxNpg9ij5QJeSrEBlV9HSsuzPuE+2F/x0Fo7U73tgVi7BfeAOiaVWy
qpnJwdfODWknGni12u1o2+xxAL9i5nVtcDjTu2MjRU8Zlg5tWntZD74kKjMvCY4M
aJAp7Gr+Pyh/qvLAYQ==
-----END CERTIFICATE-----
subject=/C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
issuer=/C=DE/ST=Hessen/L=Eschborn/O=Liferay GmbH/OU=Liferay GmbH/CN=production.example.com
---
No client certificate CA names sent
---
SSL handshake has read 1216 bytes and written 253 bytes
---
New, TLSv1/SSLv3, Cipher is EDH-RSA-DES-CBC3-SHA
Server public key is 1024 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : EDH-RSA-DES-CBC3-SHA
    Session-ID: 4DA8B14EA359D0AAE85D021E35942BE622A599032D107E0E8E42BE372C3B8FA4
    Session-ID-ctx: 
    Master-Key: 88AA956B660E031A631AC0A394D04F8516A988CE314F9CC6A5511E8C92879CA2DE2A9BF14E98BAA80778B8BD0CE3750D
    Key-Arg   : None
    Start Time: 1302901070
    Timeout   : 300 (sec)
    Verify return code: 18 (self signed certificate)
---

The bold part in the above output is interesting. This is the public certificate that the server provides to clients to prove its identity. This is what we'll have to register as “trusted” on our client machine, staging.example.com. Copy this block (including the BEGIN CERTIFICATE and END CERTIFICATE lines into a new file. I'll name this publisher-cert.txt.

In order to be trusted by the running server process, this certificate needs to be imported into the JRE's keystore of the JRE that actually runs the server (staging.example.com), which you'll typically find in $JRE_HOME/lib/security/cacerts. This time it can't be “just added” like we did above with the keystore that we referenced from tomcat's server.xml. As we're now possibly dealing with some system-wide installation of java, you may need to gain root or administrative permissions to do this:

staging $ sudo keytool -import -alias production.example.com -keystore $JRE_HOME/lib/security/cacerts -file publisher-cert.txt
Enter keystore password:  
Owner: CN=production.example.com, OU=Liferay, O=Liferay GmbH, L=Eschborn, ST=Hessen, C=DE
Issuer: CN=production.example.com, OU=Liferay, O=Liferay GmbH, L=Eschborn, ST=Hessen, C=DE 
Serial number: 4dab0790 
Valid from: Sun Apr 17 17:30:24 CEST 2011 until: Sat Jul 16 17:30:24 CEST 2011 
Certificate fingerprints: 
	 MD5:  1C:CD:8E:31:B1:41:01:CE:DF:DC:69:32:AF:D2:59:41 
	 SHA1: 90:E8:F2:3B:6C:79:58:A0:A3:DA:B2:89:C6:9D:51:16:6E:F3:F9:A9 
	 Signature algorithm name: SHA1withRSA 
	 Version: 3 
Trust this certificate? [no]:  yes 
Certificate was added to keystore 

With this, finally, our JRE on staging.example.com trusts the certificate used for production.example.com to be legitimate and we can finally create a server-to-server connection. Ok, might require a restart, this time for staging.example.com, but we're finally where we want to be.

Debugging

If remote publishing still doesn't work, these are the most common things to check:

Allow access to webservices

In portal-ext.properties on production.example.com you need to allow staging.example.com to access the webservice used for publishing (tunnel.servlet.hosts.allowed, axis.servlet.hosts.allowed). You can easily check this by starting a browser on staging and point it to https://production.example.com:8443/tunnel-web/axis . If this shows a “403 access denied” message, the server will not be able to connect. If you have only shell access to your server, w3m, lynx or links are text-mode browsers that work well for this test.

Certificate name mismatch

If you have a certificate for example.com, but your server is available as (and you connect to) production.example.com, the host name and the certificate do not match. They have to match in order to get the trust relationship between the two computers established.

If you want to configure your settings differently in test and production environments, but still be able to use the same or similar host names, install the same trust relationships, you might want to look up wildcard certificates. These certify the identity of *.example.com, so you can reuse it for various setups.

IPV4? IPV6?

if this seems to be configured correctly, you might have IPV4 vs. IPV6 issues: If production.example.com resolves as an IPV6 address, you need to allow staging's IPV6 address in portal-ext.properties, not the IPV4 address. Or configure your JVM to prefer the IPV4 stack over the IPV6 one: -Djava.net.preferIPv4Stack=true

Firewalls

if your servers are in different networks, make sure that the firewall between these networks allows access on the port that you need. Especially if you have nonstandard ports, like tomcat's 8443. If you follow the proceedings I've named above, you will not have these problems, because you've already connected to the port you need with the openssl tool to obtain the certificate.

SSL Terminators

If you have an apache or some other device handling the https there are more complexities to take into account: Tomcat needs to know that you're connecting through https – when you just use a reverse proxy to connect Apache to tomcat, this information is typically lost. As far as I know, ajp provides all required information to tomcat while http typically doesn't – but please test for yourself, don't take my words for granted.

NAT or multiple network interfaces

Maybe the staging connection doesn't originate from staging.example.com – or doesn't appear so – e.g. when there's NAT. This is easy to detect: Connect with a browser from staging.example.com to https://production.example.com/tunnel-web/axis. When you get a 403 error message, you'll see what IP is denied access. This is what you need to allow in tunnel.servlet.hosts.allowed and axis.servlet.hosts.allowed

https setup, trust

something can go wrong configuring the trust relationship between the servers involved. As you know, there's no way to “ad hoc” accept some certificate as trusted just as you can do in your browser. Make sure that the JVM uses the keystore you imported the certificate to. Also, make sure that the host name that you use match the name in the certificate and the one that you trust.

Portal Instances

If you're working with portal instances, you need to use an endpoint (e.g. virtual host) within the target instance as the publishing target.

What else can go wrong?

If you find other things that (can) go wrong please post them in the comments in order to make this post a bit better.

Vocabulary Disclaimer

My day-to-day job involves configuring this from time to time – but I'm not a cryptography expert. This means that I probably have used words like “certificate”, “key” in a somewhat unprecise way. If you can point out where I should have been more precise, please comment and I'll change it. (I know that SSL is supposed to be TLS nowadays, but it's an old habit and everybody knows what I'm talking about)

Thanks

The initial settings and tests were done together with Nitharsan Manoharan and Randall Hidajat. Both did the dirty work of configuring and debugging the infrastructure. Thanks for the cooperation.

Blogs
Great post Olaf. Thank you for writing that up. Something to add is that when you create the keystore, "your first and last name" should be the address of the website. I have found that you will get a certificate name mismatch if you set that to something else.
This is a really nice and compact "tutorial" for getting such a infrastructure set up quickly. Thanks for sharing! I'm going to use it to set up a Liferay "play" environment
We are working on setting this up now between a QA and a PROD LR 6.1 GA2 EE instances. Where were are running into issues is in front of the PROD server there is a load balancer and an Apache HTTPD instance on a separate server from Liferay. I think to make this work I may need a Rewrite Rule or proxying configuration on the Apache instance for this (separate from what is there for AJP now). Anyone successfully do remote staging in a context like this?
I can't reproduce this (but for timing/resourcing issues) thus just a quick workaround: Can you bypass the loadbalancer and just publish to one of the cluster machines directly? Or does the loadbalancer already terminate the SSL connection.

If it shows that publishing to a cluster is indeed a problem, another option might be to set up a second virtual host on the loadbalancer (with a trusted SSL certificate) that doesn't have multiple machines in the background: E.g. you're publishing through the loadbalancer, but it only "balances" one machine. Yes, this is lame and not the reason for you to have a load balancer in the first place, but might help gain some more time until you find the correct way. As you mention that you're on EE: Did you contact our support team on this? They will have the time/resources to reproduce (and fix if it is a problem with Liferay) or provide a better workaround.

If you open a ticket, point to this post/comment and I'm happy to assist the support staff in reproduction, time permitting.
We got it working...we originally were using the IP of the NAT of the staging server in the *hosts.allowed property on the target server (since the staging server doesn't have a public or static IP). But ultimately what worked was using the IP of the load balancer (which is a static IP) in those properties.
Be careful: If your server only sees your loadbalancer, you probably haven't set up communication between them properly: AJP would forward the original host address, HTTP doesn't (unless you use "ProxyPreserveHost On", try this) on Apache.

When your loadbalancer is the origin of *all* traffic to your appserver, by setting host.allowed to your loadbalancer, you're allowing *all* traffic that comes through your loadbalancer access to the API - probably not what you intended with this operation