Hiding the /web Friendly URL in Your Liferay SaaS: A Practical Guide

Caption

Introduction

You probably read the title of this article and thought, "finally, my friendly URL problems are solved!". And in part, you're right! The idea here is to share a "cake recipe" to achieve a crucial goal for many SaaS portals built on Liferay: removing or hiding the notorious /web from your sites' URLs. This article is based on a real-world use case from a Liferay customer, where this approach was successfully implemented, demonstrating its viability and effectiveness.

If you have a portal with multiple sites running under a single virtual host and are looking to optimize SEO by making URLs cleaner and more intuitive for your users (from myportal.com/web/site-a to the desired myportal.com/site-a), keep reading. We will explore a practical approach using the power of Nginx as a reverse proxy and intelligent redirect rules.

The Challenge: Friendly URLs and the /web Layer in Liferay

In Liferay, the standard structure for friendly URLs of site pages usually includes the /web context followed by the friendly name of the page and the site. Although functional, this structure may not be the most user-friendly and, in some cases, can impact SEO perception.

For scenarios where you host several sites on a single virtual host, the need for more direct URLs, such as myportal.com/site-a, becomes even more evident. This contributes to better mental organization for users and can simplify link sharing.

The Solution: Nginx as URL Orchestrator

The key to achieving our goal lies in the strategic configuration of Nginx, acting as an intelligent reverse proxy. Through redirect and proxy rules, we can intercept requests and rewrite URLs transparently to the end-user.

The "Cake Recipe": Step-by-Step with Nginx

Here's the approach we will implement, inspired by the configuration present in the *.conf file located in the httpd.conf.d directory:

  1. Main Virtual Host Configuration: Your main virtual host (myportal.com) will continue to be the single entry point for all requests.
  2. Definition of Distinct Domains: To implement this solution, it will be mandatory to configure two distinct domains:

    • Navigation Domain (Public): myportal.com (or the domain you want to display to end-users). This will be the domain with URLs without the /web.
    • Administration/Creation Domain: admin.myportal.com (or another subdomain of your choice). This domain will maintain the standard Liferay structure with /web and will be used for administrative access and content creation.
  3. Liferay Instance Configuration (i18n): To handle internationalized URLs (i18n), where Liferay automatically adds the locale to the URL (e.g., /en_US/site-a), it will be necessary to disable this automatic addition in your Liferay instance configuration. The property to be changed via the Admin Console is Locale Prepend Friendly URL Style located at Instance Settings > Localization:


    By changing this property to Locale is not automatically prepended to a URL, friendly URLs will not automatically include the locale, allowing the Nginx rewrite rules to function correctly.

  4. Nginx Redirect Rules (Public Domain): In your virtual host configuration file for myportal.com, you will need to define rules that:

    • Intercept requests for URLs in the format myportal.com/site-a.
    • Internally rewrite these requests to the format myportal.com/web/site-a. The reverse proxy will perform this translation before forwarding the request to the Liferay server.
  5. Reverse Proxy Configuration (Public Domain): Nginx will act as a reverse proxy, forwarding the rewritten requests to the Liferay instance responsible for serving the content.

  6. Administration Virtual Host Configuration: The admin.myportal.com domain will be configured to access Liferay in the traditional way, maintaining the /web context. This ensures that the administration and content creation interface works without the rewrite rules.

Example Configuration (Adapted from original  *.conf  file of a real word customer):

server {
   listen 80;
   server_name myportal.com;
   
   # Redirect /web/<site> to /<site> to hide /web/ from users
   location ~ ^/web/(site-a|site-b)(/.*)?$ {
      return 301 https://$host/$1$2$is_args$args;
   }

   # Block access to login
   location ~ ^/login(/.*)?$ {
      return 301 https://admin.myportal.com/login;
   }

   # Handle the 'site-a' site
   location ~ ^/site-a(/.*)?$ {
      set $rest_of_uri $1;
      set $new_uri /web/site-a$rest_of_uri;

      proxy_pass http://upstream_server$new_uri$is_args$args;
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
      proxy_hide_header liferay-portal;
      proxy_redirect off;

      proxy_http_version 1.1;
      proxy_intercept_errors on;
   }

   # Handle the 'site-b' site
   location ~ ^/site-b(/.*)?$ {
      set $rest_of_uri $1;
      set $new_uri /web/site-b$rest_of_uri;

      proxy_pass http://upstream_server$new_uri$is_args$args;
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
      proxy_hide_header liferay-portal;
      proxy_redirect off;

      proxy_http_version 1.1;
      proxy_intercept_errors on;
   }
}

Explanation of the Nginx code:

This Nginx configuration code defines a server block (virtual host) that listens on port 80 and handles requests for the myportal.com domain. It implements several routing and request manipulation rules. Let's break down each section:

1. server { ... } :

  • Defines a configuration block for a virtual server.

2. listen 80; :

  • Specifies that this virtual server should listen on port 80, which is the default port for unencrypted HTTP traffic.

3. server_name myportal.com; :

  • Sets the server name for this block. Nginx will use this configuration block when it receives an HTTP request with the Host header containing myportal.com.

4. location ~ ^/web/(site-a|site-b)(/.*)?$ { ... } :

  • Defines a location block that matches URIs starting with /web/ followed by either site-a or site-b, and optionally followed by anything else (/.*).
  • return 301 https://$host/$1$2$is_args$args;: This line performs an HTTP 301 (Moved Permanently) redirect to a new URL.
    • $host: Keeps the original host name from the request (in this case, myportal.com).
    • $1: Inserts the value captured by the first group (i.e., site-a or site-b).
    • $2: Inserts the value captured by the second group (the part of the URI after /site-a or /site-b, if it exists).
    • $is_args: Returns "?" if there are arguments in the query string, or an empty string otherwise.
    • $args: Contains the query string arguments (e.g., param1=value1&param2=value2).

In summary, this section redirects requests like /web/site-a/something?param=value to https://myportal.com/site-a/something?param=value, removing the /web/ segment from the user's visible URL.

5. location ~ ^/login(/.*)?$ { ... } :

  • Defines a location block that matches URIs starting with /login, optionally followed by anything else.
  • return 301 https://admin.myportal.com/login;: Redirects any request to /login (and subdirectories) to https://admin.myportal.com/login using an HTTP 301 redirect.
    • This effectively blocks direct access to the /login route on myportal.com and redirects users to an admin.myportal.com subdomain for login.

This block prevents administrator users from logging into the public domain, which is unique and exclusive for browsing. Non-administrator users can log in normally, either through SSO or Liferay authentication.

6. location ~ ^/site-a(/.*)?$ { ... } :

  • Defines a location block that matches URIs starting with /site-a, optionally followed by anything else.
  • set $rest_of_uri $1;: Sets a variable named $rest_of_uri to the value captured by the first group (the part of the URI after /site-a, including the leading slash if present).
  • set $new_uri /web/site-a$rest_of_uri;: Sets a variable named $new_uri by concatenating /web/site-a with the value of $rest_of_uri.
  • proxy_pass http://upstream_server$new_uri$is_args$args;: This is the key directive for reverse proxying.
    • http://upstream_server: Sends the request to a backend server defined as upstream_server. This name is typically defined in an upstream block elsewhere in the Nginx configuration.
    • $new_uri: Uses the modified URI that includes /web/site-a.
    • $is_args$args: Appends the original query string to the request sent to the backend server.
    • In summary, requests to /site-a/... are forwarded to http://upstream_server/web/site-a/...

In summary, this Nginx code does the following:

  • Listens on port 80 for the myportal.com domain.
  • Redirects requests to /web/site-a/... and /web/site-b/... to the same URLs without the /web/ prefix, using HTTPS.
  • Redirects all requests to /login to https://admin.myportal.com/login.
  • Forwards (reverse proxies) requests to /site-a/... and /site-b/... to a backend server (upstream_server) at the path /web/site-a/... and /web/site-b/...

  • Sets various important HTTP headers for the reverse proxy, ensuring the backend server has information about the original request.

Important Points and Considerations:

  • Maintenance: This configuration requires attention for maintenance, especially when adding new sites or changing friendly URLs.
  • Internal Links: Ensure that internal links within your portal are configured correctly to work with the new URL structure on the public domain.
  • SEO: Removing /web can bring benefits to SEO, making URLs cleaner and more relevant to search engines. Monitor your results after implementation.
  • Cache: Configure Nginx caching correctly to ensure optimal performance of your portal.
  • Testing: Perform rigorous testing in a low level environment before applying the configurations to production.

Another important point that must be taken into consideration is that this type of configuration in a SaaS environment can only be performed by the Cloud Team. The SaaS customer opens a request through Liferay support, providing the necessary configuration rules, and these rules will be analyzed and validated by the cloud team, and may or may not be deployed.

Conclusion:

Achieving friendly URLs without /web in a multi-site, single virtual host Liferay SaaS environment is possible through the strategic configuration of Nginx as a reverse proxy. The requirement to use two distinct domains (one for public navigation and one for administration) is a key aspect of this solution.

We hope this "cake recipe" is helpful to you! Share your questions and experiences in the comments below.

Blogs

Great content Anderson! Congrats! Another positive side effect of this solution is that administrators will have a unique domain for their work. This domain can enforce strict IP rules, enhancing the project's security!

Really a very useful content, thank you for sharing Anderson! This addresses a very common need for Liferay users!