Manage your imports, exports and reports using ETL process manager

I'd like to spend some time to present a portlet that we recently published on the Marketplace: ETL process manager. I'm sure that not everyone knows what an ETL tool is and how poweful this can be when combined with Liferay.

ETL stands for "Extract, Transform and Load". ETL tools are extremely useful when you need to manipulate data: extract, import, export, clean, etc. 

Here is a list of most frequently features asked from our customers :

  • Import users from Excel file
  • Weekly report with the list of users with various information
  • On demand report with all organizations / sites and users with their roles

There are many ways to handle that kind of request. Imports can easily be managed using Liferay webservices but reading Excel sheets is a pain (CSV is generally a good compromise). Exports can be handled by a reporting tool like Jasper or BIRT but it takes time to design a report and the report engine generally requires an access to the database.

We finally ended using Talend Open Studio. Talend DI (Data Integration) is a very powerful ETL and opensource solution. It provides a lot of connectors/tools:

  • it can read and write almost any kind of format (CSV, Excel, txt, etc.)
  • it can retrieve data from anywhere (database, FTP, email, SSH, etc.)
  • it can manipulate data very efficiently (normalize, denormalize, sort, duplicate, ...)

The problem when using an ETL is to maintain the processes and to give to end users the possibility to execute them by themself without having to connect to the server and execute some strange Shell script. This is why we decided to develop an ETL portlet for managing processes.

ETL process manager

The portlet can be download from the Markeplace: LINK

The portlet is useless without ETL processes. Fortunately, we provide a few examples that you can use to see how it works. Examples are available on this page.

1. Import your Talend process

The first step is to import an existing Talend process. You can easily download Talend Open Studio (DI), design your own process and then import them into Liferay. Here, I'm going to import a very simple process, HelloWord (download here), in order to illustrate how it works.

  • Make sure you installed "ETL process manager" from the Marketplace
  • Go on the control panel and click on "Server | ETL Processes"
  • Click on the Add button
  • Upload the HelloWorld_0.1.zip file

  • You should now have a new process available in the list

 

2. Execute process

Now, you can execute the process directly from Liferay.

  • Click on "Actions | Execute now" (notice the message on the top of the list)
  • Go on History tab, you will see one entry in the list of executions

  • Click on the date to see the details of the execution

  • From the new screen, you can now download the Runtime logs (and Error logs if any)
  • Check that you see the message "Hello World!" in the logs

 

3. So what?

OK. It is certainly not so impressive so far. Anyone can easily print a "Hello World" message into a log. 

So let's take another, better example.

 

Report with all active organizations, sites and active users and their roles

This example is amazing and shows some of the portlet capabilities. The new example uses the process "ReportUsersRoles" (download here).

  • The process uses Liferay API to retrieve the following information
    • List of all users with detailed information and the list of roles (per organization and site)
    • List of all organizations (sorted according to the hierarchy) with the list of members and their roles
    • List of all sites with the list of members and their roles
  • The process uses Talend connectors:
    • To generate an Excel sheet (not a CSV!) with one tab for each list
    • To send the final report by email to a provided address (parameter)

 

To test it, follow the instructions below:

  • Upload the process (download here) into Liferay
  • Edit the process and add the following context parameter: MailTo=YOUREMAIL
  • Execute the process
  • Check your emails
  • Download the attachment and enjoy!

Notice that you can also schedule the execution of the process in order to receive that report every week, for example.

If the example doesn't work:

  • make sure that you have configured your email server correctly for your portal
  • make sure that the MailTo parameter is properly configured with your email address
  • check the error logs

3
Blogs
Sven - thank you for sharing this post. We have a similar need to make reports on pages / portlet permissions of specific sites through our portal. This has been very useful in our current 5.2 installation. It looks like this same tooling could be used for this?
Yes. The module simply lets you run an ETL process on Liferay.
Talend is fantastic for agregating data, and producing any kind of output (CSV, Excel, Report, etc.). Running it in Liferay gives it the possibility to access the whole Liferay API.
Hi, I do only get errors when executing...
"java.lang.ClassNotFoundException: talend_liferay_examples.HelloWorld_0_1.HelloWorld"...
What did I do wrong?