Importing Products

with the Liferay Batch API

Introduction

In most cases, if you’re going to be using Liferay’s Product capabilities you’re going to be importing product data from an external system such as a PIM, ERP, or even a legacy Commerce platform.  In some cases, this will be a one time data load, and in others, the remote system might continue to be the source of truth and Liferay’s catalog might receive product updates and new products on a regular basis.

When it comes to integrations such as this, Liferay provides a robust set of Headless APIs that make this a very straightforward process but unfortunately many people I speak with aren’t aware these APIs exist or know which APIs they should be using.

Understanding Liferay Commerce Headless APIs

To get started, let’s take a quick look at the Commerce Headless APIs that are available to us.  To do this, be sure you are logged in as an Administrator and navigate to the Liferay API Explorer (e.g. http://localhost:8080/o/api).  Review at the different applications listed under the REST Applications.  You’ll notice that for Commerce, the applications are nicely divided between the headless-commerce-admin APIs and the headless-commerce-delivery APIs.

The Headless Commerce Admin APIs are designed for administrative tasks such as catalog or store setup or for integration with external systems. The Headless Commerce Delivery APIs are designed for building end user experiences such as custom widgets or fragments or for creating commerce experiences in external systems such as a native mobile application.

Since importing products is an integration task, we should be looking at the headless-admin-commerce applications to find a suitable set of endpoints and if we take a look at what’s available we will see that the Liferay Commerce Admin Catalog API is probably our best bet.

Exploring the Liferay Commerce Admin Catalog API

At first glance, the endpoints available might be a little overwhelming because we typically think of a Product as a single entity, but in reality, there are actually many entities involved in a Product and each of those entities has its own set of endpoints which makes it easy to integrate with external systems or to configure Liferay.  Thankfully, even though we have this granularity of endpoints when we need it, it isn't limiting for us when we need to upload a product.  

To create (POST) a Product as a singular entity, Liferay provides us with the /headless-commerce-admin-catalog/v1.0/products endpoint. And thanks to Liferay API Explorer we also have an example of what the JSON payload should look like in order to be successful.


 

Unfortunately, the values are all ‘sample’ values and it’s not clear which are required and which are optional so I always recommend to start with a GET against that same endpoint to see what a real product in your system looks like when transformed into JSON. If you are working with a completely new system and don’t have any products available I recommend you either create a realistic example through the UI or start with one of Liferay’s built-in Accelerators.

Once you have some products in the system, then you can use a GET request against the endpoint to analyze the JSON that’s returned.  Now normally, the GET would return a JSON array of products, but if you just want to analyze one without having to look up a Product ID, you can set the pageSize parameter to 1 and you’ll get a single Product to analyze.


 

If you’re using the Minium Accelerator, the response might look something like this:

{
  "actions": {},
  "facets": [],
  "items": [
    {
      "actions": {
        "get": {
          "method": "GET",
          "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
        },
        "update": {
          "method": "PATCH",
          "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
        },
        "delete": {
          "method": "DELETE",
          "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
        }
      },
      "active": true,
      "catalogId": 32780,
      "categories": [
        {
          "externalReferenceCode": "af85c045-73c7-9fb6-c839-08e94fb63dd1",
          "id": 32862,
          "name": "Brake System",
          "vocabulary": "minium"
        }
      ],
      "createDate": "2024-06-25T21:46:21Z",
      "customFields": [],
      "description": {
        "en_US": "Product designed and manufactured to accommodate OEM applications. All\nproducts are tested and inspected in an ISO-9000 compliant environment"
      },
      "displayDate": "2023-06-25T21:46:00Z",
      "expando": {},
      "externalReferenceCode": "MIN93015minium-full-initializer",
      "id": 33012,
      "metaDescription": {
        "en_US": ""
      },
      "metaKeyword": {
        "en_US": ""
      },
      "metaTitle": {
        "en_US": ""
      },
      "modifiedDate": "2024-06-25T21:46:21Z",
      "name": {
        "en_US": "ABS Sensor"
      },
      "productAccountGroupFilter": false,
      "productChannelFilter": true,
      "productId": 33013,
      "productStatus": 0,
      "productType": "simple",
      "productTypeI18n": "Simple",
      "shortDescription": {
        "en_US": ""
      },
      "skuFormatted": "MIN93015",
      "tags": [],
      "thumbnail": "/o/commerce-media/accounts/-9223372036854775808/images/33036?download=false",
      "urls": {
        "en_US": "abs-sensor"
      },
      "version": 1,
      "workflowStatusInfo": {
        "code": 0,
        "label": "approved",
        "label_i18n": "Approved"
      }
    }
  ],
  "lastPage": 53,
  "page": 1,
  "pageSize": 1,
  "totalCount": 53
}

Because we’re hitting the /products endpoint, we’re going to get an array of products, even if we only asked for a pagSize of 1 so to understand what a single product would look like you can ignore the first few lines and last few lines of the response and just focus on the single object in the items array:

{
  "actions": {
    "get": {
      "method": "GET",
      "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
    },
    "update": {
      "method": "PATCH",
      "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
    },
    "delete": {
      "method": "DELETE",
      "href": "http://localhost:8080/o/headless-commerce-admin-catalog/v1.0/products/33013"
    }
  },
  "active": true,
  "catalogId": 32780,
  "categories": [
    {
      "externalReferenceCode": "af85c045-73c7-9fb6-c839-08e94fb63dd1",
      "id": 32862,
      "name": "Brake System",
      "vocabulary": "minium"
    }
  ],
  "createDate": "2024-06-25T21:46:21Z",
  "customFields": [],
  "description": {
    "en_US": "Product designed and manufactured to accommodate OEM applications. All\nproducts are tested and inspected in an ISO-9000 compliant environment"
  },
  "displayDate": "2023-06-25T21:46:00Z",
  "expando": {},
  "externalReferenceCode": "MIN93015minium-full-initializer",
  "id": 33012,
  "metaDescription": {
    "en_US": ""
  },
  "metaKeyword": {
    "en_US": ""
  },
  "metaTitle": {
    "en_US": ""
  },
  "modifiedDate": "2024-06-25T21:46:21Z",
  "name": {
    "en_US": "ABS Sensor"
  },
  "productAccountGroupFilter": false,
  "productChannelFilter": true,
  "productId": 33013,
  "productStatus": 0,
  "productType": "simple",
  "productTypeI18n": "Simple",
  "shortDescription": {
    "en_US": ""
  },
  "skuFormatted": "MIN93015",
  "tags": [],
  "thumbnail": "/o/commerce-media/accounts/-9223372036854775808/images/33036?download=false",
  "urls": {
    "en_US": "abs-sensor"
  },
  "version": 1,
  "workflowStatusInfo": {
    "code": 0,
    "label": "approved",
    "label_i18n": "Approved"
  }
}

Hopefully, many of these attributes should look familiar and be self-explanatory. We can safely ignore quite a few of these when we create new products because they are only relevant for existing products or will be generated by Liferay automatically. And by starting with an existing product we can easily grab those values that are going to be specific to our destination system such as the Catalog Id and Category information.

The absolute minimum attributes we need to supply for a product would be:

{
  "active": "true",
  "catalogId": 32780,
  "name": {
    "en_US": "DEMO Sensor"
  },
  "productType": "simple"
} 

Of course, this wouldn’t be a very exciting product, but it gives you a starting point to build upon.  

Now getting back to our original use case, loading multiple products from an external system, we could just keep POSTing products one at a time against the headless-commerce-admin-catalog/v1.0/products endpoint and technically that would work, but the performance would be terrible for large data sets. Thankfully, Liferay has provided a better way through the Batch framework. If we go back to the Liferay API Explorer and scroll down just a little further we will find another the headless-commerce-admin-catalog/v1.0/products endpoint.


 

With the Batch endpoint, we now have a new optional parameter that we can pass with our request, the callbackURL. The reason for the callbackURL is because we typically process very large payloads through the Batch endpoint so the response we get back immediately won’t be able to provide a status of the overall request. It will provide us with an ID that we can use to check on the status of the import at a later time or the system can reach out to let us know once it has finished by calling the callbackURL.

One other thing to note with the Batch endpoints is that they don’t include an example, in the same way that the single entity endpoints do. This is a bug and there’s a ticket for this (LPD-30065) so hopefully that will be resolved in an upcoming release. But never fear, the pattern for the batch endpoints is very consistent and they will take a JSON array of individual entities. So for our overly simplistic product example, this could look like:

[
  {
    "active": "true",
    "catalogId": 32780,
    "name": {
      "en_US": "DEMO Sensor One"
    },
    "productType": "simple"
  },
  {
    "active": "true",
    "catalogId": 32780,
    "name": {
      "en_US": "DEMO Sensor Two"
    },
    "productType": "simple"
  },
  {
    "active": "true",
    "catalogId": 32780,
    "name": {
      "en_US": "DEMO Sensor Three"
    },
    "productType": "simple"
  }  
]

Once submitted, we’ll see the immediate response to let us know the process has started and to provide us with the ID of the process so we can check on the status later.  

{
  "className": "com.liferay.headless.commerce.admin.catalog.dto.v1_0.Product",
  "contentType": "JSON",
  "errorMessage": "",
  "executeStatus": "STARTED",
  "externalReferenceCode": "b8efd816-aeb0-52f4-c967-3a26dd42dd33",
  "failedItems": [],
  "id": 1,
  "importStrategy": "ON_ERROR_FAIL",
  "operation": "CREATE",
  "processedItemsCount": 0,
  "startTime": "2024-07-01T03:06:00Z",
  "totalItemsCount": 3
}

After a few minutes (or in our case a few seconds) we can check the status of our import using the /headless-batch-engine/v1.0/import-task endpoint and supplying the ID of our job:  


 

If everything worked correctly, we should see something that looks like this:  

{
  "className": "com.liferay.headless.commerce.admin.catalog.dto.v1_0.Product",
  "contentType": "JSON",
  "endTime": "2024-07-01T03:06:01Z",
  "errorMessage": "",
  "executeStatus": "COMPLETED",
  "externalReferenceCode": "b8efd816-aeb0-52f4-c967-3a26dd42dd33",
  "failedItems": [],
  "id": 1,
  "importStrategy": "ON_ERROR_FAIL",
  "operation": "CREATE",
  "processedItemsCount": 3,
  "startTime": "2024-07-01T03:06:00Z",
  "totalItemsCount": 3
}

Conclusion

So now we've seen how to import a single, very simple product as well as an array of simple products.  We can easily extend this example to include more complex products or much larger data sets. 

If you want to learn more about Liferay's Batch framework my colleague Dave Nebinger wrote a great post called Effective Liferay Batch that covers the topic in more detail.  You can also explore the official Liferay documentation on Liferay Learn to read more about Batch Engine API Basics - Importing Data.   And please let me know if there are other aspects of managing products that you'd like me to cover in future blog posts.  

 

Blogs

Absolutely great Jeffrey - you nailed it! Thanks for providing the missing pieces in the current documentation. 

"...performance would be terrible for large data sets." Yes: nearly two days in my case. Unfortunatelly after switching to batch import today, I could not observe any increased velocity: ~700 ms for each separate POST per product vs. ~700 ms per product with one batch POST.

(Liferay Community Edition Portal 7.4.3.120 CE GA120. Default liferay image in an "up to date" local docker environment). 

Any idea why this is still so slow? How much performance improvement would you usually expect using the batch approach?