Message Boards

Document preview and conversion Unicode issues

Matt Glosson, modified 7 Years ago.

Document preview and conversion Unicode issues

New Member Posts: 2 Join Date: 9/23/16 Recent Posts
We're using Liferay 6.2. I noticed that when I upload a document in Microsoft Word's .docx format in a non-Western script (e.g., Korean), the document preview does not show any non-ASCII characters correctly, but replaces them with "unknown character" boxes. That occurs not only the preview, but if somebody clicks "PDF" to download it, all those same "character not found" boxes are there. They get the true text only if they download the original version (in this case, docx). There's an option in Word to embed the fonts in the file. I tried that but it made no difference. If I save as PDF within Word, and upload that, it previews fine.

This seems like either a bug in Liferay's preview handler/document converter or a back-end feature that I don't know about that needs to be enabled. Has anybody dealt with this before? The obvious workaround is to save them as PDF on my computer from within Word (or use a PowerShell tool that uses Word to batch-convert) but I would rather not if I don't have to. Short of that, any advice?