Ask Questions and Find Answers
Important:
Ask is now read-only. You can review any existing questions and answers, but not add anything new.
But - don't panic! While ask is no more, we've replaced it with discuss - the new Liferay Discussion Forum! Read more here here or just visit the site here:
discuss.liferay.com
RE: Avoid crawl of web pages by search engine using IP and Domain name.
Hi,
I am using liferay CE 7.1.2 GA3 edition.
I have to avoid website pages to be index by any search engine.
I have setup virtual host with my domain name.
I have configured the Domain name in instance setting.
I also updated the content for robots.txt using
Build -> Pages ->Advance Setting ->Set the robots.txt for pages. as below
User-Agent: *
Disallow:/
when I access http://mydomain.com/robots.txt
I am able to see
User-Agent: *
Disallow:/
But when i access same file using ip
http://10.0.0.1/robots.txt
Content i can see are as bellow:
User-Agent: *
Disallow:
How can i setup liferay server so that pages should not be index by search engine.
Robots setting should works with domain name and IP right now liferay only supporting either way.
Thanks in advance
-Amit Sharma
I am using liferay CE 7.1.2 GA3 edition.
I have to avoid website pages to be index by any search engine.
I have setup virtual host with my domain name.
I have configured the Domain name in instance setting.
I also updated the content for robots.txt using
Build -> Pages ->Advance Setting ->Set the robots.txt for pages. as below
User-Agent: *
Disallow:/
when I access http://mydomain.com/robots.txt
I am able to see
User-Agent: *
Disallow:/
But when i access same file using ip
http://10.0.0.1/robots.txt
Content i can see are as bellow:
User-Agent: *
Disallow:
How can i setup liferay server so that pages should not be index by search engine.
Robots setting should works with domain name and IP right now liferay only supporting either way.
Thanks in advance
-Amit Sharma
I usually do things like that on a reverse proxy in front of Liferay. It's one of the many perks of having a reverse proxy.
If you need to do this in Liferay, you have to write a filter that intercepts the requests and returns a robots.txt that fits your needs depending on the host header.
https://portal.liferay.dev/docs/7-1/tutorials/-/knowledge_base/t/servlet-filters
If you need to do this in Liferay, you have to write a filter that intercepts the requests and returns a robots.txt that fits your needs depending on the host header.
https://portal.liferay.dev/docs/7-1/tutorials/-/knowledge_base/t/servlet-filters
Thanks This solved my problem.
Copyright © 2025 Liferay, Inc
• Privacy Policy
Powered by Liferay™