<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Problems with new Google news bot</title>
  <link rel="self" href="https://liferay.dev/c/message_boards/find_thread?p_l_id=119785294&amp;threadId=4353963" />
  <subtitle>Problems with new Google news bot</subtitle>
  <id>https://liferay.dev/c/message_boards/find_thread?p_l_id=119785294&amp;threadId=4353963</id>
  <updated>2026-05-11T04:47:18Z</updated>
  <dc:date>2026-05-11T04:47:18Z</dc:date>
  <entry>
    <title>Problems with new Google news bot</title>
    <link rel="alternate" href="https://liferay.dev/c/message_boards/find_message?p_l_id=119785294&amp;messageId=4353962" />
    <author>
      <name>Mike Robins</name>
    </author>
    <id>https://liferay.dev/c/message_boards/find_message?p_l_id=119785294&amp;messageId=4353962</id>
    <updated>2009-12-07T22:19:55Z</updated>
    <published>2009-12-07T22:19:55Z</published>
    <summary type="html">Hello,&lt;br /&gt;&lt;br /&gt;We have just started seeing these requests in our web server log files:&lt;br /&gt;&lt;br /&gt;66.249.65.4 - - [07/Dec/2009:15:49:41 +0000] &amp;#34;GET /somepage/somechildpage;!-1835184563!1087628637!1260200943826 HTTP/1.1&amp;#34; 404 818 &amp;#34;-&amp;#34; &amp;#34;Googlebot-News&amp;#34;&lt;br /&gt;&lt;br /&gt;It seems Google have started using a different bot for crawling news content (http://googlewebmastercentral.blogspot.com/2009/12/new-user-agent-for-news.html). The problem appears to be Liferay sees the &amp;#39;;!-1835...&amp;#39; as part of the page name which obviously doesn&amp;#39;t exist and therefore a 404 is returned.&lt;br /&gt;&lt;br /&gt;How can I tell Liferay to ignore everything from the &amp;#39;;&amp;#39; so it just looks for a page called &amp;#39;/somepage/somechildpage&amp;#39;?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Many thanks,&lt;br /&gt;&lt;br /&gt;Mike.</summary>
    <dc:creator>Mike Robins</dc:creator>
    <dc:date>2009-12-07T22:19:55Z</dc:date>
  </entry>
</feed>
