b***@mnogosearch.org
2017-01-31 15:57:10 UTC
Author: Julien D.
But you can remove the protocol at search time,
using the search template language functionality.
http://www.mnogosearch.org/doc34/msearch-templates.html#template-
functions
Hello Alexander,
Thanks for the answer.
However, the problem occurs on the indexing phase : the crawler tries to index
http://www.example.com/www.example.com/page-b.html (which does not exist)
instead of http://www.example.com/page-b.html
Can I prevent those 404 errors ?
Thanks !
Reply: <http://www.mnogosearch.org/board/message.php?id=21810>
Hello,
withHello,
I couldn't find any information on this subject.
As people start using HTTPS, I get more and more problems when crawling
I couldn't find any information on this subject.
As people start using HTTPS, I get more and more problems when crawling
links that don't use a specific protocol.
<a href="//www.example.com/page-b.html">text</a>
Will be seen as : http://www.example.com/www.example.com/page-b.html
And of course will cause a 404 error.
Any idea on how to get the right links ?
Thanks.
The crawler stores full URLs in the database.<a href="//www.example.com/page-b.html">text</a>
Will be seen as : http://www.example.com/www.example.com/page-b.html
And of course will cause a 404 error.
Any idea on how to get the right links ?
Thanks.
But you can remove the protocol at search time,
using the search template language functionality.
http://www.mnogosearch.org/doc34/msearch-templates.html#template-
http://www.mnogosearch.org/doc33/msearch-templates-
oper.html#templates-oper-miscHello Alexander,
Thanks for the answer.
However, the problem occurs on the indexing phase : the crawler tries to index
http://www.example.com/www.example.com/page-b.html (which does not exist)
instead of http://www.example.com/page-b.html
Can I prevent those 404 errors ?
Thanks !
Reply: <http://www.mnogosearch.org/board/message.php?id=21810>