Discussion:
Webboard: Search of...Indexing on 2 DB
b***@mnogosearch.org
2013-12-07 17:15:14 UTC
Permalink
Author: Laurent
Email:
Message:
Hi Guys,

To improve performance, I split my index database (reindexing from start) on 2 different platforms.

Separetely, the search.htm works perfectly, limited in each of the indexes of course.

I would now like to merge the search so data is taken from the 2 SQL servers. I saw in the doc the brief explanation, but it is a bit confusing to me.
search.htm is PHP and the explanations are made for a more risky CGI.

Does someone knows the trick for the PHP version of the search script ?

Thanks in advance

Reply: <http://www.mnogosearch.org/board/message.php?id=21603>
b***@mnogosearch.org
2013-12-09 13:06:16 UTC
Permalink
Author: Alexander Barkov
Email: ***@mnogosearch.org
Message:
Hi,
Post by b***@mnogosearch.org
Hi Guys,
To improve performance, I split my index database (reindexing from start) on 2 different platforms.
Separetely, the search.htm works perfectly, limited in each of the indexes of course.
I would now like to merge the search so data is taken from the 2 SQL servers. I saw in the doc the brief explanation, but it is a bit confusing to me.
search.htm is PHP and the explanations are made for a more risky CGI.
Does someone knows the trick for the PHP version of the search script ?
You can use Udm_Alloc_Agent_Array() to specify multiple databases:
http://www.php.net/manual/en/function.udm-alloc-agent-array.php

However, the PHP module does not support parallel execution.
It queries the database consequently.

Note, the CGI version queries the databases in parallel.
So it should be faster.

Btw, how many documents do you have?
What is the output from "indexer -S"?
Post by b***@mnogosearch.org
Thanks in advance
Reply: <http://www.mnogosearch.org/board/message.php?id=21604>
b***@mnogosearch.org
2013-12-10 05:59:32 UTC
Permalink
Author: Laurent
Email:
Message:
Hi Alex,

Thanks for your reply.

Currently, I dont have that many documents.
I am talking about avg 300K in the main DB, and 100K in the other one.
But the robot is currently frozen due to lack of disk space.

During Xmas, I'll update to 2x600 Go and, from that, I'll free the indexer. I expect millions of URLs to be indexed at the end, and so I just anticipate this.
I already see the difference when BLOBing. From 1800s to 800s, just because I split in 2 logical groups of information..

About parallel search, could it be smart, to avoid using more risky CGI, to make (for example) a perl search front-end, able to thread and merge the results and, from its actions, just use PHP to get and display the results ??

Maybe a major improvement idea to consider :-)

Thx

Reply: <http://www.mnogosearch.org/board/message.php?id=21606>

Loading...