Discussion:
blob, single or multi parameters
d***@mapluz.fr
2013-10-03 10:53:09 UTC
Permalink
Hi
I have install release mnogosearch 3.3.14 with a mysql database
i have create table with thois command :

./indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf in my indexer .conf i have this :

DBAddr mysql://root:***@localhost/mnogosearchactu/?DBMode=multi

when i try to run search, i have this message :
Inverted word index not found. Probably you forgot to run 'indexer -Eblob' .
so i have run this command :
./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf

and all run, but i have questions :

1 - why must i run this commande : ./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf
i do not uderstand the -Eblob parameter
2 - in my crontab, is this line correct 00 23 * * * /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf to indexing all days at 23h ?

Thanks a lot for your responses.
Alexander Barkov
2013-10-03 12:13:25 UTC
Permalink
Hi,
Post by d***@mapluz.fr
Hi
I have install release mnogosearch 3.3.14 with a mysql database
*./indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf*
*Inverted word index not found. Probably you forgot to run 'indexer
-Eblob'*.
*./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf*
I guess you forgot to fix DBAddr in search.htm to match the one in
indexer.conf.
Post by d***@mapluz.fr
1 - why must i run this commande :*./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf
* i do not uderstand the*-Eblob* parameter
2 - in my crontab, is this line correct* **00 23 * * * /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexer.conf* to indexing all days at 23h ?*
With DBMode=multi crawling and indexing is done at the same time.
The advantage is that search index is always up to date with
what crawler has already downloaded.

With DBMode=blob crawling and indexing are separated in time.
The advantage of DBMode=blob is that it is much faster at search
time than DBMode=blob.
But it needs an extra step "indexer --index"
(or "indexer -Eblob" - these commands are synonyms)
to make the index up to date after the crawler has downloaded
a number of documents with new content
(i.e. both new documents and old documents that have changed
since last crawling).


The choice between DBMode=multi and DBMode=blob can be done
depending on the database size and search performance.


- If your document collection is rather small and you're
happy with search performance provided by DBMode=multi,
then use this command in both indexer.conf and search.htm:

DBAddr mysql://root:***@localhost/mnogosearchactu/?DBMode=multi

The command in crontab is Okey in this case.


- If your document collection is rather big, and/or you prefer faster
search results, then use this DBAddr in both indexer.conf and
search.htm:

DBAddr mysql://root:***@localhost/mnogosearchactu/?DBMode=blob

In this case, the crontab task should do two things consequently:

# Crawling
/usr/local/mnogosearch/sbin/indexer -d
/usr/local/mnogosearch/etc/indexer.conf
# Indexing
/usr/local/mnogosearch/sbin/indexer --index -d
/usr/local/mnogosearch/etc/indexer.conf

It's a good idea to put these two commands into a shell script,
then use it from crontab.


Now you can try to change search.htm changing between DBMode=blob
and DBMode=multi and compare performance.



If you decide to stay with DBMode=multi, then just copy DBAddr
from indexer.conf to search.htm.


If you decide to switch to DBMode=blob, then it's a good idea
to start from scratch:

1. Drop the tables in the current database that were created
for DBMode=multi

indexer --drop

2. Edit indexer.conf and search.htm, change to DBMode to blob.

3. Create tables for DBMode=blob

indexer --create

4. Crawl your document collection

indexer

5. Create index

indexer --index

6. Search
Post by d***@mapluz.fr
*Thanks a lot for your responses.*
*
_______________________________________________
General mailing list
http://lists.mnogosearch.org/listinfo/general
Mapluz Dev
2013-10-11 17:24:32 UTC
Permalink
thanks a lot for your information
Post by Alexander Barkov
Hi,
Post by d***@mapluz.fr
Hi
I have install release mnogosearch 3.3.14 with a mysql database
*./indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf*
*Inverted word index not found. Probably you forgot to run 'indexer
-Eblob'*.
*./indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf*
I guess you forgot to fix DBAddr in search.htm to match the one in
indexer.conf.
Post by d***@mapluz.fr
1 - why must i run this commande :*./indexer -Eblob -d
/usr/local/mnogosearch/etc/indexer.conf
* i do not uderstand the*-Eblob* parameter
2 - in my crontab, is this line correct* **00 23 * * *
/usr/local/mnogosearch/sbin/indexer -d
/usr/local/mnogosearch/etc/indexer.conf* to indexing all days at 23h ?*
With DBMode=multi crawling and indexing is done at the same time.
The advantage is that search index is always up to date with
what crawler has already downloaded.
With DBMode=blob crawling and indexing are separated in time.
The advantage of DBMode=blob is that it is much faster at search
time than DBMode=blob.
But it needs an extra step "indexer --index"
(or "indexer -Eblob" - these commands are synonyms)
to make the index up to date after the crawler has downloaded
a number of documents with new content
(i.e. both new documents and old documents that have changed
since last crawling).
The choice between DBMode=multi and DBMode=blob can be done
depending on the database size and search performance.
- If your document collection is rather small and you're
happy with search performance provided by DBMode=multi,
The command in crontab is Okey in this case.
- If your document collection is rather big, and/or you prefer faster
search results, then use this DBAddr in both indexer.conf and
# Crawling
/usr/local/mnogosearch/sbin/indexer -d
/usr/local/mnogosearch/etc/indexer.conf
# Indexing
/usr/local/mnogosearch/sbin/indexer --index -d
/usr/local/mnogosearch/etc/indexer.conf
It's a good idea to put these two commands into a shell script,
then use it from crontab.
Now you can try to change search.htm changing between DBMode=blob
and DBMode=multi and compare performance.
If you decide to stay with DBMode=multi, then just copy DBAddr
from indexer.conf to search.htm.
If you decide to switch to DBMode=blob, then it's a good idea
1. Drop the tables in the current database that were created
for DBMode=multi
indexer --drop
2. Edit indexer.conf and search.htm, change to DBMode to blob.
3. Create tables for DBMode=blob
indexer --create
4. Crawl your document collection
indexer
5. Create index
indexer --index
6. Search
Post by d***@mapluz.fr
*Thanks a lot for your responses.*
*
_______________________________________________
General mailing list
http://lists.mnogosearch.org/listinfo/general
--
VBLC Signature
------------------------------------------------------------------------

Développement Mapluz - MAPLUZ <http://www.mapluz.fr>

Ingénierie Génie Logiciel

*Mobile :*+33 6 79 24 91 50

*Email :****@mapluz.fr <mailto:***@mapluz.fr>
Loading...