Discussion:
Webboard: cpu usage
b***@mnogosearch.org
2013-12-11 18:37:54 UTC
Permalink
Author: fasfuuiios
Email:
Message:
I have noted that even if I start indexer with 5 or 10 or 20 or 40
threads with CrawlerThreads option in indexer.conf, top command is
always showing not more than 40% of cpu and very rarely it can rise up
to 55%. With more threads it can slightly ddos some sites and they
give 503 error or even 508. Using munin for server monitoring is
showing rather stable perfomance without high cpu and memory usage
during indexation. Sometimes indexer hangs but I check it with cron
each minute and start it again if it is not active.

* * * * * root pgrep indexer > /dev/null ||
/usr/local/mnogosearch/sbin/indexer -l

Does mnogosearch has some internal perfomance limitations for indexer
to make possible parallel searches and indexing? Or maybe I have
missed something in compiling options or any special options in
indexer.conf? I have not experimented with more than one indexer
processes. Is it possible to achieve 80% of cpu usage constantly? If
yes, what is the safest and stablest way to do it, if server is used
only for indexing?

Or maybe it is good practice to limit indexer? I have seen php
crawlers that can easily eat 90% of cpu. Of course, their slow
perfomance are not compared with mnogosearch high speed. It works very
fast. But of course, it is interesting how to load server completely
during indexing.



Reply: <http://www.mnogosearch.org/board/message.php?id=21611>
b***@mnogosearch.org
2013-12-13 12:34:06 UTC
Permalink
Author: fasfuuiios
Email:
Message:
Regarding to these tests I have forgotten to add configuration
specific details.

I use PostreSQL that is tuned with
http://pgfoundry.org/projects/pgtune/
on each node.

Nodes are simple and old.

1) Pentium(R) Dual-Core CPU T4500 @ 2.30GHz x2
with 4 Gb memory
with usual HDD
OS Debian 32bit

2) Intel(R) Celeron(R) CPU E1400 @ 2.00GHz x2
with 1 Gb memory
with usual HDD
OS Debian 32bit

3) AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ 2x2800 MHz
with 2 Gb memory
with Debian 64bit
with usual two usual HDDs with software RAID 1

I have noted that SSD openvz vps can work much faster. This is
understandable. SSD is always recommended for such database things.

But anyway none of these nodes can be overloaded by indexer to high
cpu with 5/10/20/40/50 threads. I have not tested more threads that in
some cases it becames little ddos attack. CrawlDelay is not used.

It seems that hard drive is always the main bottleneck.

Currently I think that maybe sysctl.conf must be edited to work
faster.

Reply: <http://www.mnogosearch.org/board/message.php?id=21612>
b***@mnogosearch.org
2013-12-15 12:14:07 UTC
Permalink
Author: fasfuuiios
Email:
Message:
Found this related thread http://www.mnogosearch.org/board/message.php?
id=19643

I have tried to start 2 instances of indexer.
indexer.conf has
CrawlerThreads 50

I thought that maybe it is related to number of cores. But it looks
like there is no difference between 1 and 2 indexer instances with
defined
CrawlerThreads 50

Reply: <http://www.mnogosearch.org/board/message.php?id=21614>

Loading...