2006-09-21 System Load

What determines system load? EmacsWiki is hosted on a server that is seeing incredible load: Usually above 5, sometimes above 10. Oddmuse is fast, but on emacswiki.org, it’s slow as a dog because there’s always an initial delay. I have noted the same delay when connecting via ssh and running as simple a command as ls. When I run Emacs, things are fast. When I do filename completion, there’s that delay again. Is it the disk?

EmacsWiki

Look at the output of top – there seems to be no %CPU usage. Perhaps some processes are hidden from me. That’s the only explanation I can think of.

top - 00:20:26 up 37 days, 12:26,  3 users,  load average: 11.76, 8.21, 6.38
Tasks:  51 total,   2 running,  49 sleeping,   0 stopped,   0 zombie
Cpu(s): 32.6% us,  4.3% sy,  3.3% ni, 23.3% id, 36.2% wa,  0.0% hi,  0.3% si
Mem:   2059712k total,  2043180k used,    16532k free,   141856k buffers
Swap:  2097144k total,   291732k used,  1805412k free,   801332k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 8007 aschremm  16   0  5364 1640 1392 S  0.3  0.1   7:27.25 top
    1 root      16   0  1532  466 1296 S  0.0  0.1   0:01.18 cron
 1209 root      15   0  3552  760  664 S  0.0  0.0   0:19.33 sshd
 6823 aschremm  15   0 11752 7096  744 S  0.0  0.3   9:56.24 screen
 6824 aschremm  16   0  7748 1388 1384 S  0.0  0.1   0:00.08 zsh
 6842 aschremm  16   0 11396 8272 1728 S  0.0  0.4  11:48.22 run_supybot
 6852 aschremm  16   0  6900 1300 1296 S  0.0  0.1   0:00.05 zsh
 6861 aschremm  15   0 56332  40m 3772 S  0.0  2.0  61:56.18 irssi
 8007 aschremm  16   0  5364 1640 1392 S  0.0  0.1   7:27.24 top
19423 aschremm  16   0  5796 1576 1500 S  0.0  0.1   0:00.02 sqlite3

How can I prove that my wiki is not involved in this load problem?

​#Unix

Comments

(Please contact me if you want to remove your comment.)

Hm, if a system is getting slow, the HD could be the problem (it is the problem we encounter most on our server farm). Did you check _var_log/messages for “I/O error” or similar? Load of >10 is already quite a lot, depending on the system, sw running, ... Could also be, that there’s a lot of IO traffic slowing down the system. Or too less RAM. Does your machine swap? More usefull commands which eventually help: pstree, ps -auxw|grep wiki, netstat, vmstat (but probably you already checked them...)

– 2ni 2006-09-22 09:41 UTC

2ni

---

Well, disk IO was what the admin told me. In a shared hosting environment, however, I’d like to see whether it is my scripts causing the problem or not.

At the moment load seems back under control. I am serving the four most popular RSS feeds from the filesystem (and regenerating them every two hours) instead of generating thousands of them every day. Specially the RSS feed containing full page content is a hog, and serving uploaded files from the filesystem, automatically changing the links to the current version of the image in the wiki to point to the static copy of the file.

Load is between 1 and 4, at the moment. I find it very frustrating to not have *proof*, however, that it was in fact my change that solved the problem.

– Alex Schroeder 2006-09-22 22:37 UTC

Alex Schroeder