2014-12-24 Emacs Wiki Migration

(Continued from yesterday.)

yesterday

Current status:

Here’s the munin graph that makes me suspect that I’m not getting all of the CPU I need. The gaps happen when load is so high that munin doesn’t get to run. Yikes!

https://alexschroeder.ch/pics/15911905077_b82b4056a0_o.png

I think what happens is that somebody else on this server is taking up %CPU and this results in Apache processes piling up.

Here’s from the log files:

Dec 24 09:46:37 localhost monit[29022]: 'apache' trying to restart
Dec 24 09:46:37 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
Dec 24 09:46:38 localhost sm-mta[27076]: rejecting connections on daemon MTA-v4: load average: 71
...
Dec 24 13:02:55 localhost monit[29022]: 'apache' trying to restart
Dec 24 13:02:55 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
Dec 24 13:03:01 localhost sm-mta[27076]: rejecting connections on daemon MTA-v4: load average: 37
...
Dec 24 15:55:35 localhost monit[29022]: 'apache' trying to restart
Dec 24 15:55:35 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
Dec 24 15:55:43 localhost sm-mta[27076]: rejecting connections on daemon MTA-v4: load average: 27
...
Dec 24 16:54:16 localhost monit[29022]: 'apache' trying to restart
Dec 24 16:54:16 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
Dec 24 16:54:25 localhost sm-mta[27076]: rejecting connections on daemon MTA-v4: load average: 34
...
Dec 24 17:56:52 localhost monit[29022]: 'apache' trying to restart
Dec 24 17:56:52 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
...
Dec 24 21:56:02 localhost monit[29022]: 'apache' trying to restart
Dec 24 21:56:02 localhost monit[29022]: 'apache' stop: /etc/init.d/apache2
Dec 24 21:56:03 localhost sm-mta[27076]: rejecting connections on daemon MTA-v4: load average: 37

OK, so how do I make Apache more robust against those spikes?

The current settings:

1. StartServers: initial number of server processes to start
1. MinSpareThreads: minimum number of worker threads which are kept spare
1. MaxSpareThreads: maximum number of worker threads which are kept spare
1. ThreadLimit: ThreadsPerChild can be changed to this maximum value during a
1.              graceful restart. ThreadLimit can only be changed by stopping
1.              and starting Apache.
1. ThreadsPerChild: constant number of worker threads in each server process
1. MaxClients: maximum number of simultaneous client connections
1. MaxRequestsPerChild: maximum number of requests a server process serves
1. <IfModule mpm_worker_module>
1.     StartServers          2
1.     MinSpareThreads      25
1.     MaxSpareThreads      75
1.     ThreadLimit          64
1.     ThreadsPerChild      25
1.     MaxClients          150
1.     MaxRequestsPerChild   0
1. </IfModule>
<IfModule mpm_worker_module>
    StartServers          2
    ServerLimit           3
    MinSpareThreads      25
    MaxSpareThreads      75
    ThreadLimit         100
    ThreadsPerChild     100
    MaxClients          300
    MaxRequestsPerChild  10000
</IfModule>

Nic said, I should drop those numbers to better match the cores I have available. Each thread actually running will do IO, so that’s an additional issue.

Two cores → no more than 2 servers. When a thread waits for the disk, another thread can run. How many more threads? Not many, because they will also need to use the disk. How about a 90% reduction: 10 threads per child. Looking at this German blog post again.

this German blog post

Hours later. It seems to have worked? Load stable, hovering around 2 – this makes sense.

https://alexschroeder.ch/pics/16100535652_84b896b86a_o.png

All is not perfect, unfortunately. Monit says Apache has 6h uptime. That means somebody restarted Apache during the night. Damn.

​#Emacs ​#Wikis ​#Oddmuse ​#mod perl ​#Apache ​#devops ​#Administration