Every now and then I wake up to a website that’s dead. When I try to restart the Toadfarm using Monit, nothing happens. I connect to the box using `ssh` and try to restart it from the command line:
$ ./farm reload * Reloading the process farm Can't create listen socket: Address already in use at /home/alex/perl5/perlbrew/perls/perl-5.26.1/lib/site_perl/5.26.1/Mojo/IOLoop.pm line 126. ...fail!
So the farm is up and listening to the port but it’s not actually doing anything. Now what? They’re all asleep!
top - 07:24:55 up 150 days, 15:34, 2 users, load average: 0.00, 0.03, 0.00 Tasks: 124 total, 2 running, 122 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.7 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st KiB Mem : 3084768 total, 336608 free, 737820 used, 2010340 buff/cache KiB Swap: 1022972 total, 16 free, 1022956 used. 2119480 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16448 alex 20 0 331468 125888 3760 S 0.0 4.1 0:00.33 /home/alex/farm 16563 alex 20 0 378652 104660 3668 S 0.0 3.4 0:00.38 /home/alex/farm 6681 alex 20 0 358828 92952 3940 S 0.0 3.0 0:00.89 /home/alex/farm 709 alex 20 0 347792 84888 5484 S 0.0 2.8 0:01.06 /home/alex/farm 6730 alex 20 0 342380 74416 3940 S 0.0 2.4 0:01.13 /home/alex/farm 710 alex 20 0 335376 65868 5424 S 0.0 2.1 0:00.99 /home/alex/farm 22714 alex 30 10 118320 45080 7772 S 0.0 1.5 0:00.53 perl 18640 alex 20 0 85272 25896 6612 S 0.0 0.8 0:00.43 perl 9579 alex 20 0 343312 25568 3848 S 0.0 0.8 0:00.45 /home/alex/farm 13534 alex 20 0 331080 25548 3848 S 0.0 0.8 0:00.09 /home/alex/farm 8731 alex 20 0 362272 25520 3848 S 0.0 0.8 0:00.10 /home/alex/farm 10399 alex 20 0 330736 25500 3812 S 0.0 0.8 0:00.35 /home/alex/farm 31772 alex 20 0 327720 25468 3848 S 0.0 0.8 0:01.46 /home/alex/farm 1931 alex 20 0 328808 25396 3848 S 0.0 0.8 0:00.19 /home/alex/farm 974 alex 20 0 329248 25380 3848 S 0.0 0.8 0:00.27 /home/alex/farm
And so I’m doing what I usually do, without actually solving the underlying problem:
for pid in $(ps -u alex | grep /farm | cut -c 1-5); do kill $pid; done
Do you have a better idea? What’s going on, here?
#Toadfarm #Monit #Administration
(Please contact me if you want to remove your comment.)
⁂
Gah. Had to kill it all, again. Why‽
I fixed my `kill-farm` script:
#! /bin/bash for pid in $(ps -xo pid,command|perl -ne '@s=split; print $s[0] . "\n" if $s[1] eq "/home/alex/farm/farm"'); do kill -9 $pid done
From `/var/log/syslog`:
Linux sibirocobombus 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Mon Oct 1 19:40:53 2018 from 2a02:168:4823:0:9c0b:e651:b5d7:2f4f root@sibirocobombus:~# cat /tmp/log Oct 1 16:51:54 sibirocobombus kernel: [15296489.567141] /home/alex/farm: page allocation stalls for 10472ms, order:0, mode:0x24200ca(GFP_HIGHUSER_MOVABLE) Oct 1 16:51:55 sibirocobombus kernel: [15296489.567150] CPU: 0 PID: 13804 Comm: /home/alex/farm Not tainted 4.9.0-3-amd64 #1 Debian 4.9.30-2+deb9u5 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567153] 0000000000000000 ffffffffb9f285b4 ffffffffba5febb0 ffffb9ccc10cbb68 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567156] ffffffffb9d84f3a 024200ca00000002 ffffffffba5febb0 ffffb9ccc10cbb08 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567159] ffff8f1a00000010 ffffb9ccc10cbb78 ffffb9ccc10cbb28 98dccf6c149eda9d Oct 1 16:51:55 sibirocobombus kernel: [15296489.567161] Call Trace: Oct 1 16:51:55 sibirocobombus kernel: [15296489.567168] [<ffffffffb9f285b4>] ? dump_stack+0x5c/0x78 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567171] [<ffffffffb9d84f3a>] ? warn_alloc+0x13a/0x160 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567173] [<ffffffffb9d8592d>] ? __alloc_pages_slowpath+0x95d/0xbc0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567175] [<ffffffffb9d85d8e>] ? __alloc_pages_nodemask+0x1fe/0x260 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567177] [<ffffffffb9dd7c3e>] ? alloc_pages_vma+0xae/0x260 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567181] [<ffffffffb9daf069>] ? wp_page_copy+0x89/0x700 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567183] [<ffffffffb9db0361>] ? do_wp_page+0x161/0x7d0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567186] [<ffffffffb9db3170>] ? handle_mm_fault+0x8d0/0x1350 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567188] [<ffffffffb9c24701>] ? __switch_to+0x2c1/0x6c0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567191] [<ffffffffb9c5fd84>] ? __do_page_fault+0x2a4/0x510 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567194] [<ffffffffba207788>] ? async_page_fault+0x28/0x30 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567195] Mem-Info: Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] active_anon:563083 inactive_anon:141043 isolated_anon:813 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] active_file:3237 inactive_file:1620 isolated_file:0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] unevictable:0 dirty:21 writeback:7962 unstable:0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] slab_reclaimable:8695 slab_unreclaimable:14080 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] mapped:5449 shmem:8956 pagetables:15088 bounce:0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567199] free:14229 free_pcp:0 free_cma:0 Oct 1 16:51:55 sibirocobombus kernel: [15296489.567209] Node 0 active_anon:2252332kB inactive_anon:564172kB active_file:12948kB inactive_file:6480kB unevictable:0kB isolated(anon):3252kB isolated(file):0kB mapped:21796kB dirty:84kB writeback:31848kB shmem:35824kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:2 all_unreclaimable? no Oct 1 16:51:55 sibirocobombus kernel: [15296489.567210] Node 0 DMA free:12124kB min:232kB low:288kB high:344kB active_anon:3492kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:12kB slab_unreclaimable:88kB kernel_stack:0kB pagetables:184kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567214] lowmem_reserve[]: 0 2975 2975 2975 2975 Oct 1 16:51:56 sibirocobombus kernel: [15296489.567217] Node 0 DMA32 free:44792kB min:44820kB low:56024kB high:67228kB active_anon:2248840kB inactive_anon:564172kB active_file:12948kB inactive_file:6480kB unevictable:0kB writepending:31932kB present:3129216kB managed:3068860kB mlocked:0kB slab_reclaimable:34768kB slab_unreclaimable:56232kB kernel_stack:5392kB pagetables:60168kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567221] lowmem_reserve[]: 0 0 0 0 0 Oct 1 16:51:56 sibirocobombus kernel: [15296489.567224] Node 0 DMA: 5*4kB (MEH) 5*8kB (UME) 4*16kB (ME) 5*32kB (UEH) 9*64kB (UMEH) 6*128kB (UEH) 5*256kB (UEH) 2*512kB (ME) 2*1024kB (UH) 3*2048kB (UME) 0*4096kB = 12124kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567236] Node 0 DMA32: 2599*4kB (UMEH) 1538*8kB (UMEH) 672*16kB (UMEH) 265*32kB (UME) 23*64kB (MEH) 1*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 44812kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567248] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567249] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567249] 26825 total pagecache pages Oct 1 16:51:56 sibirocobombus kernel: [15296489.567251] 13006 pages in swap cache Oct 1 16:51:56 sibirocobombus kernel: [15296489.567252] Swap cache stats: add 3064172, delete 3051166, find 484392409/484925615 Oct 1 16:51:56 sibirocobombus kernel: [15296489.567253] Free swap = 147028kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567253] Total swap = 1022972kB Oct 1 16:51:56 sibirocobombus kernel: [15296489.567254] 786302 pages RAM Oct 1 16:51:56 sibirocobombus kernel: [15296489.567255] 0 pages HighMem/MovableOnly Oct 1 16:51:56 sibirocobombus kernel: [15296489.567255] 15110 pages reserved Oct 1 16:51:56 sibirocobombus kernel: [15296489.567256] 0 pages hwpoisoned Oct 1 16:51:56 sibirocobombus kernel: [15296512.538179] /home/alex/farm invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0, oom_score_adj=0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538181] /home/alex/farm cpuset=/ mems_allowed=0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538186] CPU: 0 PID: 13764 Comm: /home/alex/farm Not tainted 4.9.0-3-amd64 #1 Debian 4.9.30-2+deb9u5 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538187] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538189] 0000000000000000 ffffffffb9f285b4 ffffb9ccc0e9bc20 ffff8f1a4e8f8100 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538193] ffffffffb9dfe020 0000000000000000 0000000000000000 0000000000000001 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538195] ffffffffb9d844e7 0000004252982a80 ffffffffc0454695 0000000000000001 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538197] Call Trace: Oct 1 16:51:56 sibirocobombus kernel: [15296512.538204] [<ffffffffb9f285b4>] ? dump_stack+0x5c/0x78 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538206] [<ffffffffb9dfe020>] ? dump_header+0x78/0x1fd Oct 1 16:51:56 sibirocobombus kernel: [15296512.538209] [<ffffffffb9d844e7>] ? get_page_from_freelist+0x3f7/0xb40 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538214] [<ffffffffc0454695>] ? virtballoon_oom_notify+0x25/0x70 [virtio_balloon] Oct 1 16:51:56 sibirocobombus kernel: [15296512.538217] [<ffffffffb9d8047a>] ? oom_kill_process+0x21a/0x3e0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538220] [<ffffffffb9d800fd>] ? oom_badness+0xed/0x170 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538221] [<ffffffffb9d80911>] ? out_of_memory+0x111/0x470 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538223] [<ffffffffb9d85b4f>] ? __alloc_pages_slowpath+0xb7f/0xbc0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538225] [<ffffffffb9d85d8e>] ? __alloc_pages_nodemask+0x1fe/0x260 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538227] [<ffffffffb9dd7c3e>] ? alloc_pages_vma+0xae/0x260 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538230] [<ffffffffb9daf069>] ? wp_page_copy+0x89/0x700 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538233] [<ffffffffb9db0361>] ? do_wp_page+0x161/0x7d0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538235] [<ffffffffb9db3170>] ? handle_mm_fault+0x8d0/0x1350 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538238] [<ffffffffb9c24701>] ? __switch_to+0x2c1/0x6c0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538240] [<ffffffffb9c5fd84>] ? __do_page_fault+0x2a4/0x510 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538243] [<ffffffffba207788>] ? async_page_fault+0x28/0x30 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538244] Mem-Info: Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] active_anon:571476 inactive_anon:143154 isolated_anon:532 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] active_file:204 inactive_file:222 isolated_file:0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] unevictable:0 dirty:0 writeback:3178 unstable:0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] slab_reclaimable:3157 slab_unreclaimable:13740 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] mapped:1215 shmem:7913 pagetables:15023 bounce:0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538248] free:14207 free_pcp:37 free_cma:0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538252] Node 0 active_anon:2285904kB inactive_anon:572616kB active_file:816kB inactive_file:888kB unevictable:0kB isolated(anon):2128kB isolated(file):0kB mapped:4860kB dirty:0kB writeback:12712kB shmem:31652kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:5391 all_unreclaimable? yes Oct 1 16:51:56 sibirocobombus kernel: [15296512.538252] Node 0 DMA free:12124kB min:232kB low:288kB high:344kB active_anon:3484kB inactive_anon:4kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:12kB slab_unreclaimable:88kB kernel_stack:0kB pagetables:188kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538257] lowmem_reserve[]: 0 2975 2975 2975 2975 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538260] Node 0 DMA32 free:44704kB min:44820kB low:56024kB high:67228kB active_anon:2282420kB inactive_anon:572612kB active_file:816kB inactive_file:888kB unevictable:0kB writepending:12712kB present:3129216kB managed:3068860kB mlocked:0kB slab_reclaimable:12616kB slab_unreclaimable:54872kB kernel_stack:5376kB pagetables:59904kB bounce:0kB free_pcp:148kB local_pcp:148kB free_cma:0kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538264] lowmem_reserve[]: 0 0 0 0 0 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538266] Node 0 DMA: 4*4kB (UEH) 2*8kB (E) 2*16kB (E) 5*32kB (UMEH) 8*64kB (UMEH) 5*128kB (UEH) 4*256kB (UEH) 1*512kB (M) 3*1024kB (UEH) 3*2048kB (UME) 0*4096kB = 12128kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538278] Node 0 DMA32: 2108*4kB (UEH) 1402*8kB (UMEH) 746*16kB (UMEH) 286*32kB (UMEH) 40*64kB (MEH) 1*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 44704kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538289] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538291] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538291] 18160 total pagecache pages Oct 1 16:51:56 sibirocobombus kernel: [15296512.538293] 9813 pages in swap cache Oct 1 16:51:56 sibirocobombus kernel: [15296512.538294] Swap cache stats: add 3100981, delete 3091168, find 484392412/484925620 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538294] Free swap = 0kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538295] Total swap = 1022972kB Oct 1 16:51:56 sibirocobombus kernel: [15296512.538296] 786302 pages RAM Oct 1 16:51:56 sibirocobombus kernel: [15296512.538296] 0 pages HighMem/MovableOnly Oct 1 16:51:56 sibirocobombus kernel: [15296512.538297] 15110 pages reserved Oct 1 16:51:56 sibirocobombus kernel: [15296512.538298] 0 pages hwpoisoned Oct 1 16:51:56 sibirocobombus kernel: [15296512.538298] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Oct 1 16:51:56 sibirocobombus kernel: [15296512.538302] [ 188] 0 188 15502 833 27 3 80 0 systemd-journal Oct 1 16:51:56 sibirocobombus kernel: [15296512.538304] [ 376] 0 376 11625 55 26 3 104 0 systemd-logind Oct 1 16:51:56 sibirocobombus kernel: [15296512.538306] [ 381] 0 381 7416 35 20 3 50 0 cron Oct 1 16:51:56 sibirocobombus kernel: [15296512.538308] [ 387] 0 387 62560 163 30 3 351 0 rsyslogd Oct 1 16:51:56 sibirocobombus kernel: [15296512.538309] [ 388] 0 388 3152 9 12 3 37 0 rsync Oct 1 16:51:56 sibirocobombus kernel: [15296512.538311] [ 393] 0 393 1054 0 8 3 35 0 acpid Oct 1 16:51:56 sibirocobombus kernel: [15296512.538312] [ 394] 105 394 11283 60 27 3 78 -900 dbus-daemon Oct 1 16:51:56 sibirocobombus kernel: [15296512.538314] [ 450] 0 450 3634 0 12 3 38 0 agetty Oct 1 16:51:56 sibirocobombus kernel: [15296512.538316] [ 458] 0 458 28698 198 26 3 93 0 monit Oct 1 16:51:56 sibirocobombus kernel: [15296512.538317] [ 464] 0 464 17486 41 37 3 164 -1000 sshd Oct 1 16:51:56 sibirocobombus kernel: [15296512.538319] [ 614] 0 614 12949 767 30 3 1563 0 munin-node Oct 1 16:51:56 sibirocobombus kernel: [15296512.538329] [ 823] 107 823 27941 23 50 3 365 0 exim4 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538331] [10368] 100 10368 31821 19 32 3 119 0 systemd-timesyn Oct 1 16:51:56 sibirocobombus kernel: [15296512.538333] [10648] 0 10648 11308 12 24 3 91 -1000 systemd-udevd Oct 1 16:51:56 sibirocobombus kernel: [15296512.538335] [25971] 1000 25971 27465 378 54 3 6922 0 /home/alex/camp Oct 1 16:51:56 sibirocobombus kernel: [15296512.538337] [22812] 0 22812 26416 842 58 3 46 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538338] [ 2794] 0 2794 169401 3422 61 4 455 0 fail2ban-server Oct 1 16:51:56 sibirocobombus kernel: [15296512.538340] [29327] 33 29327 216079 2688 100 4 530 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538342] [23538] 1000 23538 81856 167 143 3 29692 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538343] [14140] 1000 14140 81929 249 144 3 29760 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538345] [15321] 1000 15321 85002 249 149 3 31614 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538346] [15742] 1000 15742 82686 247 144 3 29465 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538348] [15822] 1000 15822 82672 253 144 3 29467 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538349] [18583] 1000 18583 85088 250 149 3 31258 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538351] [16180] 1000 16180 83100 164 148 3 30518 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538353] [16240] 1000 16240 84600 170 151 3 31838 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538354] [17135] 1000 17135 84455 169 150 3 31723 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538356] [18081] 1000 18081 83276 169 147 3 29866 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538357] [18089] 1000 18089 82757 167 146 3 29665 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538359] [18097] 1000 18097 83000 167 146 3 29722 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538360] [28873] 1000 28873 83946 167 148 3 30644 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538362] [29477] 1000 29477 84020 168 148 3 30825 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538364] [17703] 1000 17703 85534 10436 152 3 22151 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538366] [25630] 1000 25630 84044 17581 147 3 13484 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538367] [23799] 1000 23799 83909 16024 149 4 15467 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538369] [23831] 1000 23831 88924 19813 159 4 16552 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538370] [17148] 1000 17148 27465 3262 52 3 4112 0 /home/alex/camp Oct 1 16:51:56 sibirocobombus kernel: [15296512.538372] [24680] 65534 24680 12608 2506 27 3 0 0 python Oct 1 16:51:56 sibirocobombus kernel: [15296512.538374] [ 6121] 1000 6121 84608 32664 151 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538376] [ 9051] 33 9051 26290 742 54 3 46 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538377] [ 9372] 1000 9372 79677 26765 141 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538379] [ 9393] 1000 9393 88087 36094 158 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538380] [ 9395] 1000 9395 104793 53630 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538382] [ 9396] 1000 9396 94890 43125 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538392] [ 9418] 1000 9418 21312 4988 45 3 0 0 perl Oct 1 16:51:56 sibirocobombus kernel: [15296512.538394] [ 9989] 1000 9989 17541 3956 37 3 0 0 perl Oct 1 16:51:56 sibirocobombus kernel: [15296512.538395] [11782] 33 11782 215922 2843 99 4 45 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538397] [11833] 33 11833 214475 2459 97 4 45 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538398] [26741] 1000 26741 92116 38897 165 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538400] [17934] 1000 17934 87620 35407 157 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538402] [18489] 1000 18489 85855 32863 152 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538403] [19093] 33 19093 215694 2608 99 4 45 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538405] [13755] 1000 13755 94098 41334 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538407] [13758] 1000 13758 106413 54375 195 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538408] [13759] 1000 13759 94098 41361 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538410] [13763] 1000 13763 94098 41501 170 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538411] [13764] 1000 13764 89787 36814 160 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538413] [13765] 1000 13765 93158 38962 166 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538414] [13766] 1000 13766 89786 36814 160 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538415] [13767] 1000 13767 94098 41334 171 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538417] [13768] 1000 13768 106413 54423 193 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538419] [13769] 1000 13769 93158 38962 166 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538420] [13771] 1000 13771 94098 41501 170 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538422] [13772] 1000 13772 106413 54423 193 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538423] [13773] 1000 13773 93158 38961 166 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538425] [13774] 1000 13774 89786 36814 160 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538427] [13775] 33 13775 214080 930 95 4 46 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538428] [13804] 1000 13804 94098 41501 170 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538430] [13805] 1000 13805 89786 36814 160 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538431] [13806] 1000 13806 95932 43127 174 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538433] [13807] 1000 13807 106413 54423 193 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538435] [13808] 1000 13808 92634 38961 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538436] [13809] 1000 13809 89786 36815 160 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538438] [13810] 1000 13810 95932 43127 174 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538439] [13812] 1000 13812 92634 38961 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538441] [13813] 1000 13813 105709 54206 190 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538442] [13814] 1000 13814 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538444] [13815] 33 13815 213919 868 95 4 46 0 apache2 Oct 1 16:51:56 sibirocobombus kernel: [15296512.538446] [13817] 1000 13817 89164 36681 158 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538447] [13844] 1000 13844 92634 38898 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538449] [13845] 1000 13845 105609 54032 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538450] [13846] 1000 13846 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538452] [13847] 1000 13847 105789 54335 190 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538453] [13848] 1000 13848 89263 36789 158 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538455] [13849] 1000 13849 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538456] [13850] 1000 13850 92634 38898 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538458] [13851] 1000 13851 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538459] [13852] 1000 13852 105311 53674 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538461] [13853] 1000 13853 88605 36095 157 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538462] [13854] 1000 13854 92634 38940 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538463] [13855] 1000 13855 88605 36095 157 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538465] [13856] 1000 13856 105311 53631 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538466] [13857] 1000 13857 92634 38898 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538468] [13858] 1000 13858 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538469] [13859] 1000 13859 88605 36095 157 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538471] [13860] 1000 13860 105311 53631 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538472] [13861] 1000 13861 92634 38898 164 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538474] [13862] 1000 13862 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538475] [13863] 1000 13863 88605 36095 157 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538477] [13864] 1000 13864 105311 53737 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538478] [13865] 1000 13865 92116 38883 162 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538480] [13866] 1000 13866 95408 43126 172 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538481] [13867] 1000 13867 88087 36080 155 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538483] [13868] 1000 13868 105311 53631 189 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538484] [13869] 1000 13869 94890 43125 171 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538486] [13870] 1000 13870 92116 38883 162 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538488] [13871] 1000 13871 88087 36080 155 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538489] [13872] 1000 13872 104793 53616 188 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538491] [13873] 1000 13873 94890 43111 171 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538492] [13875] 1000 13875 88087 36070 155 3 0 0 /home/alex/farm Oct 1 16:51:56 sibirocobombus kernel: [15296512.538494] Out of memory: Kill process 13768 (/home/alex/farm) score 53 or sacrifice child Oct 1 16:51:56 sibirocobombus kernel: [15296512.867096] Killed process 13768 (/home/alex/farm) total-vm:425652kB, anon-rss:217024kB, file-rss:668kB, shmem-rss:0kB
– Alex Schroeder 2018-10-01 17:02 UTC
---
OK, I wrote myself a `cull-farm` script:
#! /home/alex/perl5/perlbrew/perls/perl-5.26.1/bin/perl use Modern::Perl; my %data = map { $_->[0] => $_->[1] } grep { $_->[2] eq "/home/alex/farm/farm" and $_->[3] ne "1" } map { [ split ] } (qx(ps -xo pid,etimes,command,ppid)); my @ids = sort { $data{$a} <=> $data{$b} } keys %data; # say "pids: @ids"; my @old_ids = @ids[4 .. $#ids]; if (@old_ids > 0) { say "SIGTERM for @old_ids"; kill 'TERM', @old_ids; } else { say "The only child processes we have are @ids"; }
It runs a `ps` which prints the process id, the elapsed time in seconds, the command name (so that I can filter for my processes), and the parent process id. I don’t want to kill the top level process that keeps spawning the children!
– Alex Schroeder 2018-10-07 17:17 UTC
---
I fiddled with that script again because I noticed something. Here’s an example output when sorted by etimes, the elapsed time in seconds:
alex@sibirocobombus:~$ ps -x -k etimes -o pid,etimes,command,ppid | grep /home/alex/farm/farm 22634 0 grep /home/alex/farm/farm 16815 22541 75 /home/alex/farm/farm 32495 22542 75 /home/alex/farm/farm 32495 22539 76 /home/alex/farm/farm 32495 22540 76 /home/alex/farm/farm 32495 32495 27081 /home/alex/farm/farm 1 4376 217327 /home/alex/farm/farm 1 4352 217356 /home/alex/farm/farm 1 4285 217497 /home/alex/farm/farm 1 4238 217538 /home/alex/farm/farm 1
In this situation we want to start killing from the end. Look at all those old processes that are still hanging around, started by 1. These need to go!
#! /home/alex/perl5/perlbrew/perls/perl-5.26.1/bin/perl use Modern::Perl; if (grep /^(-h|--help)$/, @ARGV) { say "Use -d for debugging"; exit; } my $debug = 0; $debug = 1 if (grep /^(-d|--debug)$/, @ARGV); my %data = map { $_->[0] => { etimes => $_->[1], ppid => $_->[3] }} grep { $_->[2] eq "/home/alex/farm/farm" } map { [ split ] } # -o is output; 0: pid; 1: etimes; 2: command; 3: ppid # -k is sorting; etimes is elapsed time (qx(ps -x -k etimes -o pid,etimes,command,ppid)); # Example result: # 22634 0 grep /home/alex/farm/farm 16815 # 22541 75 /home/alex/farm/farm 32495 # 22542 75 /home/alex/farm/farm 32495 # 22539 76 /home/alex/farm/farm 32495 # 22540 76 /home/alex/farm/farm 32495 # 32495 27081 /home/alex/farm/farm 1 # 4376 217327 /home/alex/farm/farm 1 # 4352 217356 /home/alex/farm/farm 1 # 4285 217497 /home/alex/farm/farm 1 # 4238 217538 /home/alex/farm/farm 1 # In this situation, the process that we do not want to kill is 32495 and we # want to start killing from the back: 4376, 4352, 4285, 4238. for my $pid (keys %data) { if ($data{$pid} and $data{$data{$pid}->{ppid}}) { say "protected pid: $data{$pid}->{ppid}" if $debug; delete $data{$data{$pid}->{ppid}}; } } my @ids = sort { $data{$b} <=> $data{$a} } keys %data; say "pids: @ids" if $debug; my @old_ids = @ids[4 .. $#ids]; if (@old_ids > 0) { say "SIGTERM for @old_ids"; kill 'TERM', @old_ids unless $debug; } else { say "The only child processes we have are @ids"; }
– Alex Schroeder 2018-10-26 12:04 UTC