Not really! It’s Phoebe… 😭
I just noticed that three of my server processes seem to consume a lot of memory. The following three wikis were reporting more than 2G used, each:
Strangely, this site, or Campaign Wiki seemed not to be affected. I don’t see any performance problems.
Memory issues started on week 51, 2020
I’m suspecting that something else is going on. I restarted the the three processes using Hypnotoad, a hot deployment. But htop is still telling me that we’re using a lot of memory. When I sort by memory, however, I see it’s Phoebe at the very top.
Phoebe is consuming 40% if all memory
So now I’m wondering:
Something to investigate in the upcoming days. For now, I’m going to restart Phoebe.
I fixed the regular expressions for my Munin setup so that all the command line parameters I’m using to start Phoebe don’t trigger matches for other processes such as Community Wiki, Emacs Wiki, or Oddmuse Wiki.
That still leaves the question: What did I install, what changes did I make around December 14? I have no idea...
#Phoebe
(Please contact me if you want to remove your comment.)
⁂
Yep, Phoebe is at fault.
Memory use increasing for Phoebe only
So now I have to find a way to reproduce this result locally. Ideally we’ll find that there’s a single config file at fault... If I’m unlucky it’s because Phoebe 2 switched from Net::Server to Mojo::IOLoop and I’m somehow not closing the connections correctly, or some reference remains that prevents them from closing 100%. Yikes.
– Alex 2021-01-03 09:57 UTC
---
OK, I think I have a test setup that works. Let me know if you see an error in my thinking.
The test starts a server in the background, listening to a random port ($port) like all the other tests. I need its process id so that I can determine the memory it uses:
lsof -i:$port -F p
Given the process id ($pid), I can now ask for the memory size of the process:
ps -q $pid -o size
So what I’m doing is I get this size after starting the server, then I run 100 requests, and I get the size again. I run just Phoebe, no problem. If I install the speed bump extension (see 2020-12-25 Defending against crawlers), size goes up.
2020-12-25 Defending against crawlers
Well, I guess I’m fine with it going up a bit since we need to keep some numbers in memory, right? There problem is only truly a problem if the memory keeps going up. Let’s try.
100 requests:
$ make test TEST_FILES=t/leak.t PERL_DL_NONLAZY=1 "/home/alex/perl5/perlbrew/perls/perl-5.30.0/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/leak.t t/leak.t .. 1/? # Failed test 'No memory lost (42524 = 43256)' # at t/leak.t line 33. # got: '42524' # expected: '43256' # Looks like you failed 1 test of 1. t/leak.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests Test Summary Report ------------------- t/leak.t (Wstat: 256 Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=1, Tests=1, 2 wallclock secs ( 0.02 usr 0.00 sys + 0.71 cusr 0.07 csys = 0.80 CPU) Result: FAIL Failed 1/1 test programs. 1/1 subtests failed. make: *** [Makefile:940: test_dynamic] Error 1
1000:
# Failed test 'No memory lost (42528 = 43256)'
10000:
# Failed test 'No memory lost (42524 = 43256)'
OK, clearly it just stays the same. Hm...
Perhaps it’s time to take a look at Devel::MAT. Too bad Devel::MAT::Dumper doesn’t install on the server. Something about xlocale.h missing. Installing libnewlib-dev (which provides /usr/include/newlib/xlocale.h) appears to have no effect:
Building Devel-MAT-Dumper cc -I/home/alex/perl5/perlbrew/perls/perl-5.26.1/lib/5.26.1/x86_64-linux/CORE -DVERSION="0.42" -DXS_VERSION="0.42" -fPIC -c -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -O2 -o lib/Devel/MAT/Dumper.o lib/Devel/MAT/Dumper.c In file included from lib/Devel/MAT/Dumper.xs:8: /home/alex/perl5/perlbrew/perls/perl-5.26.1/lib/5.26.1/x86_64-linux/CORE/perl.h:738:13: fatal error: xlocale.h: No such file or directory # include <xlocale.h> ^~~~~~~~~~~ compilation terminated.
OK. Before doing anything else, I’m going to use Perlbrew to upgrade Perl 5.26 to 5.32. Then we’ll see about installing Devel::MAT.
– Alex 2021-01-03 17:38 UTC
---
OK, that worked! I wrote a little extension that allows me to write a heap dump to disk on the server.
Install it in your conf.d directory, set the known fingerprints to the hash of your favourite fingerprint in your config file (or edit the heap-dump.pl file):
our @known_fingerprints = qw( sha256$54c0b95dd56aebac1432a3665107d3aec0d4e28fef905020ed6762db49e84ee1);
Install the Devel::MAT::Dump package on your server, and Devel::MAT on your development machine. Restart Phoebe.
Finally, visit the path /do/heap-dump on your Phoebe instance. You’ll be prompted for your client certificate. Point your Gemini client to the one whose fingerprint you added your config above, and now it should say: “Heap Dump Saved” – copy the file it created to your local machine and examine it, using the user guide linked above as your guide.
Here’s the output right after I restarted it:
$ pmat phoebe.pmat Perl memory dumpfile from perl 5.32.0 non-threaded Heap contains 217470 objects pmat> largest HASH(24948) at 0x558c89875548=strtab: 1.7 MiB HASH(5672) at 0x558c89b3c450: 261.0 KiB SCALAR(PV) at 0x558c89beaa18: 215.7 KiB ARRAY(1,!REAL) at 0x558c89899098: 92.5 KiB HASH(2231) at 0x558c8aac5240: 84.3 KiB others: 15.2 MiB
I’m going to let it run over night and we’ll see how it does tomorrow.
– Alex 2021-01-03 21:34 UTC
---
OK, a few hours later...
$ pmat phoebe.pmat Perl memory dumpfile from perl 5.32.0 non-threaded Heap contains 313361 objects pmat> largest HASH(39010) at 0x558c89875548=strtab: 2.4 MiB HASH(8940) at 0x558c89b8f2a8: 337.6 KiB HASH(5672) at 0x558c89b3c450: 261.0 KiB SCALAR(PV) at 0x558c98eaf6e8: 215.7 KiB SCALAR(PV) at 0x558c89beaa18: 215.7 KiB others: 21.0 MiB
That is not very encouraging. There apparently is no single thing that grew in limitless ways. 😢
pmat> count Kind Count (blessed) Bytes (blessed) ARRAY 11108 7 1.2 MiB 520 bytes CODE 10387 1.3 MiB GLOB 14796 16 2.1 MiB 2.4 KiB HASH 23305 3182 7.4 MiB 849.6 KiB INVLIST 655 674.4 KiB IO 33 33 5.2 KiB 5.2 KiB PAD 7870 1.5 MiB REF 41199 1 965.6 KiB 24 bytes REGEXP 2503 253 528.0 KiB 53.4 KiB SCALAR 200878 12 8.0 MiB 1022 bytes STASH 627 801.6 KiB ------- ------ --------- --------- ---------- (total) 313361 3504 24.5 MiB 912.1 KiB
What I don’t really understand is that the numbers I’m seeing are all around 20 MiB, and yet Munin is telling me that the minimum memory is 97.31 MiB and the maximum is 266.54 MiB (also the current value). So where are the 170 MiB I’m missing?
Hm. Let’s take a look at the two scalar values we saw up in the list of “largest” SVs. Surprisingly, they are exactly the same size. Let’s examine them.
... SCALAR(PV) at 0x558c98eaf6e8: 215.7 KiB SCALAR(PV) at 0x558c89beaa18: 215.7 KiB ... pmat> show 0x558c98eaf6e8 SCALAR(PV) at 0x558c98eaf6e8 with refcount 1 size 215.7 KiB (220840 bytes) PV="// This Source Code Form is subject to the terms of the Mozilla "... PVLEN 220798 pmat> show 0x558c89beaa18 SCALAR(PV) at 0x558c89beaa18 with refcount 1 size 215.7 KiB (220840 bytes) PV="// This Source Code Form is subject to the terms of the Mozilla "... PVLEN 220798
OK, that is surprising. I don’t recognize this text. Perhaps there are more like it? After all, the count shows 200878 scalars. What’s with the refcount? Who is keeping those references?
pmat> inrefs --all 0x558c98eaf6e8 pmat> inrefs --all 0x558c89beaa18 s a constant CODE(PP) at 0x558c89b8f080
OK, nobody is referring to the first one, which is even more surprising. Perhaps it’s about to be garbage collected? The second value gives us a hint:
pmat> show 0x558c89b8f080 CODE(PP) at 0x558c89b8f080 with refcount 1 size 128 bytes named as &IO::Socket::SSL::PublicSuffix::_builtin_data no hekname stash=STASH(19) at 0x558c89bd0f98 glob=GLOB(&*) at 0x558c89bea460 location=/home/alex/perl5/perlbrew/perls/perl-5.32.0/lib/site_perl/5.32.0/IO/Socket/SSL/PublicSuffix.pm line 348 scope=CODE() at 0x558c89c4f820 pad[0]=PAD(1) at 0x558c89b8f188
The SSL stuff. 😭 This reminds me of the only test in gemini-diagnostics Phoebe is currently failing: “Server should send a close_notify alert before closing the connection.”
Perhaps that’s my problem. The SSL/TLS connections aren’t closed correctly, so they hang around.
I’ll have to think about this some more.
– Alex 2021-01-04 08:12 UTC
---
Ten hours later, Munin says 418.04 MiB, and the heap dump says:
pmat> largest HASH(46066) at 0x558c89875548=strtab: 3.3 MiB HASH(15996) at 0x558c89b8f2a8: 631.0 KiB SCALAR(PV) at 0x558c8af5d8a0: 305.0 KiB HASH(5672) at 0x558c89b3c450: 261.0 KiB SCALAR(PV) at 0x558c98cc4d48: 232.0 KiB others: 24.5 MiB pmat> count Kind Count (blessed) Bytes (blessed) ARRAY 12116 7 1.3 MiB 520 bytes CODE 11395 1.4 MiB GLOB 14796 16 2.1 MiB 2.4 KiB HASH 30361 3182 9.6 MiB 849.6 KiB INVLIST 655 674.4 KiB IO 33 33 5.2 KiB 5.2 KiB PAD 8878 1.8 MiB REF 56319 1 1.3 MiB 24 bytes REGEXP 2510 253 529.5 KiB 53.4 KiB SCALAR 228143 12 9.7 MiB 1022 bytes STASH 627 801.6 KiB ------- ------ --------- --------- ---------- (total) 365833 3504 29.2 MiB 912.1 KiB
– Alex 2021-01-04 17:40 UTC
---
Ok, let’s look at the ever growing hash at 0x558c8d32a720:
pmat> values 0x558c89b8f2a8 ... {ptr_0x558c8d3d5f40} REF() at 0x558c8d29c850 => HASH(1) at 0x558c8d2ab990 {ptr_0x558c8d3eabc0} REF() at 0x558c8d2ec190 => HASH(1) at 0x558c8d29fc48 {ptr_0x558c8d3eb470} REF() at 0x558c8d3bdc18 => HASH(1) at 0x558c8d3bdc30 {ptr_0x558c8d3f6cd0} REF() at 0x558c8d2f3e00 => HASH(1) at 0x558c8d32a720 ... (15946 more) pmat [more]> values 0x558c8d32a720 {"ssleay_verify_callback!!func"} REF() at 0x558c8d3a4758 => CODE(PP,closure) at 0x558c8d32aac8 pmat> values 0x558c8d3bdc30 {"ssleay_verify_callback!!func"} REF() at 0x558c8d3bdca8 => CODE(PP,closure) at 0x558c8d3bd8e8 pmat> values 0x558c8d29fc48 {"ssleay_verify_callback!!func"} REF() at 0x558c8d2ac260 => CODE(PP,closure) at 0x558c8d32aac8 pmat> values 0x558c8d2ab990 {"ssleay_verify_callback!!func"} REF() at 0x558c8d29d078 => CODE(PP,closure) at 0x558c8d32aac8
Ugh, SSL again! Verify callback – that also rings a bell. Phoebe code:
IO::Socket::SSL::set_defaults( SSL_verify_mode => SSL_VERIFY_PEER, SSL_verify_callback => \&verify_fingerprint);
Hm. So now I’m thinking: a hash with 16,000 SSL sockets?
– Alex 2021-01-04 20:05 UTC
---
I’ve made a small change: as the stream closes, I undefine the $data reference. I figured that something close to this loop (”for every socket?”) must be keeping references around.
Mojo::IOLoop->server({ address => $address, port => $port, tls => 1, tls_cert => $server->{cert_file}, tls_key => $server->{key_file}, } => sub { my ($loop, $stream) = @_; my $data = { buffer => '', handler => \&handle_request }; $stream->on(read => sub { my ($stream, $bytes) = @_; $log->debug("Received " . length($bytes) . " bytes"); $data->{buffer} .= $bytes; $data->{handler}->($stream, $data) }); $stream->on(close => sub { my $stream = shift; undef($data); }) });
Twelve hours later and Munin reports a current RSS of 206 MiB. That’s much better than anticipated! I’ll keep it running for another day or so, see how it develops.
– Alex 2021-01-06 07:55 UTC
---
A few more hours and memory is down to 174 MiB, so memory actually went *down*. Amazing! 😄
pmat> largest HASH(29739) at 0x55cf4fc7c548=strtab: 2.0 MiB HASH(5672) at 0x55cf4ff43d60: 261.0 KiB SCALAR(PV) at 0x55cf4fff2238: 215.7 KiB HASH(4782) at 0x55cf4ff96aa8: 176.1 KiB ARRAY(1,!REAL) at 0x55cf4fca01d8: 92.5 KiB others: 17.2 MiB pmat> count Kind Count (blessed) Bytes (blessed) ARRAY 10198 7 1.2 MiB 520 bytes CODE 9603 1.2 MiB GLOB 14374 16 2.1 MiB 2.4 KiB HASH 11712 3181 5.0 MiB 884.5 KiB INVLIST 583 652.3 KiB IO 33 33 5.2 KiB 5.2 KiB PAD 7079 1.3 MiB REF 24869 1 582.9 KiB 24 bytes REGEXP 2303 253 485.8 KiB 53.4 KiB SCALAR 173493 12 6.8 MiB 1022 bytes STASH 610 776.3 KiB ------- ------ --------- --------- ---------- (total) 254857 3503 19.9 MiB 946.9 KiB
– Alex 2021-01-06 12:14 UTC
---
Too bad with memory climbing again. That’s wasn’t the answer, or not the entire answer, sadly! Back at 325 MiB.
When I look at this, at all the complications, and the memory used, and compare it to the implementations based on Net::Server (”old style”) with it’s 23 MiB (used for the Gopher and Finger server on ports 70 and 79, as well as the SSL enabled Gopher on port 7334) – I don’t have happy thoughts.
pmat> count --blessed Kind Count (blessed) Bytes (blessed) ARRAY 11506 7 1.3 MiB 520 bytes Math::BigInt::Calc 0 3 0 bytes 216 bytes Net::DNS::RR::OPT 0 2 0 bytes 160 bytes Class::Struct::Tie_ISA 0 1 0 bytes 64 bytes Net::DNS::RR 0 1 0 bytes 80 bytes CODE 10786 1.3 MiB GLOB 14792 16 2.1 MiB 2.4 KiB IO::Socket::IP 0 15 0 bytes 2.2 KiB IO::Socket::SSL 0 1 0 bytes 152 bytes HASH 26086 3182 8.2 MiB 884.6 KiB Specio::DeclaredAt 0 1537 0 bytes 324.2 KiB Specio::Constraint::Simple 0 1428 0 bytes 483.8 KiB DateTime::Format::Builder::Parser::Regex 0 67 0 bytes 15.9 KiB Specio::Constraint::Parameterizable 0 60 0 bytes 22.1 KiB Mojo::IOLoop::Server 0 15 0 bytes 3.7 KiB Specio::Constraint::ObjectIsa 0 15 0 bytes 5.5 KiB Specio::Constraint::Enum 0 10 0 bytes 3.8 KiB Specio::Constraint::Union 0 7 0 bytes 2.2 KiB Specio::Constraint::AnyCan 0 6 0 bytes 2.2 KiB Specio::Constraint::ObjectCan 0 5 0 bytes 1.8 KiB (others) 0 32 0 bytes 19.5 KiB INVLIST 651 662.6 KiB IO 33 33 5.2 KiB 5.2 KiB IO::File 0 33 0 bytes 5.2 KiB PAD 8269 1.6 MiB REF 47156 1 1.1 MiB 24 bytes IO::Socket::SSL::SSL_HANDLE 0 1 0 bytes 24 bytes REGEXP 2491 253 525.4 KiB 53.4 KiB Regexp 0 253 0 bytes 53.4 KiB SCALAR 211663 12 8.2 MiB 1022 bytes Encode::XS 0 5 0 bytes 360 bytes JSON::PP::Boolean 0 4 0 bytes 288 bytes Cpanel::JSON::XS 0 2 0 bytes 292 bytes Math::BigInt 0 1 0 bytes 82 bytes STASH 627 803.2 KiB -------------------------------------------- ------ --------- --------- ---------- (total) 334060 3504 25.8 MiB 947.1 KiB
I don’t understand what this means… I looks like everything is OK?
Comparing the two heap dumps I have:
$ pmat-counts phoebe-2021-01-06T* phoebe-2021-01-06T13:14.pmat Kind Count (blessed) Bytes (blessed) ARRAY 10198 7 1.2 MiB 520 bytes CODE 9603 1.2 MiB GLOB 14374 16 2.1 MiB 2.4 KiB HASH 11712 3181 5.0 MiB 884.5 KiB INVLIST 583 652.3 KiB IO 33 33 5.2 KiB 5.2 KiB PAD 7079 1.3 MiB REF 24869 1 582.9 KiB 24 bytes REGEXP 2303 253 485.8 KiB 53.4 KiB SCALAR 173493 12 6.8 MiB 1022 bytes STASH 610 776.3 KiB ------- ------ --------- --------- ---------- (total) 254857 3503 19.9 MiB 946.9 KiB phoebe-2021-01-06T22:35.pmat Kind Count (blessed) Bytes (blessed) ARRAY 11506(+1308) 7 1.3 MiB(+119.4 KiB) 520 bytes CODE 10786(+1183) 1.3 MiB(+147.9 KiB) GLOB 14792(+418) 16 2.1 MiB(+62.0 KiB) 2.4 KiB HASH 26086(+14374) 3182(+1) 8.2 MiB(+3.2 MiB) 884.6 KiB(+168 bytes) INVLIST 651(+68) 662.6 KiB(+10.3 KiB) IO 33 33 5.2 KiB 5.2 KiB PAD 8269(+1190) 1.6 MiB(+330.7 KiB) REF 47156(+22287) 1 1.1 MiB(+522.4 KiB) 24 bytes REGEXP 2491(+188) 253 525.4 KiB(+39.7 KiB) 53.4 KiB SCALAR 211663(+38170) 12 8.2 MiB(+1.5 MiB) 1022 bytes STASH 627(+17) 803.2 KiB(+26.8 KiB) ------- -------------- ------------- --------------------- ---------------------- (total) 334060(+79203) 3504(+1) 25.8 MiB(+5.9 MiB) 947.1 KiB(+168 bytes)
The difference seems minimal. And yet, ps -e rss reports 325 MiB instead of 174 MiB used.
– Alex 2021-01-06 21:30 UTC
---
Well… What do we have? I have three heap dump files for the current process:
23M 2021-01-06 13:14 phoebe-2021-01-06T13:14.pmat 29M 2021-01-06 22:35 phoebe-2021-01-06T22:35.pmat 42M 2021-01-07 21:03 phoebe-2021-01-07T21:03.pmat
RSS reported by ps is 174 MiB, 325 MiB, and 822 Mib. It’s not looking good. What amazes me, however, is that the memory usage as reported by Munin using “/proc/meminfo” doesn’t seem to reflect that. Or maybe it does?
Anyway, I’m experimenting with pmat tools. The output of “pmat-leakreport” is extremly long. It’s pages and pages of this:
$ pmat-leakreport phoebe-2021-01-0[67]* … LEAK[2] HASH(1) at 0x55cf5e4d3e00 LEAK[2] REF() at 0x55cf5e472f08 LEAK[2] SCALAR(PV) at 0x55cf5e2c7318 LEAK[2] SCALAR(UV) at 0x55cf5e4b8b78 LEAK[2] HASH(1) at 0x55cf5ab96ce0 LEAK[2] HASH(1) at 0x55cf5b69f8b8 LEAK[2] HASH(1) at 0x55cf604236c8 LEAK[2] UNDEF() at 0x55cf62d869c0 LEAK[2] GLOB($*) at 0x55cf5e039088 LEAK[2] HASH(1) at 0x55cf60a59478 LEAK[2] REF() at 0x55cf62887cf0 LEAK[2] SCALAR(UV) at 0x55cf5e49a430 LEAK[2] REF() at 0x55cf63108b28 LEAK[2] UNDEF() at 0x55cf5d57cfb8 LEAK[2] SCALAR(UV) at 0x55cf5e401578 LEAK[2] UNDEF() at 0x55cf6086c7a0 LEAK[2] REF() at 0x55cf5ecf80c0 LEAK[2] REF() at 0x55cf5e369e00 LEAK[2] UNDEF() at 0x55cf5fa091e8 LEAK[2] HASH(1) at 0x55cf5bc07e48 LEAK[2] UNDEF() at 0x55cf62a23f60 LEAK[2] UNDEF() at 0x55cf5c78a818 LEAK[2] SCALAR(UV) at 0x55cf5e48d5c0 LEAK[2] REF() at 0x55cf6137ebf8
I don’t know what to make of this. The documenation says:
A leaking SV is one that appears and then never gets touched again. To detect them, we need to look for SVs that appear between two files, and then don’t disappear again. Any that do disappear were simply temporaries and can be ignored.
So, all of these are new appearances, and they don’t disappear again. 😟
Let’s look at them, starting at the end:
pmat> identify 0x55cf6137ebf8 REF() at 0x55cf6137ebf8 is: └─value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf6137ec58, which is: └─(via RV) value {ptr_0x55cf614d66c0} of HASH(35071) at 0x55cf4ff96aa8, which is: └─not found pmat> identify 0x55cf5e48d5c0 SCALAR(UV) at 0x55cf5e48d5c0 is: └─value {"\\0"} of HASH(1) at 0x55cf5e48d5a8, which is: └─(via RV) value {rakkestad} of HASH(725) at 0x55cf5e44f090, which is: └─(via RV) value {no} of HASH(1526) at 0x55cf4ffdf998, which is: └─(via RV) value {tree} of HASH(2)=IO::Socket::SSL::PublicSuffix at 0x55cf5e546aa0, which is: └─(via RV) value {"1"} of HASH(1) at 0x55cf4ffd88a8, which is: └─the lexical %default at depth 1 of CODE(PP) at 0x55cf4ffd8440, which is: └─the symbol '&IO::Socket::SSL::PublicSuffix::default' pmat> identify 0x55cf5c78a818 UNDEF() at 0x55cf5c78a818 is: └─pad temporary 29 at depth 1 of CODE(PP,closure) at 0x55cf5c719058, which is: ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c681250, which is: │ └─(via RV) value {ptr_0x55cf5c8999e0} of HASH(35071) at 0x55cf4ff96aa8, which is (*A): │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c718b00, which is: │ └─(via RV) value {ptr_0x55cf5c891a30} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c86dea8, which is: │ └─(via RV) value {ptr_0x55cf5c88cff0} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c86def0, which is: │ └─(via RV) value {ptr_0x55cf5c89a180} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c7fc768, which is: │ └─(via RV) value {ptr_0x55cf5c871bd0} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c78a7d0, which is: │ └─(via RV) value {ptr_0x55cf5c8991c0} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found └─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf5c7fc7f8, which is: └─(via RV) value {ptr_0x55cf5c8995d0} of HASH(35071) at 0x55cf4ff96aa8, which is: └─not found pmat> identify 0x55cf62a23f60 UNDEF() at 0x55cf62a23f60 is: └─the lexical $ctx_store at depth 1 of CODE(PP,closure) at 0x55cf62a24770, which is: ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62abd3d0, which is: │ └─(via RV) value {ptr_0x55cf62ba4960} of HASH(35071) at 0x55cf4ff96aa8, which is (*A): │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62991a00, which is: │ └─(via RV) value {ptr_0x55cf62b68370} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62667430, which is: │ └─(via RV) value {ptr_0x55cf62b628d0} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62667f70, which is: │ └─(via RV) value {ptr_0x55cf62ba5100} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62a24b48, which is: │ └─(via RV) value {ptr_0x55cf62b72e10} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found ├─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62a24398, which is: │ └─(via RV) value {ptr_0x55cf62ba4550} of HASH(35071) at 0x55cf4ff96aa8, which is: │ └─not found └─(via RV) value {"ssleay_verify_callback!!func"} of HASH(1) at 0x55cf62abd160, which is: └─(via RV) value {ptr_0x55cf62ba4140} of HASH(35071) at 0x55cf4ff96aa8, which is: └─not found pmat> values 0x55cf5bc07e48 {"ssleay_verify_callback!!func"} REF() at 0x55cf5bb9e678 => CODE(PP,closure) at 0x55cf5bd3d8d8
OK, I’d it’s pretty clear by now: the “ssleay_verify_callback” stuff is bonkers!
Verify callback – remember the Phoebe code? Also note how “%default” is mentioned above.
IO::Socket::SSL::set_defaults( SSL_verify_mode => SSL_VERIFY_PEER, SSL_verify_callback => \&verify_fingerprint);
But I just read the IO::Socket::SSL man page. It says: “If you have a server and it looks like you have a memory leak you might check the size of your session cache. Default for Net::SSLeay seems to be 20480, see the example for SSL_create_ctx_callback for how to limit it.”
OK. I’m going to try that instead. The previous code had no effect. I’m limiting the session cache to 128, like in the example provided. Restarted the server. Memory is down below 100 MiB again:
# pgrep -f -l "phoebe" 29000 phoebe # ps -q 29000 -eo rss,size RSS SIZE 73392 62360
– Alex
---
I’m surprised. This seems to have helped!
Ignoring the spike of Emacs Wiki and just looking Phoebe: we seem to be stable at 250 MiB.
Cool!
Hm… Then again, at the same time, I changed the “phoebe.service” file for systemd:
MemoryMax=250M MemoryHigh=200M
That is… uncannily close! So here’s what I’m doing now: I decreased the SSH cache from 128 to 64, and I didn’t change the “phoebe.service” file. Let’s see if memory use is halved, i.e. the cache is relevant, or not, i.e. the memory limits are relevant.
– Alex 2021-01-09 15:00 UTC
---
OK, reducing the cache size didn’t help. We’re back at 250 MiB. This *is* a memory leak of sorts, but the resource limit I have in the service file is working.
I think I’m going to reduce those limits:
MemoryMax=100M MemoryHigh=90M
Let’s see whether it works well enough!
– Alex 2021-01-10 09:54 UTC
---
It works well enough. I added this to the documentation and stopped trying to understand where Mojo::IOLoop::TLS, IO::Socket::SSL, Net::SSLeay, or OpenSSL are failing.
– Alex 2021-01-11 11:26 UTC