💾 Archived View for thrig.me › blog › 2023 › 06 › 26 › tcp-escoteria.gmi captured on 2023-09-28 at 16:14:08. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-07-10)

➡️ Next capture (2023-11-14)

-=-=-=-=-=-=-

TCP Escoteria

We now resume a dialog already in progress.

Josephus.— See here, there is a "connected" method to check whether a TCP socket is connected.

Aloysius.— The peer could close the connection the very next second.

Josephus.— But I want to know whether the socket is connected right now.

Aloysius.— This is also not possible; the peer may have already closed the socket, and one cannot tell the difference between the ESTABLISHED and CLOSE-WAIT states.

Josephus.— Other students speak of SO_LINGER, can that be used?

Aloysius.— Certainly; if the peer has set SO_LINGER with a zero timeout, then when the peer closes the socket the local state will transition immediately to CLOSED. However, SO_LINGER can lead to data corruption; a server should instead use the SO_REUSEADDR option. There may be cases where SO_LINGER is suitable, though one must use it with care and consideration.

Josephus.— Teach me that I may learn.

The Quick Answer

Use an asynchronous I/O library, and then you mostly don't need to care about the connection state. Said libraries also generally know how to read X bytes from the network, something that can be pretty tricky to get right. Your single read(2) call will get you all the bytes you want, right?

In Somewhat More Detail

The problem is that the peer could or already has closed the TCP connection, and do you really need to know whether the socket is up? A more direct approach would be to read or write as necessary, and then handle any errors, assuming that they are detectable. For instance a TCP keepalive cannot tell the difference between mostly dead and all dead:

https://www.youtube.com/watch?v=xbE8E1ez97M

Meanwhile, the Perl IO::Socket module does have a "connected" method,

   connected
           my $peer_addr = $sock->connected();
           if ($peer_addr) {
               say "We're connected to $peer_addr";
           }

       If the socket is in a connected state, the peer address is
       returned. If the socket is not in a connected state, "undef"
       is returned.

which actually is a wrapper for getpeername(2)

    $ perldoc -m IO::Socket | perl -00 -nle 'print if /sub connected/'
    sub connected {
        @_ == 1 or croak 'usage: $sock->connected()';
        my($sock) = @_;
        getpeername($sock);
    }

and assumes that any error means the socket is not connected. A failure does mean that the socket is not in a good shape, though the error may not be because the connection went away.

On success, is the socket actually connected? Maybe! Again, the other side may have already called close and now we're waiting for that state to surface on our side. Here SO_LINGER is an option, if you are okay with an RST being sent instead of the usual "aww man we gotta wait??" sequence, among other potential downsides.

A duplicate packet did once show up, and then AWS fell over. Just because your system hasn't fallen over from playing fast and loose with TCP states does not mean that it will not fail in the future. Customizations here should be well documented and tested as to why your special case is necessary, how any data being lost or arriving two or more times is handled, etc. These edge cases can be tricky to test, which means they probably will not be tested until the network tests it for you, in production.

With that disclaimer in mind, the following test code shows how SO_LINGER can be enabled with a zero timeout. This should influence whether the client sees the connection as connected or not. Or at least it does for me, YMMV.

  #!/usr/bin/env perl
  # SO_LINGER test - there are things to TWEAK
  use strict;
  use warnings;
  use feature 'say';
  use IO::Socket
    qw(AF_INET SOCK_STREAM SHUT_RD SHUT_RDWR SHUT_WR SOL_SOCKET SO_LINGER);
  
  my $host = 'localhost';    # TWEAK
  
  my $server = IO::Socket->new(
      Domain    => AF_INET,
      Listen    => 1,
      LocalHost => $host,
      LocalPort => 0,             # get a random port
      Proto     => 'tcp',
      Reuse     => 1,
      Type      => SOCK_STREAM,
  ) or die "server failed $!\n";
  my $port = $server->sockport;
  
  # child sends a PING to the server, ignores response
  my $pid = fork // die "fork failed: $!\n";
  unless ($pid) {
      my $client = IO::Socket->new(
          Domain   => AF_INET,
          PeerHost => $host,
          PeerPort => $port,
          Type     => SOCK_STREAM,
          proto    => 'tcp',
      ) or die "client failed $!\n";
      $client->send("PING");
      # KLUGE TWEAK better would be to communicate via a pipe
      sleep 3;
      my $state = $client->connected ? 1 : 0;
      say "client connected? $state";
      $client->close;
      exit;
  }
  
  # server responds to client
  my $client = $server->accept // die "server accept $!\n";
  
  my $opt = $client->getsockopt( SOL_SOCKET, SO_LINGER );
  printf "server linger %vx ($opt)\n", $opt;
  # TWEAK uncomment this next line to enable SO_LINGER, no timeout
  #$client->setsockopt( SOL_SOCKET, SO_LINGER, pack 'I*', 1, 0 );
  $opt = $client->getsockopt( SOL_SOCKET, SO_LINGER );
  printf "server linger %vx ($opt)\n", $opt;
  
  my $state = $client->connected ? 1 : 0;
  $client->read( my $data, 4 );
  say "server <$data> connected? $state";
  $client->send("PONG");
  $client->shutdown(SHUT_WR); # try out other shutdowns!
  $state = $client->connected ? 1 : 0;
  say "server shut_wr connected? $state";
  close $client;
  sleep 5;
  
  close $server;

connected.pl

The weird pack is because SO_LINGER takes a two integer struct, as seen in /usr/include/sys/socket.h on OpenBSD:

    struct  linger {
        int l_onoff;        /* option on/off */
        int l_linger;       /* linger time */
    };

SO_LINGER with a positive timeout has various concerns with non-blocking I/O. But that is a different can of worms. Maybe use an asynchronous I/O library?

Analysis

TCP tools should ideally be used to log the connection. This will help confirm that the code is doing what you think it should be doing. For example tcpdump(8) might show the following for a typical connection:

    0.000000 localhost.22142 > localhost.1742: S 922081975:922081975(0) win 16384 <mss 32728,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 1098699120 0> (DF)
    0.000091 localhost.1742 > localhost.22142: S 3692366156:3692366156(0) ack 922081976 win 16384 <mss 32728,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 2183310540 1098699120> (DF)
    0.000226 localhost.22142 > localhost.1742: . ack 1 win 256 <nop,nop,timestamp 1098699120 2183310540> (DF)
    0.003023 localhost.22142 > localhost.1742: P 1:5(4) ack 1 win 256 <nop,nop,timestamp 1098699120 2183310540> (DF)
    0.003137 localhost.1742 > localhost.22142: . ack 5 win 255 <nop,nop,timestamp 2183310540 1098699120> (DF)
    0.004340 localhost.1742 > localhost.22142: P 1:5(4) ack 5 win 256 <nop,nop,timestamp 2183310540 1098699120> (DF)
    0.004531 localhost.22142 > localhost.1742: . ack 5 win 255 <nop,nop,timestamp 1098699120 2183310540> (DF)
    0.004562 localhost.1742 > localhost.22142: F 5:5(0) ack 5 win 256 <nop,nop,timestamp 2183310540 1098699120> (DF)
    0.004662 localhost.22142 > localhost.1742: . ack 6 win 255 <nop,nop,timestamp 1098699120 2183310540> (DF)
    3.006468 localhost.22142 > localhost.1742: F 5:5(0) ack 6 win 256 <nop,nop,timestamp 1098702130 2183310540> (DF)
    3.006580 localhost.1742 > localhost.22142: . ack 6 win 256 <nop,nop,timestamp 2183313550 1098702130> (DF)

while the following instead shows an RST with SO_LINGER set:

    0.000000 localhost.2285 > localhost.17616: S 3169430727:3169430727(0) win 16384 <mss 32728,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3206043957 0> (DF)
    0.000057 localhost.17616 > localhost.2285: S 3442066523:3442066523(0) ack 3169430728 win 16384 <mss 32728,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 2748919114 3206043957> (DF)
    0.000151 localhost.2285 > localhost.17616: . ack 1 win 256 <nop,nop,timestamp 3206043957 2748919114> (DF)
    0.000823 localhost.2285 > localhost.17616: P 1:5(4) ack 1 win 256 <nop,nop,timestamp 3206043957 2748919114> (DF)
    0.000870 localhost.17616 > localhost.2285: . ack 5 win 255 <nop,nop,timestamp 2748919114 3206043957> (DF)
    0.001916 localhost.17616 > localhost.2285: P 1:5(4) ack 5 win 256 <nop,nop,timestamp 2748919124 3206043957> (DF)
    0.001999 localhost.2285 > localhost.17616: . ack 5 win 255 <nop,nop,timestamp 3206043967 2748919124> (DF)
    0.002086 localhost.17616 > localhost.2285: F 5:5(0) ack 5 win 256 <nop,nop,timestamp 2748919124 3206043967> (DF)
    0.002140 localhost.2285 > localhost.17616: . ack 6 win 255 <nop,nop,timestamp 3206043967 2748919124> (DF)
    0.002304 localhost.17616 > localhost.2285: R 6:6(0) ack 5 win 0 (DF)

A notable difference is that the RST happened pretty quickly, while in the first case the client slept for three seconds before getting around to completing the usual FIN+ACK shutdown by calling close. What happened to the response from the server on the client? The above code ignores it.

netstat(1) will show the TCP state a program is in, though this may be difficult to use on ephemeral programs. Other tools may provide better tracking of TCP state changes of a port or associated with some process.

References

tags #networking #perl

bphflog links

bphflog index

next: Cold Start