💾 Archived View for thrig.me › blog › 2023 › 01 › 01 › yet-more-testing.gmi captured on 2023-05-24 at 18:26:02. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-04-19)

➡️ Next capture (2023-12-28)

-=-=-=-=-=-=-

Yet More Testing

Test::UnixCmdWrap, mostly written for testing my unix scripts, tests below whether /bin/echo when given "foo" returns something that matches "foo". (A better test might be qr/^foo$/ or that there is only a single line that only contains "foo"--a bad echo could emit "foofoo" or "foo\nfoo" and pass the following test, but that's not relevant here.)

    use Test::UnixCmdWrap;
    my $echo = Test::UnixCmdWrap->new( cmd => '/bin/echo' );
    $echo->run( args => 'foo', stdout => qr/foo/ );

Under the new Test2::Tools::Command the above becomes

    use Test2::Tools::Command;
    @Test2::Tools::Command::command = ( '/bin/echo' );
    command { args =>  [ 'foo' ], stdout => qr/foo/ };

using Test2::Suite in place of the Test::Cmd and Test::More modules.

CPU Waste?

A benchmark might be indicative of how much CPU is wasted, though Perl isn't exactly towards the uses-little-CPU end of the spectrum. I guess I could write (or look for?) something probably in Go that performs similar tests only using less CPU (at the cost of more programmer time).

    #!/usr/bin/env perl

    # obviously OO will have more overhead than an array assignment...
    use Test::UnixCmdWrap;
    my $echo = Test::UnixCmdWrap->new( cmd => '/bin/echo' );

    use Test2::Tools::Command;
    @Test2::Tools::Command::command = '/bin/echo';

    use Benchmark qw(cmpthese);
    cmpthese -10, {
      old => sub {
        $echo->run( args => 'foo', stdout => qr/foo/ );
      },
      new => sub {
        command { args => ['foo'], stdout => qr/foo/ };
      }
    };

And it turns out the modules are roughly the same speed. Probably most of the overhead is the fork/exec, maybe a longer benchmark run would winnow out that noise? My expectation was that Test2::Tools::Command would have been consistently faster. A takeaway is that it is probably good to do some hopefully good benchmarks now and then to ensure that your mental model matches what is actually going on on the system.

I once benchmarked an Ansible "ping" to a single host against a script that would copy some C code the same host, compile that code, run that code, collect the output of that code, and display the ping reply. Ansible was six times slower. So I am generally dubious when folks trot out claims like "Python can wait just as fast as any other language". Ansible is bloated, but is all the slowness due to Ansible?

Less CPU Waste

Turns out I was using Test2 wrongly; Test2::Tools::Compare has a B<DEPRECATED> flag on it and it's easier to call ->pass or ->fail depending on how some pretty specific checks turn out. And, this is faster.

         Rate  ucw  ttc
    old 115/s   -- -45%
    new 208/s  81%   --

If something isn't as fast as expected (or was, in this case, was also horribly buggy) it's good to look into things... and how were the horrible bugs detected? By trying to use the new test module to test some new code and all the I/O tests passed, when they should have not. Which leads us to

Who Tests The Tests?

This ideally requires that the test framework is able to test itself, which Test2::Suite does. Coverage of tests that fail, which would normally cause undesireable test failures in the test system itself, can be intercepted, and more tests written to confirm that the tests failed as expected. This revealed a few more bugs in the code.

    # did we fail the implicit exit code, stdout, and stderr tests?
    # this also gives us code coverage of the failure branches
    my $events = intercept {
        command { args => ['say "out";warn "err\n";exit 23'] }
    };
    is $events->state->{failed}, 3;

So if programmers say they do not need tests (or eventually, formal verification of some sort?) I trust them not, given the rate at which I write bugs that tests catch and the rate of security vulnerabilities appearing in the CVE database.

Wart Correction

A particular wart of Test::Cmd is that the argument 0 must be protected via something like q{'0'} as otherwise the bare 0 is false and that argument is not appended:

    $cmd = $cmd." ".$args{'args'} if $args{'args'}; # false if 0
    $cmd =~ s/\$work/$self->{'workdir'}/g;
    $cmd = "|$cmd 1>$stdout_file 2>$stderr_file";

Also note that Test::Cmd always runs commands through the shell; I consider this to be a bad design. Test2::Tools::Command instead takes a list and does not need to string together a command--with hopefully no unexpected security exploits or shell warts along for the ride. Basically I favor using execv(3) where possible over throwing strings at system(3), though some high level interfaces have made this difficult or impossible to achieve.

See Also

http://man.openbsd.org/man3/execv.3

http://man.openbsd.org/man3/system.3

https://metacpan.org/pod/Test2::Tools::Command

https://metacpan.org/pod/Test::UnixCmdWrap

tags #testing #perl

bphflog links

bphflog index

next: Identity